[Rcpp-devel] Question concerning use of Raster in C vi Rcpp plugin and inline

Romain Francois romain at r-enthusiasts.com
Wed Nov 17 10:55:38 CET 2010


Le 17/11/10 10:44, Rainer M Krug a écrit :
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 11/17/2010 10:26 AM, Romain Francois wrote:
>> Le 17/11/10 10:17, Rainer M Krug a écrit :
>>> On 11/16/2010 05:49 PM, Douglas Bates wrote:
>>> Hi Douglas,
>>>
>>>> I enclose a rewrite which compiles and executes without changing the
>>>> arguments.
>>>
>>> That looks much better and "C like" - thanks a lot.
>>>
>>>> I am not entirely sure that it is doing the correct
>>>> calculation.  I tried to compare the results from calls like
>>>>
>>>> print(range(sd2D))
>>>> set.seed(1234)
>>>> print(system.time(dispSeedsC<- disperse(seeds.m, sd2D,
>>>> dispRparForSugarOrg)))
>>>> set.seed(1234)
>>>> print(system.time(dispSeedsR<- disperse(seeds.m, sd2D, dispRparFor)))
>>>> print(range(sd2D))
>>>>
>>>> and all.equal(dispSeedsC, dispSeedsR) fails, so I may have introduced
>>>> errors.
>>>
>>> I will look at that - and I am optimistic that I will sort that out now.
>>>
>>>> I am still somewhat confused by the logic.
>>>
>>> What I want to calculate is the dispersal of seeds into each cell from
>>> the neighboring cells, defined in sd2D. sd2D is a probability density
>>> function, and seeds.m is the initial seed distribution of to be
>>> dispersed seeds.
>>>
>>> Well - the basic idea is of a mowing-window calculation over seeds.m,
>>> where the window is defined by sd2D, and the calculation is based on the
>>> rbinom numbers drawn from all cells around a particular cell (x,y),
>>> covered y sd2D; the prob of rbinom is the value of sd2D, the size is the
>>> value of the underlying cell in seeds.m, n is 1. All numbers drawn are
>>> added up and stored in the cell (x,y). And this is repeated for all
>>> cells.
>>>
>>> Now, I need the values for all cells, even those where sd2D does not
>>> completely overlap with seeds.m, and this is why I buffer seeds.m with
>>> NAs to get seeds, on which the calculations are now performed.
>>>
>>> Hope this clarifies my problem, and my somehow strange approach (I am
>>> thinking of skipping the buffering and to check for overlap in the loop
>>> itself).
>>>
>>> A procedural question:
>>>
>>> You attached below a file named disp.cpp, which indicates to me that
>>> your c code is in a separate file - is this the case? And if yes, how
>>> can I do that with the inline command? Or do I have to compile
>>> externally and use Rcpp directly?
>>
>> Ultimately, making a package is the best action.
>
> True - But I haven't looked at that option yet, as I was happy with
> sourcing the code, but I will for my next project.
>
>>
>> Otherwise, I guess you could do something like this:
>>
>> f<- cxxfunction( signature(....),
>>      paste( readLines( "disp.cpp" ), collapse = "\n" ),
>>          plugin = "Rcpp" )
>
> Sounds good - would there be an option of including an alternative
> argument to "body", i.e. "file", of which exactly one has to be supplied?

Perhaps. Feel free to submit a patch to the maintainer of inline.

>> I'll look at the rest of the thread and try to provide some help later.
>
> Thanks - I'll work on it today again, and post (hopefully) some progress
> later today.
>
> Rainer
>
>>
>> Romain
>>
>>> Cheers and thanks a lot for your help,
>>>
>>> Rainer
>>>
>>>>
>>>> On Tue, Nov 16, 2010 at 10:15 AM, Douglas
>>>> Bates<bates-GX8I/T4BApV4piUD7e9S/g at public.gmane.org>   wrote:
>>>>> On Tue, Nov 16, 2010 at 9:43 AM, Rainer M
>>>>> Krug<Rainer-vfylz/Ys1k4 at public.gmane.org>   wrote:
>>>>>> On 11/16/2010 03:16 PM, Douglas Bates wrote:
>>>>>>> On Tue, Nov 16, 2010 at 4:31 AM, Rainer M
>>>>>>> Krug<r.m.krug-Re5JQEeQqe8AvxtiuMwx3w-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org>
>>>>>>> wrote:
>>>>>>> Hi
>>>>>>>
>>>>>>> I am new to C, and my question might be basic, but I am struggling
>>>>>>> with it:
>>>>>>>
>>>>>>> I have about 10 lines of code in a simulation, which take up about
>>>>>>> 50%
>>>>>>> of the whole simulation, As the simulation, takes several hours to
>>>>>>> run,
>>>>>>> I want to make the code faster. After trying different approaches
>>>>>>> in R,
>>>>>>> with different degrees of success, I decided to try out inline and
>>>>>>> Rcpp
>>>>>>> for this.
>>>>>>>
>>>>>>> I did some progress (using sugar), and the net-result is so far,
>>>>>>> that it
>>>>>>> is as fast / slow as R; but as I said, this is a first step and I am
>>>>>>> happy so far. My problem now is, in the R code, I have the following
>>>>>>> (the complete code is below):
>>>>>>>
>>>>>>> s<- seeds[x:(x+dx2), y:(y+dy2)]
>>>>>>> dispSeeds[x,y]<- sum(
>>>>>>>                                  rbinom(
>>>>>>>                                         s,
>>>>>>>                                         s,
>>>>>>>                                         sd2D
>>>>>>>                                         ),
>>>>>>>                                  na.rm = TRUE
>>>>>>>                                  )
>>>>>>>
>>>>>>>> I think it unlikely that you will be able to write C/C++ code that
>>>>>>>> runs much faster than that expression in R.  The sum and rbinom
>>>>>>>> functions are vectorized and other than a small interpretation
>>>>>>>> overhead not much can be saved by going to compiled code.
>>>>>>
>>>>>> The main reason why I am trying to use C are the loops around the code
>>>>>> above. See attached code example.
>>>>>>
>>>>>>>
>>>>>>> As the rbinom part is only using the vector representation of s, I
>>>>>>> have
>>>>>>> already translated it into C.
>>>>>>>
>>>>>>> So my problem / question is: can I use a similar syntax for s in
>>>>>>> inline
>>>>>>> using sugar?
>>>>>>>
>>>>>>> As the code above sits in two large loops, I would like to
>>>>>>> translate the
>>>>>>> whole loops into C.
>>>>>>>
>>>>>>> Below the code (R and R / inline):
>>>>>>>
>>>>>>> maxx, maxy, dx, dy: scalar integers
>>>>>>> seeds: integer matrix
>>>>>>> dispSeeds: integer matrix
>>>>>>>
>>>>>>> The pure R code:
>>>>>>>
>>>>>>> #+BEGIN_SRC R
>>>>>>>    dispRparFor<- function(
>>>>>>>                            maxx  = maxx,
>>>>>>>                            maxy  = maxy,
>>>>>>>                            seeds = seeds,
>>>>>>>                            sd2D  = sd2D,
>>>>>>>                            dx    = dx,
>>>>>>>                            dy    = dy,
>>>>>>>                            dispSeeds = dispSeeds # the return value
>>>>>>>                            ) {
>>>>>>>      dx2<- 2*dx
>>>>>>>      dy2<- 2*dy
>>>>>>>      for ( y in 1:maxy ) {
>>>>>>>        cat(y, " ")
>>>>>>>        for (x in 1:maxx) {
>>>>>>>          s<- seeds[x:(x+dx2), y:(y+dy2)]
>>>>>>>          if (all(is.na(s))) {
>>>>>>>            dispSeeds[x,y]<- NA
>>>>>>>          } else {
>>>>>>>            dispSeeds[x,y]<- sum(
>>>>>>>                                  rbinom(
>>>>>>>                                         s,
>>>>>>>                                         s,
>>>>>>>                                         sd2D
>>>>>>>                                         ),
>>>>>>>                                  na.rm = TRUE
>>>>>>>                                  )
>>>>>>>          }
>>>>>>>        }
>>>>>>>      }
>>>>>>>      return(dispSeeds)
>>>>>>>    }
>>>>>>> #+END_SRC
>>>>>>>
>>>>>>>
>>>>>>> How far I got with using inline and Rcpp / sugar:
>>>>>>>
>>>>>>>
>>>>>>> #+BEGIN_SRC R
>>>>>>>    dispRparForSugar<- function(
>>>>>>>                                 maxx  = maxx,
>>>>>>>                                 maxy  = maxy,
>>>>>>>                                 seeds = seeds,
>>>>>>>                                 sd2D  = sd2D,
>>>>>>>                                 dx    = dx,
>>>>>>>                                 dy    = dy,
>>>>>>>                                 dispSeeds = dispSeeds # the return
>>>>>>> value
>>>>>>>                                 ) {
>>>>>>>      library(inline)
>>>>>>>      library(Rcpp)
>>>>>>>      dx2<- 2*dx
>>>>>>>      dy2<- 2*dy
>>>>>>>
>>>>>>>
>>>>>>>      fx2<- cxxfunction(
>>>>>>>                         sig = signature(N = "integer", SIZE =
>>>>>>> "integer",
>>>>>>> PROB = "double"),
>>>>>>>                         body = '
>>>>>>>                                     Rcpp::IntegerVector n (N);
>>>>>>>                                     Rcpp::IntegerVector size (SIZE);
>>>>>>>                                     Rcpp::NumericVector prob (PROB);
>>>>>>>
>>>>>>>                                     int res;
>>>>>>>                                     res = 0;
>>>>>>>
>>>>>>>                                     for( int i=0; i<n.size(); i++){
>>>>>>>                                          if (size[i]>0) {
>>>>>>>                                            res += rbinom( 1, size[i],
>>>>>>> prob[i] )[0];
>>>>>>>                                          }
>>>>>>>                                     }
>>>>>>>
>>>>>>>                                     return wrap( res );
>>>>>>>                                   ',
>>>>>>>                         plugin = "Rcpp",
>>>>>>>                         verbose = TRUE
>>>>>>>                         )
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>      for ( y in 1:maxy ) {
>>>>>>>        cat(y, " ")
>>>>>>>        for (x in 1:maxx) {
>>>>>>>          s<- seeds[x:(x+dx2), y:(y+dy2)]
>>>>>>>          if (all(is.na(s))) {
>>>>>>>            dispSeeds[x,y]<- NA
>>>>>>>          } else {
>>>>>>>            dispSeeds[x,y]<- fx2(s, s, sd2D)
>>>>>>>          }
>>>>>>>        }
>>>>>>>      }
>>>>>>>      return(dispSeeds)
>>>>>>>    }
>>>>>>> #+END_SRC
>>>>>>>
>>>>>>>> It is peculiar to call cxxfunction within another function if the
>>>>>>>> code
>>>>>>>> being compiled is fixed.  cxxfunction is a function generator and
>>>>>>>> one
>>>>>>>> usually regards a call to cxxfunction as the equivalent of
>>>>>>>> entering a
>>>>>>>> function definition.
>>>>>>
>>>>>> I changed that - yes, it was peculiar but based on my experimentation.
>>>>>>
>>>>>>>
>>>>>>>> It doesn't appear that you are using n in your C++ code (other than
>>>>>>>> taking its length) and your C++ code seems to be evaluating
>>>>>>>> sum(rbinom(1, s, sd2D)) rather than sum(rbinom(s, s, sd2D)).  Is
>>>>>>>> that
>>>>>>>> intentional?
>>>>>>
>>>>>> Left over - changed as well. Thanks.
>>>>>>
>>>>>>>
>>>>>>>> And if your fx2 is always to be called as fx2(s, s, sd2D) then
>>>>>>>> why not
>>>>>>>> just pass s and sd2D?
>>>>>>
>>>>>> Done (partly).
>>>>>>
>>>>>>
>>>>>> To add some more strange things:
>>>>>>
>>>>>> I attach the code in a working example as an R script file which will
>>>>>> show something strange (for me):
>>>>>>
>>>>>> the function overwrites a variable, namely sd2D. Am I doing something
>>>>>> wrong or do I underestimate the dangers of using Rcpp and inline and C
>>>>>> in general?
>>>>>
>>>>> Yes, it will.  You are assigning the NumericVector's s and sd2D to use
>>>>> the storage of the SD2D argument so when you assign s[indS] in the
>>>>> first loop you are overwriting the contents of SD2D.
>>>>>
>>>>> I am testing a rewrite now.
>>>>>
>>>>>> In addition, the code is not doing what it is supposed to be doing,
>>>>>> but
>>>>>> that is a different question at the moment.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Rainer
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Ultimately, I would like to have the complete dispRparFor function
>>>>>>> in C.
>>>>>>>
>>>>>>>
>>>>>>> For the ones interested, I could send a working example with data.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Rainer
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Rcpp-devel mailing list
>>>>>> Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org
>>>>>>
>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
>>>>>> Biology, UCT), Dipl. Phys. (Germany)
>>>>>>
>>>>>> Centre of Excellence for Invasion Biology
>>>>>> Natural Sciences Building
>>>>>> Office Suite 2039
>>>>>> Stellenbosch University
>>>>>> Main Campus, Merriman Avenue
>>>>>> Stellenbosch
>>>>>> South Africa
>>>>>>
>>>>>> Tel:        +33 - (0)9 53 10 27 44
>>>>>> Cell:       +27 - (0)8 39 47 90 42
>>>>>> Fax (SA):   +27 - (0)8 65 16 27 82
>>>>>> Fax (D) :   +49 - (0)3 21 21 25 22 44
>>>>>> Fax (FR):   +33 - (0)9 58 10 27 44
>>>>>> email:      Rainer-vfylz/Ys1k4 at public.gmane.org
>>>>>>
>>>>>> Skype:      RMkrug
>>>>>>
>>>>>> _______________________________________________
>>>>>> Rcpp-devel mailing list
>>>>>> Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org
>>>>>>
>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Rcpp-devel mailing list
>>>>> Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org
>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>>
>>> _______________________________________________
>>> Rcpp-devel mailing list
>>> Rcpp-devel at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>>
>>
>>
>
>
> - --
> Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
> Biology, UCT), Dipl. Phys. (Germany)
>
> Centre of Excellence for Invasion Biology
> Natural Sciences Building
> Office Suite 2039
> Stellenbosch University
> Main Campus, Merriman Avenue
> Stellenbosch
> South Africa
>
> Tel:        +33 - (0)9 53 10 27 44
> Cell:       +27 - (0)8 39 47 90 42
> Fax (SA):   +27 - (0)8 65 16 27 82
> Fax (D) :   +49 - (0)3 21 21 25 22 44
> Fax (FR):   +33 - (0)9 58 10 27 44
> email:      Rainer at krugs.de
>
> Skype:      RMkrug
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAkzjpBYACgkQoYgNqgF2egrPFQCcDhoEaLbiV87gQLV+i21BSsbO
> cSoAoIsUsUx6CCLZBIYVZYnzY2ry9TxC
> =mnzJ
> -----END PGP SIGNATURE-----
>


-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9VOd3l : ZAT! 2010
|- http://bit.ly/c6DzuX : Impressionnism with R
`- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube




More information about the Rcpp-devel mailing list