[Rcpp-devel] Question concerning use of Raster in C vi Rcpp plugin and inline

Rainer M Krug r.m.krug at gmail.com
Wed Nov 17 10:44:54 CET 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/17/2010 10:26 AM, Romain Francois wrote:
> Le 17/11/10 10:17, Rainer M Krug a écrit :
>> On 11/16/2010 05:49 PM, Douglas Bates wrote:
>> Hi Douglas,
>>
>>> I enclose a rewrite which compiles and executes without changing the
>>> arguments.
>>
>> That looks much better and "C like" - thanks a lot.
>>
>>> I am not entirely sure that it is doing the correct
>>> calculation.  I tried to compare the results from calls like
>>>
>>> print(range(sd2D))
>>> set.seed(1234)
>>> print(system.time(dispSeedsC<- disperse(seeds.m, sd2D,
>>> dispRparForSugarOrg)))
>>> set.seed(1234)
>>> print(system.time(dispSeedsR<- disperse(seeds.m, sd2D, dispRparFor)))
>>> print(range(sd2D))
>>>
>>> and all.equal(dispSeedsC, dispSeedsR) fails, so I may have introduced
>>> errors.
>>
>> I will look at that - and I am optimistic that I will sort that out now.
>>
>>> I am still somewhat confused by the logic.
>>
>> What I want to calculate is the dispersal of seeds into each cell from
>> the neighboring cells, defined in sd2D. sd2D is a probability density
>> function, and seeds.m is the initial seed distribution of to be
>> dispersed seeds.
>>
>> Well - the basic idea is of a mowing-window calculation over seeds.m,
>> where the window is defined by sd2D, and the calculation is based on the
>> rbinom numbers drawn from all cells around a particular cell (x,y),
>> covered y sd2D; the prob of rbinom is the value of sd2D, the size is the
>> value of the underlying cell in seeds.m, n is 1. All numbers drawn are
>> added up and stored in the cell (x,y). And this is repeated for all
>> cells.
>>
>> Now, I need the values for all cells, even those where sd2D does not
>> completely overlap with seeds.m, and this is why I buffer seeds.m with
>> NAs to get seeds, on which the calculations are now performed.
>>
>> Hope this clarifies my problem, and my somehow strange approach (I am
>> thinking of skipping the buffering and to check for overlap in the loop
>> itself).
>>
>> A procedural question:
>>
>> You attached below a file named disp.cpp, which indicates to me that
>> your c code is in a separate file - is this the case? And if yes, how
>> can I do that with the inline command? Or do I have to compile
>> externally and use Rcpp directly?
> 
> Ultimately, making a package is the best action.

True - But I haven't looked at that option yet, as I was happy with
sourcing the code, but I will for my next project.

> 
> Otherwise, I guess you could do something like this:
> 
> f <- cxxfunction( signature(....),
>     paste( readLines( "disp.cpp" ), collapse = "\n" ),
>         plugin = "Rcpp" )

Sounds good - would there be an option of including an alternative
argument to "body", i.e. "file", of which exactly one has to be supplied?

> 
> 
> I'll look at the rest of the thread and try to provide some help later.

Thanks - I'll work on it today again, and post (hopefully) some progress
later today.

Rainer

> 
> Romain
> 
>> Cheers and thanks a lot for your help,
>>
>> Rainer
>>
>>>
>>> On Tue, Nov 16, 2010 at 10:15 AM, Douglas
>>> Bates<bates-GX8I/T4BApV4piUD7e9S/g at public.gmane.org>  wrote:
>>>> On Tue, Nov 16, 2010 at 9:43 AM, Rainer M
>>>> Krug<Rainer-vfylz/Ys1k4 at public.gmane.org>  wrote:
>>>>> On 11/16/2010 03:16 PM, Douglas Bates wrote:
>>>>>> On Tue, Nov 16, 2010 at 4:31 AM, Rainer M
>>>>>> Krug<r.m.krug-Re5JQEeQqe8AvxtiuMwx3w-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org> 
>>>>>> wrote:
>>>>>> Hi
>>>>>>
>>>>>> I am new to C, and my question might be basic, but I am struggling
>>>>>> with it:
>>>>>>
>>>>>> I have about 10 lines of code in a simulation, which take up about
>>>>>> 50%
>>>>>> of the whole simulation, As the simulation, takes several hours to
>>>>>> run,
>>>>>> I want to make the code faster. After trying different approaches
>>>>>> in R,
>>>>>> with different degrees of success, I decided to try out inline and
>>>>>> Rcpp
>>>>>> for this.
>>>>>>
>>>>>> I did some progress (using sugar), and the net-result is so far,
>>>>>> that it
>>>>>> is as fast / slow as R; but as I said, this is a first step and I am
>>>>>> happy so far. My problem now is, in the R code, I have the following
>>>>>> (the complete code is below):
>>>>>>
>>>>>> s<- seeds[x:(x+dx2), y:(y+dy2)]
>>>>>> dispSeeds[x,y]<- sum(
>>>>>>                                 rbinom(
>>>>>>                                        s,
>>>>>>                                        s,
>>>>>>                                        sd2D
>>>>>>                                        ),
>>>>>>                                 na.rm = TRUE
>>>>>>                                 )
>>>>>>
>>>>>>> I think it unlikely that you will be able to write C/C++ code that
>>>>>>> runs much faster than that expression in R.  The sum and rbinom
>>>>>>> functions are vectorized and other than a small interpretation
>>>>>>> overhead not much can be saved by going to compiled code.
>>>>>
>>>>> The main reason why I am trying to use C are the loops around the code
>>>>> above. See attached code example.
>>>>>
>>>>>>
>>>>>> As the rbinom part is only using the vector representation of s, I
>>>>>> have
>>>>>> already translated it into C.
>>>>>>
>>>>>> So my problem / question is: can I use a similar syntax for s in
>>>>>> inline
>>>>>> using sugar?
>>>>>>
>>>>>> As the code above sits in two large loops, I would like to
>>>>>> translate the
>>>>>> whole loops into C.
>>>>>>
>>>>>> Below the code (R and R / inline):
>>>>>>
>>>>>> maxx, maxy, dx, dy: scalar integers
>>>>>> seeds: integer matrix
>>>>>> dispSeeds: integer matrix
>>>>>>
>>>>>> The pure R code:
>>>>>>
>>>>>> #+BEGIN_SRC R
>>>>>>   dispRparFor<- function(
>>>>>>                           maxx  = maxx,
>>>>>>                           maxy  = maxy,
>>>>>>                           seeds = seeds,
>>>>>>                           sd2D  = sd2D,
>>>>>>                           dx    = dx,
>>>>>>                           dy    = dy,
>>>>>>                           dispSeeds = dispSeeds # the return value
>>>>>>                           ) {
>>>>>>     dx2<- 2*dx
>>>>>>     dy2<- 2*dy
>>>>>>     for ( y in 1:maxy ) {
>>>>>>       cat(y, " ")
>>>>>>       for (x in 1:maxx) {
>>>>>>         s<- seeds[x:(x+dx2), y:(y+dy2)]
>>>>>>         if (all(is.na(s))) {
>>>>>>           dispSeeds[x,y]<- NA
>>>>>>         } else {
>>>>>>           dispSeeds[x,y]<- sum(
>>>>>>                                 rbinom(
>>>>>>                                        s,
>>>>>>                                        s,
>>>>>>                                        sd2D
>>>>>>                                        ),
>>>>>>                                 na.rm = TRUE
>>>>>>                                 )
>>>>>>         }
>>>>>>       }
>>>>>>     }
>>>>>>     return(dispSeeds)
>>>>>>   }
>>>>>> #+END_SRC
>>>>>>
>>>>>>
>>>>>> How far I got with using inline and Rcpp / sugar:
>>>>>>
>>>>>>
>>>>>> #+BEGIN_SRC R
>>>>>>   dispRparForSugar<- function(
>>>>>>                                maxx  = maxx,
>>>>>>                                maxy  = maxy,
>>>>>>                                seeds = seeds,
>>>>>>                                sd2D  = sd2D,
>>>>>>                                dx    = dx,
>>>>>>                                dy    = dy,
>>>>>>                                dispSeeds = dispSeeds # the return
>>>>>> value
>>>>>>                                ) {
>>>>>>     library(inline)
>>>>>>     library(Rcpp)
>>>>>>     dx2<- 2*dx
>>>>>>     dy2<- 2*dy
>>>>>>
>>>>>>
>>>>>>     fx2<- cxxfunction(
>>>>>>                        sig = signature(N = "integer", SIZE =
>>>>>> "integer",
>>>>>> PROB = "double"),
>>>>>>                        body = '
>>>>>>                                    Rcpp::IntegerVector n (N);
>>>>>>                                    Rcpp::IntegerVector size (SIZE);
>>>>>>                                    Rcpp::NumericVector prob (PROB);
>>>>>>
>>>>>>                                    int res;
>>>>>>                                    res = 0;
>>>>>>
>>>>>>                                    for( int i=0; i<n.size(); i++){
>>>>>>                                         if (size[i]>0) {
>>>>>>                                           res += rbinom( 1, size[i],
>>>>>> prob[i] )[0];
>>>>>>                                         }
>>>>>>                                    }
>>>>>>
>>>>>>                                    return wrap( res );
>>>>>>                                  ',
>>>>>>                        plugin = "Rcpp",
>>>>>>                        verbose = TRUE
>>>>>>                        )
>>>>>>
>>>>>>
>>>>>>
>>>>>>     for ( y in 1:maxy ) {
>>>>>>       cat(y, " ")
>>>>>>       for (x in 1:maxx) {
>>>>>>         s<- seeds[x:(x+dx2), y:(y+dy2)]
>>>>>>         if (all(is.na(s))) {
>>>>>>           dispSeeds[x,y]<- NA
>>>>>>         } else {
>>>>>>           dispSeeds[x,y]<- fx2(s, s, sd2D)
>>>>>>         }
>>>>>>       }
>>>>>>     }
>>>>>>     return(dispSeeds)
>>>>>>   }
>>>>>> #+END_SRC
>>>>>>
>>>>>>> It is peculiar to call cxxfunction within another function if the
>>>>>>> code
>>>>>>> being compiled is fixed.  cxxfunction is a function generator and
>>>>>>> one
>>>>>>> usually regards a call to cxxfunction as the equivalent of
>>>>>>> entering a
>>>>>>> function definition.
>>>>>
>>>>> I changed that - yes, it was peculiar but based on my experimentation.
>>>>>
>>>>>>
>>>>>>> It doesn't appear that you are using n in your C++ code (other than
>>>>>>> taking its length) and your C++ code seems to be evaluating
>>>>>>> sum(rbinom(1, s, sd2D)) rather than sum(rbinom(s, s, sd2D)).  Is
>>>>>>> that
>>>>>>> intentional?
>>>>>
>>>>> Left over - changed as well. Thanks.
>>>>>
>>>>>>
>>>>>>> And if your fx2 is always to be called as fx2(s, s, sd2D) then
>>>>>>> why not
>>>>>>> just pass s and sd2D?
>>>>>
>>>>> Done (partly).
>>>>>
>>>>>
>>>>> To add some more strange things:
>>>>>
>>>>> I attach the code in a working example as an R script file which will
>>>>> show something strange (for me):
>>>>>
>>>>> the function overwrites a variable, namely sd2D. Am I doing something
>>>>> wrong or do I underestimate the dangers of using Rcpp and inline and C
>>>>> in general?
>>>>
>>>> Yes, it will.  You are assigning the NumericVector's s and sd2D to use
>>>> the storage of the SD2D argument so when you assign s[indS] in the
>>>> first loop you are overwriting the contents of SD2D.
>>>>
>>>> I am testing a rewrite now.
>>>>
>>>>> In addition, the code is not doing what it is supposed to be doing,
>>>>> but
>>>>> that is a different question at the moment.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Rainer
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Ultimately, I would like to have the complete dispRparFor function
>>>>>> in C.
>>>>>>
>>>>>>
>>>>>> For the ones interested, I could send a working example with data.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Rainer
>>>>>>
>>>>> _______________________________________________
>>>>> Rcpp-devel mailing list
>>>>> Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org
>>>>>
>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
>>>>> Biology, UCT), Dipl. Phys. (Germany)
>>>>>
>>>>> Centre of Excellence for Invasion Biology
>>>>> Natural Sciences Building
>>>>> Office Suite 2039
>>>>> Stellenbosch University
>>>>> Main Campus, Merriman Avenue
>>>>> Stellenbosch
>>>>> South Africa
>>>>>
>>>>> Tel:        +33 - (0)9 53 10 27 44
>>>>> Cell:       +27 - (0)8 39 47 90 42
>>>>> Fax (SA):   +27 - (0)8 65 16 27 82
>>>>> Fax (D) :   +49 - (0)3 21 21 25 22 44
>>>>> Fax (FR):   +33 - (0)9 58 10 27 44
>>>>> email:      Rainer-vfylz/Ys1k4 at public.gmane.org
>>>>>
>>>>> Skype:      RMkrug
>>>>>
>>>>> _______________________________________________
>>>>> Rcpp-devel mailing list
>>>>> Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org
>>>>>
>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Rcpp-devel mailing list
>>>> Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>
>> _______________________________________________
>> Rcpp-devel mailing list
>> Rcpp-devel at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>
> 
> 


- -- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Natural Sciences Building
Office Suite 2039
Stellenbosch University
Main Campus, Merriman Avenue
Stellenbosch
South Africa

Tel:        +33 - (0)9 53 10 27 44
Cell:       +27 - (0)8 39 47 90 42
Fax (SA):   +27 - (0)8 65 16 27 82
Fax (D) :   +49 - (0)3 21 21 25 22 44
Fax (FR):   +33 - (0)9 58 10 27 44
email:      Rainer at krugs.de

Skype:      RMkrug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkzjpBYACgkQoYgNqgF2egrPFQCcDhoEaLbiV87gQLV+i21BSsbO
cSoAoIsUsUx6CCLZBIYVZYnzY2ry9TxC
=mnzJ
-----END PGP SIGNATURE-----


More information about the Rcpp-devel mailing list