[Rcpp-devel] Question concerning use of Raster in C vi Rcpp plugin and inline

Romain Francois romain at r-enthusiasts.com
Wed Nov 17 10:26:21 CET 2010


Le 17/11/10 10:17, Rainer M Krug a écrit :
> On 11/16/2010 05:49 PM, Douglas Bates wrote:
> Hi Douglas,
>
>> I enclose a rewrite which compiles and executes without changing the
>> arguments.
>
> That looks much better and "C like" - thanks a lot.
>
>> I am not entirely sure that it is doing the correct
>> calculation.  I tried to compare the results from calls like
>>
>> print(range(sd2D))
>> set.seed(1234)
>> print(system.time(dispSeedsC<- disperse(seeds.m, sd2D, dispRparForSugarOrg)))
>> set.seed(1234)
>> print(system.time(dispSeedsR<- disperse(seeds.m, sd2D, dispRparFor)))
>> print(range(sd2D))
>>
>> and all.equal(dispSeedsC, dispSeedsR) fails, so I may have introduced
>> errors.
>
> I will look at that - and I am optimistic that I will sort that out now.
>
>> I am still somewhat confused by the logic.
>
> What I want to calculate is the dispersal of seeds into each cell from
> the neighboring cells, defined in sd2D. sd2D is a probability density
> function, and seeds.m is the initial seed distribution of to be
> dispersed seeds.
>
> Well - the basic idea is of a mowing-window calculation over seeds.m,
> where the window is defined by sd2D, and the calculation is based on the
> rbinom numbers drawn from all cells around a particular cell (x,y),
> covered y sd2D; the prob of rbinom is the value of sd2D, the size is the
> value of the underlying cell in seeds.m, n is 1. All numbers drawn are
> added up and stored in the cell (x,y). And this is repeated for all cells.
>
> Now, I need the values for all cells, even those where sd2D does not
> completely overlap with seeds.m, and this is why I buffer seeds.m with
> NAs to get seeds, on which the calculations are now performed.
>
> Hope this clarifies my problem, and my somehow strange approach (I am
> thinking of skipping the buffering and to check for overlap in the loop
> itself).
>
> A procedural question:
>
> You attached below a file named disp.cpp, which indicates to me that
> your c code is in a separate file - is this the case? And if yes, how
> can I do that with the inline command? Or do I have to compile
> externally and use Rcpp directly?

Ultimately, making a package is the best action.

Otherwise, I guess you could do something like this:

f <- cxxfunction( signature(....),
	paste( readLines( "disp.cpp" ), collapse = "\n" ),
         plugin = "Rcpp" )


I'll look at the rest of the thread and try to provide some help later.

Romain

> Cheers and thanks a lot for your help,
>
> Rainer
>
>>
>> On Tue, Nov 16, 2010 at 10:15 AM, Douglas Bates<bates-GX8I/T4BApV4piUD7e9S/g at public.gmane.org>  wrote:
>>> On Tue, Nov 16, 2010 at 9:43 AM, Rainer M Krug<Rainer-vfylz/Ys1k4 at public.gmane.org>  wrote:
>>>> On 11/16/2010 03:16 PM, Douglas Bates wrote:
>>>>> On Tue, Nov 16, 2010 at 4:31 AM, Rainer M Krug<r.m.krug-Re5JQEeQqe8AvxtiuMwx3w-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org>  wrote:
>>>>> Hi
>>>>>
>>>>> I am new to C, and my question might be basic, but I am struggling with it:
>>>>>
>>>>> I have about 10 lines of code in a simulation, which take up about 50%
>>>>> of the whole simulation, As the simulation, takes several hours to run,
>>>>> I want to make the code faster. After trying different approaches in R,
>>>>> with different degrees of success, I decided to try out inline and Rcpp
>>>>> for this.
>>>>>
>>>>> I did some progress (using sugar), and the net-result is so far, that it
>>>>> is as fast / slow as R; but as I said, this is a first step and I am
>>>>> happy so far. My problem now is, in the R code, I have the following
>>>>> (the complete code is below):
>>>>>
>>>>> s<- seeds[x:(x+dx2), y:(y+dy2)]
>>>>> dispSeeds[x,y]<- sum(
>>>>>                                 rbinom(
>>>>>                                        s,
>>>>>                                        s,
>>>>>                                        sd2D
>>>>>                                        ),
>>>>>                                 na.rm = TRUE
>>>>>                                 )
>>>>>
>>>>>> I think it unlikely that you will be able to write C/C++ code that
>>>>>> runs much faster than that expression in R.  The sum and rbinom
>>>>>> functions are vectorized and other than a small interpretation
>>>>>> overhead not much can be saved by going to compiled code.
>>>>
>>>> The main reason why I am trying to use C are the loops around the code
>>>> above. See attached code example.
>>>>
>>>>>
>>>>> As the rbinom part is only using the vector representation of s, I have
>>>>> already translated it into C.
>>>>>
>>>>> So my problem / question is: can I use a similar syntax for s in inline
>>>>> using sugar?
>>>>>
>>>>> As the code above sits in two large loops, I would like to translate the
>>>>> whole loops into C.
>>>>>
>>>>> Below the code (R and R / inline):
>>>>>
>>>>> maxx, maxy, dx, dy: scalar integers
>>>>> seeds: integer matrix
>>>>> dispSeeds: integer matrix
>>>>>
>>>>> The pure R code:
>>>>>
>>>>> #+BEGIN_SRC R
>>>>>   dispRparFor<- function(
>>>>>                           maxx  = maxx,
>>>>>                           maxy  = maxy,
>>>>>                           seeds = seeds,
>>>>>                           sd2D  = sd2D,
>>>>>                           dx    = dx,
>>>>>                           dy    = dy,
>>>>>                           dispSeeds = dispSeeds # the return value
>>>>>                           ) {
>>>>>     dx2<- 2*dx
>>>>>     dy2<- 2*dy
>>>>>     for ( y in 1:maxy ) {
>>>>>       cat(y, " ")
>>>>>       for (x in 1:maxx) {
>>>>>         s<- seeds[x:(x+dx2), y:(y+dy2)]
>>>>>         if (all(is.na(s))) {
>>>>>           dispSeeds[x,y]<- NA
>>>>>         } else {
>>>>>           dispSeeds[x,y]<- sum(
>>>>>                                 rbinom(
>>>>>                                        s,
>>>>>                                        s,
>>>>>                                        sd2D
>>>>>                                        ),
>>>>>                                 na.rm = TRUE
>>>>>                                 )
>>>>>         }
>>>>>       }
>>>>>     }
>>>>>     return(dispSeeds)
>>>>>   }
>>>>> #+END_SRC
>>>>>
>>>>>
>>>>> How far I got with using inline and Rcpp / sugar:
>>>>>
>>>>>
>>>>> #+BEGIN_SRC R
>>>>>   dispRparForSugar<- function(
>>>>>                                maxx  = maxx,
>>>>>                                maxy  = maxy,
>>>>>                                seeds = seeds,
>>>>>                                sd2D  = sd2D,
>>>>>                                dx    = dx,
>>>>>                                dy    = dy,
>>>>>                                dispSeeds = dispSeeds # the return value
>>>>>                                ) {
>>>>>     library(inline)
>>>>>     library(Rcpp)
>>>>>     dx2<- 2*dx
>>>>>     dy2<- 2*dy
>>>>>
>>>>>
>>>>>     fx2<- cxxfunction(
>>>>>                        sig = signature(N = "integer", SIZE = "integer",
>>>>> PROB = "double"),
>>>>>                        body = '
>>>>>                                    Rcpp::IntegerVector n (N);
>>>>>                                    Rcpp::IntegerVector size (SIZE);
>>>>>                                    Rcpp::NumericVector prob (PROB);
>>>>>
>>>>>                                    int res;
>>>>>                                    res = 0;
>>>>>
>>>>>                                    for( int i=0; i<n.size(); i++){
>>>>>                                         if (size[i]>0) {
>>>>>                                           res += rbinom( 1, size[i],
>>>>> prob[i] )[0];
>>>>>                                         }
>>>>>                                    }
>>>>>
>>>>>                                    return wrap( res );
>>>>>                                  ',
>>>>>                        plugin = "Rcpp",
>>>>>                        verbose = TRUE
>>>>>                        )
>>>>>
>>>>>
>>>>>
>>>>>     for ( y in 1:maxy ) {
>>>>>       cat(y, " ")
>>>>>       for (x in 1:maxx) {
>>>>>         s<- seeds[x:(x+dx2), y:(y+dy2)]
>>>>>         if (all(is.na(s))) {
>>>>>           dispSeeds[x,y]<- NA
>>>>>         } else {
>>>>>           dispSeeds[x,y]<- fx2(s, s, sd2D)
>>>>>         }
>>>>>       }
>>>>>     }
>>>>>     return(dispSeeds)
>>>>>   }
>>>>> #+END_SRC
>>>>>
>>>>>> It is peculiar to call cxxfunction within another function if the code
>>>>>> being compiled is fixed.  cxxfunction is a function generator and one
>>>>>> usually regards a call to cxxfunction as the equivalent of entering a
>>>>>> function definition.
>>>>
>>>> I changed that - yes, it was peculiar but based on my experimentation.
>>>>
>>>>>
>>>>>> It doesn't appear that you are using n in your C++ code (other than
>>>>>> taking its length) and your C++ code seems to be evaluating
>>>>>> sum(rbinom(1, s, sd2D)) rather than sum(rbinom(s, s, sd2D)).  Is that
>>>>>> intentional?
>>>>
>>>> Left over - changed as well. Thanks.
>>>>
>>>>>
>>>>>> And if your fx2 is always to be called as fx2(s, s, sd2D) then why not
>>>>>> just pass s and sd2D?
>>>>
>>>> Done (partly).
>>>>
>>>>
>>>> To add some more strange things:
>>>>
>>>> I attach the code in a working example as an R script file which will
>>>> show something strange (for me):
>>>>
>>>> the function overwrites a variable, namely sd2D. Am I doing something
>>>> wrong or do I underestimate the dangers of using Rcpp and inline and C
>>>> in general?
>>>
>>> Yes, it will.  You are assigning the NumericVector's s and sd2D to use
>>> the storage of the SD2D argument so when you assign s[indS] in the
>>> first loop you are overwriting the contents of SD2D.
>>>
>>> I am testing a rewrite now.
>>>
>>>> In addition, the code is not doing what it is supposed to be doing, but
>>>> that is a different question at the moment.
>>>>
>>>> Cheers,
>>>>
>>>> Rainer
>>>>
>>>>>
>>>>>
>>>>>
>>>>> Ultimately, I would like to have the complete dispRparFor function in C.
>>>>>
>>>>>
>>>>> For the ones interested, I could send a working example with data.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Rainer
>>>>>
>>>> _______________________________________________
>>>> Rcpp-devel mailing list
>>>> Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus-XMD5yJDbdMReXY1tMh2IBg at public.gmane.org
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>>>>>
>>>>
>>>>
>>>> --
>>>> Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
>>>> Biology, UCT), Dipl. Phys. (Germany)
>>>>
>>>> Centre of Excellence for Invasion Biology
>>>> Natural Sciences Building
>>>> Office Suite 2039
>>>> Stellenbosch University
>>>> Main Campus, Merriman Avenue
>>>> Stellenbosch
>>>> South Africa
>>>>
>>>> Tel:        +33 - (0)9 53 10 27 44
>>>> Cell:       +27 - (0)8 39 47 90 42
>>>> Fax (SA):   +27 - (0)8 65 16 27 82
>>>> Fax (D) :   +49 - (0)3 21 21 25 22 44
>>>> Fax (FR):   +33 - (0)9 58 10 27 44
>>>> email:      Rainer-vfylz/Ys1k4 at public.gmane.org
>>>>
>>>> Skype:      RMkrug
>>>>
>>>> _______________________________________________
>>>> Rcpp-devel mailing list
>>>> Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>>>
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Rcpp-devel mailing list
>>> Rcpp-devel-Z+qqJ2/841dDXCDGMXqaGq2UG9VpUWMKQH7oEaQurus at public.gmane.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>


-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9VOd3l : ZAT! 2010
|- http://bit.ly/c6DzuX : Impressionnism with R
`- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube




More information about the Rcpp-devel mailing list