[Rcpp-devel] calling R function triggers tryCatch() ???
Romain Francois
romain at r-enthusiasts.com
Mon Oct 14 13:33:26 CEST 2013
Hello,
I've added some of my spices in
https://gist.github.com/romainfrancois/6974012
rowApply3 creates a NumericVector of the appropriate dimensions and
assign data to it in the loop. Since the dimensions match, the
assignment operator will not change the underlying SEXP so the call made
by `Language call(FUN, row);` does not need to be updated by the mean of
the proxy as in rowApply2. In theory, this should lead to less
allocations so better performance. In rowApply2 we allocated a vector at
each iteration of the loop. Now we don't see much of a performance
boost. This might be related to the implementation of MatrixRow.
Now, the second potential source of performance improvement is with
using mean_ instead of mean :
mean_ <- function(.) .Internal(mean(.))
mean gives you a lot of nice things like dispatching and additional
arguments to handle trimming, etc ... but you have to pay for them even
when you don't use them.
rowApply4 uses less syntactic sugar, so it is less nice to look at. The
idea was to see what was the cost of MatrixRow.
Here are the results of the benchmarks on my machine on the 30x50 example:
Unit: microseconds
expr min lq median uq max neval
rowMeans(M) 10.7 14.4 15.1 15.7 21.1 100
apply(M, 1L, mean) 350.6 362.0 371.2 382.7 2225.8 100
rowApply0(M, mean) 635.8 655.2 672.3 716.8 2595.1 100
rowApply1(M, mean) 225.1 233.8 239.7 249.8 2212.1 100
rowApply2(M, mean) 221.3 229.4 236.6 260.2 329.0 100
rowApply3(M, mean) 217.0 223.9 229.1 247.7 2077.2 100
rowApply3(M, mean_) 34.3 38.2 39.3 41.1 47.0 100
rowApply4(M, mean_) 34.1 36.7 38.0 39.3 47.1 100
rowMeans is better because:
- it does not involve calling the interpreter in the loop. even with
fast_eval, this has cost.
- all is done internally with a single loop looking like this pseudo code:
NumericVector means( ncol ) ;
for( int j=0, k=0; j=ncol; j++ )
for(int i=0; i<nrow; i++)
means[i] += x[k] ;
means /= ncol ;
However rowMeans can only do means. Our rowApply can apply any R function.
One strategy we are using dplyrRcpp (soon to be merged into dplyr) is
recognize known patterns and then decide whether we want to use the R
interpreter or some optimized code. Here we would recognize "mean" and
not use the R interpreter at all.
This is a nice game as it involves non standard evaluation, etc ... so
that sort of goes beyond the scope of this thread, but using the
internal dplyr tools, we can definitely make a fast flexible rowApply
that would call optimized code when recognizing known patterns and fall
back to the R interpreter otherwise.
Romain
Le 11/10/13 16:27, Hadley Wickham a écrit :
> Interestingly, that shows that rowApply2 is actually slightly _slower_
> than rowApply1.
>
> Hadley
>
> On Fri, Oct 11, 2013 at 9:24 AM, Hadley Wickham <h.wickham at gmail.com> wrote:
>> FYI, I recommend using microbenchmark which uses a much higher
>> precision timer so you can see the variability as well as the mean
>> times. Also note that you can use /*** R */ to include R code that's
>> automatically run when benchmarking:
>> https://gist.github.com/hadley/6935459
>>
>> That gives:
>>
>>> microbenchmark(rowMeans(M),
>> + apply(M, 1L, mean),
>> + rowApply0(M, mean),
>> + rowApply1(M, mean),
>> + rowApply2(M, mean))
>> Unit: microseconds
>> expr min lq median uq max neval
>> rowMeans(M) 5.31 7.11 7.54 7.84 13.9 100
>> apply(M, 1L, mean) 46.23 49.46 50.78 53.06 136.9 100
>> rowApply0(M, mean) 50.87 53.81 55.36 57.54 85.3 100
>> rowApply1(M, mean) 18.92 20.14 21.20 22.44 36.9 100
>> rowApply2(M, mean) 18.73 20.18 21.27 22.40 34.0 100
>>
>>> M <- matrix(rnorm(1500L), nrow=30L);
>>
>>> microbenchmark(rowMeans(M),
>> + apply(M, 1L, mean),
>> + rowApply0(M, mean),
>> + rowApply1(M, mean),
>> + rowApply2(M, mean))
>> Unit: microseconds
>> expr min lq median uq max neval
>> rowMeans(M) 9.86 12 13 14.1 21.9 100
>> apply(M, 1L, mean) 265.28 288 308 349.8 475.2 100
>> rowApply0(M, mean) 506.54 541 575 604.9 5469.3 100
>> rowApply1(M, mean) 174.37 185 206 232.4 334.0 100
>> rowApply2(M, mean) 171.55 183 223 245.8 4244.7 100
>>
>> On Fri, Oct 4, 2013 at 11:36 AM, Thomas Tse <tommy_228_228 at yahoo.com.hk> wrote:
>>> The next test compares the speeds:
>>>
>>> // [[Rcpp::export]]
>>> NumericVector rowApply0(NumericMatrix& x, const Function& FUN)
>>> {
>>> int n = x.nrow();
>>> NumericVector result = no_init(n);
>>>
>>> for (int r = 0; r < n; r++) {
>>> result[r] = as<double>(FUN(x(r, _) ) );
>>> }
>>> return result;
>>> }
>>>
>>> // [[Rcpp::export]]
>>> NumericVector rowApply1(NumericMatrix& x, const Function& FUN)
>>> {
>>> int n = x.nrow();
>>> NumericVector result = no_init(n);
>>>
>>> for (int r = 0; r < n; r++) {
>>> Language call(FUN, x(r, _)) ;
>>> result[r] = as<double>(call.fast_eval() );
>>> }
>>> return result;
>>> }
>>>
>>> // [[Rcpp::export]]
>>> NumericVector rowApply2(NumericMatrix& x, const Function& FUN)
>>> {
>>> int n = x.nrow();
>>> NumericVector result = no_init(n);
>>>
>>> Language call(FUN, R_NilValue);
>>> Language::Proxy proxy(call, 1);
>>> for (int r = 0; r < n; r++) {
>>> proxy = x(r, _) ;
>>> result[r] = as<double>(call.fast_eval() );
>>> }
>>> return result;
>>> }
>>>
>>>> M <- matrix(rnorm(15L), nrow=3L);
>>>> identical(rowMeans(M), apply(M, 1L, mean));
>>> [1] TRUE
>>>> identical(rowMeans(M), rowApply0(M, mean));
>>> [1] TRUE
>>>> identical(rowMeans(M), rowApply1(M, mean));
>>> [1] TRUE
>>>> identical(rowMeans(M), rowApply2(M, mean));
>>> [1] TRUE
>>>> benchmark(rowMeans(M),
>>> + apply(M, 1L, mean),
>>> + rowApply0(M, mean),
>>> + rowApply1(M, mean),
>>> + rowApply2(M, mean),
>>> + replications=300000L);
>>> test replications elapsed relative user.self sys.self
>>> user.child sys.child
>>> 3 rowApply0(M, mean) 300000 23.26 8.551 23.14 0
>>> NA NA
>>> 2 apply(M, 1, mean) 300000 21.30 7.831 21.13 0
>>> NA NA
>>> 4 rowApply1(M, mean) 300000 10.23 3.761 10.22 0
>>> NA NA
>>> 5 rowApply2(M, mean) 300000 9.87 3.629 9.82 0
>>> NA NA
>>> 1 rowMeans(M) 300000 2.72 1.000 2.72 0
>>> NA NA
>>>
>>>> M <- matrix(rnorm(1500L), nrow=30L);
>>>> benchmark(rowMeans(M),
>>> + apply(M, 1L, mean),
>>> + rowApply0(M, mean),
>>> + rowApply1(M, mean),
>>> + rowApply2(M, mean),
>>> + replications=30000L);
>>> test replications elapsed relative user.self sys.self
>>> user.child sys.child
>>> 3 rowApply0(M, mean) 30000 21.32 48.455 21.30 0
>>> NA NA
>>> 2 apply(M, 1, mean) 30000 11.17 25.386 11.17 0
>>> NA NA
>>> 4 rowApply1(M, mean) 30000 8.24 18.727 8.19 0
>>> NA NA
>>> 5 rowApply2(M, mean) 30000 8.10 18.409 8.08 0
>>> NA NA
>>> 1 rowMeans(M) 30000 0.44 1.000 0.43 0
>>> NA NA
>>>
>>>> M <- matrix(rnorm(150000L), nrow=300L);
>>>> benchmark(rowMeans(M),
>>> + apply(M, 1L, mean),
>>> + rowApply0(M, mean),
>>> + rowApply1(M, mean),
>>> + rowApply2(M, mean),
>>> + replications=3000L);
>>> test replications elapsed relative user.self sys.self
>>> user.child sys.child
>>> 3 rowApply0(M, mean) 3000 26.65 18.379 26.41 0.01
>>> NA NA
>>> 2 apply(M, 1, mean) 3000 19.76 13.628 19.71 0.00
>>> NA NA
>>> 4 rowApply1(M, mean) 3000 12.79 8.821 12.78 0.00
>>> NA NA
>>> 5 rowApply2(M, mean) 3000 12.69 8.752 12.68 0.00
>>> NA NA
>>> 1 rowMeans(M) 3000 1.45 1.000 1.45 0.00
>>> NA NA
>>>
>>> again, as Romain suggested, implementation rowApply2 is faster than
>>> rowApply1, and rowApply0 (which triggers tryCatch) is even slower than R's
>>> apply.
>>>
>>> Thanks Romain !
>>>
>>>
>>> _______________________________________________
>>> Rcpp-devel mailing list
>>> Rcpp-devel at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>
>>
>>
>> --
>> Chief Scientist, RStudio
>> http://had.co.nz/
>
>
>
--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
More information about the Rcpp-devel
mailing list