[Rcpp-devel] Struggling with Rcpp sugar
Romain Francois
romain at r-enthusiasts.com
Sat Nov 17 17:57:50 CET 2012
Another, with mapply. But there we have to use rep since our mapply only
works on vector expressions.
inline double distance(double y, double x){ return pow( (y-x), 2.0 ) ; }
// [[Rcpp::export]]
NumericVector pdist6(double x, NumericVector ys) {
return mapply( ys, rep(x,ys.size()), distance ) ;
}
All this will become more fun with C++11 lambdas.
This one does not do as good as sapply:
Unit: microseconds
expr min lq median uq max
1 pdist1(0.5, ys) 24.692 25.0625 27.8755 28.1225 366.786
2 pdist2(0.5, ys) 30.595 30.9850 31.1210 31.4635 341.235
3 pdist3(0.5, ys) 262.306 262.7620 262.9740 263.8560 565.902
4 pdist4(0.5, ys) 264.561 264.9355 265.1850 266.8125 858.453
5 pdist5(0.5, ys) 15.700 16.1570 16.4030 17.2045 318.126
6 pdist6(0.5, ys) 31.264 31.5755 31.7225 32.3770 332.139
Romain
Le 17/11/12 17:40, Romain Francois a écrit :
> Hi,
>
> While there, consider this version based on sapply:
>
>
> class Distance {
> public:
> typedef double result_type ;
> Distance( double x_ ) : x(x_){}
>
> inline double operator()(double y) const { return pow( (y-x), 2.0 )
> ; }
>
> private:
> double x;
> } ;
>
> // [[Rcpp::export]]
> NumericVector pdist5(double x, NumericVector ys) {
> return sapply( ys, Distance(x) ) ;
> }
>
>
> which here gives me quite good performance:
>
> Unit: microseconds
> expr min lq median uq max
> 1 pdist1(0.5, ys) 24.542 26.4825 27.8405 28.2305 326.597
> 2 pdist2(0.5, ys) 30.628 31.0030 31.1695 31.8715 608.207
> 3 pdist3(0.5, ys) 262.371 262.6280 262.9140 263.7840 563.796
> 4 pdist4(0.5, ys) 264.667 265.0150 265.2025 266.1770 580.343
> 5 pdist5(0.5, ys) 15.715 16.1375 16.3385 17.2195 318.412
>
> Romain
>
>
> Le 17/11/12 14:42, Hadley Wickham a écrit :
>> Hi all,
>>
>> I've included what seems to be a simple application of Rcpp sugar
>> below, but I'm getting some very strange results. Any help would be
>> much appreciate!
>>
>> Thanks,
>>
>> Hadley
>>
>> library(Rcpp)
>> library(microbenchmark)
>>
>> # Compute distance between single point and vector of points
>> pdist1 <- function(x, ys) {
>> (x - ys) ^ 2
>> }
>>
>> cppFunction('
>> NumericVector pdist2(double x, NumericVector ys) {
>> int n = ys.size();
>> NumericVector out(n);
>>
>> for(int i = 0; i < n; ++i) {
>> out[i] = pow(ys[i] - x, 2);
>> }
>> return out;
>> }
>> ')
>>
>> ys <- runif(1e4)
>> all.equal(pdist1(0.5, ys), pdist2(0.5, ys))
>>
>> library(microbenchmark)
>> microbenchmark(
>> pdist1(0.5, ys),
>> pdist2(0.5, ys)
>> )
>> # C++ version about twice as fast, presumably because it avoids a
>> # complete vector allocation.
>>
>>
>> # Sugar version:
>> cppFunction('
>> NumericVector pdist3(double x, NumericVector ys) {
>> return pow((x - ys), 2);
>> }
>> ')
>> all.equal(pdist1(0.5, ys), pdist3(0.5, ys))
>>
>> microbenchmark(
>> pdist1(0.5, ys),
>> pdist2(0.5, ys),
>> pdist3(0.5, ys)
>> )
>> # 10-fold slower?? Maybe it's because I'm using a double instead of
>> # a numeric vector?
>>
>> cppFunction('
>> NumericVector pdist4(NumericVector x, NumericVector ys) {
>> return pow((x - ys), 2);
>> }
>> ')
>> all.equal(pdist1(0.5, ys), pdist4(0.5, ys))
>>
>> # Is this a bug in sugar? Should recycle to length of longest vector.
>> # Let's try flipping the order of operations:
>>
>> cppFunction('
>> NumericVector pdist5(NumericVector x, NumericVector ys) {
>> return pow((ys - x), 2);
>> }
>> ')
>> all.equal(pdist1(0.5, ys), pdist5(0.5, ys))
>> # Where are the missing values coming from??
>>
>>
>
>
--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
R Graph Gallery: http://gallery.r-enthusiasts.com
`- http://bit.ly/SweN1Z : SuperStorm Sandy
blog: http://romainfrancois.blog.free.fr
|- http://bit.ly/RE6sYH : OOP with Rcpp modules
`- http://bit.ly/Thw7IK : Rcpp modules more flexible
More information about the Rcpp-devel
mailing list