[Rcpp-devel] R.e. Speed gain assistance (Wray, Christopher)
Christian Gunning
xian at unm.edu
Fri Aug 19 02:48:23 CEST 2011
On Thu, Aug 18, 2011 at 7:57 AM, Hadley Wickham <hadley at rice.edu> wrote:
>> Take a look at this (unweighted) sample() function. It's giving R a
>> run for it's money, and is pretty fast even for very large n, and it
>> looks statistically correct (not sure if I'm glossing over ugly,
>> machine-specific details of double->int conversion here). Does this
>> shed any light on your question?
>> mysample<-cxxfunction( signature(x='numeric', n="numeric"), src1, plugin='Rcpp')
>>
>> system.time(result <- mysample(1:50, 1e7))
>> system.time(resultR <- sample(1:50, 1e7, replace=T))
>
> Don't forget about sample.int:
>
>> system.time(result <- mysample(1:50, 1e7))
> user system elapsed
> 0.493 0.064 0.557
>> system.time(resultR <- sample(1:50, 1e7, replace=T))
> user system elapsed
> 0.872 0.089 0.962
>> system.time(resultR <- sample.int(1:50, 1e7, replace=T))
> user system elapsed
> 0.209 0.001 0.212
>
> Hadley
So, I can't figure out how to import and use sample.int into C++. I
keep getting the following when I exchange sample.int for sample,
below. I'm guess it's the interaction between IntegerVector and R??
Not sure it matters, but I'm surprised/confused:
cpp_exception("invalid first argument", "Rcpp::eval_error")
src2<-'
NumericVector xx(x);
int xx_sz = xx.size(); // index of last xx
// generate 0:(length(xx)-1)
IntegerVector index(xx_sz, 1.0); // cant use 1 here, 1.0 still required
std::partial_sum(index.begin(), index.end(), index.begin());
index = index -1;
// sample out of index
int nn=as<int>(n);
IntegerVector sindex(nn);
NumericVector ret(nn);
//Function sample("sample.int"); // doesnt work
Function sample("sample");
RNGScope scope;
// sindex has length nn
sindex = sample(index, _["size"]=nn, _["replace"] = true);
for (int i=0; i<nn; i++) {
ret[i] = xx[ sindex[i] ];
};
return(ret);
'
mysample2<-cxxfunction( signature(n="numeric", x='numeric'), src2,
plugin='Rcpp')
print(system.time(result <- mysample2( 1e7, 50:1)))
I can shave a little time off by changing everything to NumericVectors
and using ret in place of sindex (and thus avoiding an extra Vector
allocation of size nn), but then I don't expect sample.int to work
anyway.
Moral: In this case, a pure C++ loop is cleaner *and* faster. Not to
beat a dead horse or anything...
-Christian
--
A man, a plan, a cat, a ham, a yak, a yam, a hat, a canal – Panama!
More information about the Rcpp-devel
mailing list