[Rcpp-devel] [R] Find number of elements less than some number: Elegant/fastsolution needed
Douglas Bates
bates at stat.wisc.edu
Fri Apr 15 14:34:01 CEST 2011
On Thu, Apr 14, 2011 at 11:47 PM, Christian Gunning <xian at unm.edu> wrote:
> On Thu, Apr 14, 2011 at 7:02 PM,
> <rcpp-devel-request at r-forge.wu-wien.ac.at> wrote:
>
>> I was able to write a very short C++ function using the Rcpp package
>> that provided about a 1000-fold increase in speed relative to the best
>> I could do in R. I don't have the script on this computer so I will
>> post it tomorrow when I am back on the computer at the office.
>>
>> Apologies for cross-posting to the Rcpp-devel list but I am doing so
>> because this might make a good example of the usefulness of Rcpp and
>> inline.
>
> And RcppArmadillo, as the case may be.
>
> This is a cool little problem. In the examples given, I'd caution
> people against comparing apples and durian. The sort(x) is a cost
> that should be considered *within* each implementation. I used
> Armadillo to sort (src, f4), and get another 100% worth of speedup
> that I can't reproducing using R's sort (src1, f1-f3). If i modify
> SEXP in-place (and this always confuses me, so I tend to avoid it),
> I'm seeing an additional ~5-10% speed gain (src2, f5) -- the advantage
> of this last seems to be primarily in memory-constrained applications.
>
> On to the code!
>
> src = '
> NumericVector xx_(clone(x)), yy_(clone(y));
> int nxx = xx_.size();
> int nyy = yy_.size();
> arma::vec xx(xx_), yy(yy_);
> yy = sort(yy);
> xx = sort(xx);
> //
> //
> int j = 0; //gt index for yy
> for (int i=0; i < nxx; i++) {
> while ((j < nyy) && ( xx(i) > yy(j) ) ) {
> j++;
> }
> xx_(i) = j;
> }
> return (xx_);
> '
>
> src1 = '
> NumericVector xx_(clone(x)), yy_(clone(y));
> // assumes x & y are already sorted
> arma::vec xx(xx_), yy(yy_);
> int nxx = xx.n_elem;
> int nyy = yy.n_elem;
> int j = 0; //gt index for yy
> for (int i=0; i < nxx; i++) {
> while ((j < nyy) && ( xx(i) > yy(j) ) ) {
> j++;
> }
> xx_(i) = j;
> }
> return (xx_);
> '
>
> src2 = '
> NumericVector xx_(x), yy_(y); //kinda scary
> int nxx = xx_.size();
> int nyy = yy_.size();
> arma::vec xx(xx_.begin(), nxx, false), yy(yy_.begin(), nyy, false);
> //really kinda scary
> yy = sort(yy);
> xx = sort(xx);
> //
> //
> int j = 0; //gt index for yy
> for (int i=0; i < nxx; i++) {
> while ((j < nyy) && ( xx(i) > yy(j) ) ) {
> j++;
> }
> xx_(i) = j;
> }
> return (xx_);
> '
>
> require(inline)
> require(RcppArmadillo)
> f1 <- function(x, y) { sort(length(y) - findInterval(-x, rev(-sort(y))));}
> f2 <- function(x, y) {x = sort(x); length(y) - findInterval(-x, rev(-sort(y)))}
> f3.a <- cxxfunction(signature(x="numeric", y="numeric"), src1,
> plugin='RcppArmadillo')
> f3 <- function(x,y) {
> x <- sort(x)
> y <- sort(y)
> return(f3.a(x,y))
> }
> f4 <- cxxfunction(signature(x="numeric", y="numeric"), src,
> plugin='RcppArmadillo')
> ## danger -- violates R semantics
> f5 <- cxxfunction(signature(x="numeric", y="numeric"), src2,
> plugin='RcppArmadillo')
>
>
> ## this is a really ugly test. ygwypf, i suppose :)
>
> for (i in 1:5) {
> x1 <- x <- rnorm(5e6)
> y1 <- y <- rnorm(5e6)
> print( cbind(
> r1=system.time(r1 <- f1(x,y)),
> r2=system.time(r2 <- f2(x,y)), r3=system.time(r3 <- f3(x1,y1)),
> r4 = system.time(r4 <- f4(x,y)), r5 = system.time(r5 <- f5(x,y))
> ))
> }
> print(all.equal(r1, r2))
> print(all.equal(r1, r3))
> print(all.equal(r1, r4))
> print(all.equal(r1, r5))
I agree that the cost of sorting should be taken into account but I
don't think you need to go to the RcppArmadillo package to get a sort
function. Why not use std::sort?
Also, I did sequential comparisons as shown in your code but after
reading Bill Dunlap's response and looking at the documentation for
the findInterval function in R I smacked myself on the forehead and
thought "Duh - binary search, of course".
I haven't looked at the C code underlying the findInterval function
yet so I don't know if Martin has clever tricks for sorted x and y.
However the documentation for the std::upper_bound template at
cplusplus.com shows how to use that for the case here. The best I can
think of for sorted x and y is to pass the upper bound from x[i] as
the first argument in the call to std::upper_bound for x[i+1].
Unfortunately I am staring at a series of deadlines today so
implementations and comparisons may need to wait until tomorrow.
P.S. to Christian: Check the archives for several of Dirk's posts to
the rcpp-devel list where he has used the rbenchmark package to
produce clean output from comparisons of implementations of
algorithms.
More information about the Rcpp-devel
mailing list