[Rcpp-devel] R.e. First foray into Rcpp and comparison of speed with R

Dirk Eddelbuettel edd at debian.org
Fri Sep 2 02:13:29 CEST 2011


On 1 September 2011 at 16:51, Christian Gunning wrote:
| On Thu, Sep 1, 2011 at 9:19 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
| >
| >
| > I think (but feel to free me wrong by looking at headers and code :-) that
| > operator() and operator[] do the same thing.
| 
| I see a huge difference in this set of
| examples ( which are painfully repetitive on many levels)
| that are identical except for [] vs. (). This is true whether
| or not ret appears on the RHS. Am I missing something here?
| 
| library(inline);
| library(Rcpp);
| library(rbenchmark);
| src1<-'
| int nn=as<int>(n);
| NumericVector ret(nn);
| for (int i=0; i<nn; i++) {
|     for (int j=0; j<nn; j++) {
|         ret[i] = ret[i]+1;
|         // alternately,
|         // ret[i] = 1;
|     }
| };
| return(ret);
| '
| 
| src1a<-'
| int nn=as<int>(n);
| NumericVector ret(nn);
| for (int i=0; i<nn; i++) {
|     for (int j=0; j<nn; j++) {
|         ret(i) = ret(i)+1;
|         // alternately,
|         // ret(i) = 1;
|     }
| };
| return(ret);
| '
| 
| 
| mysample<-cxxfunction( signature(n="numeric"), src1, plugin='Rcpp')
| mysamplea<-cxxfunction( signature(n="numeric"), src1a, plugin='Rcpp')
| benchmark(
|     bracket = mysample( 1e4),
|     parens = mysamplea( 1e4),
|     replications = 10
| )
| 
|      test replications elapsed relative user.self sys.self user.child sys.child
| 1 bracket           10   1.132 1.000000      1.13        0          0         0
| 2  parens           10  10.190 9.001767     10.18        0          0

Uh-oh. Factor 10. Wow.

So I had to check sources and indeed, for () we seem to bounds checking:


    inline Proxy operator[]( int i ){ return cache.ref(i) ; }
    inline Proxy operator[]( int i ) const { return cache.ref(i) ; }
    inline Proxy operator()( const size_t& i) {
        return cache.ref( offset(i) ) ;
    }
    // TODO: should it be const_Proxy
    inline Proxy operator()( const size_t& i) const {
        return cache.ref( offset(i) ) ;
    }

where the () variants call this:

    /**
     * one dimensional offset doing bounds checking to ensure
     * it is valid
     */
    size_t offset(const size_t& i) const {
        if( static_cast<R_len_t>(i) >= ::Rf_length(RObject::m_sexp) ) throw index_out_of_bounds() ;
        return i ;
    }

I am still surprised it is that much of a difference maker...
 

| > There is a noNA() wrapper for Rcpp sugar to push performance -- NA checking
| > is implemented on access 'because that is how R does' (and our first task to
| > reproduce numbers you'd get at the R prompt) but if you know what you are
| > doing and are aware of possible pitfalls you can skip this.
| 
| Ah, so that's for sugar.  Thanks.
| So I *can* do linear access on a NumericMatrix, ala
| myMatrix[ computed_index ], which goes in column-major order.
| Curiously, I don't see any timing difference here.  Now I'm
| confused...
| 
| src2<-'
| int index, nn=as<int>(n);
| NumericMatrix ret(nn, nn);
| for (int i=0; i<nn; i++) {
|     for (int j=0; j<nn; j++) {
|         index = i+j*nn;
|         ret[index] = ret[index] + index;
|     }
| };
| return(ret);
| '
| 
| src2a<-'
| int index, nn=as<int>(n);
| NumericMatrix ret(nn, nn);
| for (int i=0; i<nn; i++) {
|     for (int j=0; j<nn; j++) {
|         index = i+j*nn;
|         ret(i, j) = ret(i, j) + index;
|     }
| };
| return(ret);
| '
| matfil<-cxxfunction( signature(n="numeric"), src2, plugin='Rcpp')
| matfila<-cxxfunction( signature(n="numeric"), src2a, plugin='Rcpp')
| 
| benchmark(
|     bracket = matfil( 1e4),
|     parens = matfila( 1e4),
|     replications = 10
| )
| 
|      test replications elapsed relative user.self sys.self user.child sys.child
| 1 bracket           10  17.495 1.000000     13.14     4.24          0         0
| 2  parens           10  17.514 1.001086     13.10     4.31          0         0


Matrix is different because we cannot do [i, j] for C legacy reasons. So both
variants may be fast...

Dirk


-- 
Two new Rcpp master classes for R and C++ integration scheduled for 
New York (Sep 24) and San Francisco (Oct 8), more details are at
http://dirk.eddelbuettel.com/blog/2011/08/04#rcpp_classes_2011-09_and_2011-10
http://www.revolutionanalytics.com/products/training/public/rcpp-master-class.php


More information about the Rcpp-devel mailing list