[Rcpp-devel] Forcing a shallow versus deep copy

Romain Francois romain at r-enthusiasts.com
Fri Sep 13 17:56:28 CEST 2013


Le 13/09/13 14:00, JJ Allaire a écrit :
>     Is it a big deal that we would cheat on chat reference passing means ?
>
>
> If you want to implement these sort of semantics I think at a _minimum_
> the type should be const & (otherwise it looks like you are going to
> actually modify the matrix in place which would appear to bypass the
> implicit memory barrier of SEXP). Realize that you won't actually bypass
> the memory barrier but it sure looks like you intend to for a reader of
> the code.
>
>              Rcpp::RNGScope __rngScope;
>              arma::mat& m = Rcpp::as<arma::mat& >(mSEXP);
>              test_ref(m);
>
>
> It looks like this behavior changed as of rev 4400 when the full_name()
> method was introduced. I may not understand the mechanism you
> established 100% but to me this generated code looks potentially
> problematic if you are taking a reference to a stack variable establish
> within the as<> method. My guess is that you have something more
> sophisticated going on here and there is no memory problem, however I'd
> love to understand things a bit better to be 100% sure there isn't
> something to drill into further.

Here is where I am now. To wrap up this function:

// [[Rcpp::export]]
void test_const_ref( const arma::mat& m ){}

This code gets created by the attributes parser:

RcppExport SEXP sourceCpp_71975_test_const_ref(SEXP mSEXP) {
BEGIN_RCPP
     {
         Rcpp::RNGScope __rngScope;
         Rcpp::InputParameter< const arma::mat&> m(mSEXP );
         test_const_ref(m);
     }
     return R_NilValue;
END_RCPP
}

The difference is this line:

Rcpp::InputParameter< const arma::mat&> m(mSEXP );

instead of this line:

const arma::mat& m = Rcpp::as< const arma::mat& >( mSEXP ) ;



The InputParameter template class need to be able to take a SEXP asinput 
and have a conversion operator to the requested type. So the default 
implementation obvisouly used Rcpp::as, this is how the default class is 
implemented:

     template <typename T>
     class InputParameter {
     public:
         InputParameter(SEXP x_) : x(x_){}

         inline operator T() { return as<T>(x) ; }

     private:
         SEXP x ;
     } ;

So we get exactly the same as before. What we gain however is that we 
can redefine InputParameter for other types and we can take advantage of 
its destructor to do something when the InputParameter object goes out 
of scope. Here is how I implemented a custom version for const reference 
to arma::Mat :

template <typename T>
     class InputParameter< const arma::Mat<T>& > {
     		public:
     			typedef const typename arma::Mat<T>& const_reference ;
     			
     			InputParameter( SEXP x_ ) : m(x_), mat( m.begin(), m.nrow(), 
m.ncol(), false ){}
     			
     			inline operator const_reference(){
     				return mat ; 	
     			}
     			
     		private:
     			Rcpp::Matrix< Rcpp::traits::r_sexptype_traits<T>::rtype > m ;
     			arma::Mat<T> mat ;
     } ;

The arma::mat is a member of InputParameter, constructed via the 
advanced constructor, so using the same memory as the R object, and we 
retrieve a reference to this object with the operator const_reference


This is simple and elegant. And now we can pass down references and 
const references of armadillo matrices from R without performance penalty.

This makes using RcppArmadillo even more compelling.

It leaves the issue of what happens when we return an armadillo matrix. 
At the moment, this still makes a copy of the data. I don't see a way 
around that just yet. If we want to avoid making a copy, we need to 
construct the arma::mat out of R memory and return that R object.

I also have to deal with references and const references of other arma 
types (arma::rowvec, etc ...).

I'm happy to discuss the changes I've made in Rcpp and RcppArmadillo for 
this. For now I've included the version for non const references too, 
but maybe I should not, although it does work perfectly. This is much 
better ythan what we used to have where we would allow passing 
references but still make lots of data copies which sort of goes against 
using references. When I see a function that passes an object by 
reference, I tend to think that calling the function is cheap. Now it is.


I'd specifically would like to hear from Gabor and Baptiste about the 
simplification of being able to just use (const) references as inputs 
and have RcppArmadillo simply borrow memory from the R object :

// [[Rcpp::export]]
arma::mat plus( const arma::mat& m1, const arma::mat& m2){
     return m1 + m2 ;
}

Romain

-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30



More information about the Rcpp-devel mailing list