[Rcpp-devel] Accumulating results in an Rcpp::List

Romain Francois romain at r-enthusiasts.com
Sun Jun 20 08:24:09 CEST 2010


Le 20/06/10 03:32, Douglas Bates a écrit :
>
> I want to return a named list of results from a call using Rcpp but
> the components of the list are accumulated in stages.  At the C++
> level i have a templated class that contains several data members that
> are themselves instances of classes.
>
> What I have been doing is defining methods for each of the subclasses
> that update the list and return it.  The signatures look like
>
> Rcpp::List updateList(Rcpp::List&)
>
> This particular signature reflects my R background where I would might
> write something like
>
> ans<- updateList(comp3, updateList(comp2, updateList(comp1, list()))
>
> However, it occurs to me that it might be unnecessary to return the
> list in C++ because I am passing a reference to the list so updates
> will change the original argument.
>
> Is this a good idea and does it work as long as I pass by reference?
> (I'm still a little vague on the distinction between passing a
> reference and passing a classed object.)

Hi,

That's a very good idea to pass by reference. An example of this in my 
response to the R-help thread: 
http://news.gmane.org/gmane.comp.lang.r.general

Passing by value means copy constructor, for example :

typedef std::vector<double> vec ;

vec dostuff( vec& x ){
	x.push_back( 1.0 ) ;
	return x ;
}

this involves the copy constructor of std::vector<double> which copies 
all elements. That example is particularly dangerous because :

vec x( 5 ) ;
vec y = dostuff( x ) ;
y.push_back( 10.0 ) ;

dostuff will modify x, but then return a copy of it. This is bad, very 
bad. right after the call to dostuff, x and y would contain the same 
elements, but they are distinct objects. This is almost surely going to 
lead to problems that will be hard to debug.



Now the situation might even be more problematic with Rcpp API classes 
for which the copy constructor is cheap (ie just copies the underlying 
SEXP):


require( inline )
require( Rcpp )

inc <- '
List dostuff( List& x ){
	x.push_back( 1.0 ) ;
	return x ;
}'

code <- '
List x = List::create( _["x"] = 1, _["y"] = "foo" ) ;
List y = dostuff( x ) ;
Rprintf( "SEXP(x) = <%p>, SEXP(y) = <%p>\\n", (SEXP)x, (SEXP)y ) ;

x[ "z" ] = 10 ;
y[ "foo" ] = "bar" ;
Rprintf( "SEXP(x) = <%p>, SEXP(y) = <%p>\\n", (SEXP)x, (SEXP)y ) ;

return R_NilValue ;
'
fx <- cxxfunction( , code, include = inc, plugin = "Rcpp" )
fx()


I get :

SEXP(x) = <0x100b72fb0>, SEXP(y) = <0x100b72fb0>
SEXP(x) = <0x100b72fb0>, SEXP(y) = <0x100b72fb0>
SEXP(x) = <0x100b73118>, SEXP(y) = <0x100b731a8>


The first time the pointers are printed out, we can see that this is the 
same SEXP, even though x and y are distinct objects (because the 
Rcpp::List copy constructor has been used).

Then we update existinug components of x and y. Since the component 
already exist in the list, only those components are modified and the 
SEXP stays the same.

Then we make changes that update the SEXP : we add new data to the list. 
This requires modification of the underlying SEXP. because x and y are 
distinct objects, their SEXP are modified independently, resulting in 
the third line showing different SEXPs.



I'm not sure I make my point clear, but the key is I think to avoid 
passing Rcpp API objects by value and always pass them by reference. 
This is somewhat a weakness of the API.

So in your example, I would use this signature:

void updateList(Rcpp::List&)

or you can even return the reference if you want to chain operations, 
like in your R example:

Rcpp::List& updateList(Rcpp::List&)

Romain

-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/98Uf7u : Rcpp 0.8.1
|- http://bit.ly/c6YnCi : graph gallery collage
`- http://bit.ly/bZ7ltC : inline 0.3.5



More information about the Rcpp-devel mailing list