[Rcpp-devel] Returning a named (and classed?) list from an RcppExport SEXP function

Romain Francois romain at r-enthusiasts.com
Fri Mar 12 21:23:46 CET 2010


Le 12/03/10 18:23, Douglas Bates a écrit :
>
> On Fri, Mar 12, 2010 at 10:47 AM, Dirk Eddelbuettel<edd at debian.org>  wrote:
>>
>> Hi Doug,
>>
>> Welcome! We're honoured to have you here. And glad you picked the list.
>>
>> On 12 March 2010 at 10:14, Douglas Bates wrote:
>> | I have looked at your (i.e. Dirk and Roman's) submitted Rjournal
>> | article on Rcpp and decided that I should take the plunge.
>> |
>> | To get my feet wet I am starting with a relatively simple example
>> | interfacing to (God help us) 1960's style Fortran written by Mike
>> | Powell and incorporated in the minqa package.
>> |
>> | The value to be returned is a named list of heterogeneous values.  I
>> | can see how to construct such a list in the classic Rcpp API using a
>> | RcppResultSet object but I don't know the best way to accomplish this
>> | in the more modern Rcpp namespace.  Can someone point me in the right
>> | direction?  Also, is there a convenient idiom for attaching an S3
>> | class name to a list to be returned?
>>
>> Yes. See eg the examples I put on my blog, eg inside of RcppArmadillo we
>> simply do (including actual fastLm source here modulu two ifdefs dealing with
>> older versions -- which means you need Armadillo 0.9.0 or later)
>>
>> #include<RcppArmadillo.h>
>>
>> extern "C" SEXP fastLm(SEXP ys, SEXP Xs) {
>>
>>     Rcpp::NumericVector yr(ys);                 // creates Rcpp vector from SEXP
>>     Rcpp::NumericMatrix Xr(Xs);                 // creates Rcpp matrix from SEXP
>>     int n = Xr.nrow(), k = Xr.ncol();
>>
>>     arma::mat X(Xr.begin(), n, k, false);       // reuses memory and avoids extra copy
>>     arma::colvec y(yr.begin(), yr.size(), false);
>>
>>     arma::colvec coef = solve(X, y);            // fit model y ~ X
>>
>>     arma::colvec resid = y - X*coef;            // residuals
>>
>>     double sig2 = arma::as_scalar( trans(resid)*resid/(n-k) );
>>                                                 // std.error of estimate
>>     arma::colvec stderrest = sqrt( sig2 * diagvec( arma::inv(arma::trans(X)*X)) );
>>
>>     Rcpp::Pairlist res(Rcpp::Named( "coefficients", coef),
>>                        Rcpp::Named( "stderr", stderrest));
>>     return res;
>> }
>>
>> [ By the way, I still think I should try to implement all four of your "how
>> to do lm" solutions from the older R News / R Journal piece in Armadillo one
>> day :-)   And see what the timing impacts would be. ]
>>
>> That works "on the fly" with Pairlist and named.  We also have Rcpp::List
>> examples, but AFAICR you need to 'predeclare' how long your list is and the
>> set each element.  The 200+ unit tests will have examples. Better docs will
>> appear one day...
>
> Nice approach.  I presume that the Pairlist object is converted to an
> VECSXP in the process of being packaged for return.

No, they are shipped to R as a LISTSXP.

> I will play
> around a bit with expressions like
>
> Rcpp::List res(Rcpp::PairList(Rcpp::Named(...), ...))

This should work since the List( SEXP ) constructor uses coercion. Let 
us know otherwise.

> I would like to have the capability of attaching an attribute named
> "class" to the VECSXP before returning it.

res.attr( "class" ) = "foo" ;

>> | An unrelated question -- has anyone looked at accessing S4 classed
>> | objects in Rcpp?  I have had considerable experience with S4 classes
>> | and objects and could keep such a project in mind as I begin to
>> | explore Rcpp.
>>
>> I'd be happy to go there (time permitting) but very gently as I am still much
>> more firmly in S3 land.  Romain may be more eager :)
>
> One of the nice properties of S4 classed objects is that you can
> define validity checks for classes and bypass some of the code that
> does validity checking of arguments.

For now the only things we have in the api to deal with S4'ness is these 
methods : "isS4", "slot", "hasSlot".

Not sure right now how we could formalize S4 objects in the api, but 
this would definitely be a fine addition.

>> | Yet another question, if i may.   Does coercion take place in
>> | constructing an object like
>> |
>> | Rcpp::NumericVector aa(a);
>> |
>> | where a is an SEXP, say an SEXP argument to a C++ function, where I
>> | accidentally passed an integer vector instead of a double vector?  If
>>
>> Very much so.  Again, see my blog and the RcppVectorExample code.  Int in,
>> Double out. All magic hidden.
>>
>> And of course you can just try it:
>>
>> R>  library(RcppExamples)
>> Loading required package: Rcpp
>> R>  RcppVectorExample(c(1,4,9), api="new")
>> $result
>> [1] 1 2 3
>>
>> $original
>> [1] 1 4 9
>>
>> R>  RcppVectorExample(c(2,5,10), api="new")
>> $result
>> [1] 1.414 2.236 3.162
>>
>> $original
>> [1]  2  5 10
>>
>> R>
>
> Perhaps you are assuming that c(1,4,9) generates an integer vector in
> this example.  In fact, c(1,4,9) is a numeric (in the sense of double
> precision) vector.  You need to write c(1L, 4L, 9L) to ensure you have
> an integer vector or coerce it with as.integer().
>
>> At this level, behaviour is the same with 'classic' API but we know of at
>> least one case where 'classic' doesn't cast a Matrix right and 'new' does.
>>
>> | so, what happens if there is no coercion?  Are the contents of the
>> | SEXP copied to newly allocated storage or not?
>>
>> It depends. In the new API we do fewer copies.
>>
>> | The reason that I ask is because there are some places in the lme4
>> | code where I want to make sure that I have a pointer to the contents
>> | of the original vector because I am want to change the contents
>> | without copying.
>>
>> I think we do that. If you look at the vignette and the timing example -- but
>> using pointer accessors (eg double *x = foo.begin()) and pointer ops (eg x++)
>> we get essentially the same speed as pure and ugly C code.  Because we do
>> away with the copy, and the operator[], for purest speed.
>
> Yes, I like that idea of being able to use foo.begin() to access the
> pointer.  Part of my question related to whether I could count on
> getting the pointer to the original contents from R.

Yes. The objects of the new api have the SEXP as their data member and 
methods applied on objects are relayed back to the SEXP in terms of 
calls t the R API. We use R memory management all the way.

There are some times though where memory is reallocated, because there 
is no other way. for example if you add values to a vector (push_back, 
push_front, etc ...)

  In some way I
> would need to check whether the R object had been coerced, and hence
> allocated new storage for its contents, or not.  That's not an
> immediate problem, though.

There is something that might be helpful (or not). In the file 
RcppCommon.h, we have the RCPP_RETURN_VECTOR macro, this is an initial 
attempt to avoid writing the same switch( TYPEOF(x), case REALSXP: ... 
code over and over again. There is an example use in RcppCommon.cpp


> Thanks for the response.
>
>> Hope that answer the first batch of questions.  Keep'em coming!
>>
>> Dirk

-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/OIXN : raster images and RImageJ
|- http://tr.im/OcQe : Rcpp 0.7.7
`- http://tr.im/O1wO : highlight 0.1-5



More information about the Rcpp-devel mailing list