[Rcpp-devel] Access Rcpp::List element

Simon Zehnder szehnder at uni-bonn.de
Sun May 26 18:41:03 CEST 2013


Dear Dirk,

thanks for your fast reply to my questions. 

Apologize for being imprecise in my second question. I try to rephrase it here, so more members can understand:

2'. As performance is always a matter in statistical computations, especially in simulations, it is of special interest to know, how storage (access to memory, writing to memory) of results in a loop works fastest. Very often we have in programming the problem, that if we have to work via an interface (here e.g. to R or to a database) it becomes slow, when opening and closing this interface very often. My question was towards the interface between C++ and R. As a very simple example for my question, consider the following,


EX1 

library(Rcpp)
library(inline)
src <- 'NumericMatrix numM(m); 
	    for(unsigned int i = 0; i < numM.nrow(); ++i) {
		Rcpp::RNGScope scope;
		numM.row(i) = Rcpp::rnorm(numM.ncol());
	    }
	   return numM;'
randomN <- cxxfunction(signature(m = "matrix"), body = src, plugin = "Rcpp", verbose = TRUE)
m <- matrix(0, 100000, 10000)
system.time(m <- randomN(m))


vs.

EX 2

library(Rcpp)
library(inline)
src <- 'int n = as<int>(N); 
            int k = as<int>(K); 
	    NumericMatrix numM(n,k); 
	    for(unsigned int i = 0; i < numM.nrow(); ++i) {
		Rcpp::RNGScope scope;
	        numM.row(i) = Rcpp::rnorm(numM.ncol());
	    }
	    return numM;'
randomN2 <- cxxfunction(signature(N = "numeric", K = "numeric"), body = src, plugin = "Rcpp", verbose = TRUE)
system.time(m <- randomN2(100000,10000))

In the first function we allocate memory in R and fill it in C++ inside the loop. In the second example we allocate memory in C++, fill it inside the loop and assign the matrix to an R object. 

My question was now, if there will be a performance difference in using R memory or in using C++ memory and assigning it then to R. My concerns were about the interface between C++ and R, when used inside the loop of EX1 very often. But it seems, that both versions perform quite similar. The only difference we can see, is in adding the time for allocating the R matrix in EX1:

timing <- function() {
	m <- matrix(0, 100000,1000)
	m <- randomN(m)
}

So, the result would be: If you want to perform fastest in these examples use EX2. Create an NumericVector in C++ fill it inside the loop and assign to an R object. 

Now towards your little note at the end of your mail: 

I conform to your opinion, that my questions sometimes miss adequate reactions, in form of solutions and/or a dialogue on the questioned issues. I will give more feedback in the future. In my defence I would like to say, that I am very often occupied with bunch of work in my institute, (regrettably) not related to Rcpp. I really want to find more time to work in the fields I like the most (which is programming and statistics). So, it takes often much longer for me to read into some code, as other duties deter me from doing things immediately. Hence, my disappearance was never meant to be disrespectful to the members who make efforts to answer my questions and contribute to the list. I also have some things on my todo-list which I want to pass to the community and it bothers me, that I do not find the time to do it immediately (I also do not understand how you find the time for all of this as I guess you are working for a company).

Best

Simon




On May 26, 2013, at 3:42 PM, Dirk Eddelbuettel <edd at debian.org> wrote:

> 
> Simon,
> 
> On 26 May 2013 at 15:20, Simon Zehnder wrote:
> | Dear Rcpp::Devels and Rcpp::Users,
> | 
> | I have maybe some trivial questions. 
> | 
> | 1. If I use 
> | 
> | 	Rcpp::NumericVector A(someVector),
> |     
> |     does it reuse memory from someVector for A? From http://cran.r-project.org/web/packages/Rcpp/vignettes/Rcpp-quickref.pdf I would say it does, whereas 
> | 	
> | 	Rcpp::NumerixVector A(clone(someVector)) 
> | 
> |     does not.
> 
> Yes and yes.  This is all documented.
> 
> The first (standard) form uses the "proxy object" instantiation which uses
> the _exact same memory used by the object you instantiate from_.  This is
> typically done when coming from R with a SEXP, but should work the same for
> all other objects.  So someVector and A now use the same "represensation" and
> underlying memory: change one, and you changed the other.  The possible side
> effect is the cost of the fastest possible instantiation which is also the
> most leight-weight.
> 
> And hence the need for clone() to create deep copies.
> 
> | 2. Let us assume I am right in point 1, then my next question would arise towards reusing memory from R: Does it make a difference in C++ if I create a NumericVector with own memory and fill it in a loop or if I reuse memory from R and fill it in a loop? Again, I would say no, as if memory is reused, we usually pass a pointer to the memory tom the Rcpp Object, that has now direct access to this memory (of course this will be different when using parallel code, where memory allocation is essential). 
> 
> I don't think I fully understand the question. But let me answer what I am
> guestimating you asked: It should not matter.
> 
> Besides trying to formulate more concise questions (1. above was good), you
> could simply try to __profile__ such suspicions.  That is just about the best
> way to settle this.
> 
> | 3. Imagine I use now an S4 Object from R:
> | 
> | 	Rcpp::Export SEXP fName (SEXP& data_S4) {
> | 	
> | 		Rcpp::S4 dataS4O (data_S4);
> | 
> | 	}
> | 
> |    If data_S4 contains a list, has the command Rcpp::S4 dataS4(data_S4) already created an Rcpp::List Object from it? If not, how do I proceed to get such a Rcpp::List Object? 
> 
> Rcpp::S4 offers you slots. See the documentation, see eg the working examples
> in the unit tests and see this example 
>   http://gallery.rcpp.org/articles/armadillo-sparse-matrix/
> which turns a sparse matrix (an S4 object from the Matrix package) into an
> Armdillo sparse matrix:
> 
> 
> void convertSparse(S4 mat) {         // slight improvement with two non-nested loops
>    IntegerVector dims = mat.slot("Dim");
>    IntegerVector i = mat.slot("i");
>    IntegerVector p = mat.slot("p");
>    NumericVector x = mat.slot("x");
>    [...]
> 
> You could pick off lists the exact same way. And Lists inside Lists inside S4
> elements ...  SEXP nesting works as it does in R.
> 
> Finally, if I may: You have a bit of a tendency to come to the list, ask a
> question, and then to disappear [eg your recent Armadillo RNG question].  
> 
> For more mutually beneficial discourse, I would appreciate it if you could
> reply, extend the examples or generally clarify whereever you find something
> too brief.  Of course, I don;t mean to suggest that a one-line 'me too' is
> beneficial, but it would be nice to turn this into a dialogue rather than a
> free answering system for whatever question bothers you right now :)  Also
> feel free to help others on the list.  You obviously already understand a lot
> of the material rather well...
> 
> Regards, Dirk
> 
> -- 
> Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com



More information about the Rcpp-devel mailing list