[Rcpp-devel] Access Rcpp::List element

Dirk Eddelbuettel edd at debian.org
Sun May 26 19:28:44 CEST 2013


On 26 May 2013 at 18:41, Simon Zehnder wrote:
| Apologize for being imprecise in my second question. I try to rephrase it here, so more members can understand:
| 
| 2'. As performance is always a matter in statistical computations,
|  especially in simulations, it is of special interest to know, how storage
| (access to memory, writing to memory) of results in a loop works fastest. 

We have a rather detailed (if dated) benchmark in the Rcpp package, see
examples/ConvolveBenchmarks/

| Very often we have in programming the problem, that if we have to work via
| an interface (here e.g. to R or to a database) it becomes slow, when
| opening and closing this interface very often. My question was towards the
| interface between C++ and R. As a very simple example for my question,
| consider the following, 
| 
| EX1 
| 
| library(Rcpp)
| library(inline)
| src <- 'NumericMatrix numM(m); 
| 	    for(unsigned int i = 0; i < numM.nrow(); ++i) {
| 		Rcpp::RNGScope scope;

That RNGScope instance _obviously_ belongs outside the loop.

| 		numM.row(i) = Rcpp::rnorm(numM.ncol());
| 	    }
| 	   return numM;'
| randomN <- cxxfunction(signature(m = "matrix"), body = src, plugin = "Rcpp", verbose = TRUE)
| m <- matrix(0, 100000, 10000)
| system.time(m <- randomN(m))

Don't use system.time(), use either the rbenchmark package or the
microbenchmark package which both _sample over repeated calls_. 

Numerous examples for either are in the list archives.

| vs.
| 
| EX 2
| 
| library(Rcpp)
| library(inline)
| src <- 'int n = as<int>(N); 
|             int k = as<int>(K); 
| 	    NumericMatrix numM(n,k); 

That is a new allocation that the code above does not do. How could this be faster?

| 	    for(unsigned int i = 0; i < numM.nrow(); ++i) {
| 		Rcpp::RNGScope scope;

As above.

| 	        numM.row(i) = Rcpp::rnorm(numM.ncol());
| 	    }
| 	    return numM;'
| randomN2 <- cxxfunction(signature(N = "numeric", K = "numeric"), body = src, plugin = "Rcpp", verbose = TRUE)
| system.time(m <- randomN2(100000,10000))

As above.

In either case I would expect the _fixed cost of the rnorm() call_ to dominate
your timings.  

If you want to time matrix access, time matrix access. Do not confound it
with a second, expensive operation.
 
| In the first function we allocate memory in R and fill it in C++ inside the loop. In the second example we allocate memory in C++, fill it inside the loop and assign the matrix to an R object. 
| 
| My question was now, if there will be a performance difference in using R memory or in using C++ memory and assigning it then to R. My concerns were about the interface between C++ and R, when used inside the loop of EX1 very often. But it seems, that both versions perform quite similar. The only difference we can see, is in adding the time for allocating the R matrix in EX1:
| 
| timing <- function() {
| 	m <- matrix(0, 100000,1000)
| 	m <- randomN(m)
| }
| 
| So, the result would be: If you want to perform fastest in these examples use EX2. Create an NumericVector in C++ fill it inside the loop and assign to an R object. 

Now you conflating an R+C++ operation with a pure C++ operation. Should you
not compare, time _and report here_ the timings of both?

| Now towards your little note at the end of your mail: 
| 
| I conform to your opinion, that my questions sometimes miss adequate reactions, in form of solutions and/or a dialogue on the questioned issues. I will give more feedback in the future. In my defence I would like to say, that I am very often occupied with bunch of work in my institute, (regrettably) not related to Rcpp. I really want to find more time to work in the fields I like the most (which is programming and statistics). So, it takes often much longer for me to read into some code, as other duties deter me from doing things immediately. Hence, my disappearance was never meant to be disrespectful to the members who make efforts to answer my questions and contribute to the list. I also have some things on my todo-list which I want to pass to the community and it bothers me, that I do not find the time to do it immediately (I also do not understand how you find the time for all of this as I guess you are working for a company).

It was just a hint, you can of course do as you please. 

But if over time your only interaction with the list is "taking", readers may
well be less and less inclined to solve your problems for you when they never
get anything back in return. Time will tell.

Dirk 

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com


More information about the Rcpp-devel mailing list