[Rcpp-devel] How to modifying the elements of a list in the global environment?

Dirk Eddelbuettel edd at debian.org
Fri Oct 19 13:42:14 CEST 2012


On 18 October 2012 at 21:54, Davor Cubranic wrote:
| OK, 'sweep' is a fair bit (15x) slower than the Rcpp code Giovanni 
| wrote, but you still should be using the return value of the C++ 
| function to get the result back to R, rather than writing directly into 
| function arguments.
| 
| Here is a simple benchmark, comparing Giovanni's 'f_cpp' function with 
| my solution using 'sweep', with N=10000, M=500, K=10:
| 
|  > XX <- array(x, dim=c(N, M, K))
|  > benchmark(f_cpp=f_cpp(x, mu), sweep=sweep(XX, 2:3, mu),
|              columns=c('test', 'relative', 'elapsed'),
|              order='relative')
|     test relative  elapsed
| 1 f_cpp    1.000   69.306
| 2 sweep   15.269 1058.241
| 
| I would have also used f_R, but it runs out of memory with these dimensions.

Good idea to benchmark. My "cleaner" approach doesn't shine, it is slower as
Giovanni feared but I still think his (faster!!) approach is distasteful.

R> benchmark(f_cpp=f_cpp(x, mu), sweep=sweep(XX, 2:3, mu), frcpp=f(x, mu, workspace),
+           columns=c('test', 'relative', 'elapsed'),
+           order='relative')
   test relative elapsed
1 f_cpp    1.000   8.947
3 frcpp    2.372  21.222
2 sweep   33.478 299.525
R> 

What I would really do here is to create a simple struct or rather class that
upon initialization creates the workspace and holds it. 

Doesn't fit as easily in the inline paradigm though.

Dirk
 
| Davor
| 
| 
| On 12-10-18 08:02 PM, Davor Cubranic wrote:
| > Giovanni,
| >
| > Anything you pass from R into your C++ routine should be considered
| > read-only. It's typically a bad idea to do in-place updates of the R
| > object you passed in as an argument.
| >
| > In addition, I doubt that the code you wrote (addition/subtraction of
| > matrix columns) is going to be that much faster in C++ over R simply
| > because R already does those operations with optimized C-level code.
| > Maybe not as you wrote it, but you can use available R's functionality a
| > lot more effectively with something like this:
| >
| >    sweep(array(x, dim=c(N, M, K)), 2:3, mu)
| >
| > This should do exactly what you're trying to do with your C++ function,
| > except storing it as a 3D array instead of a list of 2D matrices. (It's
| > more efficient and works better with everything else in R.)
| >
| > If it's the large number of iterations in your Gibbs sampler that you're
| > worried about, then write the entire algorithm in C++ and return the
| > final result.
| >
| > Davor
| >
| >
| > On 12-10-18 07:28 PM, Giovanni Petris wrote:
| >> Thanks Dirk!
| >>
| >> By the way, I just figured out a working solution for what I wanted to
| >> achieve - 'code_f' below.
| >>
| >> Best,
| >> Giovanni
| >> =====
| >>
| >> ### Small reproducible example
| >> K <- 2; N <- 5; M <- 3
| >> x <- matrix(rnorm(N * M), N, M)
| >> mu <- matrix(rnorm(M * K, mean = 1 : K), M, K, TRUE)
| >>
| >> ### What I would do in R
| >> f_R <- function() {
| >>      ## center observations in 'x' using columns of 'mu'
| >>      w <- lapply(1L:K, function(i) x - rep(mu[, i], each = N))
| >>      ## eventually more stuff here...
| >>      w
| >> }
| >>
| >> ### For the Rcpp part, set aside working space in the global environment
| >> workspace <- lapply(1:K, function(k) matrix(0.0, N, M))
| >>
| >> code_f <- '
| >>      Rcpp::NumericMatrix cpp_x(x);
| >>      Rcpp::NumericMatrix cpp_mu(mu);
| >>      int K = cpp_mu.ncol();
| >>      int M = cpp_mu.nrow();
| >>      int N = cpp_x.nrow();
| >>      Environment glob = Environment::global_env();
| >>      List cpp_work(glob.get("workspace"));
| >>      NumericMatrix tmpMatrix;
| >>
| >>      for (int k=0; k < K; k++) {
| >>          tmpMatrix = as<SEXP>(cpp_work[k]);
| >>          for (int i=0; i < M; i++)
| >>              tmpMatrix(_, i) = cpp_x(_, i) - cpp_mu(i, k);
| >>      }
| >>
| >>      return R_NilValue;
| >> '
| >> ## library(Rcpp); library(inline)
| >> f_cpp <- cxxfunction(signature(x = "numeric", mu = "numeric"), code_f,
| >> plugin = "Rcpp")
| >>
| >> ## try it
| >> f_cpp(x, mu)
| >> all.equal(workspace, f_R())
| >>
| >>
| >> ________________________________________
| >> From: Dirk Eddelbuettel [edd at debian.org]
| >> Sent: Thursday, October 18, 2012 9:15 PM
| >> To: Giovanni Petris
| >> Cc: Dirk Eddelbuettel; rcpp-devel at lists.r-forge.r-project.org
| >> Subject: RE: [Rcpp-devel] How to modifying the elements of a list in
| >> the        global  environment?
| >>
| >> On 19 October 2012 at 01:32, Giovanni Petris wrote:
| >> | Hi Dirk,
| >> |
| >> | Thank you for the quick reply. I will look for more examples on the
| >> net.
| >> |
| >> | About your suggestion of allocating scrap space inside the C++
| >> routine, am I wrong to think that when the matrices are large and the
| >> function is called repeatedly within a Gibbs sampler loop, this is not
| >> a very efficient approach?
| >>
| >> Yes, if you want to use it several times you can pass it around for
| >> reuse.
| >> As I mentioned, a reference is just a pointer and hence unrelated to
| >> the size
| >> of the object.
| >>
| >> Dirk
| >>
| >> | Thanks again!
| >> |
| >> | Best,
| >> | Giovanni
| >> |
| >> | ________________________________________
| >> | From: Dirk Eddelbuettel [edd at debian.org]
| >> | Sent: Thursday, October 18, 2012 8:06 PM
| >> | To: Dirk Eddelbuettel
| >> | Cc: Giovanni Petris; rcpp-devel at lists.r-forge.r-project.org
| >> | Subject: Re: [Rcpp-devel] How to modifying the elements of a list in
| >> the        global  environment?
| >> |
| >> | Giovanni,
| >> |
| >> | On 18 October 2012 at 17:46, Dirk Eddelbuettel wrote:
| >> | | | > code_f <- '
| >> | | | +     Rcpp::NumericMatrix cpp_x(x);
| >> | | | +     Rcpp::NumericMatrix cpp_mu(mu);
| >> | | | +     int K = cpp_mu.ncol();
| >> | | | +     int M = cpp_mu.nrow();
| >> | | | +     int N = cpp_x.nrow();
| >> | | | +     Environment glob = Environment::global_env();
| >> | | | +     List cpp_work(glob.get("workspace"));
| >> | | | +
| >> | | | +     for (int k=0; k < K; k++) {
| >> | | | +         cpp_work[k] = clone(cpp_x);
| >> | | | +         for (int i=0; i < M; i++)
| >> | | | +             cpp_work(_, i) = cpp_work(_, i) - cpp_mu(i, k);
| >> |
| >> | That can't work.  You have the types confused.
| >> |
| >> | You could just pass the list object down, and then pick from it.
| >> Working
| >> | examples for that are in the list archives, in the RcppExamples
| >> package, and
| >> | in other places on the net.
| >> |
| >> | Or, as I suggested, just allocate the scrap space inside the C++
| >> routine.
| >> |
| >> | Hth,  Dirk
| >> |
| >> | --
| >> | Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
| >>
| >> --
| >> Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
| >> _______________________________________________
| >> Rcpp-devel mailing list
| >> Rcpp-devel at lists.r-forge.r-project.org
| >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
| >>
| >
| > _______________________________________________
| > Rcpp-devel mailing list
| > Rcpp-devel at lists.r-forge.r-project.org
| > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
| 
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com  


More information about the Rcpp-devel mailing list