[Rcpp-devel] receiving R matrices: question about RcppArmadillo example fastlm

Douglas Bates bates at stat.wisc.edu
Thu Jan 10 18:19:55 CET 2013


The important question is whether the as<arma::mat> copies the contents of
the matrix from the SEXP passed by R.  Creating an Rcpp::NumericMatrix does
not copy the contents and converting the NumericMatrix to an arma::mat with
that trailing 'false' also just copies the pointer to the contents, not the
contents itself.  I'm not sure what the "as" method does.

I prefer the approach in RcppEigen where you have two different types of
objects with "as" methods, the Eigen::MatrixXd object (copies the contents)
and the Eigen::Map<Eigen::MatrixXd> object which only copies the pointer.
 I generally declare the objects created from SEXP's as

const Eigen::Map<Eigen::MatrixXd>  X(Rcpp::as< ...>(Xs));

so the compiler can take responsibility for ensuring that I don't
unintentionally modify the contents.

There are several examples in the RcppEigen vignette that Dirk and I wrote.




On Wed, Jan 9, 2013 at 11:58 PM, Paul Johnson <pauljohn32 at gmail.com> wrote:

> Hello, Rcpp land.
>
> I'm still collecting idioms and examples for the usage of Rcpp. Today
> I'm looking at ways that people manage the importation of R matrices.
>
> Consider fasLm.r file in the RcppArmadillo package's examples folder.
> It uses this this three-step process to bring in a matrix from R. This
> leads up to the fLmTwoCasts function:
>
> src <- '
>     Rcpp::NumericMatrix Xr(Xs);
>     Rcpp::NumericVector yr(ys);
>     int n = Xr.nrow(), k = Xr.ncol();
>     arma::mat X(Xr.begin(), n, k, false);
>     arma::colvec y(yr.begin(), yr.size(), false);
>    ...
>
> It seems to waste some effort. It allocates Xr, then an iterator, and
> then writes it into X. 3 objects to get one?
>
> If I were writing this, I would take the more direct Rcpp::as route,
> like the varSimulation.r example does
> in the same folder:
>
> src <- '
>     arma::mat X = Rcpp::as<arma::mat>(Xs);
>     arma::colvec y = Rcpp::as<arma::colvec>(ys);
>     int n = X.n_rows; int k = X.n_cols;
> ...
>
> I tested this, it gives the exact same answers.
>
> I was certain it would be quicker. It creates fewer objects, it is
> more direct.  But, as far as I can see in a simple test, the Rcpp::as
> way is not faster.  If anything, it is slower, by a minute fraction.
>
> I really thought it would be faster, though, and lighter on memory
> usage. What do you think? I'm looking at as.h in the Rcpp code to try
> to figure it out. But, wow! It that complicated, or what?
>
>
> Here's my benchmark example, which I include only as evidence that I
> really tried before asking here.
>
> ## built from RcppArmadillo examples/fastLm.r
> library(inline)
> library(rbenchmark)
>
> src <- '
>     Rcpp::NumericMatrix Xr(Xs);
>     Rcpp::NumericVector yr(ys);
>     int n = Xr.nrow(), k = Xr.ncol();
>     arma::mat X(Xr.begin(), n, k, false);
>     arma::colvec y(yr.begin(), yr.size(), false);
>     int df = n - k;
>
>     // fit model y ~ X, extract residuals
>     arma::colvec coef = arma::solve(X, y);
>     arma::colvec res  = y - X*coef;
>
>     double s2 = std::inner_product(res.begin(), res.end(),
>                                    res.begin(), 0.0)/df;
>     // std.errors of coefficients
>     arma::colvec sderr = arma::sqrt(s2 *
>        arma::diagvec(arma::pinv(arma::trans(X)*X)));
>
>     return Rcpp::List::create(Rcpp::Named("coefficients")=coef,
>                               Rcpp::Named("stderr")      =sderr,
>                               Rcpp::Named("df")          =df);
> '
>
> fLmTwoCasts <- cxxfunction(signature(Xs="numeric", ys="numeric"),
>                            src, plugin="RcppArmadillo")
>
>
> src <- '
>     arma::mat X = Rcpp::as<arma::mat>(Xs);
>     arma::colvec y = Rcpp::as<arma::colvec>(ys);
>     int n = X.n_rows; int k = X.n_cols;
>
>     int df = n - k;
>
>     // fit model y ~ X, extract residuals
>     arma::colvec coef = arma::solve(X, y);
>     arma::colvec res  = y - X*coef;
>
>     double s2 = std::inner_product(res.begin(), res.end(),
>                                    res.begin(), 0.0)/df;
>     // std.errors of coefficients
>     arma::colvec sderr = arma::sqrt(s2 *
>        arma::diagvec(arma::pinv(arma::trans(X)*X)));
>
>     return Rcpp::List::create(Rcpp::Named("coefficients")=coef,
>                               Rcpp::Named("stderr")      =sderr,
>                               Rcpp::Named("df")          =df);
> '
>
> pjTwoCasts <- cxxfunction(signature(Xs="numeric", ys="numeric"),
>                            src, plugin="RcppArmadillo")
>
>
> N <- 10000
>
> X <- cbind(1, rnorm(N), rpois(N, lambda=0.8), rnorm(N), rnorm(N),
> rnorm(N), rnorm(N), rnorm(N), rnorm(N), rnorm(N), rnorm(N), rnorm(N),
> rnorm(N), rnorm(N))
> y <- rnorm(N)
>
> res <- benchmark(
>     fLmTwoCasts(X, y),
>     pjTwoCasts(X, y),
>     columns = c("test", "replications", "relative",
>     "elapsed", "user.self", "sys.self"),
>     order="relative",
>     replications=5000)
>
> print(res)
>
> --
> Paul E. Johnson
> Professor, Political Science      Assoc. Director
> 1541 Lilac Lane, Room 504      Center for Research Methods
> University of Kansas                 University of Kansas
> http://pj.freefaculty.org               http://quant.ku.edu
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130110/40748167/attachment.html>


More information about the Rcpp-devel mailing list