[Rcpp-devel] receiving R matrices: question about RcppArmadillo example fastlm
Paul Johnson
pauljohn32 at gmail.com
Thu Jan 10 06:58:56 CET 2013
Hello, Rcpp land.
I'm still collecting idioms and examples for the usage of Rcpp. Today
I'm looking at ways that people manage the importation of R matrices.
Consider fasLm.r file in the RcppArmadillo package's examples folder.
It uses this this three-step process to bring in a matrix from R. This
leads up to the fLmTwoCasts function:
src <- '
Rcpp::NumericMatrix Xr(Xs);
Rcpp::NumericVector yr(ys);
int n = Xr.nrow(), k = Xr.ncol();
arma::mat X(Xr.begin(), n, k, false);
arma::colvec y(yr.begin(), yr.size(), false);
...
It seems to waste some effort. It allocates Xr, then an iterator, and
then writes it into X. 3 objects to get one?
If I were writing this, I would take the more direct Rcpp::as route,
like the varSimulation.r example does
in the same folder:
src <- '
arma::mat X = Rcpp::as<arma::mat>(Xs);
arma::colvec y = Rcpp::as<arma::colvec>(ys);
int n = X.n_rows; int k = X.n_cols;
...
I tested this, it gives the exact same answers.
I was certain it would be quicker. It creates fewer objects, it is
more direct. But, as far as I can see in a simple test, the Rcpp::as
way is not faster. If anything, it is slower, by a minute fraction.
I really thought it would be faster, though, and lighter on memory
usage. What do you think? I'm looking at as.h in the Rcpp code to try
to figure it out. But, wow! It that complicated, or what?
Here's my benchmark example, which I include only as evidence that I
really tried before asking here.
## built from RcppArmadillo examples/fastLm.r
library(inline)
library(rbenchmark)
src <- '
Rcpp::NumericMatrix Xr(Xs);
Rcpp::NumericVector yr(ys);
int n = Xr.nrow(), k = Xr.ncol();
arma::mat X(Xr.begin(), n, k, false);
arma::colvec y(yr.begin(), yr.size(), false);
int df = n - k;
// fit model y ~ X, extract residuals
arma::colvec coef = arma::solve(X, y);
arma::colvec res = y - X*coef;
double s2 = std::inner_product(res.begin(), res.end(),
res.begin(), 0.0)/df;
// std.errors of coefficients
arma::colvec sderr = arma::sqrt(s2 *
arma::diagvec(arma::pinv(arma::trans(X)*X)));
return Rcpp::List::create(Rcpp::Named("coefficients")=coef,
Rcpp::Named("stderr") =sderr,
Rcpp::Named("df") =df);
'
fLmTwoCasts <- cxxfunction(signature(Xs="numeric", ys="numeric"),
src, plugin="RcppArmadillo")
src <- '
arma::mat X = Rcpp::as<arma::mat>(Xs);
arma::colvec y = Rcpp::as<arma::colvec>(ys);
int n = X.n_rows; int k = X.n_cols;
int df = n - k;
// fit model y ~ X, extract residuals
arma::colvec coef = arma::solve(X, y);
arma::colvec res = y - X*coef;
double s2 = std::inner_product(res.begin(), res.end(),
res.begin(), 0.0)/df;
// std.errors of coefficients
arma::colvec sderr = arma::sqrt(s2 *
arma::diagvec(arma::pinv(arma::trans(X)*X)));
return Rcpp::List::create(Rcpp::Named("coefficients")=coef,
Rcpp::Named("stderr") =sderr,
Rcpp::Named("df") =df);
'
pjTwoCasts <- cxxfunction(signature(Xs="numeric", ys="numeric"),
src, plugin="RcppArmadillo")
N <- 10000
X <- cbind(1, rnorm(N), rpois(N, lambda=0.8), rnorm(N), rnorm(N),
rnorm(N), rnorm(N), rnorm(N), rnorm(N), rnorm(N), rnorm(N), rnorm(N),
rnorm(N), rnorm(N))
y <- rnorm(N)
res <- benchmark(
fLmTwoCasts(X, y),
pjTwoCasts(X, y),
columns = c("test", "replications", "relative",
"elapsed", "user.self", "sys.self"),
order="relative",
replications=5000)
print(res)
--
Paul E. Johnson
Professor, Political Science Assoc. Director
1541 Lilac Lane, Room 504 Center for Research Methods
University of Kansas University of Kansas
http://pj.freefaculty.org http://quant.ku.edu
More information about the Rcpp-devel
mailing list