[Rcpp-devel] Significant difference in performance when computing Ax
Avraham Adler
avraham.adler at gmail.com
Sun Apr 27 20:34:10 CEST 2014
I feel foolish now. I forgot to note that I build R using a Win64
version of OpenBLAS (see
<http://www.avrahamadler.com/2014/04/20/r-3-1-0-openblas-speed-comparisons/>
for more). I'm certain that the speed in tcrossprod is coming from an
optimized compile of R and BLAS. My apologies for any confusion.
Avi
On Sun, Apr 27, 2014 at 12:44 PM, Søren Højsgaard <sorenh at math.aau.dk> wrote:
> I also tried the tcrossprod but with results very different from what you
> report. I wonder why???
>
> I also used RcppArmadillo (mvprod3 below) which performs similar to mvprod2.
>
>
>
> Cheers
>
> Søren
>
>
>
>> all.equal(mvprod1(A,x), mvprod2(A,x), mvprod3(A,x),
>
> + A%*%x, tcrossprod(x, A))
>
> [1] TRUE
>
>
>
>> library(microbenchmark)
>
>
>
>> microbenchmark(mvprod1(A,x), mvprod2(A,x), mvprod3(A,x), A%*%x,
>
> + tcrossprod(x,A))
>
> Unit: milliseconds
>
> expr min lq median uq max neval
>
> mvprod1(A, x) 35.854102 36.151495 36.362352 36.658112 38.680147 100
>
> mvprod2(A, x) 4.409340 4.429632 4.451791 4.522232 4.975201 100
>
> mvprod3(A, x) 2.965528 2.977657 2.988386 3.043900 3.270617 100
>
> A %*% x 8.506591 8.566303 8.637910 8.752902 9.233395 100
>
> tcrossprod(x, A) 36.787099 37.232138 37.695370 41.378139 43.208443 100
>
>
>
>
>
> // [[Rcpp::export]]
>
> arma::vec mvprod3(arma::mat& A, arma::vec& x ){
>
> return A*x;
>
> }
>
>
>
>> sessionInfo()
>
> R version 3.1.0 Patched (2014-04-15 r65398)
>
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
>
>
> locale:
>
> [1] LC_COLLATE=Danish_Denmark.1252 LC_CTYPE=Danish_Denmark.1252
> LC_MONETARY=Danish_Denmark.1252
>
> [4] LC_NUMERIC=C LC_TIME=Danish_Denmark.1252
>
>
>
> attached base packages:
>
> [1] stats graphics grDevices utils datasets methods base
>
>
>
> other attached packages:
>
> [1] microbenchmark_1.3-0 RcppArmadillo_0.4.200.0 Rcpp_0.11.1
> devtools_1.5
>
> [5] shTools_1.0 markdown_0.6.5 knitr_1.5
>
>
>
> loaded via a namespace (and not attached):
>
> [1] compiler_3.1.0 digest_0.6.4 evaluate_0.5.3 formatR_0.10 httr_0.3
> memoise_0.1 parallel_3.1.0
>
> [8] RCurl_1.95-4.1 stringr_0.6.2 tools_3.1.0 whisker_0.3-2
>
>
>
>
>
>
>
>
>
>
>
> From: Avraham Adler [mailto:avraham.adler at gmail.com]
> Sent: 25. april 2014 19:43
> To: Søren Højsgaard
> Cc: rcpp-devel at lists.r-forge.r-project.org
> Subject: Re: [Rcpp-devel] Significant difference in performance when
> computing Ax
>
>
>
> There is actually one other command to test, `crossprod`, or in this case
> `tcrossprod(x, A)`.
>
> Following your code, first run and ensure TRUE is returned.
>
> d <- 2000
> A <- matrix(1.0*(1:d^2),nrow=d)
> x <- 1.0*(1:d)
> all.equal(A%*%x, tcrossprod(x, A), mvprod1(A,x), mvprod2(A,x))
>
>
>
> Now, using an i7-3740QM 2.7Ghz CPU with 8GB of RAM as a testbed on Windows 7
> (with R & Rcpp compiled from source throwing "-march=core-avx-i -O3
> -std=gnu++0x --param l1-cache-line-size=64 --param l1-cache-size=32 --param
> l2-cache-size=256" FWIW) I get the following timings:
>
> microbenchmark(A%*%x, tcrossprod(x, A), mvprod1(A, x), mvprod2(A, x), times
> = 1000L, control = list(order = 'block'))
>
>
>
>
>
>
> Unit: milliseconds
>
> expr min lq median uq max neval
>
> A %*% x 13.183992 13.242965 13.269027 13.317917 16.680117 1000
>
> tcrossprod(x, A) 6.967139 6.991869 7.004424 7.020214 9.795921 1000
>
> mvprod1(A, x) 43.089952 43.134848 43.163953 43.489824 45.129644 1000
>
> mvprod2(A, x) 4.564480 4.883693 4.911848 4.928208 7.435495 1000
>
>
>
> So the crossprod function is decently optimized as it is, but the prod2 is
> still faster.
>
> Thanks for idea and post!
>
> Avi
>
>
>
>
>
> On Fri, Apr 25, 2014 at 12:47 PM, Søren Højsgaard <sorenh at math.aau.dk>
> wrote:
>
> Dear all,
>
> When forming the matrix-vector-product one can form the inner products
> between As rows and x (mvprod1 below) or form a linear combination of As
> columns (mvprod2 below). The difference in computing time is striking (at
> least for me) for large matrices and it really illustrates the “caching
> issue” (I assume):
>
>
>
>> microbenchmark(mvprod1(A,x), mvprod2(A,x), A%*%x)
>
> Unit: milliseconds
>
> expr min lq median
> uq max neval
>
> mvprod1(A, x) 35.783812 36.310956 36.600186 36.991113 38.922889 100
>
> mvprod2(A, x) 4.408892 4.436882 4.475368 4.568668 5.477176 100
>
> A %*% x 8.514091 8.592230 8.734978 8.811951 9.738653
> 100
>
>
>
> Just wanted to share this!
>
> Cheers
>
> Søren
>
>
>
>
>
>
>
> #include <Rcpp.h>
>
> using namespace Rcpp;
>
>
>
> //[[Rcpp::export]]
>
> NumericVector mvprod1 (NumericMatrix A, NumericVector x){
>
> int nrA=A.nrow(), ncA=A.ncol(), i, j;
>
> NumericVector y(nrA);
>
> for (i=0; i<nrA; ++i){
>
> for (j=0; j<ncA; ++j){
>
> y[i] += A(i,j)*x[j];
>
> }
>
> }
>
> return y;
>
> }
>
>
>
> //[[Rcpp::export]]
>
> NumericVector mvprod2 (NumericMatrix A, NumericVector x){
>
> int nrA=A.nrow(), ncA=A.ncol(), i, j;
>
> NumericVector y(nrA);
>
> for (j=0; j<ncA; ++j){
>
> for (i=0; i<nrA; ++i){
>
> y[i] += A(i,j)*x[j];
>
> }
>
> }
>
> return y;
>
> }
>
>
>
>
>
>
>
> /*** R
>
> d <- 5
>
> A <- matrix(1.0*(1:d^2),nrow=d)
>
> x <- 1.0*(1:d)
>
>
>
> A%*%x
>
> mvprod1(A,x)
>
> mvprod2(A,x)
>
>
>
> d <- 2000
>
> A <- matrix(1.0*(1:d^2),nrow=d)
>
> x <- 1.0*(1:d)
>
>
>
> library(microbenchmark)
>
> microbenchmark(mvprod1(A,x), mvprod2(A,x), A%*%x)
>
> */
>
>
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>
>
More information about the Rcpp-devel
mailing list