# [Rcpp-devel] How to increase the coding efficiency

Honglang Wang wanghonglang2008 at gmail.com
Wed Dec 5 20:09:40 CET 2012

The following is a full example although I don't know whether it's minimal
or not:

library(Rcpp)
sourceCpp("betahat_mod.cpp")

#The following is data generation.
n=200
m=20
p=2
t=runif(m*n,min=0, max=1)
X1=rnorm(m*n,0,1)
X1=as.matrix(1+2*exp(t)+X1)
X2=rnorm(m*n,0,1)
X2=as.matrix(3-4*t^2+X2)
X=cbind(X1,X2)
beta1=0.5*sin(t)
beta2=beta1
rho=0.2
sig2=1

eps=unlist(lapply(1:n,function(x){as.matrix(arima.sim(list(ar=rho),m,sd=sqrt(sig2*(1-rho^2))))}))
y=as.matrix(X1*beta1+X2*beta2+eps)

#A simple kernel function.
ker=function(x,h)
{
ans=x
lo=(abs(x)<h)
ans[lo]=(3/4)*(1-(x[lo]/h)^2)/h
ans[!lo]=0
return(ans)
}

h=0.3

#assess the time for evaluating bethat of 100 t's:
system.time((betahatt=t(apply(as.matrix(t[1:100]),1,function(x)
betahat(ker,x,X,y,t,h,m)\$betahat))))

And the .cpp file is the following:

using namespace Rcpp;

// [[Rcpp::export]]
List betahat(Function ker, double t0, NumericMatrix Xr, NumericMatrix yr,
NumericVector tr, double h, int m) {
int n = Xr.nrow(), p = Xr.ncol();
arma::mat X(Xr.begin(), n, p, false);
arma::mat y(yr.begin(), n, 1, false);
arma::colvec t(tr.begin(), tr.size(), false);
arma::mat T = X;
T.each_col() %= (t-t0)/h;
arma::mat D = arma::join_rows(X,T);
arma::vec kk =as<arma::vec>(ker(tr-t0,h));
arma::mat W = (arma::diagmat(kk))/m;
arma::mat Inv = arma::trans(D)*W*D;
arma::vec betahat = arma::inv(Inv)*arma::trans(D)*W*y;
arma::colvec betahat0(betahat.begin(),betahat.size()/2,false);
return List::create(Named("betahat") = betahat0);
}

Best wishes!

Honglang Wang

Office C402 Wells Hall
Department of Statistics and Probability
Michigan State University
1579 I Spartan Village, East Lansing, MI 48823
wangho16 at msu.edu

On Wed, Dec 5, 2012 at 4:07 AM, Christian Gunning <xian at unm.edu> wrote:

> Can you post a minimal full example?
> -Christian
>
> On Tue, Dec 4, 2012 at 8:39 PM, Honglang Wang
> <wanghonglang2008 at gmail.com> wrote:
> > Yes, the main issue for my coding is the allocation of memory.  And I
> have
> > fixed one of the biggest memory allocation issue: 4000 by 4000 diagonal
> > matrix. And since I am not familiar with Rcpp and RcppArmadillo, I have
> no
> > idea how to reuse the memory. I hope I can have some materials to learn
> > this. Thanks.
> >
> >
> >>
> >> What exactly do these timings show?  A single call you your function?
> >> How many calls?
> >>
> > Here I called my function for 100 times.
> >
> >>
> >> Building on Romain's point: -- a portion of your function's runtime is
> >> in memory allocation
> >> (and you have a lot of allocations here).
> >> If you're calling your function thousands or millions of times, then
> >> it might pay to closely
> >> examine your memory allocation strategies and figure out what's
> >> temporary, for example.
> >> It looks like you're already using  copy_aux_mem = false in a number
> >> of places, but you're
> >> allocating a lot of objects -- of approx what size?
> >>
> >> For example, wouldn't this work just as well with one less allocation?
> >> arma::vec kk = t;
> >> arma::uvec q1 = arma::find(arma::abs(tp)<h);
> >> kk.elem(q1) = ((1-arma::pow(tp.elem(q1)/h,2))/h)*0.75;
> >> // done with q1.  let's reuse it.
> >> q1 = arma::find(arma::abs(tp)>=h);
> >> // was q2
> >> kk.elem(q1).zeros();
> >>
> >> You could potentially allocate memory for temporary working space in
> >> R, grab it with copy_aux_mem = false, write your temp results there,
> >> and reuse these objects in subsequent function calls.  It doesn't make
> >> sense to go to this trouble, though, if your core algorithm consumes
> >> the bulk of runtime.
> >>
> >> Have you looked on the armadillo notes r.e. inv?  Matrix inversion has
> >> O(>n^2).  You may be aided by pencil-and-paper math here.
> >> http://arma.sourceforge.net/docs.html#inv
> >>
> > Here my matrix for inverse is only 4 by 4, so I think it's ok.
> >
> >>
> >> best,
> >> Christian
> >>
> >> > Dear All,
> >> > I have tried out the first example by using RcppArmadillo, but I am
> not
> >> > sure whether the code is efficient or not. And I did the comparison of
> >> > the
> >> > computation time.
> >> >
> >> > 1) R code using for loop in R: 87.22s
> >> > 2) R code using apply: 77.86s
> >> > 3) RcppArmadillo by using for loop in C++: 53.102s
> >> > 4) RcppArmadillo together with apply in R: 47.310s
> >> >
> >> > It is kind of not so big increase. I am wondering whether I used an
> >> > inefficient way for the C++ coding:
> >>
> >>
> >>
> >> --
> >> A man, a plan, a cat, a ham, a yak, a yam, a hat, a canal – Panama!
> >
> >
>
>
>
> --
> A man, a plan, a cat, a ham, a yak, a yam, a hat, a canal – Panama!
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20121205/c36a832e/attachment.html>