[Rcpp-devel] optimizing rcpp iterating code

Steve Bellan steve.bellan at gmail.com
Wed Jul 16 18:31:42 CEST 2014


Hi all,

I've been transferring a Markov Chain model in R to Rcpp. It's a fairly elaborate model and I've successfully replicated the results in R. But despite doing my best to read up on Rcpp and C++ to squeeze the most speed, I am not getting the gains I expected. When numcouples = 10,100, 1000, or 10^4, the increase of speed from R to Rcpp is 100X, 15X, 4X, and 2.5X faster, respectively. This seems to be because C++ time increased linearly with numcouples whereas R did not.

I thought it might have to do with the passing of a lot of data from R to C++; the real code (from which the below example was diluted) passes 2 Lists and 7 NumericMatrices all of which are very large. But to test this, I added a loop inside cplC() that just does the for(int tt) loop a 100 times in a row so that the overhead from passing objects is minimized. But this didn't help speed things up at all. 

Are there other things I can do to speed this up? Sorry for the vagueness of the question, I'm very new to both Rcpp & C++.

Thanks for your help!

Steve


// [[Rcpp::export]]
NumericMatrix cplC(const int & max_bd,
		       const NumericMatrix & pre_fprev, const NumericMatrix & pre_msurv, 
		       const List & PreActiveList,	
		       const double &  bmb) {
  // Column names for sero
  int s__ = 0; int mb_a1A = 1; int mb_a2A = 2; // 53 state variables for each individual
  int numcouples = pre_fprev.nrow(); // get number of couples from one of the input matrices
  NumericMatrix seros(numcouples, 3); // Initialize couples by state matrix
  for(int jj = 0; jj < numcouples; ++jj) seros(jj,s__) = 1; // All couples start in first state
  // Initialize transmission hazards & probabilities & joint probability of transmission & survival (*a*live)
  int numactive(1);	      // # of active couples (changes between time steps)
  NumericVector m_haz(numcouples); // intermediate couple-specific parameters
  NumericVector pmb(numcouples);
  NumericVector pmbA(numcouples);
// For each time step up until max time 
  for(int tt = 0; tt < max_bd; ++tt) { 
    NumericMatrix serosO = clone(seros); // Copy old seros (last state from which to iterate forward)
    IntegerVector active = PreActiveList[tt]; // Currently active couples,
    numactive = active.size();		 // Number of active couples
    if (min(active) < 0) stop("active < 0"); // Check indices are contained within arrays
    if (max(active) > numcouples) stop("max(active) > numcouples");
    // For each active couple
    for(IntegerVector::iterator jj = active.begin(); jj != active.end(); ++jj) {
      // ///////////////////////////////////////////////////
      // Assign temporary parameters: about 40 lines of code like these 3 lines
      m_haz(*jj) = bmb*pre_fprev(*jj,tt); // temporary parameter 1
      pmb(*jj) = 1 - exp(-m_haz(*jj)); // temporary parameter 2
      pmbA(*jj) = pmb(*jj)*pre_msurv(*jj,tt); // temporary parameter 3
      // /////////////////////////////////////////////////
      // Update state variables: about 120 lines of code like these 3 lines
      seros(*jj,s__)    = serosO(*jj,s__)* (1-pmbA(*jj));
      seros(*jj,mb_a1A)  = serosO(*jj,s__)*pmbA(*jj)*(1-pmb(*jj));
      seros(*jj,mb_a2A) = serosO(*jj,mb_a1A)*pmbA(*jj)*pmb(*jj);
    } // end couple iteration
  }   // end month or pre-couple duration iteration
return seros;
}


/*** R

numcouples <- 10^c(2:5)
times <- numeric(length(numcouples))
for(cc in 1:length(numcouples)) {
    nn <- numcouples[cc]
    maxT <- 500
    examplist <- list(NA)
    for(ii in 1:maxT) examplist[[ii]] <- sample(0:(nn-1), max(101-ii,1))
    pre_fprev <- matrix(runif(nn*maxT),nn,maxT)
    pre_msurv <- matrix(runif(nn*maxT),nn,maxT)
    bmb <-  .05
    times[cc] <- system.time(cplC(max_bd = maxT, pre_fprev = pre_fprev, pre_msurv = pre_msurv, PreActiveList = examplist, bmb = bmb))[3]
}
cbind(numcouples,times, times/numcouples)

*/




More information about the Rcpp-devel mailing list