[Rcpp-commits] r356 - papers/rjournal

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Tue Jan 12 19:05:46 CET 2010


Author: romain
Date: 2010-01-12 19:05:46 +0100 (Tue, 12 Jan 2010)
New Revision: 356

Modified:
   papers/rjournal/EddelbuettelFrancois.tex
Log:
embryo for performance section

Modified: papers/rjournal/EddelbuettelFrancois.tex
===================================================================
--- papers/rjournal/EddelbuettelFrancois.tex	2010-01-12 16:10:54 UTC (rev 355)
+++ papers/rjournal/EddelbuettelFrancois.tex	2010-01-12 18:05:46 UTC (rev 356)
@@ -1,9 +1,10 @@
 \title{Mesh R and C++ with Rcpp}
-\author{by Romain Franc\c{c}ois and Dirk Eddelbuettel}
+\author{by Dirk Eddelbuettel and Romain Franc\c{c}ois}
 
 \maketitle
 
-\abstract{TBD}
+\abstract{
+}
 
 \section{Introduction}
 
@@ -25,19 +26,24 @@
 \pkg{Rcpp} package, in particular with respect to existing solutions
 that deal with coupling R and C++ and shows several use cases.
 
-Writing R Extensions \citep{R:exts} provides extensive documentation about the 
-various ways to couple R with code written in C. 
+Writing R Extensions \citep{R:exts} provides extensive 
+documentation about the ways to couple R with code written in C. 
 Writing such code requires both expertise and discipline from the 
-programmer. Discipline, with a large amount of bookkeeping 
-%% FIXME:  The two sentences need a rewrite
-duties around the \code{PROTECT}/\code{UNPROTECT} dance one 
-has to master the steps. Expertise, to learn and use efficiently 
-the set of macros offered by R headers. 
+programmer.
 
+% Expertise, to learn and use efficiently 
+% the set of macros offered by R headers. 
+% Discipline, with a large amount of bookkeeping 
+% %% FIXME:  The two sentences need a rewrite
+% duties around the \code{PROTECT}/\code{UNPROTECT} dance one 
+% as to master the steps. 
+
 The \pkg{Rcpp} package makes extensive use of C++ features (encapsulation, 
 constructors, destructors, operator overloading, templates) in order
 to hide the complexity of the R API --- without losing its 
-efficiency --- under the carpet of object orientation. 
+efficiency --- under the carpet of object orientation. In addition, 
+\pkg{Rcpp} takes advantage of some features of the forthcoming \code{C++0x} 
+standard, already supported by recent versions of the GCC.
 
 \subsection{Background}
 
@@ -313,7 +319,7 @@
 code can be trimmed to the essential :
 
 \begin{example}
-CharacterVector ab = {"foo","bar"};
+CharacterVector ab = \{"foo","bar"\};
 \end{example}
 
 
@@ -463,9 +469,10 @@
 \ \ R_FindNamespace( mkString("stats") ) ) ;
 SEXP rnorm = PROTECT( 
 \ \ findVarInFrame( stats, install("rnorm") ) ) ;
-SEXP call  = PROTECT( LCONS( rnorm, 
-\ \ CONS(ScalarInteger(10), 
-\ \ \ \ CONS(ScalarReal(100.0), R_NilValue)))) ;
+SEXP call  = PROTECT( 
+\ \ LCONS( rnorm, 
+\ \ \ \ CONS(ScalarInteger(10), 
+\ \ \ \ \ \ CONS(ScalarReal(100.0), R_NilValue)))) ;
 SET_TAG( CDDR(call), install("sd") ) ;
 SEXP res = PROTECT( eval( call, R_GlobalEnv ) );
 UNPROTECT(4) ;
@@ -487,16 +494,115 @@
 \end{example}
 
 
+\section{Performance/Limitations}
 
-\section{Performance}
+In this section, we illustrate that C++ features come with a price
+in terms of performance. As users of \pkg{Rcpp}, we do not want
+to replace performance with comfort. 
 
+As part of the redesign of \pkg{Rcpp}, data copy is kept to the
+absolute minimum, the \code{RObject} class and all its derived
+class is just a container for a \code{SEXP}, we let R perform
+all memory management and access data though the macros or functions
+offered by the standard R API. In contrasts, some data structures
+of the classic \pkg{Rcpp} interface such as the templated 
+\code{RcppVector} used containers offered by the standard template
+library to hold the data, requiring copy of the data 
+from R to C++ and back.
 
-\section{Summary}
+In this section, we illustrate how to take advantage of \code{Rcpp}
+to get the best of it. The classic Rcpp translation of the convolve example 
+from \cite{R:exts} appears in section~\ref{sec:classic_rcpp}. 
 
+With the new API, the code can be written as follows: 
 
+\begin{example}
+#include <Rcpp.h>
 
+RcppExport SEXP convolve3cpp(SEXP a, SEXP b)\{
+    Rcpp::NumericVector xa(a);
+    Rcpp::NumericVector xb(b);
+    int n_xa = xa.size() ;
+    int n_xb = xb.size() ;
+    int nab = n_xa + n_xb - 1;
+    Rcpp::NumericVector xab(nab);
 
+    for (int i = 0; i < nab; i++) xab[i] = 0.0;
+    for (int i = 0; i < n_xa; i++)
+        for (int j = 0; j < n_xb; j++) 
+            xab[i + j] += xa[i] * xa[j];
 
+    return xab ;
+\}
+\end{example}
+
+Seemingly, this code is as efficient as it can be. 
+However, when considering the implementation of the \code{operator[]}
+for the \code{NumericVector} class: 
+
+\begin{example}
+inline double& operator[]( const int& i ) { 
+	return REAL(m_sexp)[i];
+}
+\end{example}
+
+Each call to the \code{operator[]} on a \code{NumericVector}
+calls the \code{REAL} macro of the R API to retrieve the pointer to the
+underlying array of \code{double}. The code in \cite{R:exts} is much 
+more parsimonious with exactly only 3 calls to the \code{REAL} macro, 
+delegating extraction to pointer arithmetics which are usually much more 
+efficient. 
+
+The \code{NumericVector} class provides two member functions \code{begin}
+and \code{end} that can use used to retrieve respectively 
+the pointer to the first element and to the element after the last element
+of the underlying array. We can revisit the code to take advantage
+of \code{begin} : 
+
+\begin{example}
+#include <Rcpp.h>
+
+RcppExport SEXP convolve4cpp(SEXP a, SEXP b) \{
+    Rcpp::NumericVector xa(a);
+    Rcpp::NumericVector xb(b);
+    int n_xa = xa.size() ;
+    int n_xb = xb.size() ;
+    int nab = n_xa + n_xb - 1;
+    Rcpp::NumericVector xab(nab);
+    
+    double* pa = xa.begin() ;
+    double* pb = xb.begin() ;
+    double* pab = xab.begin() ;
+    int i,j=0; 
+    for (i = 0; i < nab; i++) pab[i] = 0.0;
+    for (i = 0; i < n_xa; i++)
+	for (j = 0; j < n_xb; j++) 
+	    pab[i + j] += pa[i] * pb[j];
+
+    return xab ;
+\}
+\end{example}
+
+The following timings show the time taken (in milliseconds) 
+by 1000 replicates of each function with \code{a} and 
+\code{b} containing 100 elements.
+    
+\begin{center}
+\begin{tabular}{cc}
+Method & elapsed time (ms) \\ 
+\hline
+R API & 34 \\
+\code{RcppVector<double>} & 353 \\
+\code{NumericVector::operator[]} & 245 \\
+\code{NumericVector::begin} & 36 \\
+\hline
+\end{tabular}
+\end{center}
+
+\section{Summary}
+
+
+
 \bibliography{EddelbuettelFrancois}
 
 \address{Dirk Eddelbuettel\\



More information about the Rcpp-commits mailing list