[Rcpp-commits] r356 - papers/rjournal
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Tue Jan 12 19:05:46 CET 2010
Author: romain
Date: 2010-01-12 19:05:46 +0100 (Tue, 12 Jan 2010)
New Revision: 356
Modified:
papers/rjournal/EddelbuettelFrancois.tex
Log:
embryo for performance section
Modified: papers/rjournal/EddelbuettelFrancois.tex
===================================================================
--- papers/rjournal/EddelbuettelFrancois.tex 2010-01-12 16:10:54 UTC (rev 355)
+++ papers/rjournal/EddelbuettelFrancois.tex 2010-01-12 18:05:46 UTC (rev 356)
@@ -1,9 +1,10 @@
\title{Mesh R and C++ with Rcpp}
-\author{by Romain Franc\c{c}ois and Dirk Eddelbuettel}
+\author{by Dirk Eddelbuettel and Romain Franc\c{c}ois}
\maketitle
-\abstract{TBD}
+\abstract{
+}
\section{Introduction}
@@ -25,19 +26,24 @@
\pkg{Rcpp} package, in particular with respect to existing solutions
that deal with coupling R and C++ and shows several use cases.
-Writing R Extensions \citep{R:exts} provides extensive documentation about the
-various ways to couple R with code written in C.
+Writing R Extensions \citep{R:exts} provides extensive
+documentation about the ways to couple R with code written in C.
Writing such code requires both expertise and discipline from the
-programmer. Discipline, with a large amount of bookkeeping
-%% FIXME: The two sentences need a rewrite
-duties around the \code{PROTECT}/\code{UNPROTECT} dance one
-has to master the steps. Expertise, to learn and use efficiently
-the set of macros offered by R headers.
+programmer.
+% Expertise, to learn and use efficiently
+% the set of macros offered by R headers.
+% Discipline, with a large amount of bookkeeping
+% %% FIXME: The two sentences need a rewrite
+% duties around the \code{PROTECT}/\code{UNPROTECT} dance one
+% as to master the steps.
+
The \pkg{Rcpp} package makes extensive use of C++ features (encapsulation,
constructors, destructors, operator overloading, templates) in order
to hide the complexity of the R API --- without losing its
-efficiency --- under the carpet of object orientation.
+efficiency --- under the carpet of object orientation. In addition,
+\pkg{Rcpp} takes advantage of some features of the forthcoming \code{C++0x}
+standard, already supported by recent versions of the GCC.
\subsection{Background}
@@ -313,7 +319,7 @@
code can be trimmed to the essential :
\begin{example}
-CharacterVector ab = {"foo","bar"};
+CharacterVector ab = \{"foo","bar"\};
\end{example}
@@ -463,9 +469,10 @@
\ \ R_FindNamespace( mkString("stats") ) ) ;
SEXP rnorm = PROTECT(
\ \ findVarInFrame( stats, install("rnorm") ) ) ;
-SEXP call = PROTECT( LCONS( rnorm,
-\ \ CONS(ScalarInteger(10),
-\ \ \ \ CONS(ScalarReal(100.0), R_NilValue)))) ;
+SEXP call = PROTECT(
+\ \ LCONS( rnorm,
+\ \ \ \ CONS(ScalarInteger(10),
+\ \ \ \ \ \ CONS(ScalarReal(100.0), R_NilValue)))) ;
SET_TAG( CDDR(call), install("sd") ) ;
SEXP res = PROTECT( eval( call, R_GlobalEnv ) );
UNPROTECT(4) ;
@@ -487,16 +494,115 @@
\end{example}
+\section{Performance/Limitations}
-\section{Performance}
+In this section, we illustrate that C++ features come with a price
+in terms of performance. As users of \pkg{Rcpp}, we do not want
+to replace performance with comfort.
+As part of the redesign of \pkg{Rcpp}, data copy is kept to the
+absolute minimum, the \code{RObject} class and all its derived
+class is just a container for a \code{SEXP}, we let R perform
+all memory management and access data though the macros or functions
+offered by the standard R API. In contrasts, some data structures
+of the classic \pkg{Rcpp} interface such as the templated
+\code{RcppVector} used containers offered by the standard template
+library to hold the data, requiring copy of the data
+from R to C++ and back.
-\section{Summary}
+In this section, we illustrate how to take advantage of \code{Rcpp}
+to get the best of it. The classic Rcpp translation of the convolve example
+from \cite{R:exts} appears in section~\ref{sec:classic_rcpp}.
+With the new API, the code can be written as follows:
+\begin{example}
+#include <Rcpp.h>
+RcppExport SEXP convolve3cpp(SEXP a, SEXP b)\{
+ Rcpp::NumericVector xa(a);
+ Rcpp::NumericVector xb(b);
+ int n_xa = xa.size() ;
+ int n_xb = xb.size() ;
+ int nab = n_xa + n_xb - 1;
+ Rcpp::NumericVector xab(nab);
+ for (int i = 0; i < nab; i++) xab[i] = 0.0;
+ for (int i = 0; i < n_xa; i++)
+ for (int j = 0; j < n_xb; j++)
+ xab[i + j] += xa[i] * xa[j];
+ return xab ;
+\}
+\end{example}
+
+Seemingly, this code is as efficient as it can be.
+However, when considering the implementation of the \code{operator[]}
+for the \code{NumericVector} class:
+
+\begin{example}
+inline double& operator[]( const int& i ) {
+ return REAL(m_sexp)[i];
+}
+\end{example}
+
+Each call to the \code{operator[]} on a \code{NumericVector}
+calls the \code{REAL} macro of the R API to retrieve the pointer to the
+underlying array of \code{double}. The code in \cite{R:exts} is much
+more parsimonious with exactly only 3 calls to the \code{REAL} macro,
+delegating extraction to pointer arithmetics which are usually much more
+efficient.
+
+The \code{NumericVector} class provides two member functions \code{begin}
+and \code{end} that can use used to retrieve respectively
+the pointer to the first element and to the element after the last element
+of the underlying array. We can revisit the code to take advantage
+of \code{begin} :
+
+\begin{example}
+#include <Rcpp.h>
+
+RcppExport SEXP convolve4cpp(SEXP a, SEXP b) \{
+ Rcpp::NumericVector xa(a);
+ Rcpp::NumericVector xb(b);
+ int n_xa = xa.size() ;
+ int n_xb = xb.size() ;
+ int nab = n_xa + n_xb - 1;
+ Rcpp::NumericVector xab(nab);
+
+ double* pa = xa.begin() ;
+ double* pb = xb.begin() ;
+ double* pab = xab.begin() ;
+ int i,j=0;
+ for (i = 0; i < nab; i++) pab[i] = 0.0;
+ for (i = 0; i < n_xa; i++)
+ for (j = 0; j < n_xb; j++)
+ pab[i + j] += pa[i] * pb[j];
+
+ return xab ;
+\}
+\end{example}
+
+The following timings show the time taken (in milliseconds)
+by 1000 replicates of each function with \code{a} and
+\code{b} containing 100 elements.
+
+\begin{center}
+\begin{tabular}{cc}
+Method & elapsed time (ms) \\
+\hline
+R API & 34 \\
+\code{RcppVector<double>} & 353 \\
+\code{NumericVector::operator[]} & 245 \\
+\code{NumericVector::begin} & 36 \\
+\hline
+\end{tabular}
+\end{center}
+
+\section{Summary}
+
+
+
\bibliography{EddelbuettelFrancois}
\address{Dirk Eddelbuettel\\
More information about the Rcpp-commits
mailing list