[Rcpp-commits] r345 - papers/rjournal

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Mon Jan 11 21:35:22 CET 2010


Author: romain
Date: 2010-01-11 21:35:19 +0100 (Mon, 11 Jan 2010)
New Revision: 345

Added:
   papers/rjournal/FrancoisEddelbuettel.bib
Modified:
   papers/rjournal/
   papers/rjournal/FrancoisEddelbuettel.tex
   papers/rjournal/Makefile
Log:
added some content


Property changes on: papers/rjournal
___________________________________________________________________
Name: svn:ignore
   - RJwrapper.out
RJwrapper.aux
RJwrapper.log
RJwrapper.pdf


   + RJwrapper.out
RJwrapper.aux
RJwrapper.log
RJwrapper.pdf
RJwrapper.bbl
RJwrapper.blg



Added: papers/rjournal/FrancoisEddelbuettel.bib
===================================================================
--- papers/rjournal/FrancoisEddelbuettel.bib	                        (rev 0)
+++ papers/rjournal/FrancoisEddelbuettel.bib	2010-01-11 20:35:19 UTC (rev 345)
@@ -0,0 +1,31 @@
+ at String{CRAN = "http://cran.r-project.org/" }
+ at String{manuals = CRAN # "doc/manuals/" }
+ at String{RCoreTeam = "{R Development Core Team}" }
+ at String{RFoundation = "R Foundation for Statistical Computing" }
+
+ at manual{R:exts, 
+	author = RCoreTeam,
+    organization = RFoundation,
+    address = {Vienna, Austria},
+    year = {2009},
+	title = "Writing R extensions",
+	url = manuals # "R-exts.html"
+}
+
+ at manual{R:ints, 
+	author = RCoreTeam,
+    organization = RFoundation,
+    address = {Vienna, Austria},
+    year = {2009},
+	title = "R internals",
+	url = manuals # "R-ints.html"
+}
+
+ at Manual{cran:inline,
+    title = {inline: Inline C, C++, Fortran function calls from R},
+    author = {Oleg Sklyar and Duncan Murdoch and Mike Smith and Dirk Eddelbuettel},
+    year = {2009},
+    note = {R package version 0.3.4},
+    url = {http://CRAN.R-project.org/package=inline},
+  }
+

Modified: papers/rjournal/FrancoisEddelbuettel.tex
===================================================================
--- papers/rjournal/FrancoisEddelbuettel.tex	2010-01-11 16:24:09 UTC (rev 344)
+++ papers/rjournal/FrancoisEddelbuettel.tex	2010-01-11 20:35:19 UTC (rev 345)
@@ -1,58 +1,331 @@
-\title{Capitalized Title Here}
+\title{Mesh R and C++ with Rcpp}
 \author{by Romain Franc\c{c}ois and Dirk Eddelbuettel}
 
 \maketitle
 
 \abstract{
-An abstract of less than 150 words.
+The \pkg{Rcpp} package provides a consistent and comprehensive set 
+of C++ classes designed to ease coupling of C++ highly efficient code
+with R. The \code{RObject} class assumes the responsability of 
+protecting and releasing its encapsulated R object (\code{SEXP})
+from garbage collection. The \code{wrap} set of functions allows
+wrapping many C++ built-in types and data structures from the standard
+template library into R objects. Similarly, the \code{as} set of 
+templated functions allows conversion of R objects back into C++
+types, such as \code{std::string}. With recent additions to the 
+\pkg{inline} package \citep{cran:inline}, 
+C++ code using the classes of the 
+\pkg{Rcpp} package can be inlined, compiled, loaded and wrapped 
+into an R function without leaving the R console. 
+This article reviews some of the design choices of the
+\pkg{Rcpp} package, in particular with respect to existing solutions
+that deal with coupling R and C++ and shows several use cases.
 }
 
-Introductory section which may include references in parentheses, say
-\citep{R:Ihaka+Gentleman:1996} or cite a reference such as
-\citet{R:Ihaka+Gentleman:1996} in the text.
+Writing R Extensions \citep{R:exts} provides extensive documentation about the 
+various ways to couple R with code written in C. 
+Writing such code requires both expertise and discipline from the 
+programmer. Discipline, with a large amount of bookkeeping 
+duties around the \code{PROTECT}/\code{UNPROTECT} dance one 
+has to master the steps. Expertise, to learn and use efficiently 
+the set of macros offered by R headers. 
 
-\section{Section title in sentence case}
+The \pkg{Rcpp} package makes extensive use of C++ features (encapsulation, 
+constructors, destructors, operator overloading, templates) in order
+to hide the complexity of the R API --- without losing its 
+efficiency --- under the carpet of object orientation. 
 
-This section may contain a figure such as Figure \ref{figure:onecolfig}.
+\section{\pkg{Rcpp} C++ classes}
 
-\begin{figure}
-\vspace*{.1in}
-\framebox[\textwidth]{\hfill \raisebox{-.45in}{\rule{0in}{1in}}
-                      A picture goes here \hfill}
-\caption{\label{figure:onecolfig}
-A normal figure only occupies one column.}
-\end{figure}
+\subsection{The RObject class}
 
-\section{Another section}
+The \code{RObject} class is the base class of all objects in the 
+API of the \pkg{Rcpp} package. An \code{RObject} has only one 
+data member, the protected \code{SEXP} it encapsulates. 
+The \code{RObject} treats the \code{SEXP} as a resource, following the
+RAII (resource acquisition is initialization) pattern. As long as the 
+\code{RObject} instance is alive, its underlying \code{SEXP} remains 
+protected from garbage collection. When the \code{RObject} goes out 
+of scope (function return, exceptions), it removes the protection so that 
+if the \code{SEXP} is not otherwise protected it becomes subject to 
+garbage collection. 
 
-There will likely be several sections, perhaps including code snippets, such
-as
+Garbage collection is only mentionned here to illustrate the basic design
+of the \code{RObject} class, the user of \pkg{Rcpp} need not to concern 
+himself/herself with such matters and can instead focus on the problem
+that he/she is solving.
+
+The \code{RObject} class also defines a set of member functions that
+can be used on any R object, regardless of its type.
+
+\begin{center}
+\begin{small}
+\begin{tabular}{cc}
+method & action \\
+\hline
+\code{isNULL} & is the object \code{NULL}\\
+\hline
+\code{attributeNames} & the names of its attributes\\
+\code{hasAttribute} & does it have a given attribute\\
+\code{attr} & retrieve or set an attribute \\
+\hline
+\code{isS4} & is it an S4 object \\
+\code{hasSlot} & if S4, does it have the given slot\\
+\code{slot} & retrieve a given slot \\
+\hline
+\end{tabular}
+\end{small} 
+\end{center}
+
+\subsection{Derived classes}
+
+Internally, an R object must have one type amongst the set of 
+predefined types, commonly referred to as SEXP types. R internals
+\citep{R:ints} documents the various types. \pkg{Rcpp} associates
+a C++ class for most SEXP types.
+        
+\begin{center}
+\begin{small}
+\begin{tabular}{ccc}
+SEXP type &  \pkg{Rcpp} class \\
+\hline 
+\code{NILSXP} &  	\\
+\code{SYMSXP} &	 \code{Symbol} \\
+\code{LISTSXP} & \code{Pairlist} \\
+\code{CLOSXP} &	 \code{Function} \\
+\code{ENVSXP} &	 \code{Environment} \\
+\code{PROMSXP} & \code{Promise} \\
+\code{LANGSXP} & \code{Language} \\
+\code{SPECIALSXP} & \code{Function} \\
+\code{BUILTINSXP} & \code{Function} \\
+\code{CHARSXP} & \\
+\code{LGLSXP} &	 \code{LogicalVector} \\
+\code{INTSXP} &	 \code{IntegerVector} \\
+\code{REALSXP} & \code{NumericVector} \\
+\code{CPLXSXP} & \code{ComplexVector}\\
+\code{STRSXP} &	 \code{CharacterVector} \\
+\code{DOTSXP} &	 \code{Pairlist} \\
+\code{ANYSXP} &	 \\
+\code{VECSXP} &	 \code{List} \\
+\code{EXPRSXP} & \code{ExpressionVector}\\
+\code{BCODESXP} & \\
+\code{EXTPTRSXP} & \code{XPtr<T>}\\
+\code{WEAKREFSXP} & \code{WeakReference}\\
+\code{RAWSXP} &	 \code{RawVector}\\
+\code{S4SXP} & \\
+\hline
+\end{tabular}
+\end{small}
+\end{center}
+
+Some types do not have their own C++ class. \code{NILSXP} and 
+\code{S4SXP} have their functionality covered by the \code{RObject}
+class, \code{ANYSXP} is just a placeholder to facilitate S4 dispatch 
+and no object in R has this type and \code{BCODESXP} is not currently 
+used.
+
+Each class contains functionality that is relevant to the R object
+that it encapsulates. For example \code{Environment} contains 
+member methods to query the list of objects in the associated environment, 
+classes with the \code{Vector} overload the \code{operator[]} in order
+to extract/modify values at the given position in the vector, ...
+
+The rest of this section presents example uses of \pkg{Rcpp} classes. 
+
+\subsection{numeric vector}
+
+The following code snippet is extracted from Writing R extensions
+\citep{R:exts}. It creates a \code{numeric} vector of two elements 
+and assigns some values to it. 
+
 \begin{example}
-  x <- 1:10
-  result <- myFunction(x)
+#include <R.h>
+#include <Rinternals.h>
+
+SEXP ab;
+  ....
+PROTECT(ab = allocVector(REALSXP, 2));
+REAL(ab)[0] = 123.45;
+REAL(ab)[1] = 67.89;
+UNPROTECT(1);
 \end{example}
 
+Although this is one of the simplest examples in Writing R extensions, 
+it seems verbose and it is not trivial at first sight what is happening.
+\begin{itemize}
+\item \code{allocVector} is used to allocate memory. We must supply to it 
+the type of data (\code{REALSXP}) and the number of elements.
+\item once allocated, the \code{ab} object must be protected from
+garbage collection. Since the garbage collector can happen at any time, 
+not protecting an object means its memory might be reclaimed before we are
+finished with it.
+\item The \code{REAL} macro returns a pointer to the beginning of the 
+actual array. 
+\end{itemize}
+
+Using the \code{Rcpp::NumericVector}, the code can be rewritten: 
+
+\begin{example}
+#include <Rcpp.h>
+using namespace Rcpp;
+NumericVector ab(2) ;
+ab[0] = 123.45;
+ab[1] = 67.89;
+\end{example}
+
+The code contains much less idiomatic decorations. Here are the steps involved: 
+\begin{itemize}
+\item The \code{NumericVector} constructor is given the number
+of elements the vector contains (2), this hides a call to the 
+\code{allocVector} we saw previously. 
+\item Also hidden is protection of the 
+object from garbage collection, which is a behavior that \code{NumericVector}
+inherits from \code{RObject}
+\item values are assigned to the first and second elements of the vector. 
+This is achieved \code{NumericVector} overloads the \code{operator[]}.
+\end{itemize}
+
+With recent compilers (e.g. GCC >= 4.4) implementing the forthcoming 
+C++ standard (C++0x), the previous code may even be reduced 
+to the following :
+
+\begin{example}
+#include <Rcpp.h>
+using namespace Rcpp;
+NumericVector ab = {123.45, 67.89};
+\end{example}
+
+\subsection{character vectors}
+
+A second example deals with character vectors and emulates this R code
+
+\begin{example}
+> x <- c("foo", "bar")
+\end{example}
+
+Using the traditional R API, the vector can be allocated and filled as such:
+
+\begin{example}
+SEXP ab;
+PROTECT(ab = allocVector(STRSXP, 2));
+SET_STRING_ELT( ab, 0, mkChar("foo") );
+SET_STRING_ELT( ab, 1, mkChar("bar") );
+UNPROTECT(1);
+\end{example}
+
+Using the \pkg{Rcpp::CharacterVector} class, we can express this code as : 
+
+\begin{example}
+CharacterVector ab(2) ;
+ab[0] = "foo" ;
+ab[1] = "bar" ;
+\end{example}
+
+Additionally, if C++0x initializer list is implemented by the compiler, the 
+code can be trimmed to the essential :
+
+\begin{example}
+CharacterVector ab = {"foo","bar"};
+\end{example}
+
+
+\section{wrap and as}
+
+Besides classes, the \pkg{Rcpp} package also contains utilities allowing
+conversion from R objects to C++ types and vice-versa. Through 
+polymorphism, the \code{wrap} set of functions can be used to wrap 
+some data structure into an \code{RObject} instance. In total, the 
+\pkg{Rcpp} defines 23 different \code{wrap} functions, listed below :
+
+\begin{small}
+\begin{center}
+\begin{tabular}{cc}
+C++ type & \pkg{Rcpp} type \\
+\hline
+\code{SEXP} & $\star$ \\
+\hline
+\code{bool} & \code{LogicalVector} \\
+\code{double} & \code{NumericVector}  \\
+\code{int} & \code{IntegerVector}  \\
+\code{size\_t} & \code{IntegerVector}  \\
+\code{unsigned char} & \code{RawVector}  \\
+\code{string} & \code{CharacterVector}  \\
+\code{char*} & \code{CharacterVector}  \\
+\hline
+\code{vector<int>} & \code{IntegerVector}  \\
+\code{vector<double>} & \code{NumericVector}  \\
+\code{vector<unsigned char>} & \code{RawVector}  \\
+\code{vector<bool>} & \code{LogicalVector}  \\
+\code{vector<string>} & \code{CharacterVector}  \\
+\hline
+\code{set<int>} & \code{IntegerVector}  \\
+\code{set<double>} & \code{NumericVector}  \\
+\code{set<unsigned char>} & \code{RawVector}  \\
+\code{set<string>} & \code{CharacterVector}  \\
+\hline
+\code{initializer\_list<int>} & \code{IntegerVector}  \\
+\code{initializer\_list<double>} & \code{NumericVector}  \\
+\code{initializer\_list<unsigned char>} & \code{RawVector}  \\
+\code{initializer\_list<bool>} & \code{LogicalVector}  \\
+\code{initializer\_list<string>} & \code{CharacterVector}  \\
+\code{initializer\_list<RObject>} & \code{List} \\
+\hline
+\end{tabular}
+\begin{small}$\star$ : depends on the type of the \code{SEXP}\end{small}
+\end{center}
+\end{small}
+
+Here are a few examples of \code{wrap} calls: 
+
+\begin{example}
+LogicalVector x1 = wrap( false ); 
+IntegerVector x2 = wrap( 1 ) ;    
+
+vector<double> v ; 
+v.push_back(0.0); v.push_back( 1.0 ); 
+NumericVector x3 = wrap( v ) ;  
+
+// initializer list (only on GCC >= 4.4)
+LogicalVector x4 = wrap( \{ false, true\} );
+CharacterVector x5 = wrap( \{"foo", "bar"\} );
+\end{example}
+
+Similarly, converting an R object to a C++ standard type is implemented
+by variations on the \code{as} template function. In this case, we must 
+use the angle brackets to specify which version of as we want to use. 
+
+\begin{example}
+bool x = as<bool>(x) ;
+double x = as<double>(x) ;
+vector<int> x = as< vector<int> >(x) ;
+\end{example}
+
+\section{external pointers}
+
+factor this out from romain's blog
+
+\section{inline code}
+
+dirk ?
+
+\section{others}
+
+CXXR
+Rserve C++ client
+RcppTemplate
+rcppbind
+...
+
+\section{Rcpp vintage api}
+
+
 \section{Summary}
 
-This file is only a basic article template. For full details of \emph{The R Journal}
-style and information on how to prepare your article for submission, see the
-\href{http://journal.r-project.org/latex/RJauthorguide.pdf}{Instructions for Authors}.
 
-%\bibliography{example}
 
-\begin{thebibliography}{1}
-\expandafter\ifx\csname natexlab\endcsname\relax\def\natexlab#1{#1}\fi
-\expandafter\ifx\csname url\endcsname\relax
-  \def\url#1{{\tt #1}}\fi
 
-\bibitem[Ihaka and Gentleman(1996)]{R:Ihaka+Gentleman:1996}
-R.~Ihaka and R.~Gentleman.
-\newblock R: A language for data analysis and graphics.
-\newblock {\em Journal of Computational and Graphical Statistics}, 5\penalty0
-  (3):\penalty0 299--314, 1996.
-\newblock URL \url{http://www.amstat.org/publications/jcgs/}.
 
-\end{thebibliography}
+\bibliography{FrancoisEddelbuettel}
 
 \address{Romain Fran\c{c}ois\\
   Professionnal R Enthusiast\\

Modified: papers/rjournal/Makefile
===================================================================
--- papers/rjournal/Makefile	2010-01-11 16:24:09 UTC (rev 344)
+++ papers/rjournal/Makefile	2010-01-11 20:35:19 UTC (rev 345)
@@ -8,5 +8,7 @@
 
 RJwrapper.pdf: RJwrapper.tex FrancoisEddelbuettel.tex RJournal.sty
 	pdflatex RJwrapper.tex
+	bibtex RJwrapper
 	pdflatex RJwrapper.tex
+	pdflatex RJwrapper.tex
 



More information about the Rcpp-commits mailing list