[Rcpp-devel] [Rcpp-commits] r354 - papers/rjournal
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Tue Jan 12 16:30:21 CET 2010
Author: romain
Date: 2010-01-12 16:30:21 +0100 (Tue, 12 Jan 2010)
New Revision: 354
Added:
papers/rjournal/EddelbuettelFrancois.bib
papers/rjournal/EddelbuettelFrancois.tex
Removed:
papers/rjournal/FrancoisEddelbuettel.bib
papers/rjournal/FrancoisEddelbuettel.tex
Modified:
papers/rjournal/Makefile
papers/rjournal/RJwrapper.tex
Log:
use alphabetical order
Copied: papers/rjournal/EddelbuettelFrancois.bib (from rev 353, papers/rjournal/FrancoisEddelbuettel.bib)
===================================================================
--- papers/rjournal/EddelbuettelFrancois.bib (rev 0)
+++ papers/rjournal/EddelbuettelFrancois.bib 2010-01-12 15:30:21 UTC (rev 354)
@@ -0,0 +1,63 @@
+ at String{CRAN = "http://cran.r-project.org/" }
+ at String{manuals = CRAN # "doc/manuals/" }
+ at String{RCoreTeam = "{R Development Core Team}" }
+ at String{RFoundation = "R Foundation for Statistical Computing" }
+
+ at manual{R:exts,
+ author = RCoreTeam,
+ organization = RFoundation,
+ address = {Vienna, Austria},
+ year = {2009},
+ title = "Writing R extensions",
+ url = manuals # "R-exts.html"
+}
+
+ at manual{R:ints,
+ author = RCoreTeam,
+ organization = RFoundation,
+ address = {Vienna, Austria},
+ year = {2009},
+ title = "R internals",
+ url = manuals # "R-ints.html"
+}
+
+ at Manual{cran:inline,
+ title = {inline: Inline C, C++, Fortran function calls from R},
+ author = {Oleg Sklyar and Duncan Murdoch and Mike Smith and Dirk Eddelbuettel},
+ year = {2009},
+ note = {R package version 0.3.4},
+ url = {http://CRAN.R-project.org/package=inline},
+ }
+
+ at Manual{cran:Rserve,
+ title = {Rserve: Binary R server},
+ author = {Simon Urbanek},
+ note = {R package version 0.6-1},
+ url = {http://www.rforge.net/Rserve/},
+ }
+
+ at InProceedings{batesdebroy01:cppclasses,
+ author = {Douglas M. Bates and Saikat DebRoy},
+ title = {{C++} Classes for {R} Objects},
+ booktitle = {Proceedings of the 2nd International Workshop on Distributed Statistical Computing},
+ year = 2001,
+ editor = {Kurt Hornik & Friedrich Leisch},
+ address = {TU Vienna, Austria}
+}
+
+ at Unpublished{javagailemanly07:r_cpp,
+ author = {James J. Java and Daniel P. Gaile and Kenneth E. Manly},
+ title = {R/Cpp: Interface Classes to Simplify Using R Objects in C++ Extensions},
+ note = {Unpublished manuscript, University of Buffalo},
+ month = {July},
+ year = 2007
+}
+
+ at InProceedings{runnalls09:cxxr,
+ author= {Andrew Runnalls},
+ title = {Aspects of CXXR internals},
+ booktitle = {Directions in Statistical Computing},
+ address = {University of Copenhagen, Denmark},
+ year= 2009
+ }
+
Copied: papers/rjournal/EddelbuettelFrancois.tex (from rev 353, papers/rjournal/FrancoisEddelbuettel.tex)
===================================================================
--- papers/rjournal/EddelbuettelFrancois.tex (rev 0)
+++ papers/rjournal/EddelbuettelFrancois.tex 2010-01-12 15:30:21 UTC (rev 354)
@@ -0,0 +1,510 @@
+\title{Mesh R and C++ with Rcpp}
+\author{by Romain Franc\c{c}ois and Dirk Eddelbuettel}
+
+\maketitle
+
+\abstract{TBD}
+
+\section{Introduction}
+
+\subsection{Overview}
+The \pkg{Rcpp} package provides a consistent and comprehensive set
+of C++ classes designed to ease coupling of C++ code
+with R. The \code{RObject} class is responsible for
+protecting and releasing its encapsulated R object (\code{SEXP})
+from garbage collection. The \code{wrap} set of functions allows
+wrapping many C++ built-in types and data structures from the standard
+template library into R objects. Similarly, the \code{as} set of
+templated functions allows conversion of R objects back into C++
+types, such as \code{std::string}. With recent additions to the
+\pkg{inline} package \citep{cran:inline},
+C++ code using the classes of the
+\pkg{Rcpp} package can be inlined, compiled, loaded and wrapped
+into an R function without leaving the R console.
+This article reviews some of the design choices of the
+\pkg{Rcpp} package, in particular with respect to existing solutions
+that deal with coupling R and C++ and shows several use cases.
+
+Writing R Extensions \citep{R:exts} provides extensive documentation about the
+various ways to couple R with code written in C.
+Writing such code requires both expertise and discipline from the
+programmer. Discipline, with a large amount of bookkeeping
+%% FIXME: The two sentences need a rewrite
+duties around the \code{PROTECT}/\code{UNPROTECT} dance one
+has to master the steps. Expertise, to learn and use efficiently
+the set of macros offered by R headers.
+
+The \pkg{Rcpp} package makes extensive use of C++ features (encapsulation,
+constructors, destructors, operator overloading, templates) in order
+to hide the complexity of the R API --- without losing its
+efficiency --- under the carpet of object orientation.
+
+\subsection{Background}
+
+The first public version of \pkg{Rcpp} was released in 2005 as a contribution
+to the \pkg{RQuantLib} package. \pkg{Rcpp} was then released in a package of
+the same name in early 2006 which was following by several releases. It was
+then renamed to \pkg{RcppTemplate} and had several more releases during 2006.
+However, no releases or updates were made during 2007 and 2008.
+
+Given the continued use of package, it was revived and using the former name
+\pkg{Rcpp}. New releases started in November 2008 which include an improved
+build and distribution process, additional documentation, new
+functionality---while retaining the existing interface. This constitutes the
+`classic \pkg{Rcpp}' interface (see section FOO) which will be provided for
+the forseeable future.
+
+Yet C++ coding standards continued to evolved. So, in late 2009 the codebase
+was significantly extended and numerous new features were added. Several of
+these are described below in section BAR. This constitutes the `enhanced
+\pkg{Rcpp}' interface which we also intend to support going forward.
+
+\subsection{Comparison}
+
+Integration of C++ and R has been addressed by several authors starting with
+\cite{batesdebroy01:cppclasses}. \cite{javagailemanly07:r_cpp}, in an
+unpublished paper, express several ideas that are close to some of our
+approaches, though not yet fully fleshed out.
+
+Rserve \citep{cran:Rserve} was early with C++ use in one
+of its clients and also wrapped SEXP objects.
+% FIXME: the Rserve client does not know about SEXP, it defines
+% Rexp, something that looks like SEXP, but isn't;
+% R does not have to be on the client side
+
+CXXR \citep{runnalls09:cxxr} comes from the other side
+and aims to rebuild R using a C++. If moving from C to C++ is to be
+compared with improving someone's house, \pkg{Rcpp} is repainting the
+walls where CXXR rebuilds from the foundations.
+% maybe we do a bit more than the walls ?
+% new more comfortable furnitures ?
+
+rcppbind ...
+
+Whit A ...
+
+RcppTemplate recently decided to break with the `classic \pkg{Rcpp}' API.
+
+
+\section{Classic Rcpp}
+
+\pkg{Rcpp} is focussed on function in the standard sense of returning one (or
+more) results given inputs. An illustration can be provided using the
+time-tested example of a convolution of two vectors \citep{R:exts} but now
+rewritten using \pkg{Rcpp}.
+
+\begin{example}
+#include <Rcpp.h>
+
+RcppExport SEXP convolve2cpp(SEXP a, SEXP b) \{
+ RcppVector<double> xa(a);
+ RcppVector<double> xb(b);
+ int nab = xa.size() + xb.size() - 1;
+
+ RcppVector<double> xab(nab);
+ for (int i = 0; i < nab; i++) xab(i) = 0.0;
+
+ for (int i = 0; i < xa.size(); i++)
+ for (int j = 0; j < xb.size(); j++)
+ xab(i + j) += xa(i) * xb(j);
+
+ RcppResultSet rs;
+ rs.add("ab", xab);
+ return rs.getReturnList();
+\}
+\end{example}
+
+\section{inline code}
+
+TBD (Dirk), maybe also something about deployment (update.package() bring new
+versiosn, Rcpp:::CxxFlags() and friends, dynamic linking)
+
+\section{\pkg{Rcpp} C++ classes}
+
+\subsection{The RObject class}
+
+The \code{RObject} class is the base class of all objects in the
+API of the \pkg{Rcpp} package. An \code{RObject} has only one
+data member, the protected \code{SEXP} it encapsulates.
+The \code{RObject} treats the \code{SEXP} as a resource, following the
+RAII (resource acquisition is initialization) pattern. As long as the
+\code{RObject} instance is alive, its underlying \code{SEXP} remains
+protected from garbage collection. When the \code{RObject} goes out
+of scope (function return, exceptions), it removes the protection so that
+if the \code{SEXP} is not otherwise protected when it becomes subject to
+garbage collection.
+
+% FIXME: Shorten and make a footnote?
+Garbage collection is only mentioned here to illustrate the basic design
+of the \code{RObject} class, the user of \pkg{Rcpp} need not to concern
+himself/herself with such matters and can instead focus on the problem
+that he/she is solving.
+
+The \code{RObject} class also defines a set of member functions that
+can be used on any R object, regardless of its type.
+
+\begin{center}
+\begin{small}
+\begin{tabular}{cc}
+method & action \\
+\hline
+\code{isNULL} & is the object \code{NULL}\\
+\hline
+\code{attributeNames} & the names of its attributes\\
+\code{hasAttribute} & does it have a given attribute\\
+\code{attr} & retrieve or set an attribute \\
+\hline
+\code{isS4} & is it an S4 object \\
+\code{hasSlot} & if S4, does it have the given slot\\
+\code{slot} & retrieve a given slot \\
+\hline
+\end{tabular}
+\end{small}
+\end{center}
+
+\subsection{Derived classes}
+
+Internally, an R object must have one type amongst the set of
+predefined types, commonly referred to as SEXP types. R internals
+\citep{R:ints} documents the various types. \pkg{Rcpp} associates
+a C++ class for most SEXP types.
+
+\begin{center}
+\begin{small}
+\begin{tabular}{ccc}
+SEXP type & \pkg{Rcpp} class \\
+\hline
+\code{NILSXP} & \\
+\code{SYMSXP} & \code{Symbol} \\
+\code{LISTSXP} & \code{Pairlist} \\
+\code{CLOSXP} & \code{Function} \\
+\code{ENVSXP} & \code{Environment} \\
+\code{PROMSXP} & \code{Promise} \\
+\code{LANGSXP} & \code{Language} \\
+\code{SPECIALSXP} & \code{Function} \\
+\code{BUILTINSXP} & \code{Function} \\
+\code{CHARSXP} & \\
+\code{LGLSXP} & \code{LogicalVector} \\
+\code{INTSXP} & \code{IntegerVector} \\
+\code{REALSXP} & \code{NumericVector} \\
+\code{CPLXSXP} & \code{ComplexVector}\\
+\code{STRSXP} & \code{CharacterVector} \\
+\code{DOTSXP} & \code{Pairlist} \\
+\code{ANYSXP} & \\
+\code{VECSXP} & \code{List} \\
+\code{EXPRSXP} & \code{ExpressionVector}\\
+\code{BCODESXP} & \\
+\code{EXTPTRSXP} & \code{XPtr<T>}\\
+\code{WEAKREFSXP} & \code{WeakReference}\\
+\code{RAWSXP} & \code{RawVector}\\
+\code{S4SXP} & \\
+\hline
+\end{tabular}
+\end{small}
+\end{center}
+
+Some types do not have their own C++ class. \code{NILSXP} and
+\code{S4SXP} have their functionality covered by the \code{RObject}
+class; \code{ANYSXP} is just a placeholder to facilitate S4 dispatch
+and no object in R has this type; and \code{BCODESXP} is not currently
+used.
+
+Each class contains functionality that is relevant to the R object
+that it encapsulates. For example \code{Environment} contains
+member methods to query the list of objects in the associated environment,
+classes with the \code{Vector} overload the \code{operator[]} in order
+to extract/modify values at the given position in the vector, ...
+
+The rest of this section presents example uses of \pkg{Rcpp} classes.
+
+\subsection{numeric vector}
+
+The following code snippet is extracted from Writing R extensions
+\citep{R:exts}. It creates a \code{numeric} vector of two elements
+and assigns some values to it.
+
+\begin{example}
+#include <R.h>
+#include <Rinternals.h>
+
+SEXP ab;
+ ....
+PROTECT(ab = allocVector(REALSXP, 2));
+REAL(ab)[0] = 123.45;
+REAL(ab)[1] = 67.89;
+UNPROTECT(1);
+\end{example}
+
+Although this is one of the simplest examples in Writing R extensions,
+it seems verbose and it is not trivial at first sight what is happening.
+\begin{itemize}
+\item \code{allocVector} is used to allocate memory. We must supply to it
+the type of data (\code{REALSXP}) and the number of elements.
+\item once allocated, the \code{ab} object must be protected from
+garbage collection. Since the garbage collector can happen at any time,
+not protecting an object means its memory might be reclaimed before we are
+finished with it.
+\item The \code{REAL} macro returns a pointer to the beginning of the
+actual array; its indexing is does not resemble either R or C++.
+\end{itemize}
+
+Using the \code{Rcpp::NumericVector}, the code can be rewritten:
+
+\begin{example}
+#include <Rcpp.h>
+using namespace Rcpp;
+NumericVector ab(2) ;
+ab[0] = 123.45;
+ab[1] = 67.89;
+\end{example}
+
+The code contains much less idiomatic decorations. Here are the steps involved:
+\begin{itemize}
+\item The \code{NumericVector} constructor is given the number
+of elements the vector contains (2), this hides a call to the
+\code{allocVector} we saw previously.
+\item Also hidden is protection of the
+object from garbage collection, which is a behavior that \code{NumericVector}
+inherits from \code{RObject}
+\item values are assigned to the first and second elements of the vector.
+This is achieved \code{NumericVector} overloads the \code{operator[]}.
+\end{itemize}
+
+With recent compilers (e.g. GCC >= 4.4) implementing the forthcoming
+C++ standard (C++0x), the previous code may even be reduced
+to the following :
+
+\begin{example}
+#include <Rcpp.h>
+using namespace Rcpp;
+NumericVector ab = {123.45, 67.89};
+\end{example}
+
+\subsection{character vectors}
+
+A second example deals with character vectors and emulates this R code
+
+\begin{example}
+> x <- c("foo", "bar")
+\end{example}
+
+Using the traditional R API, the vector can be allocated and filled as such:
+
+\begin{example}
+SEXP ab;
+PROTECT(ab = allocVector(STRSXP, 2));
+SET_STRING_ELT( ab, 0, mkChar("foo") );
+SET_STRING_ELT( ab, 1, mkChar("bar") );
+UNPROTECT(1);
+\end{example}
+
+Using the \pkg{Rcpp::CharacterVector} class, we can express this code as :
+
+\begin{example}
+CharacterVector ab(2) ;
+ab[0] = "foo" ;
+ab[1] = "bar" ;
+\end{example}
+
+Additionally, if C++0x initializer list is implemented by the compiler, the
+code can be trimmed to the essential :
+
+\begin{example}
+CharacterVector ab = {"foo","bar"};
+\end{example}
+
+
+\section{wrap and as}
+
+Besides classes, the \pkg{Rcpp} package also contains utilities allowing
+conversion from R objects to C++ types and vice-versa. Through
+polymorphism, the \code{wrap} set of functions can be used to wrap
+some data structure into an \code{RObject} instance.
+
+In total, the \pkg{Rcpp} defines 23 different \code{wrap}
+functions, including :
+\begin{itemize}
+\item SEXP
+\item primitive types : \code{bool}, \code{int}, \code{double},
+\code{size\_t}, \code{unsigned char} (byte), \code{std::string} and
+\code{char*}
+\item stl vectors of these types: \code{vecor<int>},
+\code{vector<double>}, \code{vector<bool>}, \code{vector<unsigned char>},
+\code{vector<string>}
+\item stl sets : \code{set<int>}, \code{set<double>}, \code{set<unsigned char>},
+\code{set<string>}
+\item initializer lists (only available in GCC 4.4).
+\end{itemize}
+
+Each type is wrapped in the most sensible class, e.g. \code{vector<double>}
+is wrapped into an \pkg{NumericVector} object, which in turns encapsulates
+a numeric vector (a \code{SEXP} of type \code{REALSXP}).
+Here are a few examples of \code{wrap} calls:
+
+\begin{example}
+LogicalVector x1 = wrap( false );
+IntegerVector x2 = wrap( 1 ) ;
+
+vector<double> v ;
+v.push_back(0.0); v.push_back( 1.0 );
+NumericVector x3 = wrap( v ) ;
+
+// initializer list (only on GCC >= 4.4)
+LogicalVector x4 = wrap( \{ false, true\} );
+CharacterVector x5 = wrap( \{"foo", "bar"\} );
+\end{example}
+
+Similarly, converting an R object to a C++ standard type is implemented
+by variations on the \code{as} template function. In this case, we must
+use the angle brackets to specify which version of as we want to use.
+
+\begin{example}
+bool x = as<bool>(x) ;
+double x = as<double>(x) ;
+vector<int> x = as< vector<int> >(x) ;
+\end{example}
+
+\section{external pointers}
+
+In addition to primitive data types, R can handle arbitrary pointers
+by encapsulating the pointer in a special R object, the external
+pointer. \cite{R:exts} documents the available API R has to offer to
+deal with external pointers.
+
+\pkg{Rcpp} takes advantage of C++ templates and smart pointers and
+defines the templated class \code{XPtr} that acts as a smart
+pointer to the underlying C++ object.
+
+Assuming we get from R an external pointer to a \code{std::vector<int>}
+c++ object, we can manipulate it as such using the \code{XPtr} class:
+
+\begin{example}
+// xp is an external pointer
+// to a std::vector<int>
+XPtr< std::vector<int> > p(xp) ;
+p->push\_back(1) ;
+p->push\_back(2) ;
+p->size() ;
+\end{example}
+
+The \code{XPtr} class directly derives from the \code{RObject} class.
+Thanks to its template parameter and overloading of the \code{->}
+and \code{*} operators, objects of the \code{XPtr<Foo>} generated
+class look and feel like raw pointers (\code{Foo*}).
+
+Making an external pointer from a raw pointer is equally easy using
+another constructor.
+
+\begin{example}
+std::vector<int> *pv = new std::vector<int> ;
+XPtr< std::vector<int> > p(pv,true) ;
+\end{example}
+
+The creation of the instance of the \code{XPtr< std::vector<int> >}
+smart extenal pointer to a \code{std::vector<int>} hides the
+R API that is typically used for external pointers, including registration
+of a finalizer to be executed to free the memory of the vector when the
+external pointer goes out of scope.
+
+\section{other examples}
+
+The last example shows how to use \pkg{Rcpp} to emulate the R code below.
+For more examples, the reader is invited to
+refer to the comprehensive documentation included in \pkg{Rcpp}
+as well as the many examples that the package contains as part of
+its unit tests.
+
+\begin{example}
+> rnorm( 10L, sd = 100.0 )
+\end{example}
+
+The code can be expressed in several ways in \pkg{Rcpp}, the first version
+shows the use of the \code{Environment} and \code{Function} classes.
+
+\begin{example}
+Environment stats("package:stats") ;
+Function rnorm = stats.get("rnorm") ;
+return rnorm(10, Named("sd", 100.0) ) ;
+\end{example}
+
+We first pull out the \code{rnorm} function from the environment
+called \samp{package:stats} in the search path, then call the function
+using syntax similar to calling the function in R. The \code{Named}
+class is an utility class that helps emulating the use of
+named arguments.
+
+The second version shows the use of the \code{Language} class, which
+manage calls (LANGSXP).
+
+\begin{example}
+Language call("rnorm", 10, Named("sd", 100 ) ) ;
+call.eval() ;
+\end{example}
+
+%TODO: implement Language::eval( ) !!
+
+In this version, we first create a call to the symbol "rnorm" and
+evaluate the call in the global environment, this is similar to the
+R code :
+
+\begin{example}
+> eval( call( "rnorm", 10L, sd = 100 ) )
+\end{example}
+
+Using the R api, the first example, using the actual
+\code{rnorm} function,
+translates to :
+
+\begin{example}
+SEXP stats = PROTECT(
+\ \ R_FindNamespace( mkString("stats") ) ) ;
+SEXP rnorm = PROTECT(
+\ \ findVarInFrame( stats, install("rnorm") ) ) ;
+SEXP call = PROTECT( LCONS( rnorm,
+\ \ CONS(ScalarInteger(10),
+\ \ \ \ CONS(ScalarReal(100.0), R_NilValue)))) ;
+SET_TAG( CDDR(call), install("sd") ) ;
+SEXP res = PROTECT( eval( call, R_GlobalEnv ) );
+UNPROTECT(4) ;
+return res ;
+\end{example}
+
+and the second example, using the \samp{rnorm} symbol, and therefore
+involving implicit lookup in hte search path, can be written as:
+
+\begin{example}
+SEXP call = PROTECT(
+\ \ LCONS( install("rnorm"),
+\ \ \ \ CONS(ScalarInteger(10),
+\ \ \ \ \ \ CONS(ScalarReal(100.0), R_NilValue)))) ;
+SET_TAG( CDDR(call), install("sd") ) ;
+SEXP res = PROTECT( eval( call, R_GlobalEnv ) );
+UNPROTECT(2) ;
+return res ;
+\end{example}
+
+
+
+\section{Performance}
+
+
+\section{Summary}
+
+
+
+
+
+\bibliography{EddelbuettelFrancois}
+
+\address{Dirk Eddelbuettel\\
+ Debian Project\\
+ Chicago, IL\\
+ USA}\\
+\email{edd at debian.org}
+
+\address{Romain Fran\c{c}ois\\
+ Professionnal R Enthusiast\\
+ 3 rue Emile Bonnet, 34 090 Montpellier\\
+ FRANCE}\\
+\email{francoisromain at free.fr}
+
Deleted: papers/rjournal/FrancoisEddelbuettel.bib
===================================================================
--- papers/rjournal/FrancoisEddelbuettel.bib 2010-01-12 15:25:22 UTC (rev 353)
+++ papers/rjournal/FrancoisEddelbuettel.bib 2010-01-12 15:30:21 UTC (rev 354)
@@ -1,63 +0,0 @@
- at String{CRAN = "http://cran.r-project.org/" }
- at String{manuals = CRAN # "doc/manuals/" }
- at String{RCoreTeam = "{R Development Core Team}" }
- at String{RFoundation = "R Foundation for Statistical Computing" }
-
- at manual{R:exts,
- author = RCoreTeam,
- organization = RFoundation,
- address = {Vienna, Austria},
- year = {2009},
- title = "Writing R extensions",
- url = manuals # "R-exts.html"
-}
-
- at manual{R:ints,
- author = RCoreTeam,
- organization = RFoundation,
- address = {Vienna, Austria},
- year = {2009},
- title = "R internals",
- url = manuals # "R-ints.html"
-}
-
- at Manual{cran:inline,
- title = {inline: Inline C, C++, Fortran function calls from R},
- author = {Oleg Sklyar and Duncan Murdoch and Mike Smith and Dirk Eddelbuettel},
- year = {2009},
- note = {R package version 0.3.4},
- url = {http://CRAN.R-project.org/package=inline},
- }
-
- at Manual{cran:Rserve,
- title = {Rserve: Binary R server},
- author = {Simon Urbanek},
- note = {R package version 0.6-1},
- url = {http://www.rforge.net/Rserve/},
- }
-
- at InProceedings{batesdebroy01:cppclasses,
- author = {Douglas M. Bates and Saikat DebRoy},
- title = {{C++} Classes for {R} Objects},
- booktitle = {Proceedings of the 2nd International Workshop on Distributed Statistical Computing},
- year = 2001,
- editor = {Kurt Hornik & Friedrich Leisch},
- address = {TU Vienna, Austria}
-}
-
- at Unpublished{javagailemanly07:r_cpp,
- author = {James J. Java and Daniel P. Gaile and Kenneth E. Manly},
- title = {R/Cpp: Interface Classes to Simplify Using R Objects in C++ Extensions},
- note = {Unpublished manuscript, University of Buffalo},
- month = {July},
- year = 2007
-}
-
- at InProceedings{runnalls09:cxxr,
- author= {Andrew Runnalls},
- title = {Aspects of CXXR internals},
- booktitle = {Directions in Statistical Computing},
- address = {University of Copenhagen, Denmark},
- year= 2009
- }
-
Deleted: papers/rjournal/FrancoisEddelbuettel.tex
===================================================================
--- papers/rjournal/FrancoisEddelbuettel.tex 2010-01-12 15:25:22 UTC (rev 353)
+++ papers/rjournal/FrancoisEddelbuettel.tex 2010-01-12 15:30:21 UTC (rev 354)
@@ -1,510 +0,0 @@
-\title{Mesh R and C++ with Rcpp}
-\author{by Romain Franc\c{c}ois and Dirk Eddelbuettel}
-
-\maketitle
-
-\abstract{TBD}
-
-\section{Introduction}
-
-\subsection{Overview}
-The \pkg{Rcpp} package provides a consistent and comprehensive set
-of C++ classes designed to ease coupling of C++ code
-with R. The \code{RObject} class is responsible for
-protecting and releasing its encapsulated R object (\code{SEXP})
-from garbage collection. The \code{wrap} set of functions allows
-wrapping many C++ built-in types and data structures from the standard
-template library into R objects. Similarly, the \code{as} set of
-templated functions allows conversion of R objects back into C++
-types, such as \code{std::string}. With recent additions to the
-\pkg{inline} package \citep{cran:inline},
-C++ code using the classes of the
-\pkg{Rcpp} package can be inlined, compiled, loaded and wrapped
-into an R function without leaving the R console.
-This article reviews some of the design choices of the
-\pkg{Rcpp} package, in particular with respect to existing solutions
-that deal with coupling R and C++ and shows several use cases.
-
-Writing R Extensions \citep{R:exts} provides extensive documentation about the
-various ways to couple R with code written in C.
-Writing such code requires both expertise and discipline from the
-programmer. Discipline, with a large amount of bookkeeping
-%% FIXME: The two sentences need a rewrite
-duties around the \code{PROTECT}/\code{UNPROTECT} dance one
-has to master the steps. Expertise, to learn and use efficiently
-the set of macros offered by R headers.
-
-The \pkg{Rcpp} package makes extensive use of C++ features (encapsulation,
-constructors, destructors, operator overloading, templates) in order
-to hide the complexity of the R API --- without losing its
-efficiency --- under the carpet of object orientation.
-
-\subsection{Background}
-
-The first public version of \pkg{Rcpp} was released in 2005 as a contribution
-to the \pkg{RQuantLib} package. \pkg{Rcpp} was then released in a package of
-the same name in early 2006 which was following by several releases. It was
-then renamed to \pkg{RcppTemplate} and had several more releases during 2006.
-However, no releases or updates were made during 2007 and 2008.
-
-Given the continued use of package, it was revived and using the former name
-\pkg{Rcpp}. New releases started in November 2008 which include an improved
-build and distribution process, additional documentation, new
-functionality---while retaining the existing interface. This constitutes the
-`classic \pkg{Rcpp}' interface (see section FOO) which will be provided for
-the forseeable future.
-
-Yet C++ coding standards continued to evolved. So, in late 2009 the codebase
-was significantly extended and numerous new features were added. Several of
-these are described below in section BAR. This constitutes the `enhanced
-\pkg{Rcpp}' interface which we also intend to support going forward.
-
-\subsection{Comparison}
-
-Integration of C++ and R has been addressed by several authors starting with
-\cite{batesdebroy01:cppclasses}. \cite{javagailemanly07:r_cpp}, in an
-unpublished paper, express several ideas that are close to some of our
-approaches, though not yet fully fleshed out.
-
-Rserve \citep{cran:Rserve} was early with C++ use in one
-of its clients and also wrapped SEXP objects.
-% FIXME: the Rserve client does not know about SEXP, it defines
-% Rexp, something that looks like SEXP, but isn't;
-% R does not have to be on the client side
-
-CXXR \citep{runnalls09:cxxr} comes from the other side
-and aims to rebuild R using a C++. If moving from C to C++ is to be
-compared with improving someone's house, \pkg{Rcpp} is repainting the
-walls where CXXR rebuilds from the foundations.
-% maybe we do a bit more than the walls ?
-% new more comfortable furnitures ?
-
-rcppbind ...
-
-Whit A ...
-
-RcppTemplate recently decided to break with the `classic \pkg{Rcpp}' API.
-
-
-\section{Classic Rcpp}
-
-\pkg{Rcpp} is focussed on function in the standard sense of returning one (or
-more) results given inputs. An illustration can be provided using the
-time-tested example of a convolution of two vectors \citep{R:exts} but now
-rewritten using \pkg{Rcpp}.
-
-\begin{example}
-#include <Rcpp.h>
-
-RcppExport SEXP convolve2cpp(SEXP a, SEXP b) \{
- RcppVector<double> xa(a);
- RcppVector<double> xb(b);
- int nab = xa.size() + xb.size() - 1;
-
- RcppVector<double> xab(nab);
- for (int i = 0; i < nab; i++) xab(i) = 0.0;
-
- for (int i = 0; i < xa.size(); i++)
- for (int j = 0; j < xb.size(); j++)
- xab(i + j) += xa(i) * xb(j);
-
- RcppResultSet rs;
- rs.add("ab", xab);
- return rs.getReturnList();
-\}
-\end{example}
-
-\section{inline code}
-
-TBD (Dirk), maybe also something about deployment (update.package() bring new
-versiosn, Rcpp:::CxxFlags() and friends, dynamic linking)
-
-\section{\pkg{Rcpp} C++ classes}
-
-\subsection{The RObject class}
-
-The \code{RObject} class is the base class of all objects in the
-API of the \pkg{Rcpp} package. An \code{RObject} has only one
-data member, the protected \code{SEXP} it encapsulates.
-The \code{RObject} treats the \code{SEXP} as a resource, following the
-RAII (resource acquisition is initialization) pattern. As long as the
-\code{RObject} instance is alive, its underlying \code{SEXP} remains
-protected from garbage collection. When the \code{RObject} goes out
-of scope (function return, exceptions), it removes the protection so that
-if the \code{SEXP} is not otherwise protected when it becomes subject to
-garbage collection.
-
-% FIXME: Shorten and make a footnote?
-Garbage collection is only mentioned here to illustrate the basic design
-of the \code{RObject} class, the user of \pkg{Rcpp} need not to concern
-himself/herself with such matters and can instead focus on the problem
-that he/she is solving.
-
-The \code{RObject} class also defines a set of member functions that
-can be used on any R object, regardless of its type.
-
-\begin{center}
-\begin{small}
-\begin{tabular}{cc}
-method & action \\
-\hline
-\code{isNULL} & is the object \code{NULL}\\
-\hline
-\code{attributeNames} & the names of its attributes\\
-\code{hasAttribute} & does it have a given attribute\\
-\code{attr} & retrieve or set an attribute \\
-\hline
-\code{isS4} & is it an S4 object \\
-\code{hasSlot} & if S4, does it have the given slot\\
-\code{slot} & retrieve a given slot \\
-\hline
-\end{tabular}
-\end{small}
-\end{center}
-
-\subsection{Derived classes}
-
-Internally, an R object must have one type amongst the set of
-predefined types, commonly referred to as SEXP types. R internals
-\citep{R:ints} documents the various types. \pkg{Rcpp} associates
-a C++ class for most SEXP types.
-
-\begin{center}
-\begin{small}
-\begin{tabular}{ccc}
-SEXP type & \pkg{Rcpp} class \\
-\hline
-\code{NILSXP} & \\
-\code{SYMSXP} & \code{Symbol} \\
-\code{LISTSXP} & \code{Pairlist} \\
-\code{CLOSXP} & \code{Function} \\
-\code{ENVSXP} & \code{Environment} \\
-\code{PROMSXP} & \code{Promise} \\
-\code{LANGSXP} & \code{Language} \\
-\code{SPECIALSXP} & \code{Function} \\
-\code{BUILTINSXP} & \code{Function} \\
-\code{CHARSXP} & \\
-\code{LGLSXP} & \code{LogicalVector} \\
-\code{INTSXP} & \code{IntegerVector} \\
-\code{REALSXP} & \code{NumericVector} \\
-\code{CPLXSXP} & \code{ComplexVector}\\
-\code{STRSXP} & \code{CharacterVector} \\
-\code{DOTSXP} & \code{Pairlist} \\
-\code{ANYSXP} & \\
-\code{VECSXP} & \code{List} \\
-\code{EXPRSXP} & \code{ExpressionVector}\\
-\code{BCODESXP} & \\
-\code{EXTPTRSXP} & \code{XPtr<T>}\\
-\code{WEAKREFSXP} & \code{WeakReference}\\
-\code{RAWSXP} & \code{RawVector}\\
-\code{S4SXP} & \\
-\hline
-\end{tabular}
-\end{small}
-\end{center}
-
-Some types do not have their own C++ class. \code{NILSXP} and
-\code{S4SXP} have their functionality covered by the \code{RObject}
-class; \code{ANYSXP} is just a placeholder to facilitate S4 dispatch
-and no object in R has this type; and \code{BCODESXP} is not currently
-used.
-
-Each class contains functionality that is relevant to the R object
-that it encapsulates. For example \code{Environment} contains
-member methods to query the list of objects in the associated environment,
-classes with the \code{Vector} overload the \code{operator[]} in order
-to extract/modify values at the given position in the vector, ...
-
-The rest of this section presents example uses of \pkg{Rcpp} classes.
-
-\subsection{numeric vector}
-
-The following code snippet is extracted from Writing R extensions
-\citep{R:exts}. It creates a \code{numeric} vector of two elements
-and assigns some values to it.
-
-\begin{example}
-#include <R.h>
-#include <Rinternals.h>
-
-SEXP ab;
- ....
-PROTECT(ab = allocVector(REALSXP, 2));
-REAL(ab)[0] = 123.45;
-REAL(ab)[1] = 67.89;
-UNPROTECT(1);
-\end{example}
-
-Although this is one of the simplest examples in Writing R extensions,
-it seems verbose and it is not trivial at first sight what is happening.
-\begin{itemize}
-\item \code{allocVector} is used to allocate memory. We must supply to it
-the type of data (\code{REALSXP}) and the number of elements.
-\item once allocated, the \code{ab} object must be protected from
-garbage collection. Since the garbage collector can happen at any time,
-not protecting an object means its memory might be reclaimed before we are
-finished with it.
-\item The \code{REAL} macro returns a pointer to the beginning of the
-actual array; its indexing is does not resemble either R or C++.
-\end{itemize}
-
-Using the \code{Rcpp::NumericVector}, the code can be rewritten:
-
-\begin{example}
-#include <Rcpp.h>
-using namespace Rcpp;
-NumericVector ab(2) ;
-ab[0] = 123.45;
-ab[1] = 67.89;
-\end{example}
-
-The code contains much less idiomatic decorations. Here are the steps involved:
-\begin{itemize}
-\item The \code{NumericVector} constructor is given the number
-of elements the vector contains (2), this hides a call to the
-\code{allocVector} we saw previously.
-\item Also hidden is protection of the
-object from garbage collection, which is a behavior that \code{NumericVector}
-inherits from \code{RObject}
-\item values are assigned to the first and second elements of the vector.
-This is achieved \code{NumericVector} overloads the \code{operator[]}.
-\end{itemize}
-
-With recent compilers (e.g. GCC >= 4.4) implementing the forthcoming
-C++ standard (C++0x), the previous code may even be reduced
-to the following :
-
-\begin{example}
-#include <Rcpp.h>
-using namespace Rcpp;
-NumericVector ab = {123.45, 67.89};
-\end{example}
-
-\subsection{character vectors}
-
-A second example deals with character vectors and emulates this R code
-
-\begin{example}
-> x <- c("foo", "bar")
-\end{example}
-
-Using the traditional R API, the vector can be allocated and filled as such:
-
-\begin{example}
[TRUNCATED]
To get the complete diff run:
svnlook diff /svnroot/rcpp -r 354
_______________________________________________
Rcpp-commits mailing list
Rcpp-commits at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-commits
More information about the Rcpp-devel
mailing list