[Rcpp-commits] r2074 - papers/rjournal

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Sat Sep 4 19:05:51 CEST 2010


Author: edd
Date: 2010-09-04 19:05:50 +0200 (Sat, 04 Sep 2010)
New Revision: 2074

Modified:
   papers/rjournal/EddelbuettelFrancois.tex
Log:
committing a bunch of changes and rewrites
- discussion of classic/new issue
- removal 'classic api' and 'new api' section, while
  * trying to retain some of the useful general paragraphs
  * switched to initial example with new API
- lots of work left to do


Modified: papers/rjournal/EddelbuettelFrancois.tex
===================================================================
--- papers/rjournal/EddelbuettelFrancois.tex	2010-09-04 16:59:33 UTC (rev 2073)
+++ papers/rjournal/EddelbuettelFrancois.tex	2010-09-04 17:05:50 UTC (rev 2074)
@@ -32,17 +32,24 @@
 % [romain] : removing this paragraph from the introduction. 
 %            The article is about today's Rcpp, the only place where the classic
 %            api should be is in the hist(e|o)rical section. 
+% [dirk] :   the classix API will be supported, is the past of Rcpp and may
+%            as well be mentioned. We have nothing to hide
+%            please see the next paragraph where I now talk about classic
+%            but also note deprecated and link to the recommended new API
+
+%The current version of 
+The \pkg{Rcpp} package combines two distinct
+APIs. The first---which we call `classic \pkg{Rcpp} API'---exists since 
+the first version of \pkg{Rcpp}. While still contained in the package to
+ensure compatibility, its use is otherwise deprecated. All new development should
+use the newer and richer second API. It is enclosed in the \code{Rcpp} C++ 
+namespace, and corresponds to the newer redesigned codebase. 
+% which we started to develop more recently. -- [dirk] start no longer recent
 %
-% The current version of \pkg{Rcpp} combines two distinct
-% APIs. The first---which we call `classic \pkg{Rcpp} API'---exists since 
-% the first version of \pkg{Rcpp}. The second API, enclosed in the 
-% \code{Rcpp} C++ namespace, is a newer codebase which we started to develop
-% more recently. 
-This article 
-highlights some of the key design and implementation choices: 
-lightweight encapsulation of R objects in C++ classes, automatic
-garbage collection strategy, code inlining, data interchange between 
-R and C++ and error handling. 
+This article highlights some of the key design and implementation choices of
+the new API: lightweight encapsulation of R objects in C++ classes, automatic
+garbage collection strategy, code inlining, data interchange between R and
+C++ and error handling.
 
 Several examples are included to illustrate the benefits of using \pkg{Rcpp}
 as opposed to the traditional R API. Many more examples are available within
@@ -65,16 +72,25 @@
 (not described in this article) which will be maintained for the foreseeable future.
 
 Yet C++ coding standards continued to evolve \citep{meyers:effectivecplusplus}.
-In 2009, Eddelbuettel and Fran\c{c}ois started to significantly extend the codebase and numerous new
-features were added.  Several of these are described below in the section on
-the `new \pkg{Rcpp}' interface, as well as in the seven vignettes included
-with the package. This new API is our current focus, and we
-intend to both extend and support it in future development of the package.
+In 2009, Eddelbuettel and Fran\c{c}ois started a significant redesign of the
+codebase which added numerous new features.  Several of these are described
+below in the section on the \pkg{Rcpp} API interface, as well as in the
+seven vignettes included with the package. This new API is our current focus,
+and we intend to both extend and support it in future development of the
+package. 
 %% TODO Should we talk about RcppExamples and/or RcppArmadillo here?
 % [romain] I don't like the word "New" anymore. Was certainly appropriate in 
 %          february, but no longer is now.
-%          Furthermore, I don't think we did extend the codebase per se, we rather
-%          started more or less from scratch. I'd like something like "Rcpp was redesigned" ...
+%          Furthermore, I don't think we did extend the codebase per se, 
+%          we rather started more or less from scratch. I'd like something
+%          like "Rcpp was redesigned" ... 
+% [dirk]   'new' should stay as we have the term consistently to contrast
+%          with 'classic'. We support both APIs, we need labels for both
+%          'new' and 'classic' work
+%          OTOH "Rcpp was redesigned" could work instead of 'extend'
+%          I gave it a spin above (and sorry about the reindent)
+%          [ minutes later ] 
+%          ok, one 'new' is gone above as the corresponding section title is gone
 
 \subsection{Comparison}
 
@@ -83,7 +99,7 @@
 An unpublished paper by \cite{javagailemanly07:r_cpp} expresses several ideas
 that are close to some of our approaches, though not yet fully fleshed out.
 %
-The \pkg{Rserve} package \citep{cran:Rserve} was another early approach,
+The \pkg{Rserve} package \citep{urbanek2003:rserve,cran:Rserve} was another early approach,
 going back to 2002. On the server side, \pkg{Rserve} translates R data
 structures into a binary serialization format and uses TCP/IP for
 transfer. On the client side, objects are reconstructed as instances of Java
@@ -94,7 +110,7 @@
 \citep{armstrong09:RObjects} are all implemented using C++ templates.
 However, neither has matured to the point of a CRAN release and it is
 unclear how much usage these packages are seeing beyond their own authors.
-
+%
 CXXR \citep{runnalls09:cxxr} comes to this topic from the other side: 
 its aim is to completely refactor R on a stronger C++ foundation. 
 CXXR is therefore concerned with all aspects of the R interpreter,
@@ -102,12 +118,12 @@
 part. A similar approach is discussed by \cite{templelang09:modestproposal}
 who suggests making low-level internals extensible by package developers in
 order to facilitate extending \R.
-
+%
 Another slightly different angle is offered by
 \cite{templelang09:rgcctranslationunit} who uses compiler output for
 references on the code in order to add bindings and wrappers.
 %
-Lastly, the \pkg{cxxPack} package \citep{cran:cxxPack} builds on top of
+The \pkg{cxxPack} package \citep{cran:cxxPack} builds on top of
 \pkg{Rcpp} and adds a small collection of diverse functions.
 
 %DE: Removed per editor  
@@ -115,8 +131,14 @@
 %API features, performance, usability and documentation would be a welcome
 %addition to the literature, but is beyond the scope of this article.
 
-% FIXME: this section is now irrelevant. it needs to go. 
-\section{Classic Rcpp API}
+% [romain] FIXME: this section is now irrelevant. it needs to go. 
+% [dirk]   the first two paragraphs are generic and still true for the new
+%          API (irrespective of the fact that it does more too)
+%          Maybe we make it a 'subsection' and part of the 'Introduction'
+%          section?
+
+%\section{Classic Rcpp API}
+\subsection{Rcpp Use Cases}  % or some such
 \label{sec:classic_rcpp}
 
 The core focus of \pkg{Rcpp}---particularly for the earlier API described in
@@ -138,37 +160,39 @@
 and parameters are passed via \pkg{Rcpp} to a function set-up to call code
 from an external library.  
 
-An illustration can be provided using the time-tested example of a
-convolution of two vectors. This example is shown in sections 5.2 (for the
-\code{.C()} interface) and 5.9 (for the \code{.Call()} interface) of 'Writing
-R Extensions' \citep{R:exts}. We have rewritten it here using classes of the
-classic \pkg{Rcpp} API:
+TODO: Wrap this this so that it ties in better with what follows
 
-\begin{example}
-#include <Rcpp.h>
+% An illustration can be provided using the time-tested example of a
+% convolution of two vectors. This example is shown in sections 5.2 (for the
+% \code{.C()} interface) and 5.9 (for the \code{.Call()} interface) of 'Writing
+% R Extensions' \citep{R:exts}. We have rewritten it here using classes of the
+% classic \pkg{Rcpp} API:
 
-RcppExport SEXP convolve2cpp(SEXP a,SEXP b) \{
-  RcppVector<double> xa(a);
-  RcppVector<double> xb(b);
-  int nab = xa.size() + xb.size() - 1;
+% \begin{example}
+% #include <Rcpp.h>
 
-  RcppVector<double> xab(nab);
-  for (int i = 0; i < nab; i++) xab(i) = 0.0;
+% RcppExport SEXP convolve2cpp(SEXP a,SEXP b) \{
+%   RcppVector<double> xa(a);
+%   RcppVector<double> xb(b);
+%   int nab = xa.size() + xb.size() - 1;
 
-  for (int i = 0; i < xa.size(); i++)
-    for (int j = 0; j < xb.size(); j++) 
-       xab(i + j) += xa(i) * xb(j);
+%   RcppVector<double> xab(nab);
+%   for (int i = 0; i < nab; i++) xab(i) = 0.0;
 
-  RcppResultSet rs;
-  rs.add("ab", xab);
-  return rs.getReturnList();
-\}
-\end{example}
+%   for (int i = 0; i < xa.size(); i++)
+%     for (int j = 0; j < xb.size(); j++) 
+%        xab(i + j) += xa(i) * xb(j);
 
-We can highlight several aspects. First, only a single header file
-\code{Rcpp.h} is needed to use the \pkg{Rcpp} API.  Second, given two
-\code{SEXP} types, a third is returned.  Third, both inputs are converted to
-templated.
+%   RcppResultSet rs;
+%   rs.add("ab", xab);
+%   return rs.getReturnList();
+% \}
+% \end{example}
+
+% We can highlight several aspects. First, only a single header file
+% \code{Rcpp.h} is needed to use the \pkg{Rcpp} API.  Second, given two
+% \code{SEXP} types, a third is returned.  Third, both inputs are converted to
+% templated.
 % \footnote{C++ templates allow functions or classes to be written
 %   somewhat independently from the template parameter. The actual class is
 %   instantiated by the compiler by replacing occurrences of the templated
@@ -178,27 +202,28 @@
 %   `templated' type $T$, the compiler will create a concrete instance using an
 %   \texttt{int} or \texttt{double} type dependent on the context is which the
 %   code is called.}  
-C++ vector types, here a standard \code{double} type is
-used to create a vector of doubles from the template type.  Fourth, the
-usefulness of these classes can be seen when we query the vectors directly
-for their size---using the \code{size()} member function---in order to
-reserve a new result type of appropriate length whereas use based on C arrays
-would have required additional parameters for the length of vectors $a$ and
-$b$, leaving open the possibility of mismatches between the actual length and
-the length reported by the programmer.  Fifth, the computation itself is
-straightforward embedded looping just as in the original examples in the
-'Writing R Extensions' manual \citep{R:exts}.  Sixth, a return type
-(\code{RcppResultSet}) is prepared as a named object which is then converted
-to a list object that is returned.  We should note that the
-\code{RcppResultSet} supports the return of numerous (named) objects which
-can also be of different types.
+% C++ vector types, here a standard \code{double} type is
+% used to create a vector of doubles from the template type.  Fourth, the
+% usefulness of these classes can be seen when we query the vectors directly
+% for their size---using the \code{size()} member function---in order to
+% reserve a new result type of appropriate length whereas use based on C arrays
+% would have required additional parameters for the length of vectors $a$ and
+% $b$, leaving open the possibility of mismatches between the actual length and
+% the length reported by the programmer.  Fifth, the computation itself is
+% straightforward embedded looping just as in the original examples in the
+% 'Writing R Extensions' manual \citep{R:exts}.  Sixth, a return type
+% (\code{RcppResultSet}) is prepared as a named object which is then converted
+% to a list object that is returned.  We should note that the
+% \code{RcppResultSet} supports the return of numerous (named) objects which
+% can also be of different types.
 
-We argue that this usage is already much easier to read, write and debug than the
-C macro-based approach supported by R itself. Possible performance issues and
-other potential limitations will be discussed throughout the article and
-reviewed at the end.
+% We argue that this usage is already much easier to read, write and debug than the
+% C macro-based approach supported by R itself. Possible performance issues and
+% other potential limitations will be discussed throughout the article and
+% reviewed at the end.
 
-\section{New \pkg{Rcpp} API}
+%\section{New \pkg{Rcpp} API}
+\section{The \pkg{Rcpp} API}
 \label{sec:new_rcpp}
 
 More recently, the \pkg{Rcpp} API has been dramatically extended, leading to a 
@@ -210,6 +235,49 @@
 of R in a C++ applications and \code{RProtoBuf} \citep{cran:rprotobuf} 
 that interfaces with the protocol buffers library. 
 
+\subsection{A First Example}
+
+\begin{example}
+#include <Rcpp.h>
+
+RcppExport SEXP convolve3cpp(SEXP a, SEXP b){
+    Rcpp::NumericVector xa(a);
+    Rcpp::NumericVector xb(b);
+    int n_xa = xa.size(), n_xb = xb.size();
+    int nab = n_xa + n_xb - 1;
+    Rcpp::NumericVector xab(nab);
+
+    for (int i = 0; i < n_xa; i++)
+        for (int j = 0; j < n_xb; j++) 
+            xab[i + j] += xa[i] * xb[j];
+
+    return xab ;
+}
+\end{example}
+
+We can highlight several aspects. First, only a single header file
+\code{Rcpp.h} is needed to use the \pkg{Rcpp} API.  Second, given two
+\code{SEXP} types, a third is returned.  Third, both inputs are 
+converted to C++ vector types provided by \pkg{Rcpp} (and we have more to day about these
+conversions below).  Fourth, the
+usefulness of these classes can be seen when we query the vectors directly
+for their size---using the \code{size()} member function---in order to
+reserve a new result type of appropriate length whereas use based on C arrays
+would have required additional parameters for the length of vectors $a$ and
+$b$, leaving open the possibility of mismatches between the actual length and
+the length reported by the programmer.  Fifth, the computation itself is
+straightforward embedded looping just as in the original examples in the
+'Writing R Extensions' manual \citep{R:exts}.  Sixth, the return conversion
+is also automatic from the \code{NumericVector} to the \code{SEXP} type.
+
+We argue that this usage is much easier to read, write and debug than the
+C macro-based approach supported by R itself. %Possible performance issues and
+%other potential limitations will be discussed throughout the article and
+%reviewed at the end.
+
+% [dirk]  TODO maybe another sentence to tie into the next segment
+
+
 \subsection{Rcpp Class hierarchy}
 
 The \code{Rcpp::RObject} class is the basic class of the new \pkg{Rcpp} API. 



More information about the Rcpp-commits mailing list