[Rcpp-commits] r2144 - papers/rjournal
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Thu Sep 23 15:39:10 CEST 2010
Author: romain
Date: 2010-09-23 15:39:09 +0200 (Thu, 23 Sep 2010)
New Revision: 2144
Modified:
papers/rjournal/EddelbuettelFrancois.tex
Log:
first pass after revising the exception section
Modified: papers/rjournal/EddelbuettelFrancois.tex
===================================================================
--- papers/rjournal/EddelbuettelFrancois.tex 2010-09-22 11:09:28 UTC (rev 2143)
+++ papers/rjournal/EddelbuettelFrancois.tex 2010-09-23 13:39:09 UTC (rev 2144)
@@ -36,6 +36,7 @@
% as well be mentioned. We have nothing to hide
% please see the next paragraph where I now talk about classic
% but also note deprecated and link to the recommended new API
+% [romain] : alright
%The current version of
The \pkg{Rcpp} package combines two distinct
@@ -91,6 +92,7 @@
% I gave it a spin above (and sorry about the reindent)
% [ minutes later ]
% ok, one 'new' is gone above as the corresponding section title is gone
+% [romain] fine
\subsection{Comparison}
@@ -125,6 +127,7 @@
%
The \pkg{cxxPack} package \citep{cran:cxxPack} builds on top of
\pkg{Rcpp} and adds a small collection of diverse functions.
+% [romain] So what ? Is this the mention you want to remove ? Go right ahead !
%DE: Removed per editor
%A critical comparison of these packages that addresses relevant aspects such
@@ -141,10 +144,11 @@
\subsection{Rcpp Use Cases} % or some such
\label{sec:classic_rcpp}
-The core focus of \pkg{Rcpp}---particularly for the earlier API described in
-this section---has always been on allowing the programmer to add C++-based
-functions. We use this term in the standard mathematical sense of providing
-results (output) given a set of parameters or data (input). This was
+The core focus of \pkg{Rcpp} has always been on allowing the
+programmer to add C++-based functions.
+We use this term in the standard mathematical sense of providing
+results (output) given a set of parameters or data (input).
+This was
facilitated from the earliest releases using C++ classes for receiving
various types of R objects, converting them to C++ objects and allowing the
programmer to return the results to R with relative use.
@@ -160,69 +164,9 @@
and parameters are passed via \pkg{Rcpp} to a function set-up to call code
from an external library.
-TODO: Wrap this this so that it ties in better with what follows
+% TODO: Wrap this this so that it ties in better with what follows
+% [romain] : should this section be merged with the next one. It looks odd on its own.
-% An illustration can be provided using the time-tested example of a
-% convolution of two vectors. This example is shown in sections 5.2 (for the
-% \code{.C()} interface) and 5.9 (for the \code{.Call()} interface) of 'Writing
-% R Extensions' \citep{R:exts}. We have rewritten it here using classes of the
-% classic \pkg{Rcpp} API:
-
-% \begin{example}
-% #include <Rcpp.h>
-
-% RcppExport SEXP convolve2cpp(SEXP a,SEXP b) \{
-% RcppVector<double> xa(a);
-% RcppVector<double> xb(b);
-% int nab = xa.size() + xb.size() - 1;
-
-% RcppVector<double> xab(nab);
-% for (int i = 0; i < nab; i++) xab(i) = 0.0;
-
-% for (int i = 0; i < xa.size(); i++)
-% for (int j = 0; j < xb.size(); j++)
-% xab(i + j) += xa(i) * xb(j);
-
-% RcppResultSet rs;
-% rs.add("ab", xab);
-% return rs.getReturnList();
-% \}
-% \end{example}
-
-% We can highlight several aspects. First, only a single header file
-% \code{Rcpp.h} is needed to use the \pkg{Rcpp} API. Second, given two
-% \code{SEXP} types, a third is returned. Third, both inputs are converted to
-% templated.
-% \footnote{C++ templates allow functions or classes to be written
-% somewhat independently from the template parameter. The actual class is
-% instantiated by the compiler by replacing occurrences of the templated
-% parameter(s). A simple example would be a templated function
-% \texttt{abs(T)} which returns the negative of the template argument $T$
-% when $T<0$ and $T$ otherwise. While the source code is written with a
-% `templated' type $T$, the compiler will create a concrete instance using an
-% \texttt{int} or \texttt{double} type dependent on the context is which the
-% code is called.}
-% C++ vector types, here a standard \code{double} type is
-% used to create a vector of doubles from the template type. Fourth, the
-% usefulness of these classes can be seen when we query the vectors directly
-% for their size---using the \code{size()} member function---in order to
-% reserve a new result type of appropriate length whereas use based on C arrays
-% would have required additional parameters for the length of vectors $a$ and
-% $b$, leaving open the possibility of mismatches between the actual length and
-% the length reported by the programmer. Fifth, the computation itself is
-% straightforward embedded looping just as in the original examples in the
-% 'Writing R Extensions' manual \citep{R:exts}. Sixth, a return type
-% (\code{RcppResultSet}) is prepared as a named object which is then converted
-% to a list object that is returned. We should note that the
-% \code{RcppResultSet} supports the return of numerous (named) objects which
-% can also be of different types.
-
-% We argue that this usage is already much easier to read, write and debug than the
-% C macro-based approach supported by R itself. Possible performance issues and
-% other potential limitations will be discussed throughout the article and
-% reviewed at the end.
-
-%\section{New \pkg{Rcpp} API}
\section{The \pkg{Rcpp} API}
\label{sec:new_rcpp}
@@ -262,10 +206,16 @@
conversions below). Fourth, the
usefulness of these classes can be seen when we query the vectors directly
for their size---using the \code{size()} member function---in order to
-reserve a new result type of appropriate length whereas use based on C arrays
-would have required additional parameters for the length of vectors $a$ and
-$b$, leaving open the possibility of mismatches between the actual length and
-the length reported by the programmer. Fifth, the computation itself is
+reserve a new result type of appropriate length
+% whereas use based on C arrays
+% would have required additional parameters for the length of vectors $a$ and
+% $b$, leaving open the possibility of mismatches between the actual length and
+% the length reported by the programmer.
+% [romain] : hmmm. There is no need for extra parameters if you use .Call
+% with the R API. I don't think the point is valid.
+and with the use of the
+\verb|operator[]| to extract and set individual elements of the vector.
+Fifth, the computation itself is
straightforward embedded looping just as in the original examples in the
'Writing R Extensions' manual \citep{R:exts}. Sixth, the return conversion
is also automatic from the \code{NumericVector} to the \code{SEXP} type.
@@ -330,8 +280,8 @@
member functions to manage objects in the associated environment.
Similarly, classes related to vectors (\code{IntegerVector}, \code{NumericVector},
\code{RawVector}, \code{LogicalVector}, \code{CharacterVector},
-\code{GenericVector} and \code{ExpressionVector}) expose functionality
-to extract and set values from the vectors.
+\code{GenericVector} (also known as \code{List}) and \code{ExpressionVector})
+expose functionality to extract and set values from the vectors.
The following sub-sections present typical uses of \pkg{Rcpp} classes in
comparison with the same code expressed using functions of the R API.
@@ -377,14 +327,29 @@
the first and second elements of the vector as \code{NumericVector} overloads
the \code{operator[]}.
-With the most recent compilers (e.g. GNU g++ >= 4.4) which already implement
-parts of the next C++ standard (C++0x) currently being drafted, the preceding
-code may even be reduced to this:
+% With the most recent compilers (e.g. GNU g++ >= 4.4) which already implement
+% parts of the next C++ standard (C++0x) currently being drafted, the preceding
+% code may even be reduced to this:
+%
+% \begin{example}
+% Rcpp::NumericVector ab = \{123.45, 67.89\};
+% \end{example}
+% [romain] I'm trading this for the use of create, as this always works
+% so that we don't confuse readers because if you have gcc 4.4
+% you don't get this automatically, you have to enable it, etc ...
+The snippet can also be written more concisely using the \code{create}
+static member function of the \code{NumericVector} class:
+
\begin{example}
-Rcpp::NumericVector ab = \{123.45, 67.89\};
+Rcpp::NumericVector ab =
+ Rcpp::NumericVector::create( 123.45, 67.89 );
\end{example}
+It should be noted that although the copy constructor of the
+\code{NumericVector} class is used, it does not imply copies of the
+underlying array, only the \code{SEXP} is copied.
+
\subsection{Character vectors}
A second example deals with character vectors and emulates this R code
@@ -439,7 +404,7 @@
object and converts this object into a \code{SEXP}, which is what R expects.
Currently wrappable types are :
\begin{itemize}
-\item primitive types, \code{int}, \code{double}, ... which are converted
+\item primitive types: \code{int}, \code{double}, ... which are converted
into the corresponding atomic R vectors;
\item \code{std::string} which are converted to R atomic character vectors;
\item STL containers such as \code{std::vector<T>} or \code{std::list<T>},
@@ -449,16 +414,14 @@
the type \code{T} is wrappable;
\item any type that implements implicit conversion to \code{SEXP} through the
\code{operator SEXP()};
-\item any type for which the \code{wrap} template is partially or fully
-specialized.
+\item any type for which the \code{wrap} template is % partially or [romain] partially is not true anymore
+fully specialized.
\end{itemize}
-%One example for the specialisation of the templated \code{wrap} function is
-%provided in \pkg{RInside} \citep{cran:rinside} by \code{vector< vector<
-% double > >} and \code{vector< vector< int > >} which are used for
-%representing numeric matrices.
Wrappability of an object type is resolved at compile time using
-modern techniques of template meta programming and class traits.
+modern techniques of template meta programming and class traits. The
+\code{Rcpp-extending} vignette discusses in depth how to extend \code{wrap}
+to third party types and the \pkg{RcppArmadillo} features several examples.
The following code snippet illustrates that the design allows
composition:
@@ -490,7 +453,7 @@
\code{Rcpp::as} template whose signature is:
\begin{example}
template <typename T>
-T as(SEXP x);
+T as(SEXP x) throw(not_compatible) ;
\end{example}
It offers less flexibility and currently
@@ -534,16 +497,19 @@
\end{example}
In the first part of the example, the code extracts a
-\code{std::vector<double>} from the global environment. This is
-achieved by the templated \code{operator[]} of \code{Environment}
-that first extracts the requested object from the environment as a \code{SEXP},
-and then outsources to \code{Rcpp::as} the creation of the
-requested type.
+\code{std::vector<double>} from the global environment. In order to achieve this,
+the \code{operator[]} of \code{Environment} uses the proxy pattern to distinguish
+between left hand side (LHS) and right hand side (RHS) use.
+% [TODO] : reference (meyers more effective C++ I think?)
+The output of the operator is an instance of the nested class
+\code{Environment::Binding}, which defines a templated implicit conversion
+operator that allows a \code{Binding} to be assigned to any type that
+\code{Rcpp::as} is able to handle.
-In the second part of the example, the \code{operator[]}
-delegates to \code{wrap} the production of an R object based on the
-type that is passed in (\code{std::map<std::string,std::string>}),
-and then assigns the object to the requested name.
+In the second part of the example, LHS use of the \code{Binding} instance is
+implemented through its assignment operator, which is also templated and uses
+\code{Rcpp::wrap} to perform the conversion to a \code{SEXP} that can be
+assigned to the requested symbol in the global environment.
The same mechanism is used throughout the API. Examples include access/modification
of object attributes, slots, elements of generic vectors (lists),
@@ -627,8 +593,8 @@
that is easier to read, write and maintain. More examples are available as
part of the documentation included in the \pkg{Rcpp} package, as well as
among its over one hundred and ninety unit tests.
+% TODO: bump this up to the current test count
-
\section{Using code `inline'}
\label{sec:inline}
@@ -668,8 +634,7 @@
\pkg{inline} as shown below. The function body is provided by the character
variable \code{src}, the function header is defined by the argument
\code{signature}---and we only need to enable \code{plugin="Rcpp"} to obtain a
-new function \code{fun} based on the C++ code in \code{src} where we also
-switched from the classic \pkg{Rcpp} API to the new one:
+new function \code{fun} based on the C++ code in \code{src}:
\begin{example}
> src <- '
@@ -686,17 +651,23 @@
> fun <- cxxfunction(
+ \ \ \ \ signature(a="numeric", b="numeric"),
+ \ \ \ \ src, plugin="Rcpp")
+> fun( 1:3, 1:4 )
+[1] 1 4 10 16 17 12
\end{example}
-The main difference to the previous solution is that the input parameters are
-directly passed to types \code{Rcpp::NumericVector}, and that the return
-vector is automatically converted to a \code{SEXP} type through implicit
-conversion. Also in this version, the vector \code{xab} is not
-initialized because the constructor already performs initialization
-to match the behavior of the R function \code{numeric}.
+% The main difference to the previous solution is that the input parameters are
+% directly passed to types \code{Rcpp::NumericVector}, and that the return
+% vector is automatically converted to a \code{SEXP} type through implicit
+% conversion.
+% Also in this version, the vector \code{xab} is not
+% initialized because the constructor already performs initialization
+% to match the behavior of the R function \code{numeric}.
\section{Using STL algorithms}
+% [romain] hmmmm. we do now have sapply and lapply. I think we should mention
+% them here.
+
% This is taken from :
% http://www.cplusplus.com/reference/algorithm/
@@ -714,7 +685,6 @@
ellipsis (\code{...}).} version of \code{lapply}
using the \code{transform} algorithm from the STL.
-% [Romain] does the code need comments ?
\begin{example}
> src <- '
+ Rcpp::List input(data);
@@ -759,38 +729,76 @@
\subsection{C++ exceptions in R}
-The traditional way of dealing with C++ exceptions in R is to
-catch them through explicit try/catch blocks and
-convert this exception into an R error manually.
+The internals of the R condition mechanism and the implementation of
+C++ exceptions are both based on a layer above posix jumps. These layers
+both assume total control over the call stack and should not be used together
+without extra precaution. \pkg{Rcpp} contains facilities to combine both systems
+so that a C++ exception is caught and recycled into the R condition
+mechanism.
-In C++, when an application throws an exception that is not caught,
-a special function (called the terminate handler) is invoked. This typically causes
-the program to abort. \pkg{Rcpp} takes advantage of this mechanism
-and installs its own terminate handler which translates C++
-exceptions into R conditions. The following code gives an illustration.
+\pkg{Rcpp} defines the \code{BEGIN\_RCPP} and \code{END\_RCPP} macros that should
+be used to bracket code that might throw C++ exceptions.
\begin{example}
-> fun <- cxxfunction(signature(x = "integer"), '
-+ int dx = as<int>(x);
-+ if( dx > 10 )
-+ throw std::range_error("too big") ;
-+ return wrap(dx*dx);
-+ ', plugin="Rcpp",
-+ includes = "using namespace Rcpp;" )
-> tryCatch( fun(12),
-+ "std::range_error" = function(e){
-+ writeLines( conditionMessage(e) )
-+ } )
-too big
+RcppExport SEXP fun( SEXP x )\{
+BEGIN_RCPP
+ int dx = Rcpp::as<int>(x);
+ if( dx > 10 )
+ throw std::range_error("too big") ;
+ return Rcpp::wrap( dx * dx) ;
+END_RCPP
+\}
\end{example}
+The macros are simply defined to avoid code repetition, they expand to
+simple try/catch blocks:
+
+\begin{example}
+RcppExport SEXP fun( SEXP x )\{
+ try\{
+ int dx = Rcpp::as<int>(x);
+ if( dx > 10 )
+ throw std::range_error("too big") ;
+ return Rcpp::wrap( dx * dx) ;
+ \} catch( std::exception& __ex__ )\{
+ forward_exception_to_r( __ex__ ) ;
+ \} catch(...)\{
+ ::Rf_error( "c++ exception (unknown reason)" ) ;
+ \}
+\}
+\end{example}
+
+Using \code{BEGIN\_RCPP} and \code{END\_RCPP} --- or the expanded versions ---
+guarantess that the stack is first unwound in terms of C++ exceptions, before
+the problem is converted to the standard R error management system (\code{Rf\_error}).
+
+The \code{forward\_exception\_to\_r} uses run-time type information to
+extract information about the class of the C++ exception and its message, so that
+dedicated handlers can be installed on the R side.
+
+\begin{example}
+> f <- function(x) .Call( "fun", x )
+> tryCatch( f( 12 ),
++ "std::range_error" = function(e) \{
++ conditionMessage( e )
++ \} )
+[1] "too big"
+> tryCatch( f( 12 ),
++ "std::range_error" = function(e) \{
++ class( e )
++ \} )
+[1] "std::range_error" "C++Error"
+[3] "error" "condition"
+\end{example}
+
\subsection{R error in C++}
R currently does not offer C-level mechanisms to deal with errors. To
overcome this problem, \pkg{Rcpp} uses the \code{Rcpp::Evaluator}
class to evaluate an expression with an R-level \code{tryCatch}
block. The error, if any, that occurs while evaluating the
-function is then translated into an C++ exception.
+function is then translated into an C++ exception that can be dealt with using
+regular C++ try/catch syntax.
\section{Performance comparison}
@@ -809,9 +817,7 @@
from R to C++ and back.
Here we illustrate how to take advantage of \code{Rcpp} to get
-the best of both worlds. The classic \pkg{Rcpp} translation of the convolve example from
-\cite{R:exts} appears twice above where the second example showed the use
-with the new API.
+the best of both worlds.
The implementation of the \code{operator[]} is designed as
efficiently as possible, using both inlining and caching,
More information about the Rcpp-commits
mailing list