[Rcpp-commits] r2768 - pkg/RcppDE/inst/doc
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Sat Dec 11 20:35:15 CET 2010
Author: edd
Date: 2010-12-11 20:35:14 +0100 (Sat, 11 Dec 2010)
New Revision: 2768
Added:
pkg/RcppDE/inst/doc/Rcpp.Rnw
Removed:
pkg/RcppDE/inst/doc/RcppDE.tex
Log:
RcppDE.tex is now RcppDE.Rnw
Copied: pkg/RcppDE/inst/doc/Rcpp.Rnw (from rev 2767, pkg/RcppDE/inst/doc/RcppDE.tex)
===================================================================
--- pkg/RcppDE/inst/doc/Rcpp.Rnw (rev 0)
+++ pkg/RcppDE/inst/doc/Rcpp.Rnw 2010-12-11 19:35:14 UTC (rev 2768)
@@ -0,0 +1,2258 @@
+%% use JSS class but for now with nojss option
+\documentclass[nojss,shortnames,article]{jss}
+\usepackage{rotating}
+%\usepackage{float}
+\usepackage{flafter}
+\usepackage{booktabs}
+
+\author{Dirk Eddelbuettel\\Debian Project} % \And Second Author\\Plus Affiliation}
+\title{From \pkg{DEoptim} to \pkg{RcppDE}: \\
+ A case study in porting from \proglang{C} to \proglang{C++} \\
+ using \pkg{Rcpp} and \pkg{RcppArmadillo}}
+
+\Plainauthor{Dirk Eddelbuettel} % , Second Author} %% comma-separated
+\Plaintitle{DEoptim: A case study in porting to C++ and Rcpp}
+\Shorttitle{A case study in porting to C++ and Rcpp}
+
+\Abstract{
+ \noindent
+ \pkg{DEoptim} \citep{MullenArdiaEtAl:2009:DEoptim,ArdiaBoudtCarlEtAl:2010:DEoptim,CRAN:DEoptim}
+ provides differential evolution optimisation for
+ \proglang{R}. It is based on an implementation by Storn
+ \citep{PriceStornLampinen:2006:DE} and was originally implemented as an
+ interpreted \proglang{R} script. It was then rewritten in ANSI C which
+ resulted in a much improved performance.
+
+ The present paper introduces another implementation. This version is
+ written in \proglang{C++} based on the \pkg{Rcpp} package \citep{CRAN:Rcpp}
+ which provides tools for a more direct integration of \proglang{R} objects at the \proglang{C++}
+ level---and vice versa. It also uses the \pkg{RcppArmadillo} package \citep{CRAN:RcppArmadillo}
+ which provides an interface from \proglang{R} to the \pkg{Armadillo} linear algebra
+ package written in \proglang{C++} by Sanderson \citep{Sanderson:2010:Armadillo}.
+}
+
+\Keywords{\pkg{Rcpp}, \pkg{RcppArmadillo}, \pkg{DEoptim}, differential
+ evolution, genetic algorithm} %% at least one keyword must be supplied
+\Plainkeywords{Rcpp, RcppArmadillo, DEoptim, differential evolution, genetic algorith} %% without formatting
+
+%% publication information
+%% NOTE: Typically, this can be left commented and will be filled out by the technical editor
+%% \Volume{13}
+%% \Issue{9}
+%% \Month{September}
+%% \Year{2004}
+%% \Submitdate{2004-09-29}
+%% \Acceptdate{2004-09-29}
+
+\Address{
+ Dirk Eddelbuettel \\
+ Debian Project \\
+ River Forest, IL, USA\\
+ %% Telephone: +43/1/31336-5053
+ %% Fax: +43/1/31336-734
+ E-mail: \email{edd at debian.org}\\
+ URL: \url{http://dirk.eddelbuettel.com}
+}
+
+%% need no \usepackage{Sweave.sty}
+
+% ------------------------------------------------------------------------
+
+\begin{document}
+
+\section{Introduction}
+
+\pkg{DEoptim}
+\citep{MullenArdiaEtAl:2009:DEoptim,ArdiaBoudtCarlEtAl:2010:DEoptim,CRAN:DEoptim}
+provides differential evolution optimisation for the \proglang{R} language
+and statistical environement. Differential optimisation is one of several
+evolutionary computing approaches; genetic algorithns and simulated annealing
+are two other ones. Differential optimisation is reasonably close to genetic
+algorithms but differs in one key aspect: parameter values are encoded as
+floating point values (rather than sequences of binary digits) which makes it
+particular suitable for real-valued optimisation problems.
+
+\pkg{DEoptim} is based on an implementation by Storn
+\citep{PriceStornLampinen:2006:DE}. It was originally implemented as an
+(interpreted) \proglang{R} script before being rewritten in (compiled)
+\proglang{C} which resulted in a much improved performance. \pkg{DEoptim} is
+being used to optimise problems from a wide range of problem domains ranging
+from crystallography \citep{MullenKrayzmanLevin:2010:Atomic} to agricultural
+economics \citep{BoernerHigginsKantelhardt:2007:Rainfall} and computational
+finance \citep{BoudtPetersonCarl:2008:HFPortfolio}. It is also being used by
+two other CRAN packages for R: \pkg{micEconCES} \citep{CRAN:micEconCES} and
+\pkg{selectMeta} \citep{CRAN:selectMeta}.
+
+The present paper introduces the \proglang{R} package \pkg{RcppDE}. It
+provides another iteration as far as implementations of differential
+evolution go. This new version is based very closely on \pkg{DEoptim} but
+written in \proglang{C++}. The implementation employs the \pkg{Rcpp} package
+\citep{CRAN:Rcpp} which provides tools for a more direct integration of
+\proglang{R} objects at the \proglang{C++} level---and vice versa. It also
+uses the \pkg{RcppArmadillo} package \citep{CRAN:RcppArmadillo} which
+provides an interface from \proglang{R} to the \pkg{Armadillo} linear algebra
+package written in \proglang{C++} by Sanderson
+\citep{Sanderson:2010:Armadillo}.
+
+The code structure descends directly from the current \pkg{DEoptim} by
+\cite{CRAN:DEoptim}. The conversion to \proglang{C++} was undertaken to see
+whether one or more of the goals \textsl{shorter}, \textsl{easier} and
+\textsl{faster} could be achieved by switching the implementation
+language. These goals were loosely defined as follows:
+\begin{itemize}
+\item[shorter] replacing code that is by necessity somewhat verbose when
+ written in \proglang{C} with more compact code written in \proglang{C++}:
+ an example would be copying of a matrix which is implemented as a dual loop
+ copying each element---whereas \proglang{C++} allows us to use a single
+ (overloaded) \verb|+| operator and hence a single statement;
+\item[easier] this may appear as a corollary to the previous point but really
+ covers other aspects such as the automatic type conversion offered by
+ \pkg{Rcpp} as well as the automatic memory management: by replacing
+ allocation and freeing of heap-based dynamic memory, a consistent source of
+ programmer error would be elimnated---plus we are not trying `short and
+ incomprehensible' in the APL-sense but aim for possible improvements on
+ \textsl{both} the length and the ease of comprehension without trading one
+ off against the other;
+\item[faster] this may be a bit more of a conjecture as ultimately,
+ \proglang{C++} and \proglang{C} can be expected to be roughly equivalent
+ given matching compiler versions etc; however gains maybe be expected from
+ replacing a copying operation of a block of adjacent memory cells with a
+ single \verb|memcpy()| call done behind the scenes; \pkg{RcppArmadillo}
+ also offers further possible gains from template metaprogramming which can
+ result in the elimination of temporary object in complex expression where,
+ loosely speaking, compile-time effort is substituted to gain later run-time
+ performance.
+\end{itemize}
+
+This paper is organised as follows. The next sections describes the structure
+of \pkg{DEoptim} which \pkg{RcppDE} shadows closesly. The following two
+section compare differences at the \proglang{R} and \proglang{C++} level,
+respectively. Next, changes in auxiliarry files are discussed before a
+short section reviews perfomance changes. A summary concludes. The
+appendix contains a list of figures contrasting the two implementations.
+
+\section[DEoptim structure]{\pkg{DEoptim} structure}
+
+\pkg{DEoptim} is a straightforward and well-implemented package. Its
+functionality is provided by three \proglang{R} files, as well as three
+\proglang{C} files.
+
+In the transition \pkg{DEoptim} from to \pkg{RcppDE} many more changes were
+made to the \proglang{C} files: besides the obvious porting from \proglang{C}
+to \proglang{C++}, several internal code changes were made. We discuss these
+changes below. An important point to note is that the overall architecture
+and API remain as unchanged as possible.
+%
+On the other hand, very few changes were required at the \proglang{R}
+level. The user-facing side of \pkg{DEoptim} persists virtually unchanged
+(with one or two changes discussed below).
+
+Because of the dominant number of changes at the level of the compiled
+languages, we discuss the structure, and later on changes, of this part first
+before turning to the \proglang{R} side.
+
+\subsection[C / C++ structure and changes]{\proglang{} / \proglang{C++} structure and changes}
+
+\begin{table}[htb]
+ \begin{center}
+ \begin{tabular}{lrclr}
+ \toprule
+ \multicolumn{2}{c}{\pkg{DEoptim}} & & \multicolumn{2}{c}{\pkg{RcppDE}} \\
+ File & Functions & & File & Functions\\
+ % \cmidrule{1-2,3-4} \\
+ \midrule
+ \verb|de4_0.c| & \verb|DEoptimC()| & & \verb|deoptim.cpp| & \verb|DEoptim()| \\
+ & \verb|devol()| & & \verb|devol.cpp| & \verb|devol()| \\
+ & \verb|permute()| & & \verb|permute.cpp| & \verb|permute()| \\[6pt]
+ \verb|evaluate.c|& \verb|evaluate()| & & & \\[6pt]
+ & & & \verb|evaluate.h|& \phantom{X} \verb|EvalBase class| \\[6pt]
+ \verb|get_element.c|\phantom{X} & \verb|getListElement()| & \phantom{X} & & \\
+ \bottomrule
+ \end{tabular}
+ \caption{Source file organisation for \proglang{C} files in \pkg{DEoptim}}
+ \label{tab:Cfiles}
+ \end{center}
+\end{table}
+
+Table~\ref{tab:Cfiles} lists the \proglang{C} and \proglang{C++}
+files in \pkg{DEoptim} and \pkg{RcppDE}, respectively.
+The large file \verb|de4_0.c| has been split into three files: one each for
+the core functions \verb|DEoptim()| (which is called from \proglang{R}),
+\verb|devol()| (which is the core differential evolution optimisation
+routine) and \verb|permute()| (which is a helper function used to shuffle
+indices).
+
+The evalution function has been replaced by a base class and two virtual
+classes. These can now use of an objective function written in \proglang{R}
+(as in \pkg{DEoptim}) as well as one written in \proglang{C++} which can lead
+to substantial speed improvements. Section~\ref{sec:Cppchanges} discusses
+these changes in more detail.
+
+
+\subsection[R structure and changes]{\proglang{R} structure and changes}
+
+Table~\ref{tab:Rfiles} lists the files and corresponding key functions. Very
+few changes has to made for \pkg{RcppDE} as keeping the interface compatible
+was an important goal. As can be seen from table~\ref{tab:Rfiles}, no files
+or functions were added. A more detailed comparison follow below in
+section~\ref{sec:Rchanges}.
+
+\begin{table}[tb]
+ \begin{center}
+ \begin{tabular}{lr}
+ \toprule
+ File & Functions \\
+ \midrule
+% \cmidrule{2}
+ \verb|DEoptim.R| \phantom{XXXXX} & \verb|DEoptim()| \\
+ & \verb|DEoptim.control()| \\[6pt]
+ \verb|methods.R| & \verb|summary.DEoptim()| \\
+ & \verb|plot.DEoptim()| \\[6pt]
+ \verb|zzz.R| & \verb|.onLoad()| \\
+ \bottomrule
+ \end{tabular}
+ \caption{Source file organisation for \proglang{R} files in \pkg{DEoptim}
+ and \pkg{RcppDE}}
+ \label{tab:Rfiles}
+ \end{center}
+\end{table}
+
+
+\section[C / C++ changes]{\proglang{C} / \proglang{C++} changes}
+\label{sec:Cppchanges}
+
+In this section, we will look at the changes at the \proglang{C} /
+\proglang{C++} level. Figures~\ref{fig:deoptim_start} to
+\ref{fig:deoptim_end} contain the code the highest-level \proglang{C++}
+function: \verb|DEoptim()| (which we renamed from \verb|DEoptim_C()| as there
+is no need for a different name at the \proglang{C} level relative to
+\proglang{R}). This is followed by figures~\ref{fig:devol_start} to
+\ref{fig:devol_return} on the main worker function \verb|devol()| before
+figure~\ref{fig:evaluate_fun} compares the objective function evaluation of
+as the last element at the \proglang{C} / \proglang{C++} level.
+
+\subsection[de4_0.c and deoptim.cpp]{\code{de4\_0.c} and \code{deoptim.cpp}}
+
+The \verb|DEoptim()| function (renamed from \verb|DEoptim_C()| as there is no need for a different
+name at the \proglang{C} level relative to \proglang{R}) is the entry point
+from \proglang{R}. It receives parameters, sets up the call of \verb|devol()|
+and then prepares the return values.
+
+\paragraph{Part 1: Start of \texttt{DEoptim()}} The first part concerns
+itself with receiving parameters from \proglang{R};
+figure~\ref{fig:deoptim_start} displays this. The pure mechanics of passing and
+receiving parameters from \proglang{R} are easier thanks to logic provided by
+the \pkg{Rcpp} package:
+\begin{enumerate}
+\item Figure~\ref{fig:deoptim_start} illustrates this point: Panel B (with code
+ using \proglang{C++}) appears to be about half the size of panel A but this
+ due in part to bringing comments on the same line as code. On the other
+ hand, we save for example the declaration of ten \texttt{SEXP} variables as
+ \pkg{Rcpp} objects can be converted directly to \texttt{SEXP} type.
+\item Instead of using a mix of macros like \verb|NUMERIC_VALUE|,
+ \verb|INTEGER_VALUE|, \texttt{NUMERIC\_POINTER} and so on, we have a
+ consistent use of the \pkg{Rcpp} template function \verb|as| with template
+ types corresponding to base typed \verb|int|, \verb|double| etc. Also of
+ note is how one matrix object (\texttt{initialpom} for seeding a first
+ population of parameter values) is initialized directly from a parameter.
+\item Paremeter lookup is by a string value but done using the \pkg{Rcpp} lookup
+ of elements in the \verb|list| type (which corresponds to the \proglang{R}
+ list passed in) rather than via a (functionally similar but ad-hoc) function
+ \verb|getListElement| that hence is not longer needed in \pkg{RcppDE}.
+\item Here as in later code examples, care was taken to ensure that variable
+ names and types correpond closely between both variants.
+\end{enumerate}
+
+\paragraph{Part 2: Middle of \texttt{DEoptim()}} The second part,
+displayed in figure~\ref{fig:deoptim_memory}, allocates dynamic memory for
+both parameters returned to \proglang{R} as well as for temporary objects
+required to store the results of intermediate computations. Again, panel A
+shows the \proglang{C} code from \pkg{DEoptim} whereas panel B displays the
+\proglang{C++} code from \pkg{RcppDE}. One difference becomes immediately apparent: the lack
+of proper matrix or vector types in \proglang{C}. We use the classes from the
+\pkg{Armadillo} \proglang{C++} library written by
+\cite{Sanderson:2010:Armadillo} and provided via the \proglang{R} package
+\pkg{Armadillo} by \citet{CRAN:RcppArmadillo}.
+\begin{enumerate}
+\item Matrix objects are created in \proglang{C} by first allocating a vector
+ of pointers to pointers, which is followed by a loop in which each each
+ column is allocated as vector of approrpriate length.
+\item In \proglang{C++}, allocating a matrix is a single statement. Memory is
+ managed by reference counting and is freed when objects go out of
+ scope. This removes a \textsl{significant} portion of programmer errors.
+\item Another subtle difference is in the allocations of the container
+ holding different population snapshots, here called \texttt{d\_storepop}:
+ \pkg{Rcpp} lets us create a list object in which we store matrices, just as
+ would in \proglang{R} whereas the \proglang{C} construct is much more
+ complicated as we will see below.
+\item A subtle point discussed more below is that \pkg{RcppDE} stores
+ population members column-wise rather than row-wise. Whereas matrices on the
+ left in panel A have dimension $n \times k$, we allocate them as $k \times
+ n$ matrices in panel B.
+\end{enumerate}
+
+\paragraph{Part 3: End of \texttt{DEoptim()}} The third and last part of
+\verb|DEoptim()| covers the actual call of the worker function \verb|devol()|
+and the preparation of return values for \proglang{R}. As
+figure~\ref{fig:deoptim_end} shows, this section realized a significant
+reduction in source code size.
+
+\begin{enumerate}
+\item The \verb|devol()| function is called: as we aim to maintain
+ interfaces, the call is unchanged between both approaches shown in
+ figure~\ref{fig:deoptim_end}.
+\item The code following the function call is very different. The new
+ version is shorter for a number of reasons:
+ \begin{enumerate}
+ \item No need to create new temporary variables just to convert to
+ \texttt{SEXP} types for return to \proglang{R} as the \pkg{Rcpp} package takes
+ care of this: seamless conversion back to \proglang{R} is a key feature.
+ \item No need to allocate memory for new temporary variables (as we do not
+ need these variables, and even if we did memory allocation would be implicit).
+ \item No need to \texttt{PROTECT} and later \texttt{UNPROTECT} such dynamic
+ memory allocations (because this is handled automatically behind the scenes).
+ \item No need for an explicit new list object to hold the eight return variables.
+ \item No need to explicitly assign names for these eight return
+ variables; this done implicitly while we create the returned list object.
+ \end{enumerate}
+\item Rather, a mere two statements are executed: the call to \verb|devol()|
+ followed by single call to create a return object as a list with named
+ elements which are simply inserted---just like we would in \proglang{R} itself.
+\item The remaining code takes care of exception handling by providing to
+ \verb|catch()| branches. These either forward a recognised exception to
+ \proglang{R}, or (in the case of an unrecognised exception) signal a
+ generic error.
+\end{enumerate}
+
+In sum, we see how a number of (possibly small) enhancements taken together
+permit us to write a function which is considerably shorter and easier to read, yet
+fully equivalent in terms of its functionality.
+
+\subsection[de4_0.c and devol.cpp]{\code{de4\_0.c} and \code{devol.cpp}}
+
+The \verb|devol()| function is the key part of the \pkg{DEoptim}
+implementation. It is also by far the largest function. We will discuss it
+again in different sections, each corresponding to one figure ranging from
+figure~\ref{fig:devol_start} to figure~\ref{fig:devol_return}.
+
+\paragraph{Part 1: Start of \texttt{devol()}} The first part concerns the
+beginning of the \verb|devol()|. The display (in
+figure~\ref{fig:devol_start}) of panels A and B differs mostly in minor
+ascpects:
+\begin{enumerate}
+\item The \proglang{C} version contains a declaration of a number of loop
+ variable that are either not needed at all in the \proglang{C++} version,
+ or declared locally.
+\item The urn depth is defined as a \proglang{C} macro and a constant
+ variable, respectively.
+\item The \proglang{C++} version has an additional short block to set up the
+ proper evaluation class for the user supplied function, depending on
+ whether an external pointer object is passed (in which case we expect a
+ compiled functin) or not in which case an \proglang{R} routine is used,
+ just like in \pkg{DEoptim}.
+\item The \texttt{sortIndex} vector is filled with index only in case
+ strategy six has been selected as it is not used otherwise.
+\end{enumerate}
+
+\paragraph{Part 2: Initializations in \texttt{devol()}}
+
+The second part of \verb|devol()| deals with the creation and initialization
+of a number of variables. The \proglang{C} language code in panel A is
+clearly more verbose and longer than the \proglang{C++} code in panel B. As
+shown in figure~\ref{fig:devol_init}, key differences are:
+\begin{enumerate}
+\item Initialization of matrices to zero values uses two explicit loops in
+ the \proglang{C} version.\footnote{The \texttt{memset()} function could be
+ used in the \proglang{C} version to avoid the loops for a minor
+ performance gain.} In \proglang{C++}, we simply use the member function
+ \verb|zeros()| provided by the \pkg{Armadillo} library.
+\item In panel B for the \proglang{C++} case, the initial population in
+ variable \texttt{initialpopm} is transposed in the \proglang{C++}
+ example. We keep each population as a \textsl{column} rather than a
+ \textsl{row} as memory can generally be accessed faster column-wise.
+\item The actual initialization of the first population is very comparable;
+ in particular the \proglang{R} random number generator is called in the
+ exact same sequence all throughout \pkg{RcppDE} so that results are in fact
+ identical to those obtained from \pkg{DEoptim}.
+\item The initial population evaluation occurs with a call to
+ \verb|evaluate()| in the original version, and a call of the member
+ function of the evaluation class which will call either the supplied
+ compiled function, or the supplied \proglang{R} functions.
+\end{enumerate}
+
+\paragraph{Part 3: Iteration loop setup and start of population loop in \texttt{devol()}}
+
+The next part of \verb|devol()|, shown in figure~\ref{fig:devol_iter}, starts
+both the main outer loop over all iterations as well as the main inner loop
+over all population elements. Similar to the discussion in the
+preceding paragraph, the new code is shorter in large part of more compact
+matrix expressions. Other differences are:
+\begin{enumerate}
+\item Intermediate populations are stored directly in a list, after being
+ transposed to account for our design choice of operating column-wise. In
+ the \proglang{C} code, the matrices are somewhat awkwardly `serialised'
+ into a single vector using the counter \texttt{popcnt} that incremened
+ position by position.
+\item Several other vector copies are each excecuted in a single statement rather
+ than in an explicit loop.
+\item At the beginning of the population loop, a vector is once more stored
+ in a temporary variable and the permuation algoritm is called to pick
+ suitable indices which will be used next.
+\end{enumerate}
+
+\paragraph{Part 4 and 5: Population strategies in \texttt{devol()}}
+
+Evaluating each population member based on the user-selected strategies is
+detailed in both figures~\ref{fig:devol_first_four} and
+\ref{fig:devol_other_three} covering the six available strategies as well as
+the default case. There are only fairly minor differences between both
+version as shown by panels A and B of both figures:
+\begin{enumerate}
+\item Instead of \verb|if/else| branches, the new version uses a
+ \verb|switch| statement. This change can be beneficial as it may lead to fewer
+ comparison, depending on the chosen strategy, and though the inner loop is
+ executed many times, the overall benefit is still likely to be small.
+\item The case-invariant initialization of \verb|k| has been moved before the
+ block.
+\item The code for the different strategies differs very little between the
+ initial \proglang{C} implementation and the newer \proglang{C++} code.`4
+\end{enumerate}
+
+\paragraph{Part 6: End of population loop in \texttt{devol()}}
+
+Figure~\ref{fig:devol_end_pop} contains two fairly short segments that are
+entered once within each outer iteration after the loop over all population
+elements has finished. The two code segments in panels A and B of
+figure~\ref{fig:devol_end_pop} are fairly close, with the one difference
+once again the element-by-element copy of vector elements (in \proglang{C})
+versus the single statement using \proglang{C++} objects.
+
+\paragraph{Part 7: Special case of \texttt{bs} flag in \texttt{devol()}}
+
+Similarly, figure~\ref{fig:devol_bs_flag} once more shows differences chiefly
+due to the way interim solutions are copied.
+\begin{enumerate}
+\item Panel A has a full nine loops for copying vector or matrix elements
+ which are not needed in panel B.
+\item Panel A has a somewhat elaborate segment to use a loop to copy a first
+ population vector to a temporary vector, copy a second into the place of
+ the first before then copying the content of the temporary vector into the
+ second (and likewise for the evaluation score of these vectors). In Panel
+ B, we simply use a single call of \texttt{swap()} member function for both
+ the population vectors and their fitness.
+\end{enumerate}
+We should note that this code is executed only when the user has changed the
+default value of false for the \texttt{bs} option in the control list for
+\texttt{DEoptim()}.
+
+\paragraph{Part 8: End of \texttt{devol()}}
+
+Finallt, figure~\ref{fig:devol_return} contains the final portion of the
+\verb|devol()| function. The population and its fitness value are saved. If
+the \texttt{checkWinner} option of the control structure has been changed by
+the user from the default value of false, a possible re-evaluation of the
+best population occurs and values are updated.
+
+Next, if tracing is enabling and the iteration counter has a value which
+signals that tracing display should occur, then updates are printed before a
+few state variables are updated.
+%
+The \texttt{devol()} then finishes right after restoring the state of the
+random number generator.
+
+
+\subsection[Evaluation functions in R and C++]{Evaluation functions in \proglang{R} and \proglang{C++}}
+
+Figure~\ref{fig:evaluate_fun} details the code used to evaluate the
+user-supplied objective function. This figure is an exception: the code from
+\pkg{RcppDE} is much longer than the code in \pkg{DEoptim}. This is due to a
+key main extension in \pkg{RcppDE}: the ability to use not only an
+\proglang{R} function to describe the objective function to be
+minimmized---but also a compiled function.
+
+This is implemented by means of common \proglang{C++} idiom: an abstract base
+class, here called \texttt{EvalBase}. This is an empty class which contains
+no code, but providing an interface containing of two public functions
+\texttt{eval()} and \texttt{getNbEvals()} which are \textsl{virtual}: the
+declare the interface, but provide no implementation. This is provided by two
+classes deriving from the abstract base class: one each for evaluating the
+\proglang{R} and the \proglang{C++} function.
+
+The class \texttt{EvalStandard} in panel B correspond most closely to the
+normal \texttt{evaluate()} in panel A. A function call with a set of
+parameters is prepared and the evaluated in an environment. Here, the
+function and the environment are supplied once at the beginning---and hence
+used to instantiate the class. Each evaluation then brings a new parameter
+vector.
+
+The class \texttt{EvalCompiled} does the same, but not for the compiled
+function that we access via an external pointer. The support for external
+pointer types via type \texttt{XPtr} class in \pkg{Rcpp} was instrumental in
+implementing this. Similar to the standard case, the function is supplied at
+the beginning to instantiate the class. Later, on each evaluation call a new
+parameter vector is supplied.
+
+
+\section[R changes]{\proglang{R} changes}
+\label{sec:Rchanges}
+
+Figures~\ref{fig:fig_R_DEoptim1} and \ref{fig:fig_R_DEoptim2} display the
+main \proglang{R} function \texttt{DEoptim()} which provides the interface
+the user of these packages employs. A few changes have been made:
+\begin{enumerate}
+\item \pkg{DEoptim} supports variable arguments in the \proglang{R} function,
+ which follows the standard set by other optimisation functions. For
+ symmetry with the compiled function, we support just a standard vector.
+ However, the environment in which the function and parameters are evaluated
+ can also be supplied by the user (whereas \pkg{DEoptim} always creates a
+ new environment). The use of the environment then permits us to pass
+ auxiliary arguments to the function in the same way the variable arguments
+ would.
+\item \pkg{RcppDE} therefore has an additional argument \texttt{env} for the
+ user-supplied environement, as well as an additional creation of a default
+ environment if none was supplied.
+\item Population matrices are passed from \proglang{C++} to \proglang{R} as
+ matrix objects; no copy or rearrangement has to be undertaken. This saves
+ a block of code at the top of panel B in figure~\ref{fig:fig_R_DEoptim2}.
+ Similarly, we do not have cast the population matrix as we already obtain a
+ matrix.
+\end{enumerate}
+
+None of the other functions from the files listed in table~\ref{tab:Rfiles}
+were changed (apart from a trivial startup message in the \texttt{.onLoad()}
+function in file \verb|zzz.R|). In other words, the control options for
+\texttt{DEoptim()} are unchanged between between both versions, as are the
+additional method for summarizing, printing and plotting.
+
+\section{Auxiliary files}
+
+\subsection{Regression tests}
+
+As of release 0.1.0, full regression testing has not been implemented in
+\pkg{RcppDE} (and none exist in \pkg{DEoptim} either as of the released
+version 2.0.7).
+
+However, a directory \verb|tests/| has been added. It contains the file
+\verb|compTest.R| which provides a first means of both \textsl{comparing}
+results between \pkg{RcppDE} and \pkg{DEoptim} and also timing them.
+
+Three standard test functions (Wild, Rastrigin, Genrose) are run for four
+sets of parameter vector sizes---for both \pkg{RcppDE} and
+\pkg{DEoptim}. This ensures that results are identical between both
+implenations.
+
+\subsection{Demo files}
+\label{subsec:demos}
+
+Several demos have been added for \pkg{RcppDE} to the existing demo file
+present in \pkg{DEoptim}. These new files are
+
+\begin{itemize}
+\item \texttt{SmallBenchmark} which runs the three standard test functions
+ in both implementations for three small parameters sizes. As these small
+ optimisation problems are relatively inexpensive, they are repeated a
+ number of times and timings are obtained as trimmed means.
+\item \texttt{LargeBenchmark} which runs the three standard test functions in
+ both implementations for three larger parameters sizes, this time without
+ replication.
+\item \texttt{CompiledBenchmark} which runs the three standard test
+ functions---but this time as compiled \proglang{C++} functions
+ demonstrating a significant performance gain relative to the \proglang{R}
+ version.
+\item \texttt{environment} which runs a single small example showing how to
+ pass an auxiliary parameter to the user-supplied function using an
+ environment.
+\end{itemize}
+
+\subsection{Benchmarking Scripts}
+
+The demos file from the preceding section are also being used for performance
+comparisons (as detailed in the next section).
+
+The files are organised as thin wrapper scripts around the demo files
+described in the preceding section.
+
+\section{Performance}
+
+We will divide the performance comparison in three sections, corresponding to
+the same \textsl{small}, \textsl{large} and \textsl{compiled} split detailed
+above in section~\ref{subsec:demos}.
+
+\subsection{Small problems}
+
+\subsection{Large problems}
+
+\subsection{Compiled objective function}
+
+\subsection{Discussion}
+
+
+\section{Summary}
+
+\bibliography{RcppDE}
+
+
+\section*{Appendix}
+
+%% C++ functions
+
+\begin{sidewaysfigure} % fig 1: beginning of DEoptimC / DEoptim
+ \begin{minipage}{0.40\linewidth}
+ \tiny
+ \begin{CodeChunk}
+ \begin{CodeInput}
+SEXP DEoptimC(SEXP lower, SEXP upper, SEXP fn, SEXP control, SEXP rho)
+{
+ int i, j;
+
+ /* External pointers to return to R */
+ SEXP sexp_bestmem, sexp_bestval, sexp_nfeval, sexp_iter,
+ out, out_names, sexp_pop, sexp_storepop, sexp_bestmemit, sexp_bestvalit;
+
+ if (!isFunction(fn))
+ error("fn is not a function!");
+ if (!isEnvironment(rho))
+ error("rho is not an environment!");
+
+ /*-----Initialization of annealing parameters-------------------------*/
+ /* value to reach */
+ double VTR = NUMERIC_VALUE(getListElement(control, "VTR"));
+ /* chooses DE-strategy */
+ int i_strategy = INTEGER_VALUE(getListElement(control, "strategy"));
+ /* Maximum number of generations */
+ int i_itermax = INTEGER_VALUE(getListElement(control, "itermax"));
+ /* Number of objective function evaluations */
+ long l_nfeval = (long)NUMERIC_VALUE(getListElement(control, "nfeval"));
+ /* Dimension of parameter vector */
+ int i_D = INTEGER_VALUE(getListElement(control, "npar"));
+ /* Number of population members */
+ int i_NP = INTEGER_VALUE(getListElement(control, "NP"));
+ /* When to start storing populations */
+ int i_storepopfrom = INTEGER_VALUE(getListElement(control, "storepopfrom"))-1;
+ /* How often to store populations */
+ int i_storepopfreq = INTEGER_VALUE(getListElement(control, "storepopfreq"));
+ /* User-defined inital population */
+ int i_specinitialpop = INTEGER_VALUE(getListElement(control, "specinitialpop"));
+ double *initialpopv = NUMERIC_POINTER(getListElement(control, "initialpop"));
+ /* User-defined bounds */
+ double *f_lower = NUMERIC_POINTER(lower);
+ double *f_upper = NUMERIC_POINTER(upper);
+ /* stepsize */
+ double f_weight = NUMERIC_VALUE(getListElement(control, "F"));
+ /* crossover probability */
+ double f_cross = NUMERIC_VALUE(getListElement(control, "CR"));
+ /* Best of parent and child */
+ int i_bs_flag = NUMERIC_VALUE(getListElement(control, "bs"));
+ /* Print progress? */
+ int i_trace = NUMERIC_VALUE(getListElement(control, "trace"));
+ /* Re-evaluate best parameter vector? */
+ int i_check_winner = NUMERIC_VALUE(getListElement(control, "checkWinner"));
+ /* Average */
[TRUNCATED]
To get the complete diff run:
svnlook diff /svnroot/rcpp -r 2768
More information about the Rcpp-commits
mailing list