[Rcpp-commits] r3248 - pkg/RcppEigen/inst/doc

Sun Oct 30 18:34:42 CET 2011

Author: edd
Date: 2011-10-30 18:34:41 +0100 (Sun, 30 Oct 2011)
New Revision: 3248

Modified:
pkg/RcppEigen/inst/doc/RcppEigen-Intro.Rnw
Log:
style police: nuke most uses of 'We' or 'we'

Modified: pkg/RcppEigen/inst/doc/RcppEigen-Intro.Rnw
===================================================================
--- pkg/RcppEigen/inst/doc/RcppEigen-Intro.Rnw	2011-10-30 02:15:08 UTC (rev 3247)
+++ pkg/RcppEigen/inst/doc/RcppEigen-Intro.Rnw	2011-10-30 17:34:41 UTC (rev 3248)
@@ -180,8 +180,8 @@
The \pkg{Eigen} classes themselves provide high-performance,
versatile and comprehensive representations of dense and sparse
matrices and vectors, as well as decompositions and other functions
-to be applied to these objects.  In the next section we introduce some
-of these classes and show how to interface to them from \proglang{R}.
+to be applied to these objects.  The next section introduces some
+of these classes and shows how to interface to them from \proglang{R}.

\section{Eigen classes}
\label{sec:eclasses}
@@ -195,12 +195,11 @@
rather than compiled code in other languages where operations often must be
coded in loops.

-As in many \proglang{C++} template libraries using template
-meta-programming
+As in many \proglang{C++} template libraries using template meta-programming
\citep{Abrahams+Gurtovoy:2004:TemplateMetaprogramming}, the templates
themselves can be very complicated.  However, \pkg{Eigen} provides
\code{typedef}s for common classes that correspond to \proglang{R} matrices and
-vectors, as shown in Table~\ref{tab:REigen}. We will use these
+vectors, as shown in Table~\ref{tab:REigen}, and this paper will use these
\code{typedef}s throughout this document.
\begin{table}[tb]
\caption{Correspondence between R matrix and vector types and classes in the \code{Eigen} namespace.}
@@ -223,7 +222,7 @@

The \proglang{C++} classes shown in Table~\ref{tab:REigen} are in the
\code{Eigen} namespace, which means that they must be written as
-\code{Eigen::MatrixXd}.  However, if we preface our use of these class
+\code{Eigen::MatrixXd}.  However, if one prefaces the use of these class
names with a declaration like

% \begin{lstlisting}[language=C++]
@@ -244,7 +243,7 @@
\mbox{}
\normalfont
\end{quote}
-we can use these names without the qualifier.
+then one can use these names without the namespace qualifier.

\subsection{Mapped matrices in Eigen}
\label{sec:mapped}
@@ -255,10 +254,10 @@
from an \proglang{R} object involves copying its contents.  An
alternative is to have the contents of the \proglang{R} matrix or
vector mapped to the contents of the object from the Eigen class.  For
-dense matrices we use the Eigen templated class \code{Map}.  For
-sparse matrices we use the Eigen templated class \code{MappedSparseMatrix}.
+dense matrices one can use the Eigen templated class \code{Map}, and for
+sparse matrices one can deploy the Eigen templated class \code{MappedSparseMatrix}.

-We must, of course, be careful not to modify the contents of the
+One must, of course, be careful not to modify the contents of the
\proglang{R} object in the \proglang{C++} code.  A recommended
practice is always to declare mapped objects as \lstinline!const!.

@@ -266,13 +265,13 @@
\label{sec:arrays}

For matrix and vector classes \pkg{Eigen} overloads the \texttt{*'}
-operator to indicate matrix multiplication.  Occasionally we want
-component-wise operations instead of matrix operations.  The
+operator to indicate matrix multiplication.  Occasionally
+component-wise operations instead of matrix operations are preferred.  The
\code{Array} templated classes are used in \pkg{Eigen} for
-component-wise operations.  Most often we use the \code{array()} method
+component-wise operations.  Most often the \code{array()} method is used
for Matrix or Vector objects to create the array.  On those occasions
-when we wish to convert an array to a matrix or vector object we use
-the \code{matrix()} method.
+when one wishes to convert an array to a matrix or vector object
+the \code{matrix()} method is used.

\subsection{Structured matrices in \pkg{Eigen}}
\label{sec:structured}
@@ -281,7 +280,7 @@
as symmetric matrices, triangular matrices and banded matrices.  For
dense matrices, these special structures are described as views'',
meaning that the full dense matrix is stored but only part of the
-matrix is used in operations.  For a symmetric matrix we need to
+matrix is used in operations.  For a symmetric matrix one needs to
specify whether the lower triangle or the upper triangle is to be used as
the contents, with the other triangle defined by the implicit symmetry.

@@ -322,7 +321,7 @@
package \citep*{CRAN:inline} for \proglang{R} and its \pkg{RcppEigen}
plugin provide a convenient way of developing and debugging the
-\proglang{C++} code.  For actual production code we generally
+\proglang{C++} code.  For actual production code one generally
incorporate the \proglang{C++} source code files in a package and
include the line \code{LinkingTo: Rcpp, RcppEigen} in the package's
\code{DESCRIPTION} file.  The
@@ -332,25 +331,25 @@

The \code{cxxfunction} with the \code{"Rcpp"} or \code{"RcppEigen"}
plugins has the \code{as} and \code{wrap} functions already defined as
-\code{Rcpp::as} and \code{Rcpp::wrap}.  In the examples below we will
-omit these declarations.  Do remember that you will need them in
-\proglang{C++} source code for a package.
+\code{Rcpp::as} and \code{Rcpp::wrap}.  In the examples below
+these declarations are omitted.  It is important to remember that they are
+needed in actual \proglang{C++} source code for a package.

The first few examples are simply for illustration as the operations
shown could be more effectively performed directly in \proglang{R}.
-We do compare the results from \pkg{Eigen} to those from the direct
+Finally, the results from \pkg{Eigen} are compared to those from the direct
\proglang{R} results.

\subsection{Transpose of an integer matrix}
\label{sec:transpose}

-We create a simple matrix of integers
+The next \proglang{R} code snippet creates a simple matrix of integers
(A <- matrix(1:6, ncol=2))
str(A)
@
-and, in Listing~\ref{trans}, use the \code{transpose()} method for the
-\code{Eigen::MatrixXi} class to return its transpose. The \proglang{R}
+and, in Listing~\ref{trans}, the \code{transpose()} method for the
+\code{Eigen::MatrixXi} class is used to return the transpose of the supplied matrix. The \proglang{R}
matrix in the \code{SEXP} \code{AA} is mapped to an
\code{Eigen::MatrixXi} object then the matrix \code{At} is constructed
from its transpose and returned to \proglang{R}.
@@ -390,9 +389,10 @@
\end{quote}
\end{figure}

-We compile and link this code segment (stored as text in a variable named
-\code{transCpp}) into an executable function \code{ftrans} and check that it
-works as intended.
+The next \proglang{R} snippet compiles and links the \proglang{C++} code
+segment (stored as text in a variable named \code{transCpp}) into an
+executable function \code{ftrans} and then checks that it works as intended
+by compariing the output to an explicit transpose of the matrix argument.
<<ftrans>>=
ftrans <- cxxfunction(signature(AA="matrix"), transCpp, plugin="RcppEigen")
(At <- ftrans(A))
@@ -402,8 +402,8 @@
For numeric or integer matrices the \code{adjoint()} method is
equivalent to the \code{transpose()} method.  For complex matrices, the
adjoint is the conjugate of the transpose.  In keeping with the
-conventions in the \pkg{Eigen} documentation we will, in what follows,
-use the \code{adjoint()} method to create the transpose of numeric or
+conventions in the \pkg{Eigen} documentation, in what follows,
+the \code{adjoint()} method is used to create the transpose of numeric or
integer matrices.

\subsection{Products and cross-products}
@@ -446,7 +446,7 @@
As shown in the last example, the \proglang{R} function
\code{crossprod} calculates the product of the transpose of its first
argument with its second argument.  The single argument form,
-\code{crossprod(X)}, evaluates $\bm X^\prime\bm X$.  We could, of
+\code{crossprod(X)}, evaluates $\bm X^\prime\bm X$.  One could, of
course, calculate this product as
<<eval=FALSE>>=
t(X) %*% X
@@ -456,7 +456,7 @@
The function \code{tcrossprod} evaluates \code{crossprod(t(X))}
without actually forming the transpose.

-To express these calculations in Eigen we create a \code{SelfAdjointView},
+To express these calculations in Eigen, a \code{SelfAdjointView} is created,
which is a dense matrix of which only one triangle is used, the other
triangle being inferred from the symmetry.  (Self-adjoint'' is
equivalent to symmetric for non-complex matrices.)
@@ -514,7 +514,7 @@
to it and convert back to a general matrix form (i.e. the strict
lower triangle is copied into the strict upper triangle).

-For these products we could use either the lower triangle or the upper
+For these products one could use either the lower triangle or the upper
triangle as the result will be symmetrized before it is returned.

\subsection{Cholesky decomposition of the crossprod}
@@ -531,13 +531,13 @@
R$is simply the transpose of$\bm L$from the LLt'' form. The templated \pkg{Eigen} classes for the LLt and LDLt forms are -called \code{LLT} and \code{LDLT}. In general we would preserve the -objects from these classes so that we could use them for solutions of -linear systems. For illustration we simply return the matrix$\bm L$-from the LLt'' form. +called \code{LLT} and \code{LDLT}. In general, one would preserve the +objects from these classes in order to re-use them for solutions of +linear systems. For a simple illustration, the matrix$\bm L$+from the LLt'' form is returned. -Because the Cholesky decomposition involves taking square roots we -switch to numeric matrices +Because the Cholesky decomposition involves taking square roots, the internal +representation is switched to numeric matrices <<storage>>= storage.mode(A) <- "double" @ @@ -583,11 +583,11 @@ |\bm X^\prime\bm X|=|\bm L\bm L^\prime|=|\bm L|\,|\bm L^\prime|=|\bm L|^2 \end{displaymath} -Alternatively, if we use the LDLt'' decomposition,$\bm L\bm D\bm
+Alternatively, if using the LDLt'' decomposition, $\bm L\bm D\bm L^\prime=\bm X^\prime\bm X$ where $\bm L$ is unit lower triangular and
$\bm D$ is diagonal then $|\bm X^\prime\bm X|$ is the product of the
-diagonal elements of $\bm D$.  Because we know that the diagonals of
-$\bm D$ must be non-negative, we often evaluate the logarithm of the
+diagonal elements of $\bm D$.  Because it is known that the diagonals of
+$\bm D$ must be non-negative, one often evaluates the logarithm of the
determinant as the sum of the logarithms of the diagonal elements of
$\bm D$.  Several options are shown in Listing~\ref{cholDet}.

@@ -623,7 +623,7 @@

Note the use of the \code{array()} method in the calculation of the
log-determinant.  Because the \code{log()} method applies to arrays,
-not to vectors or matrices, we must create an array from \code{Dvec}
+not to vectors or matrices, an array from \code{Dvec} has to be created
before applying the \code{log()} method.

\section{Least squares solutions}
@@ -636,15 +636,15 @@
\end{displaymath}
where the model matrix, $\bm X$, is $n\times p$ ($n\ge p$) and $\bm y$
is an $n$-dimensional response vector.  There are several ways based
-on matrix decompositions, to determine such a solution.  We have
-already seen two forms of the Cholesky decomposition: LLt'' and
-LDLt'', that can be used to solve for $\widehat{\bm\beta}$.  Other
+on matrix decompositions, to determine such a solution.  Earlier, two forms
+of the Cholesky decomposition were discussed: LLt'' and
+LDLt'', which can both be used to solve for $\widehat{\bm\beta}$.  Other
decompositions that can be used are the QR decomposition, with or
without column pivoting, the singular value decomposition and the
eigendecomposition of a symmetric matrix.

Determining a least squares solution is relatively straightforward.
-However, in statistical computing we often require additional information,
+However, statistical computing often requires additional information,
such as the standard errors of the coefficient estimates.  Calculating
these involves evaluating the diagonal elements of $\left(\bm X^\prime\bm X\right)^{-1}$ and the residual sum of squares, $\|\bm @@ -690,8 +690,8 @@ Listing~\ref{lltLS} shows a calculation of the least squares coefficient estimates (\code{betahat}) and the standard errors (\code{se}) through an LLt'' Cholesky decomposition of the crossproduct of the model -matrix,$\bm X$. We check that the results from this calculation do -correspond to those from the \code{lm.fit} function in \proglang{R} +matrix,$\bm X$. Next, the results from this calculation are compared +to those from the \code{lm.fit} function in \proglang{R} (\code{lm.fit} is the workhorse function called by \code{lm} once the model matrix and response have been evaluated). <<lltLS>>= @@ -714,8 +714,8 @@ that \pkg{Eigen} classes do not have a recycling rule as in \proglang{R}. That is, the two vector operands must have the same length.) The \code{norm()} method evaluates the square root of the -sum of squares of the elements of a vector. Although we don't -explicitly evaluate$\left(\bm X^\prime\bm X\right)^{-1}$we do +sum of squares of the elements of a vector. Although one does not +explicitly evaluate$\left(\bm X^\prime\bm X\right)^{-1}$one does evaluate$\bm L^{-1}$to obtain the standard errors. Note also the use of the \code{colwise()} method in the evaluation of the standard errors. It applies a method to the columns of a matrix, returning a @@ -724,9 +724,9 @@ In the descriptions of other methods for solving least squares problems, much of the code parallels that shown in -Listing~\ref{lltLS}. We will omit the redundant parts and show only -the evaluation of the coefficients, the rank and the standard errors. -Actually, we only calculate the standard errors up to the scalar +Listing~\ref{lltLS}. The redundant parts are omitted, and only +the evaluation of the coefficients, the rank and the standard errors is shown. +Actually, the standard errors are calculated only up to the scalar multiple of$s$, the residual standard error, in these code fragments. The calculation of the residuals and$s$and the scaling of the coefficient standard errors is the same for all methods. (See the @@ -766,7 +766,7 @@ rowwise().norm()); \end{lstlisting} The calculations in Listing~\ref{QRLS} are quite similar to those in -Listing~\ref{lltLS}. In fact, if we had extracted the upper +Listing~\ref{lltLS}. In fact, if one had extracted the upper triangular factor (the \code{matrixU()} method) from the \code{LLT} object in Listing~\ref{lltLS}, the rest of the code would be nearly identical. @@ -775,19 +775,19 @@ \label{sec:rankdeficient} One important consideration when determining least squares solutions -is whether$\rank(\bm X)$is$p$, a situation we describe by saying +is whether$\rank(\bm X)$is$p$, a situation described by saying that$\bm X$has full column rank''. When$\bm X$does not have -full column rank we say it is rank deficient''. +full column rank it is called rank deficient''. Although the theoretical rank of a matrix is well-defined, its -evaluation in practice is not. At best we can compute an effective -rank according to some tolerance. We refer to decompositions that -allow us to estimate the rank of the matrix in this way as +evaluation in practice is not. At best one can compute an effective +rank according to some tolerance. Decompositions that +allow to estimate the rank of the matrix in this way are referred to as rank-revealing''. Because the \code{model.matrix} function in \proglang{R} does a -considerable amount of symbolic analysis behind the scenes, we usually -end up with full-rank model matrices. The common cases of +considerable amount of symbolic analysis behind the scenes, one usually +ends up with full-rank model matrices. The common cases of rank-deficiency, such as incorporating both a constant term and a full set of indicators columns for the levels of a factor, are eliminated. Other, more subtle, situations will not be detected at this stage, @@ -839,11 +839,11 @@ is considered to be (effectively) zero is a multiple of the largest singular value (i.e. the$(1,1)$element of$\bm D$). -In Listing~\ref{Dplus} we define a utility function, \code{Dplus}, to +In Listing~\ref{Dplus} utility function, \code{Dplus}, is defined to return the pseudo-inverse as a diagonal matrix, given the singular values (the diagonal of$\bm D$) and the apparent rank. To be able to use this function with the eigendecomposition where the eigenvalues -are in increasing order we include a Boolean argument \code{rev} +are in increasing order, a Boolean argument \code{rev} is included indicating whether the order is reversed. \begin{lstlisting}[frame=tb,caption={DplusCpp: Create the @@ -911,7 +911,7 @@ V$ is the same as that in the SVD.  Also the eigenvalues of $\bm X^\prime\bm X$ are the squares of the singular values of $\bm X$.

-With these definitions we can adapt much of the code from the SVD
+With these definitions one can adapt much of the code from the SVD
method for the eigendecomposition, as shown in Listing~\ref{SymmEigLS}.
\begin{lstlisting}[frame=tb,caption={SymmEigLSCpp: Least squares using the eigendecomposition},label=SymmEigLS]
@@ -944,14 +944,14 @@

An instance of the class \code{Eigen::ColPivHouseholderQR} has a
\code{rank()} method returning the computational rank of the matrix.
-When $\bm X$ is of full rank we can use essentially the same code as
-in the unpivoted decomposition except that we must reorder the
-standard errors.  When $\bm X$ is rank-deficient we evaluate the
-coefficients and standard errors for the leading $r$ columns of $\bm +When$\bm X$is of full rank one can use essentially the same code as +in the unpivoted decomposition except that one must reorder the +standard errors. When$\bm X$is rank-deficient, the +coefficients and standard errors are evaluated for the leading$r$columns of$\bm
X\bm P$only. In the rank-deficient case the straightforward calculation of the -fitted values, as$\bm X\widehat{\bm\beta}$, cannot be used. We +fitted values, as$\bm X\widehat{\bm\beta}$, cannot be used. One could do some complicated rearrangement of the columns of X and the coefficient estimates but it is conceptually (and computationally) easier to employ the relationship @@ -1024,7 +1024,7 @@ models using the methods described above is called \code{fastLm}. It follows an earlier example in the \pkg{Rcpp} package which was carried over to both \pkg{RcppArmadillo} and \pkg{RcppGSL}. The natural question to ask is, Is it indeed fast to use these methods -based on \pkg{Eigen}?''. We have provided benchmarking code for these +based on \pkg{Eigen}?''. To this end, the example provides benchmarking code for these methods, \proglang{R}'s \code{lm.fit} function and the \code{fastLm} implementations in the \pkg{RcppArmadillo} \citep{CRAN:RcppArmadillo} and \pkg{RcppGSL} \citep{CRAN:RcppGSL} packages, if they are @@ -1101,7 +1101,7 @@ A form of delayed evaluation is used in \pkg{Eigen}. That is, many operators and methods do not force the evaluation of the object but instead return an expression object'' that is evaluated when -needed. As an example, even though we write the$\bm X^\prime\bm X$+needed. As an example, even though one writes the$\bm X^\prime\bm X\$
\code{X.adjoint()} part is not evaluated immediately.  The
\code{rankUpdate} method detects that it has been passed a matrix
@@ -1114,9 +1114,9 @@
an infelicity'' in the code, if not an outright bug.  In the code
for the transpose of an integer matrix shown in Listing~\ref{trans} we
assigned the transpose as a \code{MatrixXi} before returning it with
-\code{wrap}.  The assignment forces the evaluation.  If we skip this
-shape but incorrect contents.
+\code{wrap}.  The assignment forces the evaluation.  If this
+shape but incorrect contents is obtained.

@@ -1188,10 +1188,10 @@
linear algebra computations as an extension to the \proglang{R} system.
\pkg{RcppEigen} is based on the modern \proglang{C++} library \pkg{Eigen}
which combines extended functionality with excellent performance, and
-utilizes \pkg{Rcpp} to interface \proglang{R} with \proglang{C++}.  We
-provided several illustration covering common matrix operations, including
+utilizes \pkg{Rcpp} to interface \proglang{R} with \proglang{C++}.
+Several illustration covered common matrix operations which covered
several approaches to solving a least squares problem---including an extended
-discussion of rank-revealing approaches.  A short example provived
+discussion of rank-revealing approaches.  A short example provided
an empirical illustration  for the excellent run-time performance of the
\pkg{RcppEigen} package.

`