[Rcpp-commits] r3222 - pkg/RcppEigen/inst/doc
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Wed Oct 26 17:18:28 CEST 2011
Author: dmbates
Date: 2011-10-26 17:18:28 +0200 (Wed, 26 Oct 2011)
New Revision: 3222
Modified:
pkg/RcppEigen/inst/doc/RcppEigen-Intro.Rnw
pkg/RcppEigen/inst/doc/RcppEigen-Intro.pdf
Log:
Extend the introductory document.
Modified: pkg/RcppEigen/inst/doc/RcppEigen-Intro.Rnw
===================================================================
--- pkg/RcppEigen/inst/doc/RcppEigen-Intro.Rnw 2011-10-26 15:14:19 UTC (rev 3221)
+++ pkg/RcppEigen/inst/doc/RcppEigen-Intro.Rnw 2011-10-26 15:18:28 UTC (rev 3222)
@@ -2,8 +2,7 @@
%\VignetteIndexEntry{RcppEigen-intro}
\usepackage{vmargin}
\setmargrb{0.75in}{0.75in}{0.75in}{0.75in}
-
-\usepackage{color, alltt, bm}
+\usepackage{color, alltt, bm, amsmath}
\usepackage[authoryear,round,longnamesfirst]{natbib}
\usepackage[colorlinks]{hyperref}
\definecolor{link}{rgb}{0,0,0.3} %% next few lines courtesy of RJournal.sty
@@ -14,11 +13,10 @@
linkcolor=link,%
urlcolor=link
}
-
\newcommand{\proglang}[1]{\textsf{#1}}
\newcommand{\pkg}[1]{{\fontseries{b}\selectfont #1}}
\newcommand{\code}[1]{\texttt{#1}}
-
+\newcommand{\rank}{\operatorname{rank}}
%% defined as a stop-gap measure til interaction with highlight is sorted out
\newcommand{\hlstd}[1]{\textcolor[rgb]{0,0,0}{#1}}
\newcommand{\hlnum}[1]{\textcolor[rgb]{0.69,0.49,0}{#1}}
@@ -121,7 +119,6 @@
require( inline )
require( RcppEigen )
@
-
\begin{document}
\maketitle
@@ -129,9 +126,9 @@
\noindent
The \pkg{RcppEigen} package provides access to the \pkg{Eigen}
\proglang{C++} template library from \proglang{R}. \pkg{Rcpp}
- classes and especially the packages \proglang{C++} templated
- functions \code{as} and \code{wrap} provide the ``glue'' for passing
- objects from \proglang{R} to \proglang{C++} and back.
+ classes and instantiations of the \proglang{C++} templated functions
+ \code{as} and \code{wrap} from \pkg{Rcpp} provide the ``glue'' for
+ passing objects from \proglang{R} to \proglang{C++} and back.
}
\section{Introduction}
@@ -149,8 +146,8 @@
\texttt{Rcpp::as} and \texttt{Rcpp::wrap} for the \proglang{C++}
classes defined in \pkg{Eigen}.
-The \pkg{Eigen} classes themselves provide a high-performance,
-versatile and comprehensive representation of dense and sparse
+The \pkg{Eigen} classes themselves provide high-performance,
+versatile and comprehensive representations of dense and sparse
matrices and vectors, as well as decompositions and other functions
to be applied to these objects. In the next section we introduce some
of these classes and show how to interface to them from R.
@@ -158,13 +155,21 @@
\section{Eigen classes}
\label{sec:eclasses}
-Eigen provides templated classes for matrices, vectors and arrays. As
-in many \proglang{C++} template libraries using template
+Eigen (\url{http://eigen.tuxfamily.org}) is a \proglang{C++} template
+library providing classes for many forms of matrices, vectors, arrays
+and decompositions. These classes are flexible and comprehensive
+allowing for both high performance and well structured code
+representing high-level operations. \proglang{C++} code based on Eigen
+is often more like \proglang{R} code, working on the ``whole object'',
+than compiled code in other languages where operations often must be
+coded in loops.
+
+As in many \proglang{C++} template libraries using template
meta-programming
\citep{Abrahams+Gurtovoy:2004:TemplateMetaprogramming}, the templates
-themselves are very complicated. However, \pkg{Eigen} provides
-typedef's for the classes that correspond to \proglang{R} matrices and
-vectors, as shown in Table~\ref{tab:REigen}, and we will use these
+themselves can be very complicated. However, \pkg{Eigen} provides
+typedef's for common classes that correspond to \proglang{R} matrices and
+vectors, as shown in Table~\ref{tab:REigen}. We will use these
typedef's throughout this document.
\begin{table}[tb]
\centering
@@ -212,8 +217,8 @@
sparse matrices we use the Eigen templated class \code{MappedSparseMatrix}.
We must, of course, be careful not to modify the contents of the
-\proglang{R} object in the \proglang{C++} code. It is a good idea to
-declare such objects as const.
+\proglang{R} object in the \proglang{C++} code. A recommended
+practice is always to declare mapped objects as \code{const}.
\subsection{Arrays in Eigen}
\label{sec:arrays}
@@ -377,11 +382,12 @@
For triangular matrices the class is \code{TriangularView} and the
method is \code{triangularView}. The triangle can be specified as
-\code{Lower}, \code{UnitLower}, \code{StrictlyLower} and the
-equivalent specifications for \code{Upper}.
+\code{Lower}, \code{UnitLower}, \code{StrictlyLower}, \code{Upper},
+\code{UnitUpper} or \code{StrictlyUpper}.
For self-adjoint views the \code{rankUpdate} method adds a scalar multiple
-(which defaults to 1) of $\bm A\bm A^\prime$ to the current symmetric matrix.
+of $\bm A\bm A^\prime$ to the current symmetric matrix. The scalar
+multiple defaults to 1.
<<echo=FALSE>>=
code <- 'using Eigen::Map;
@@ -419,7 +425,7 @@
lower triangle is copied into the strict upper triangle).
For these products we could use either the lower triangle or the upper
-triangle because the result is going to be symmetrized before being returned.
+triangle as the result will be symmetrized before it is returned.
\subsection{Cholesky decomposition of the crossprod}
\label{sec:chol}
@@ -454,8 +460,7 @@
const Map<MatrixXd> A(as<Map<MatrixXd> >(AA));
const int n(A.cols());
const LLT<MatrixXd> llt(MatrixXd(n, n).setZero().
- selfadjointView<Lower>().
- rankUpdate(A.adjoint()));
+ selfadjointView<Lower>().rankUpdate(A.adjoint()));
return List::create(_["L"] = MatrixXd(llt.matrixL()),
_["R"] = MatrixXd(llt.matrixU()));'
@@ -477,7 +482,7 @@
The ``D-optimal'' criterion for experimental design chooses the design
that maximizes the determinant, $|\bm X^\prime\bm X|$, for the
$n\times p$ model matrix (or Jacobian matrix), $\bm X$. The
-determinant, $|\bm L|$, of the the $p\times p$ lower Cholesky factor
+determinant, $|\bm L|$, of the $p\times p$ lower Cholesky factor
$\bm L$, defined so that $\bm L\bm L^\prime=\bm X^\prime\bm X$, is
the product of its diagonal elements. (This is true for any
triangular matrix.) By the properties of determinants,
@@ -527,7 +532,7 @@
before applying the \code{log()} method.
\section{Least squares solutions}
-\label{sec:structured}
+\label{sec:leastSquares}
A common operation in statistical computing is calculating a least
squares solution, $\widehat{\bm\beta}$, defined as
@@ -539,10 +544,127 @@
on matrix decompositions, to determine such a solution. We have
already seen two forms of the Cholesky decomposition: ``LLt'' and
``LDLt'', that can be used to solve for $\widehat{\bm\beta}$. Other
-decompositions that can be used as the QR decomposition, with or
+decompositions that can be used are the QR decomposition, with or
without column pivoting, the singular value decomposition and the
eigendecomposition of a symmetric matrix.
+Determining a least squares solution is relatively straightforward. However,
+in statistical computing we often require a bit more information, such
+as the standard errors of the coefficient estimates. Calculating
+these involves evaluating the diagonal elements of $\left(\bm
+ X^\prime\bm X\right)^{-1}$ and the residual sum of squares, $\|\bm
+y-\bm X\widehat{\bm\beta}\|^2$.
+
+\subsection{Least squares using the ``LLt'' Cholesky}
+\label{sec:LLtLeastSquares}
+
+<<echo=FALSE>>=
+code <- 'using Eigen::LLT;
+using Eigen::Lower;
+using Eigen::Map;
+using Eigen::MatrixXd;
+using Eigen::VectorXd;
+
+const Map<MatrixXd> X(as<Map<MatrixXd> >(XX));
+const Map<VectorXd> y(as<Map<VectorXd> >(yy));
+const int n(X.rows()), p(X.cols());
+const LLT<MatrixXd> llt(MatrixXd(p, p).setZero().
+ selfadjointView<Lower>().rankUpdate(X.adjoint()));
+const VectorXd betahat(llt.solve(X.adjoint() * y));
+const VectorXd fitted(X * betahat);
+const VectorXd resid(y - fitted);
+const int df(n - p);
+const double s(resid.norm() / std::sqrt(double(df)));
+const VectorXd se(s * llt.matrixL().solve(MatrixXd::Identity(p, p)).colwise().norm());
+return List::create(_["coefficients"] = betahat,
+ _["fitted.values"] = fitted,
+ _["residuals"] = resid,
+ _["s"] = s,
+ _["df.residual"] = df,
+ _["rank"] = p,
+ _["Std. Error"] = se);'
+writeLines( code, "code.cpp" )
+@
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+<<>>=
+lltLS <- cxxfunction(signature(XX = "matrix", yy = "numeric"),
+ paste(readLines( "code.cpp" ), collapse = "\n"), "RcppEigen")
+data(trees, package="datasets")
+str(lltFit <- with(trees, lltLS(cbind(1, log(Girth)), log(Volume))))
+str(lmFit <- with(trees, lm.fit(cbind(1, log(Girth)), log(Volume))))
+for (nm in c("coefficients", "residuals", "fitted.values", "rank"))
+ stopifnot(all.equal(lltFit[[nm]], unname(lmFit[[nm]])))
+stopifnot(all.equal(lltFit[["Std. Error"]],
+ unname(coef(summary(lm(log(Volume) ~ log(Girth), trees)))[,2])))
+@
+
+There are several aspects of the \proglang{C++} code worth mentioning.
+The \code{solve} method for the \code{LLT} object evaluates, in this
+case, $\left(\bm X^\prime\bm X\right)^{-1}\bm X^\prime y$ but without
+actually evaluating the inverse. The calculation of the residuals,
+$\bm y-\widehat{\bm y}$, can be written, as in \proglang{R}, as
+\code{y - fitted}. (But note that \pkg{Eigen} classes do not have a
+``recycling rule as in \proglang{R}. That is, the two vector operands must
+have the same length.) The \code{norm} method evaluates the square root of the
+sum of squares of the elements of a vector. Although we don't
+explicitly evaluate $\left(\bm X^\prime\bm X\right)^{-1}$ we do
+evaluate $\bm L^{-1}$ to obtain the standard errors. Note also the
+use of the \code{colwise} method in the evaluation of the standard errors.
+
+In the descriptions of other methods for solving least squares
+problems, much of the code parallels that shown here and we omit this.
+We show only the evaluation of the coefficients, the rank and the
+standard errors. Actually, we only calculate the standard errors up
+to the scalar multiple of $s$, the residual standard error, in these
+code fragments. The calculation of the residuals and $s$ and the
+scaling of the coefficient standard errors is the same for all
+methods. (See the file \code{fastLm.cpp} of the \pkg{RcppEigen}
+source package for details.)
+
+\subsection{Least squares using the unpivoted QR decomposition}
+\label{sec:QR}
+
+A QR decomposition has the form
+\begin{displaymath}
+ \bm X=\bm Q\bm R=\bm Q_1\bm R_1
+\end{displaymath}
+where $\bm Q$ is an $n\times n$ orthogonal matrix, which means that
+$\bm Q^\prime\bm Q=\bm Q\bm Q^\prime=\bm I_n$, and $\bm R$ is $n\times
+p$ and zero below the main diagonal. The $n\times p$ matrix $\bm Q_1$
+is the first $p$ columns of $\bm Q$ and the $p\times p$ upper
+triangular matrix $\bm R_1$ is the top $p$ rows of $\bm R$. There are
+three \pkg{Eigen} classes for the QR decomposition:
+\code{HouseholderQR} provides the basic QR decomposition using
+Householder transformations, \code{ColPivHouseholderQR} incorporates
+column pivots and \code{FullPivHouseholderQR} incorporates both row
+and column pivots.
+
+For the unpivoted QR decomposition the code is of the form\\
+<<echo=FALSE>>=
+code <- 'using Eigen::HouseholderQR;
+
+const HouseholderQR<MatrixXd> QR(X);
+const VectorXd betahat(QR.solve(y));
+const VectorXd fitted(X * betahat);
+const int df(n - p);
+const VectorXd se(QR.matrixQR().topRows(m_p).triangularView<Upper>().
+ solve(MatrixXd::Identity(m_p,m_p)).rowwise().norm());'
+writeLines( code, "code.cpp" )
+@
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+
+We see that this code is very close to the code for the ``LLt'' case.
+In fact, if we had extracted the upper triangular factor (the
+\code{matrixU} method) from the \code{LLT} object, the code would be
+nearly identical.
+
+\subsection{Handling the rank-deficient case}
+\label{sec:rankdeficient}
+
One important consideration when determining least squares solutions
is whether $\rank(\bm X)$ is $p$, a situation we describe by saying
that $\bm X$ has ``full column rank''. When $\bm X$ does not have
@@ -567,17 +689,375 @@
<<missingcell>>=
dd <- data.frame(f1 = gl(4, 6, labels = LETTERS[1:4]),
f2 = gl(3, 2, labels = letters[1:3]))[-(7:8), ]
-xtabs(~ f2 + f1, dd) # one missing cell
+xtabs(~ f2 + f1, dd) # one missing cell
mm <- model.matrix(~ f1 * f2, dd)
-kappa(mm) # large, indicating rank deficiency
-(c(rank=qr(mm)$rank, p=ncol(mm))) # alternative evaluation
+kappa(mm) # large condition number, indicating rank deficiency
+rcond(mm) # alternative evaluation, the reciprocal condition number
+(c(rank=qr(mm)$rank, p=ncol(mm))) # rank as computed in R's qr function
set.seed(1)
dd$y <- mm %*% seq_len(ncol(mm)) + rnorm(nrow(mm), sd = 0.1)
# lm detects the rank deficiency
-writeLines(capture.output(print(summary(lm(y ~ f1 * f2, dd)),
- signif.stars=FALSE))[9:22])
+fm1 <- lm(y ~ f1 * f2, dd)
+writeLines(capture.output(print(summary(fm1), signif.stars=FALSE))[9:22])
@
+The \code{lm} function for fitting linear models in \proglang{R} uses
+a rank-revealing form of the QR decomposition. When the model matrix
+is determined to be rank deficient, according to the threshold used in
+\proglang{R}'s QR decomposition, the model matrix is reduced to
+$\rank{X}$ columns by pivoting selected columns (those that are
+apparently linearly dependent on columns to their left) to the right
+hand side of the matrix. A solution for this reduced model matrix is
+determined and the coefficients and standard errors for the redundant
+columns are flagged as missing.
+
+An alternative approach is to evaluate the ``pseudo-inverse'' of $\bm
+X$ from the singular value decomposition (SVD) of $\bm X$ or the
+eigendecomposition of $\bm X^\prime\bm X$. The SVD is of the form
+\begin{displaymath}
+ \bm X=\bm U\bm D\bm V^\prime=\bm U_1\bm D_1\bm V^\prime
+\end{displaymath}
+where $\bm U$ is an orthogonal $n\times n$ matrix and $\bm U_1$ is its
+leftmost $p$ columns, $\bm D$ is $n\times p$ and zero off the main
+diagonal so that $\bm D_1$ is a $p\times p$ diagonal matrix with
+non-decreasing positive diagonal elements, and $\bm V$ is a $p\times
+p$ orthogonal matrix. The pseudo-inverse of $\bm D_1$, written $\bm
+D_1^+$ is a $p\times p$ diagonal matrix whose first $r=\rank(\bm X)$
+diagonal elements are the inverses of the corresponding diagonal
+elements of $\bm D_1$ and whose last $p-r$ diagonal elements are zero.
+
+The tolerance for determining if an element of the diagonal of $\bm D$
+is considered to be (effectively) zero is a multiple of the largest
+singular value (i.e. the $(1,1)$ element of $\bm D$).
+
+We define a utility function, \code{Dplus} to return the
+pseudo-inverse as a diagonal matrix, given the singular values (the
+diagonal of $\bm D$) and the apparent rank.
+
+<<echo=FALSE>>=
+code <- 'using Eigen::DiagonalMatrix;
+using Eigen::Dynamic;
+
+inline DiagonalMatrix<double, Dynamic> Dplus(const ArrayXd& D, int r) {
+ VectorXd Di(VectorXd::Constant(D.size(), 0.));
+ Di.head(r) = D.head(r).inverse();
+ return DiagonalMatrix<double, Dynamic>(Di);
+}'
+writeLines( code, "code.cpp" )
+@
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+
+\subsection{Least squares with the SVD}
+\label{sec:SVDls}
+
+With these definitions the code for least squares using the singular
+value decomposition can be written
+
+<<echo=FALSE>>=
+code <- 'using Eigen::JacobiSVD;
+
+const JacobiSVD<MatrixXd> UDV(X.jacobiSvd(Eigen::ComputeThinU|Eigen::ComputeThinV));
+const ArrayXd D(UDV.singularValues());
+const int r((D > D[0] * threshold()).count());
+const MatrixXd VDp(UDV.matrixV() * Dplus(D, r));
+const VectorXd betahat(VDp * UDV.matrixU().adjoint() * y);
+const int df(n - r);
+const VectorXd se(s * VDp.rowwise().norm());'
+writeLines( code, "code.cpp" )
+@
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+
+In the rank-deficient case this code will produce a complete set of
+coefficients and their standard errors. It is up to the user to note
+that the rank is less than $p$, the number of columns in $\bm X$ and
+hence that the estimated coefficients are just one of an infinite
+number of coefficient vectors that produce the same fitted values. It
+happens that this solution is the minimum norm solution.
+
+The interpretation of the standard errors from this code is also
+problematic when $\bm X$ is rank-deficient.
+
+\subsection{Least squares with the eigendecomposition}
+\label{sec:eigendecomp}
+
+The eigendecomposition of $\bm X^\prime\bm X$ is defined as
+\begin{displaymath}
+ \bm X^\prime\bm X=\bm V\bm\Lambda\bm V^\prime
+\end{displaymath}
+where $\bm V$, the matrix of eigenvectors, is a $p\times p$ orthogonal
+matrix and $\bm\Lambda$ is a $p\times p$ diagonal matrix with
+non-increasing, non-negative diagonal elements, called the eigenvalues
+of $\bm X^\prime\bm X$. When the eigenvalues are distinct this $\bm
+V$ is the same as that in the SVD. Also the eigenvalues of $\bm
+X^\prime\bm X$ are the squares of the singular values of $\bm X$.
+
+With these definitions we can adapt much of the code from the SVD
+method for the eigendecomposition.
+
+<<echo=FALSE>>=
+code <- 'using Eigen::SelfAdjointEigenSolver;
+
+const SelfAdjointEigenSolver<MatrixXd>
+ VLV(MatrixXd(p, p).setZero().selfadjointView<Lower>.rankUpdate(X.adjoint()));
+const ArrayXd D(VLV.eigenvalues().array().sqrt());
+const int r((D > D[0] * threshold()).count());
+const MatrixXd VDp(UDV.matrixV() * Dplus(D, r));
+const VectorXd betahat(VDp * VDp.adjoint() * X.adjoint() * y);
+const int df(n - p);
+const VectorXd se(s * VDp.rowwise().squaredNorm().array().sqrt());'
+writeLines( code, "code.cpp" )
+@
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+
+\subsection{Least squares with the column-pivoted QR decomposition}
+\label{sec:colPivQR}
+
+The column-pivoted QR decomposition provides results similar to those
+from \proglang{R} in both the full-rank and the rank-deficient cases.
+The decomposition is of the form
+\begin{displaymath}
+ \bm X\bm P=\bm Q\bm R=\bm Q_1\bm R_1
+\end{displaymath}
+where, as before, $\bm Q$ is $n\times n$ and orthogonal and $\bm R$ is
+$n\times p$ and upper triangular. The $p\times p$ matrix $\bm P$ is a
+permutation matrix. That is, its columns are a permutation of the
+columns of $\bm I_p$. It serves to reorder the columns of $\bm X$ so
+that the diagonal elements of $\bm R$ are non-increasing in magnitude.
+
+An instance of the class \code{Eigen::ColPivHouseholderQR} has a
+\code{rank} method returning the computational rank of the matrix.
+When $\bm X$ is of full rank we can use essentially the same code as
+in the unpivoted decomposition except that we must reorder the
+standard errors. When $\bm X$ is rank-deficient we evaluate the
+coefficients and standard errors for the leading $r$ columns of $\bm
+X\bm P$ only.
+
+In the rank-deficient case the straightforward calculation of the
+fitted values, as $\bm X\widehat{\bm\beta}$ cannot be used. We
+could do some complicated rearrangement of the columns of X and the
+coefficient estimates but it is conceptually (and computationally)
+easier to employ the relationship
+\begin{displaymath}
+ \widehat{\bm y} = \bm Q_1\bm Q_1^\prime\bm y=\bm Q
+ \begin{bmatrix}
+ \bm I_r & \bm 0
+ \bm 0 & \bm 0
+ \end{bmatrix}
+ \bm Q^\prime\bm y
+\end{displaymath}
+The vector $\bm Q^\prime\bm y$ is called the ``effects'' vector in \proglang{R}.
+
+<<echo=FALSE>>=
+code <- 'using Eigen::ColPivHouseholderQR;
+typedef ColPivHouseholderQR<MatrixXd>::PermutationType Permutation;
+
+const ColPivHouseholderQR<MatrixXd> PQR(X);
+const Permutation Pmat(PQR.colsPermutation());
+const int r(PQR.rank());
+VectorXd betahat, fitted, se;
+if (r == X.cols()) { // full rank case
+ betahat = PQR.solve(y);
+ fitted = X * m_coef;
+ se = Pmat * PQR.matrixQR().topRows(m_p).triangularView<Upper>().
+ solve(MatrixXd::Identity(m_p, m_p)).rowwise().norm();
+} else {
+ MatrixXd Rinv(PQR.matrixQR().topLeftCorner(r, r).
+ triangularView<Upper>().
+ solve(MatrixXd::Identity(r, r)));
+ VectorXd effects(PQR.householderQ().adjoint() * y);
+ betahat.head(r) = Rinv * effects.head(r);
+ betahat = Pmat * betahat;
+ // create fitted values from effects
+ // (cannot use X * m_coef when X is rank-deficient)
+ effects.tail(X.rows() - r).setZero();
+ fitted = PQR.householderQ() * effects;
+ se.head(r) = Rinv.rowwise().norm();
+ se = Pmat * se;
+}'
+writeLines( code, "code.cpp" )
+@
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+
+Just to check that this does indeed provide the desired answer
+<<rankdeficient>>=
+print(summary(fmPQR <- fastLm(y ~ f1 * f2, dd)), signif.stars=FALSE)
+all.equal(coef(fm1), coef(fmPQR))
+all.equal(unname(fitted(fm1)), fitted(fmPQR))
+all.equal(unname(residuals(fm1)), residuals(fmPQR))
+@
+
+The rank-revealing SVD method produces the same fitted
+values but not the same coefficients.
+<<rankdeficient>>=
+print(summary(fmSVD <- fastLm(y ~ f1 * f2, dd, method=4L)), signif.stars=FALSE)
+all.equal(coef(fm1), coef(fmSVD))
+all.equal(unname(fitted(fm1)), fitted(fmSVD))
+all.equal(unname(residuals(fm1)), residuals(fmSVD))
+@
+
+For this example the method based on the eigendecomposition gets the rank wrong
+\subsection{Comparative speed}
+
+In the \pkg{RcppEigen} package the \proglang{R} function to fit linear
+models using the methods described above is called \code{fastLm}. The
+natural question to ask is, ``Is it indeed fast to use these methods
+based on \pkg{Eigen}?''. We have provided a comparative benchmarking
+of these methods, plus the default method using \proglang{R}'s
+\code{lm} function and the \code{fastLm} implementations in the
+\pkg{RcppArmadillo} and \pkg{RcppGSL} packages, if they are
+installed. The benchmark is a file called \code{lmBenchmark.R} and it
+uses the \pkg{rbenchmark} package.
+
+It can be run as
+<<benchmark,eval=FALSE>>=
+source(system.file("examples", "lmBenchmark.R", package="RcppEigen"))
+@
+Results will vary according to the speed of the processor and the
+number of cores and the implementation of the BLAS (Basic Linear
+Algebra Subroutines) that is being used. (\pkg{Eigen} methods do not
+use the BLAS but the other methods do.)
+
+Results obtained on a desktop computer, circa 2010, are shown in
+Table~\ref{tab:lmRes}
+\begin{table}[tb]
+ \centering
+ \caption{\code{lmBenchmark} results on a desktop computer for the
+ default size, $100,000\times 40$, full-rank model matrix running
+ 20 repetitions for each method. Times (Elapsed, User and Sys) are
+ in seconds. The BLAS in use is a single-threaded version of Atlas
+ (Automatically Tuned Linear Algebra System).}
+ \label{tab:lmRes}
+ \begin{tabular}{r r r r r}
+ \hline
+ \multicolumn{1}{c}{Method} & \multicolumn{1}{c}{Relative} &
+ \multicolumn{1}{c}{Elapsed} & \multicolumn{1}{c}{User} &
+ \multicolumn{1}{c}{Sys}\\
+ \hline
+ LLt & 1.000000 & 1.227 & 1.228 & 0.000 \\
+ LDLt & 1.037490 & 1.273 & 1.272 & 0.000 \\
+ SymmEig & 2.895681 & 3.553 & 2.972 & 0.572 \\
+ QR & 7.828036 & 9.605 & 8.968 & 0.620 \\
+ PivQR & 7.953545 & 9.759 & 9.120 & 0.624 \\
+ arma & 8.383048 & 10.286 & 10.277 & 0.000 \\
+ lm.fit & 13.782396 & 16.911 & 15.521 & 1.368 \\
+ SVD & 54.829666 & 67.276 & 66.321 & 0.912 \\
+ GSL & 157.531377 & 193.291 & 192.568 & 0.640 \\
+ \hline
+ \end{tabular}
+\end{table}
+
+These results indicate that methods based on forming and decomposing
+$\bm X^\prime\bm X$, (i.e. LDLt, LLt and SymmEig) are considerably
+faster than the others. The SymmEig method, using a rank-revealing
+decomposition, would be preferred, although the LDLt method could
+probably be modified to be rank-revealing. Do bear in mind that the
+dimensions of the problem will influence the comparative results.
+Because there are 100,000 rows in $\bm X$ methods that decompose the
+whole $\bm X$ matrix (all the methods except those named above) will
+be at a disadvantage.
+
+The pivoted QR method is 1.6 times faster than R's \code{lm.fit} on
+this test and provides nearly the same information as \code{lm.fit}.
+Methods based on the singular value decomposition (SVD and GSL) are
+much slower but, as mentioned above, this is caused in part by $\bm X$
+having many more rows than columns. The GSL method from the GNU
+Scientific Library uses an older algorithm for the SVD and is clearly
+out of contention.
+
+An SVD method using the Lapack SVD subroutine, \code{dgesv}, may be
+faster than the native \pkg{Eigen} implementation of the SVD, which is
+not a particularly fast method.
+
+\section{Delayed evaluation}
+\label{sec:delayed}
+
+A form of delayed evaluation is used in \pkg{Eigen}. That is, many
+operators and methods do not force the evaluation of the object but
+instead return an ``expression object'' that is evaluated when
+needed. As an example, even though we write the $\bm X^\prime\bm X$
+evaluation using \code{.rankUpdate(X.adjoint())} the
+\code{X.adjoint()} part is not evaluated immediately. The
+\code{.rankUpdate} method detects that it has been passed a matrix
+that is to be used in its transposed form and evaluates the update
+by taking inner products of columns of $\bm X$ instead of rows of $\bm
+X^\prime$.
+
+Occasionally the method for \code{Rcpp::wrap} will not force an
+evaluation when it should. This is at least what Bill Venables calls
+an ``infelicity'' in the code, if not an outright bug. In the first
+example of the transpose of an integer matrix we assigned the
+transpose as a \code{MatrixXi} before returning it with \code{wrap}.
+The assignment forces the evaluation. If we skip this step we get an
+answer with the correct shape but the wrong contents.
+
+<<echo=FALSE>>=
+code <- 'using Eigen::Map;
+using Eigen::MatrixXi;
+const Map<MatrixXi> A(as<Map<MatrixXi> >(AA));
+return wrap(A.transpose());'
+writeLines( code, "code.cpp" )
+@
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+<<>>=
+Ai <- matrix(1:6, ncol=2L)
+ftrans2 <- cxxfunction(signature(AA = "matrix"),
+ paste(readLines( "code.cpp" ), collapse = "\n"),
+ plugin = "RcppEigen")
+(At <- ftrans2(Ai))
+all.equal(At, t(Ai))
+@
+
+Another recommended practice is to assign objects before wrapping them
+for return to \proglang{R}.
+
+\section{Sparse matrices}
+\label{sec:sparse}
+
+\pkg{Eigen} provides sparse matrix classes. An \proglang{R} object of
+class \code{dgCMatrix} (from the \pkg{Matrix} package) can be mapped
+as shown below.
+
+<<echo=FALSE>>=
+code <- 'using Eigen::Map;
+using Eigen::MappedSparseMatrix;
+using Eigen::SparseMatrix;
+using Eigen::VectorXd;
+
+const MappedSparseMatrix<double> A(as<MappedSparseMatrix<double> >(AA));
+const Map<VectorXd> y(as<Map<VectorXd> >(yy));
+const SparseMatrix<double> At(A.adjoint());
+return List::create(_["At"] = At,
+ _["Aty"] = At * y);'
+writeLines( code, "code.cpp" )
+@
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+<<>>=
+sparse1 <- cxxfunction(signature(AA = "dgCMatrix", yy = "numeric"),
+ paste(readLines( "code.cpp" ), collapse = "\n"),
+ plugin = "RcppEigen")
+data(KNex, package="Matrix")
+rr <- sparse1(KNex$mm, KNex$y)
+stopifnot(all.equal(rr$At, t(KNex$mm)),
+ all.equal(rr$Aty, as.vector(crossprod(KNex$mm, KNex$y))))
+@
+
+A sparse Cholesky decomposition is provided in \pkg{Eigen} as the
+\code{SimplicialCholesky} class. There are also linkages to the
+\pkg{CHOLMOD} code from the \pkg{Matrix} package. At present, both of
+these are regarded as experimental.
+
\bibliographystyle{plainnat}
\bibliography{Rcpp}
Modified: pkg/RcppEigen/inst/doc/RcppEigen-Intro.pdf
===================================================================
--- pkg/RcppEigen/inst/doc/RcppEigen-Intro.pdf 2011-10-26 15:14:19 UTC (rev 3221)
+++ pkg/RcppEigen/inst/doc/RcppEigen-Intro.pdf 2011-10-26 15:18:28 UTC (rev 3222)
@@ -73,538 +73,1114 @@
(Least squares solutions)
endobj
49 0 obj
-<< /S /GoTo /D [50 0 R /Fit ] >>
+<< /S /GoTo /D (subsection.4.1) >>
endobj
-57 0 obj <<
-/Length 3144
+52 0 obj
+(Least squares using the ``LLt'' Cholesky)
+endobj
+53 0 obj
+<< /S /GoTo /D (subsection.4.2) >>
+endobj
+56 0 obj
+(Least squares using the unpivoted QR decomposition)
+endobj
+57 0 obj
+<< /S /GoTo /D (subsection.4.3) >>
+endobj
+60 0 obj
+(Handling the rank-deficient case)
+endobj
+61 0 obj
+<< /S /GoTo /D (subsection.4.4) >>
+endobj
+64 0 obj
+(Least squares with the SVD)
+endobj
+65 0 obj
+<< /S /GoTo /D (subsection.4.5) >>
+endobj
+68 0 obj
+(Least squares with the eigendecomposition)
+endobj
+69 0 obj
+<< /S /GoTo /D (subsection.4.6) >>
+endobj
+72 0 obj
+(Least squares with the column-pivoted QR decomposition)
+endobj
+73 0 obj
+<< /S /GoTo /D (subsection.4.7) >>
+endobj
+76 0 obj
+(Comparative speed)
+endobj
+77 0 obj
+<< /S /GoTo /D (section.5) >>
+endobj
+80 0 obj
+(Delayed evaluation)
+endobj
+81 0 obj
+<< /S /GoTo /D (section.6) >>
+endobj
+84 0 obj
+(Sparse matrices)
+endobj
+85 0 obj
+<< /S /GoTo /D [86 0 R /Fit ] >>
+endobj
+94 0 obj <<
+/Length 3349
/Filter /FlateDecode
>>
stream
-xÚZYÛF~_!äe5Õaì#»X`ãu. X¬c,ó ÑÐ3ÚFø%¿}ëèn6)íÀ"»«»ëøê«¢ªÙݬ}sUüýêÍÕ_[7N(YÙw3é½pÎÏlÐÂ?{s;ûiþÇë
®ªùw×rþxÜü°Û^/T=¿=Â|ë-=£aíõÏo¾ÿâk§:µZº$ò5N|ÿ^ÁkøÜ5<MÉ"ÔµÂiÑj¶P^ø xö?aü×¾Ï|ðÙÃêÊοÂë|}»B¯
4äi¡¼wc+á]ñÛµòófkã¤PL%àZÂÿj¾·Äaïøò_¸.îu{ú´ó.wüHE/x²ªð»Ä=l¨f^kIѦup³
6ÂÂ^Ù0ûfÙ-yÍaAu(-d at g ÝK#ÎðæuUÛ^(£.0pgæY£OÑ«kæ¿\×¾ÝÑ óÁÓßÖ·×
-L÷«èÍ]oÐKF·¬¤¨¥K«ÓNï2#
-ö¾¬ÐΦ±/aÔçH)PösrÄÍSá¦tK·Ö7x"|°{Ï»~·CÙì b³¡Õðéâ5\e´;µÑÐ9´Yjö>Yé¤)9¿å¯¨¸ý\»y³FÇ!xÎ÷¸8e! üdAZ%ÝGµ7åûÔ
-Îd.±wú]ð$ÁÎßãÉVñA7<?o ìMí¨ÎúRªå(VÝ©< MãóN xvÂÔ¶ $>kµP²tjþVjS (¸ù8mJ[ìÑðëGA*Â1 Zµÿ¯IZÚóqéÂK=åÁ^XeJ+mä _hq5¨ñÙ«¾#Þt</ÆO¨fÚXÊ ø° Ó<Cʤ215jXÚ3ì¸*|çt"¬ðʵ)Â(²CéÚp8=òå( Y±Æ·¯òPò»¨¢=4}tÉY§-=FßqºÈ²A«Ë6RAÏñÙHfêH7µÂÀî139ø{ÑqÎÇ}·Á½&>6 ·¾M÷ø½Æ£SÐùÇ}ðØf(wà x at nðãf(H¦rr*v/|7ñVm
-*sehÝ3XpôßÑ
-:Bºv+I¯1d·½t±²åAÃX $ïf (?_ÆÏâöÕø¶¼j¤G ·0V¼ ÄA²Á®ÃÐÌ,Æ@ÙLà²B);m¼J
³8HrUÀ¢Ó(8¬äýMå +um¨ù\n©+QP¬üÁj 3àæ"¯U-ëºLgéP2Aò°¡îÉ8ÍËVö#h.ÒKh®®eÆQìmÆ¥ªñrS=[î¿$BöC³JÍ@Ö`¶t.eõ®ð¡C§ý
d«¬Kr+ãæÖ¥WÆý.#ýÔ&Ì×¢¬(G-¢
« ô§¬syYÞ
-þüÇ¢Tf¶ÖR@ïý¾eÜöìÜ5]hð§F?ùdôÄÇ*sêb@Ýõyou§ÖÁ&þ¨»
-eSÕ.W úãºÕêîânDÇ
-õ?Ç'Kþs`$ÊZÄ
-ç±7Ë"á6ÀÐLåçyÝ>Q,<Wjø5#At¨]s"!J£±»&Ú7 í39Í}`Ú -j/\]wÍp[.Âëì
-â}Æ]1fÖÚW alo*o9U"¤¨=öN ÔYÆÏ4äð%¯ÑæÆ ²Wòò ´ö6~ǧêG üÉb{¢âwM%
-ª¸}òi*#
-5f¼¸Úw'%]Sh±§/dd.\º©ü99àÇ:;ué/5¶!m6%¦vágßFÒʸ¹{
-?ó¬þ÷¤ª2¸Gåª=ñ!)"¶CWHÖëDÎÔ*Ö¯Öwìâàuceéܤ^
-ø*
>I©Õ ^P:×´|Pp7ç WÜm¬m$sä=I<»4~ÝA.ïùXx¼GxEO³ã»MR
.¡
-ºOÖéðÇÔ¨ÍE+6Iµ©ü¹¦«ÀqX\g¢òh14Ǧ]b¡l`ÚÌE-üz¶Ã/_sº³i¿©¬ÜìyzÙd忶;$Ï'äÛfýæ¿\ÄcY}ezebÉØð¦Ó¤sÕbKO9®ÈùØ¡âC²JË÷ü `KY¸vâì=Ìû¶EĨ(ôÅPÿ³ÞÊS¾¦¯¦:kp ccÄ\MÏ êÿ
u£Rìñ¬Lüz¬ÆXh><}OÚÙA&n3Tô
-0R)( ½òÓåj°9d at HoÚ3,»fÖ±j.{q:¥W·çÝrWr
-
-X?FÞ°¶GÛ®=èÍîïKú.ü¼E¸-Ã×Ü¡¦Gæm£9
-=jårZ¯¯ÂÚ9²1sþ]·ón9q)vcÄ&7Øï¥^ªqzÛ-¡R½XÍ$±CvùNjFÂe×Nè
-Óv:1II mU
-ï*" êº-ê_Ìy"¤ÛðcBÚøâ"Õä̪µ5óçd¤Øð£-meIæZ°¡JÛ¾ÇwSïùù'kŧë«æÏdØVåhPØïy9ÖQ&ܸÚrñKwqÜïðùo[vå*a\Qa
ü6ál õ³Ì·"¹+£ëÜm-à±-FÛ¨ë´å"Cm#r;
-lÃDh³ß'٣ٿw')çÞ?úè«7W¿^Ir_9s Êa±dÕæꧫÙ-<û;øÙ3ÜÌv{¶?f?^ý»¬,ØP¦j;%£âp: Ê¡Ô¶ÈûOøµ¾ê¾¡R°]9Q{Ãö<î£<ÞõZî}yýÖ¢"]OFÝ0DevÈóãÎ:ìù=3&=é¹VFØJ~N\Eí>ÄUÈvÝ(²èO¯S%aµlÝ÷X×VÔyË6R ®#Ù©h2¤_éU.Z!6t]®*iNõ¦×b*r\YàR~\¶ÐÚ)µÚ\ ÑÃ^{0©ÔqéÛ$ïÞµQÙUWºàÊÎp-ðe¨(Æ Aø«ËfBÑ 5uÛÇëÌÕ-µgÎà{~'nóf£A¹Å/ðS»eYÌܪÎY/?JeqÅ%µQZÂRô¼$u5A#M|~ú<î ÍïFK]hò+?}T½âÈrKRå~yoR&±àÜÖúùË]3E,êØþ¡¿ëÔÐ:tºr¼b25¬¶?,o,ùÏÐZãËD]êfmD5ýÊ# ¥Î8}Ú"ñëöåÍ6ö+¹P+uûêöuïÀZOE²üN{¡+rZFfÈ>!©õ\~. ç/·;¦ñG$&·W4Xeæûç÷«ðÂ^ó
-¬¿óÜü® ôº
;¾K2Sg%iâ-_pKû9ñ>i¨ËºP6^Ô!á©¥úT¦ý¡Ã6Õ!OÜÄtÒô0Wø?µ¤)J:QYÛÍ=²ªÊ@øi ,Òi¡jÙñµrÃP³u5ô3~Æ
aÔO¡ZWtã[Þ
-ëÇ»ñý ió²ê¾Ù]nÔQ
¿¶þ`u»¡Ä_ {9nDaV¬Í²KjN½!\zyï`¾àdÊBÖùÑ÷î([ÿÙûY_ºÕú1Jû¡ûIY}rF,¡"âÄÙíü'j¶.¶=g§?q7b¥üü¤¬Úe
-n`ÌÄ©°½húr´Èq!¡Ëo2W¯3þe'rÜf.9ÿû8øZhÿ!H2ø»ìn> d µI¿:ëáÿÿÅ
[TRUNCATED]
To get the complete diff run:
svnlook diff /svnroot/rcpp -r 3222
More information about the Rcpp-commits
mailing list