[Rcpp-commits] r3222 - pkg/RcppEigen/inst/doc

Wed Oct 26 17:18:28 CEST 2011

Author: dmbates
Date: 2011-10-26 17:18:28 +0200 (Wed, 26 Oct 2011)
New Revision: 3222

Modified:
   pkg/RcppEigen/inst/doc/RcppEigen-Intro.Rnw
   pkg/RcppEigen/inst/doc/RcppEigen-Intro.pdf
Log:
Extend the introductory document.


Modified: pkg/RcppEigen/inst/doc/RcppEigen-Intro.Rnw
===================================================================

--- pkg/RcppEigen/inst/doc/RcppEigen-Intro.Rnw	2011-10-26 15:14:19 UTC (rev 3221)
+++ pkg/RcppEigen/inst/doc/RcppEigen-Intro.Rnw	2011-10-26 15:18:28 UTC (rev 3222)
@@ -2,8 +2,7 @@
 %\VignetteIndexEntry{RcppEigen-intro}
 \usepackage{vmargin}
 \setmargrb{0.75in}{0.75in}{0.75in}{0.75in}
-
-\usepackage{color, alltt, bm}
+\usepackage{color, alltt, bm, amsmath}
 \usepackage[authoryear,round,longnamesfirst]{natbib}
 \usepackage[colorlinks]{hyperref}
 \definecolor{link}{rgb}{0,0,0.3}	%% next few lines courtesy of RJournal.sty
@@ -14,11 +13,10 @@
     linkcolor=link,%
     urlcolor=link
 }
-
 \newcommand{\proglang}[1]{\textsf{#1}}
 \newcommand{\pkg}[1]{{\fontseries{b}\selectfont #1}}
 \newcommand{\code}[1]{\texttt{#1}}
-
+\newcommand{\rank}{\operatorname{rank}}
 %% defined as a stop-gap measure til interaction with highlight is sorted out
 \newcommand{\hlstd}[1]{\textcolor[rgb]{0,0,0}{#1}}
 \newcommand{\hlnum}[1]{\textcolor[rgb]{0.69,0.49,0}{#1}}
@@ -121,7 +119,6 @@
 require( inline )
 require( RcppEigen )
 @
-
 \begin{document}
 \maketitle
 
@@ -129,9 +126,9 @@
   \noindent
   The \pkg{RcppEigen} package provides access to the \pkg{Eigen}
   \proglang{C++} template library from \proglang{R}. \pkg{Rcpp}
-  classes and especially the packages \proglang{C++} templated
-  functions \code{as} and \code{wrap} provide the ``glue'' for passing
-  objects from \proglang{R} to \proglang{C++} and back.
+  classes and instantiations of the \proglang{C++} templated functions
+  \code{as} and \code{wrap} from \pkg{Rcpp} provide the ``glue'' for
+  passing objects from \proglang{R} to \proglang{C++} and back.
 }
 
 \section{Introduction}
@@ -149,8 +146,8 @@
 \texttt{Rcpp::as} and \texttt{Rcpp::wrap} for the \proglang{C++}
 classes defined in \pkg{Eigen}.
 
-The \pkg{Eigen} classes themselves provide a high-performance,
-versatile and comprehensive representation of dense and sparse
+The \pkg{Eigen} classes themselves provide high-performance,
+versatile and comprehensive representations of dense and sparse
 matrices and vectors, as well as decompositions and other functions
 to be applied to these objects.  In the next section we introduce some
 of these classes and show how to interface to them from R.
@@ -158,13 +155,21 @@
 \section{Eigen classes}
 \label{sec:eclasses}
 
-Eigen provides templated classes for matrices, vectors and arrays.  As
-in many \proglang{C++} template libraries using template
+Eigen (\url{http://eigen.tuxfamily.org}) is a \proglang{C++} template
+library providing classes for many forms of matrices, vectors, arrays
+and decompositions.  These classes are flexible and comprehensive
+allowing for both high performance and well structured code
+representing high-level operations. \proglang{C++} code based on Eigen
+is often more like \proglang{R} code, working on the ``whole object'',
+than compiled code in other languages where operations often must be
+coded in loops.
+
+As in many \proglang{C++} template libraries using template
 meta-programming
 \citep{Abrahams+Gurtovoy:2004:TemplateMetaprogramming}, the templates
-themselves are very complicated.  However, \pkg{Eigen} provides
-typedef's for the classes that correspond to \proglang{R} matrices and
-vectors, as shown in Table~\ref{tab:REigen}, and we will use these
+themselves can be very complicated.  However, \pkg{Eigen} provides
+typedef's for common classes that correspond to \proglang{R} matrices and
+vectors, as shown in Table~\ref{tab:REigen}. We will use these
 typedef's throughout this document.
 \begin{table}[tb]
   \centering 
@@ -212,8 +217,8 @@
 sparse matrices we use the Eigen templated class \code{MappedSparseMatrix}.
 
 We must, of course, be careful not to modify the contents of the
-\proglang{R} object in the \proglang{C++} code.  It is a good idea to
-declare such objects as const.
+\proglang{R} object in the \proglang{C++} code.  A recommended
+practice is always to declare mapped objects as \code{const}.
 
 \subsection{Arrays in Eigen}
 \label{sec:arrays}
@@ -377,11 +382,12 @@
 
 For triangular matrices the class is \code{TriangularView} and the
 method is \code{triangularView}.  The triangle can be specified as
-\code{Lower}, \code{UnitLower}, \code{StrictlyLower} and the
-equivalent specifications for \code{Upper}.
+\code{Lower}, \code{UnitLower}, \code{StrictlyLower}, \code{Upper},
+\code{UnitUpper} or \code{StrictlyUpper}.
 
 For self-adjoint views the \code{rankUpdate} method adds a scalar multiple
-(which defaults to 1) of $\bm A\bm A^\prime$ to the current symmetric matrix.
+of $\bm A\bm A^\prime$ to the current symmetric matrix.  The scalar
+multiple defaults to 1.
 
 <<echo=FALSE>>=
 code <- 'using Eigen::Map;
@@ -419,7 +425,7 @@
 lower triangle is copied into the strict upper triangle).
 
 For these products we could use either the lower triangle or the upper
-triangle because the result is going to be symmetrized before being returned.
+triangle as the result will be symmetrized before it is returned.
 
 \subsection{Cholesky decomposition of the crossprod}
 \label{sec:chol}
@@ -454,8 +460,7 @@
 const Map<MatrixXd> A(as<Map<MatrixXd> >(AA));
 const int           n(A.cols());
 const LLT<MatrixXd> llt(MatrixXd(n, n).setZero().
-                        selfadjointView<Lower>().
-                        rankUpdate(A.adjoint()));
+                        selfadjointView<Lower>().rankUpdate(A.adjoint()));
 
 return List::create(_["L"] = MatrixXd(llt.matrixL()),
                     _["R"] = MatrixXd(llt.matrixU()));'
@@ -477,7 +482,7 @@
 The ``D-optimal'' criterion for experimental design chooses the design
 that maximizes the determinant, $|\bm X^\prime\bm X|$, for the
 $n\times p$ model matrix (or Jacobian matrix), $\bm X$.  The
-determinant, $|\bm L|$, of the the $p\times p$ lower Cholesky factor
+determinant, $|\bm L|$, of the $p\times p$ lower Cholesky factor
 $\bm L$, defined so that $\bm L\bm L^\prime=\bm X^\prime\bm X$, is
 the product of its diagonal elements.  (This is true for any
 triangular matrix.)  By the properties of determinants,
@@ -527,7 +532,7 @@
 before applying the \code{log()} method.
 
 \section{Least squares solutions}
-\label{sec:structured}
+\label{sec:leastSquares}
 
 A common operation in statistical computing is calculating a least
 squares solution, $\widehat{\bm\beta}$, defined as
@@ -539,10 +544,127 @@
 on matrix decompositions, to determine such a solution.  We have
 already seen two forms of the Cholesky decomposition: ``LLt'' and
 ``LDLt'', that can be used to solve for $\widehat{\bm\beta}$.  Other
-decompositions that can be used as the QR decomposition, with or
+decompositions that can be used are the QR decomposition, with or
 without column pivoting, the singular value decomposition and the
 eigendecomposition of a symmetric matrix.
 
+Determining a least squares solution is relatively straightforward.  However,
+in statistical computing we often require a bit more information, such
+as the standard errors of the coefficient estimates.  Calculating
+these involves evaluating the diagonal elements of $\left(\bm
+  X^\prime\bm X\right)^{-1}$ and the residual sum of squares, $\|\bm
+y-\bm X\widehat{\bm\beta}\|^2$.
+
+\subsection{Least squares using the ``LLt'' Cholesky}
+\label{sec:LLtLeastSquares}
+
+<<echo=FALSE>>=
+code <- 'using Eigen::LLT;
+using Eigen::Lower;
+using Eigen::Map;
+using Eigen::MatrixXd;
+using Eigen::VectorXd;
+
+const Map<MatrixXd>   X(as<Map<MatrixXd> >(XX));
+const Map<VectorXd>   y(as<Map<VectorXd> >(yy));
+const int             n(X.rows()), p(X.cols());
+const LLT<MatrixXd> llt(MatrixXd(p, p).setZero().
+                        selfadjointView<Lower>().rankUpdate(X.adjoint()));
+const VectorXd  betahat(llt.solve(X.adjoint() * y));
+const VectorXd   fitted(X * betahat);
+const VectorXd    resid(y - fitted);
+const int            df(n - p);
+const double          s(resid.norm() / std::sqrt(double(df)));
+const VectorXd       se(s * llt.matrixL().solve(MatrixXd::Identity(p, p)).colwise().norm());
+return     List::create(_["coefficients"]   = betahat,
+                        _["fitted.values"]  = fitted,
+                        _["residuals"]      = resid,
+                        _["s"]              = s,
+                        _["df.residual"]    = df,
+                        _["rank"]           = p,
+                        _["Std. Error"]     = se);'
+writeLines( code, "code.cpp" )
+@ 
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+<<>>=
+lltLS <- cxxfunction(signature(XX = "matrix", yy = "numeric"),
+                     paste(readLines( "code.cpp" ), collapse = "\n"), "RcppEigen")
+data(trees, package="datasets")
+str(lltFit <- with(trees, lltLS(cbind(1, log(Girth)), log(Volume))))
+str(lmFit <- with(trees, lm.fit(cbind(1, log(Girth)), log(Volume))))
+for (nm in c("coefficients", "residuals", "fitted.values", "rank"))
+    stopifnot(all.equal(lltFit[[nm]], unname(lmFit[[nm]])))
+stopifnot(all.equal(lltFit[["Std. Error"]],
+                    unname(coef(summary(lm(log(Volume) ~ log(Girth), trees)))[,2])))
+@
+
+There are several aspects of the \proglang{C++} code worth mentioning.
+The \code{solve} method for the \code{LLT} object evaluates, in this
+case, $\left(\bm X^\prime\bm X\right)^{-1}\bm X^\prime y$ but without
+actually evaluating the inverse.  The calculation of the residuals,
+$\bm y-\widehat{\bm y}$, can be written, as in \proglang{R}, as
+\code{y - fitted}. (But note that \pkg{Eigen} classes do not have a
+``recycling rule as in \proglang{R}.  That is, the two vector operands must
+have the same length.)  The \code{norm} method evaluates the square root of the
+sum of squares of the elements of a vector.  Although we don't
+explicitly evaluate $\left(\bm X^\prime\bm X\right)^{-1}$ we do
+evaluate $\bm L^{-1}$ to obtain the standard errors.  Note also the
+use of the \code{colwise} method in the evaluation of the standard errors.
+
+In the descriptions of other methods for solving least squares
+problems, much of the code parallels that shown here and we omit this.
+We show only the evaluation of the coefficients, the rank and the
+standard errors.  Actually, we only calculate the standard errors up
+to the scalar multiple of $s$, the residual standard error, in these
+code fragments.  The calculation of the residuals and $s$ and the
+scaling of the coefficient standard errors is the same for all
+methods.  (See the file \code{fastLm.cpp} of the \pkg{RcppEigen}
+source package for details.)
+
+\subsection{Least squares using the unpivoted QR decomposition}
+\label{sec:QR}
+
+A QR decomposition has the form
+\begin{displaymath}
+  \bm X=\bm Q\bm R=\bm Q_1\bm R_1
+\end{displaymath}
+where $\bm Q$ is an $n\times n$ orthogonal matrix, which means that
+$\bm Q^\prime\bm Q=\bm Q\bm Q^\prime=\bm I_n$, and $\bm R$ is $n\times
+p$ and zero below the main diagonal.  The $n\times p$ matrix $\bm Q_1$
+is the first $p$ columns of $\bm Q$ and the $p\times p$ upper
+triangular matrix $\bm R_1$ is the top $p$ rows of $\bm R$.  There are
+three \pkg{Eigen} classes for the QR decomposition:
+\code{HouseholderQR} provides the basic QR decomposition using
+Householder transformations, \code{ColPivHouseholderQR} incorporates
+column pivots and \code{FullPivHouseholderQR} incorporates both row
+and column pivots.
+
+For the unpivoted QR decomposition the code is of the form\\
+<<echo=FALSE>>=
+code <- 'using Eigen::HouseholderQR;
+
+const HouseholderQR<MatrixXd> QR(X);
+const VectorXd           betahat(QR.solve(y));
+const VectorXd            fitted(X * betahat);
+const int                     df(n - p);
+const VectorXd                se(QR.matrixQR().topRows(m_p).triangularView<Upper>().
+	                         solve(MatrixXd::Identity(m_p,m_p)).rowwise().norm());'
+writeLines( code, "code.cpp" )
+@ 
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+
+We see that this code is very close to the code for the ``LLt'' case.
+In fact, if we had extracted the upper triangular factor (the
+\code{matrixU} method) from the \code{LLT} object, the code would be
+nearly identical.
+
+\subsection{Handling the rank-deficient case}
+\label{sec:rankdeficient}
+
 One important consideration when determining least squares solutions
 is whether $\rank(\bm X)$ is $p$, a situation we describe by saying
 that $\bm X$ has ``full column rank''.   When $\bm X$ does not have
@@ -567,17 +689,375 @@
 <<missingcell>>=
 dd <- data.frame(f1 = gl(4, 6, labels = LETTERS[1:4]),
                  f2 = gl(3, 2, labels = letters[1:3]))[-(7:8), ]
-xtabs(~ f2 + f1, dd)     # one missing cell
+xtabs(~ f2 + f1, dd)                    # one missing cell
 mm <- model.matrix(~ f1 * f2, dd)
-kappa(mm)                # large, indicating rank deficiency
-(c(rank=qr(mm)$rank, p=ncol(mm))) # alternative evaluation
+kappa(mm)         # large condition number, indicating rank deficiency
+rcond(mm)         # alternative evaluation, the reciprocal condition number
+(c(rank=qr(mm)$rank, p=ncol(mm))) # rank as computed in R's qr function
 set.seed(1)
 dd$y <- mm %*% seq_len(ncol(mm)) + rnorm(nrow(mm), sd = 0.1)
                          # lm detects the rank deficiency
-writeLines(capture.output(print(summary(lm(y ~ f1 * f2, dd)),
-                                signif.stars=FALSE))[9:22])
+fm1 <- lm(y ~ f1 * f2, dd)
+writeLines(capture.output(print(summary(fm1), signif.stars=FALSE))[9:22])
 @ 
 
+The \code{lm} function for fitting linear models in \proglang{R} uses
+a rank-revealing form of the QR decomposition.  When the model matrix
+is determined to be rank deficient, according to the threshold used in
+\proglang{R}'s QR decomposition, the model matrix is reduced to
+$\rank{X}$ columns by pivoting selected columns (those that are
+apparently linearly dependent on columns to their left) to the right
+hand side of the matrix.  A solution for this reduced model matrix is
+determined and the coefficients and standard errors for the redundant
+columns are flagged as missing.
+
+An alternative approach is to evaluate the ``pseudo-inverse'' of $\bm
+X$ from the singular value decomposition (SVD) of $\bm X$ or the
+eigendecomposition of $\bm X^\prime\bm X$.  The SVD is of the form
+\begin{displaymath}
+  \bm X=\bm U\bm D\bm V^\prime=\bm U_1\bm D_1\bm V^\prime
+\end{displaymath}
+where $\bm U$ is an orthogonal $n\times n$ matrix and $\bm U_1$ is its
+leftmost $p$ columns, $\bm D$ is $n\times p$ and zero off the main
+diagonal so that $\bm D_1$ is a $p\times p$ diagonal matrix with
+non-decreasing positive diagonal elements, and $\bm V$ is a $p\times
+p$ orthogonal matrix.  The pseudo-inverse of $\bm D_1$, written $\bm
+D_1^+$ is a $p\times p$ diagonal matrix whose first $r=\rank(\bm X)$
+diagonal elements are the inverses of the corresponding diagonal
+elements of $\bm D_1$ and whose last $p-r$ diagonal elements are zero.
+
+The tolerance for determining if an element of the diagonal of $\bm D$
+is considered to be (effectively) zero is a multiple of the largest
+singular value (i.e. the $(1,1)$ element of $\bm D$).
+
+We define a utility function, \code{Dplus} to return the
+pseudo-inverse as a diagonal matrix, given the singular values (the
+diagonal of $\bm D$) and the apparent rank.
+
+<<echo=FALSE>>=
+code <- 'using Eigen::DiagonalMatrix;
+using Eigen::Dynamic;
+
+inline DiagonalMatrix<double, Dynamic> Dplus(const ArrayXd& D, int r) {
+    VectorXd   Di(VectorXd::Constant(D.size(), 0.));
+    Di.head(r)    = D.head(r).inverse();
+    return DiagonalMatrix<double, Dynamic>(Di);
+}'
+writeLines( code, "code.cpp" )
+@ 
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+
+\subsection{Least squares with the SVD}
+\label{sec:SVDls}
+
+With these definitions the code for least squares using the singular
+value decomposition can be written
+
+<<echo=FALSE>>=
+code <- 'using Eigen::JacobiSVD;
+
+const JacobiSVD<MatrixXd> UDV(X.jacobiSvd(Eigen::ComputeThinU|Eigen::ComputeThinV));
+const ArrayXd               D(UDV.singularValues());
+const int                   r((D > D[0] * threshold()).count());
+const MatrixXd            VDp(UDV.matrixV() * Dplus(D, r));
+const VectorXd        betahat(VDp * UDV.matrixU().adjoint() * y);
+const int                  df(n - r);
+const VectorXd             se(s * VDp.rowwise().norm());'
+writeLines( code, "code.cpp" )
+@ 
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+
+In the rank-deficient case this code will produce a complete set of
+coefficients and their standard errors.  It is up to the user to note
+that the rank is less than $p$, the number of columns in $\bm X$ and
+hence that the estimated coefficients are just one of an infinite
+number of coefficient vectors that produce the same fitted values.  It
+happens that this solution is the minimum norm solution.
+
+The interpretation of the standard errors from this code is also
+problematic when $\bm X$ is rank-deficient.
+
+\subsection{Least squares with the eigendecomposition}
+\label{sec:eigendecomp}
+
+The eigendecomposition of $\bm X^\prime\bm X$ is defined as
+\begin{displaymath}
+  \bm X^\prime\bm X=\bm V\bm\Lambda\bm V^\prime
+\end{displaymath}
+where $\bm V$, the matrix of eigenvectors, is a $p\times p$ orthogonal
+matrix and $\bm\Lambda$ is a $p\times p$ diagonal matrix with
+non-increasing, non-negative diagonal elements, called the eigenvalues
+of $\bm X^\prime\bm X$.  When the eigenvalues are distinct this $\bm
+V$ is the same as that in the SVD.  Also the eigenvalues of $\bm
+X^\prime\bm X$ are the squares of the singular values of $\bm X$.
+
+With these definitions we can adapt much of the code from the SVD
+method for the eigendecomposition.
+
+<<echo=FALSE>>=
+code <- 'using Eigen::SelfAdjointEigenSolver;
+
+const SelfAdjointEigenSolver<MatrixXd>
+    VLV(MatrixXd(p, p).setZero().selfadjointView<Lower>.rankUpdate(X.adjoint()));
+const ArrayXd               D(VLV.eigenvalues().array().sqrt());
+const int                   r((D > D[0] * threshold()).count());
+const MatrixXd            VDp(UDV.matrixV() * Dplus(D, r));
+const VectorXd        betahat(VDp * VDp.adjoint() * X.adjoint() * y);
+const int                  df(n - p);
+const VectorXd             se(s * VDp.rowwise().squaredNorm().array().sqrt());'
+writeLines( code, "code.cpp" )
+@ 
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+
+\subsection{Least squares with the column-pivoted QR decomposition}
+\label{sec:colPivQR}
+
+The column-pivoted QR decomposition provides results similar to those
+from \proglang{R} in both the full-rank and the rank-deficient cases.
+The decomposition is of the form
+\begin{displaymath}
+  \bm X\bm P=\bm Q\bm R=\bm Q_1\bm R_1
+\end{displaymath}
+where, as before, $\bm Q$ is $n\times n$ and orthogonal and $\bm R$ is
+$n\times p$ and upper triangular.  The $p\times p$ matrix $\bm P$ is a
+permutation matrix.  That is, its columns are a permutation of the
+columns of $\bm I_p$.  It serves to reorder the columns of $\bm X$ so
+that the diagonal elements of $\bm R$ are non-increasing in magnitude.
+
+An instance of the class \code{Eigen::ColPivHouseholderQR} has a
+\code{rank} method returning the computational rank of the matrix.
+When $\bm X$ is of full rank we can use essentially the same code as
+in the unpivoted decomposition except that we must reorder the
+standard errors.  When $\bm X$ is rank-deficient we evaluate the
+coefficients and standard errors for the leading $r$ columns of $\bm
+X\bm P$ only.
+
+In the rank-deficient case the straightforward calculation of the
+fitted values, as $\bm X\widehat{\bm\beta}$ cannot be used.  We 
+could do some complicated rearrangement of the columns of X and the
+coefficient estimates but it is conceptually (and computationally)
+easier to employ the relationship
+\begin{displaymath}
+  \widehat{\bm y} = \bm Q_1\bm Q_1^\prime\bm y=\bm Q
+  \begin{bmatrix}
+    \bm I_r & \bm 0
+    \bm 0   & \bm 0
+  \end{bmatrix}
+  \bm Q^\prime\bm y
+\end{displaymath}
+The vector $\bm Q^\prime\bm y$ is called the ``effects'' vector in \proglang{R}.
+
+<<echo=FALSE>>=
+code <- 'using Eigen::ColPivHouseholderQR;
+typedef ColPivHouseholderQR<MatrixXd>::PermutationType  Permutation;
+
+const ColPivHouseholderQR<MatrixXd> PQR(X);
+const Permutation                   Pmat(PQR.colsPermutation());
+const int                              r(PQR.rank());
+VectorXd                       betahat, fitted, se;
+if (r == X.cols()) {	// full rank case
+    betahat  = PQR.solve(y);
+    fitted   = X * m_coef;
+    se       = Pmat * PQR.matrixQR().topRows(m_p).triangularView<Upper>().
+	       solve(MatrixXd::Identity(m_p, m_p)).rowwise().norm();
+} else {
+    MatrixXd                      Rinv(PQR.matrixQR().topLeftCorner(r, r).
+				       triangularView<Upper>().
+				       solve(MatrixXd::Identity(r, r)));
+    VectorXd                   effects(PQR.householderQ().adjoint() * y);
+    betahat.head(r)                    = Rinv * effects.head(r);
+    betahat                            = Pmat * betahat;
+			// create fitted values from effects
+			// (cannot use X * m_coef when X is rank-deficient)
+    effects.tail(X.rows() - r).setZero();
+    fitted                             = PQR.householderQ() * effects;
+    se.head(r)                         = Rinv.rowwise().norm();
+    se                                 = Pmat * se;
+}'
+writeLines( code, "code.cpp" )
+@ 
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+
+Just to check that this does indeed provide the desired answer
+<<rankdeficient>>=
+print(summary(fmPQR <- fastLm(y ~ f1 * f2, dd)), signif.stars=FALSE)
+all.equal(coef(fm1), coef(fmPQR))
+all.equal(unname(fitted(fm1)), fitted(fmPQR))
+all.equal(unname(residuals(fm1)), residuals(fmPQR))
+@ 
+
+The rank-revealing SVD method produces the same fitted
+values but not the same coefficients.
+<<rankdeficient>>=
+print(summary(fmSVD <- fastLm(y ~ f1 * f2, dd, method=4L)), signif.stars=FALSE)
+all.equal(coef(fm1), coef(fmSVD))
+all.equal(unname(fitted(fm1)), fitted(fmSVD))
+all.equal(unname(residuals(fm1)), residuals(fmSVD))
+@ 
+
+For this example the method based on the eigendecomposition gets the rank wrong
+\subsection{Comparative speed}
+
+In the \pkg{RcppEigen} package the \proglang{R} function to fit linear
+models using the methods described above is called \code{fastLm}.  The
+natural question to ask is, ``Is it indeed fast to use these methods
+based on \pkg{Eigen}?''.  We have provided a comparative benchmarking
+of these methods, plus the default method using \proglang{R}'s
+\code{lm} function and the \code{fastLm} implementations in the
+\pkg{RcppArmadillo} and \pkg{RcppGSL} packages, if they are
+installed.  The benchmark is a file called \code{lmBenchmark.R} and it
+uses the \pkg{rbenchmark} package.
+
+It can be run as 
+<<benchmark,eval=FALSE>>=
+source(system.file("examples", "lmBenchmark.R", package="RcppEigen"))
+@
+Results will vary according to the speed of the processor and the
+number of cores and the implementation of the BLAS (Basic Linear
+Algebra Subroutines) that is being used.  (\pkg{Eigen} methods do not
+use the BLAS but the other methods do.)
+
+Results obtained on a desktop computer, circa 2010, are shown in
+Table~\ref{tab:lmRes}
+\begin{table}[tb]
+  \centering
+  \caption{\code{lmBenchmark} results on a desktop computer for the
+    default size, $100,000\times 40$, full-rank model matrix running
+    20 repetitions for each method.  Times (Elapsed, User and Sys) are
+    in seconds.  The BLAS in use is a single-threaded version of Atlas
+    (Automatically Tuned Linear Algebra System).}
+  \label{tab:lmRes}
+  \begin{tabular}{r r r r r}
+    \hline
+    \multicolumn{1}{c}{Method} & \multicolumn{1}{c}{Relative} &
+    \multicolumn{1}{c}{Elapsed} & \multicolumn{1}{c}{User} &
+    \multicolumn{1}{c}{Sys}\\
+    \hline
+     LLt &   1.000000 &   1.227 &     1.228 &    0.000 \\
+    LDLt &   1.037490 &   1.273 &     1.272 &    0.000 \\
+ SymmEig &   2.895681 &   3.553 &     2.972 &    0.572 \\
+      QR &   7.828036 &   9.605 &     8.968 &    0.620 \\
+   PivQR &   7.953545 &   9.759 &     9.120 &    0.624 \\
+    arma &   8.383048 &  10.286 &    10.277 &    0.000 \\
+  lm.fit &  13.782396 &  16.911 &    15.521 &    1.368 \\
+     SVD &  54.829666 &  67.276 &    66.321 &    0.912 \\
+     GSL & 157.531377 & 193.291 &   192.568 &    0.640 \\
+     \hline
+  \end{tabular}
+\end{table}
+
+These results indicate that methods based on forming and decomposing
+$\bm X^\prime\bm X$, (i.e. LDLt, LLt and SymmEig) are considerably
+faster than the others.  The SymmEig method, using a rank-revealing
+decomposition, would be preferred, although the LDLt method could
+probably be modified to be rank-revealing.  Do bear in mind that the
+dimensions of the problem will influence the comparative results.
+Because there are 100,000 rows in $\bm X$ methods that decompose the
+whole $\bm X$ matrix (all the methods except those named above) will
+be at a disadvantage.
+
+The pivoted QR method is 1.6 times faster than R's \code{lm.fit} on
+this test and provides nearly the same information as \code{lm.fit}.
+Methods based on the singular value decomposition (SVD and GSL) are
+much slower but, as mentioned above, this is caused in part by $\bm X$
+having many more rows than columns.  The GSL method from the GNU
+Scientific Library uses an older algorithm for the SVD and is clearly
+out of contention.
+
+An SVD method using the Lapack SVD subroutine, \code{dgesv}, may be
+faster than the native \pkg{Eigen} implementation of the SVD, which is
+not a particularly fast method.
+
+\section{Delayed evaluation}
+\label{sec:delayed}
+
+A form of delayed evaluation is used in \pkg{Eigen}.  That is, many
+operators and methods do not force the evaluation of the object but
+instead return an ``expression object'' that is evaluated when
+needed.  As an example, even though we write the $\bm X^\prime\bm X$
+evaluation using \code{.rankUpdate(X.adjoint())} the
+\code{X.adjoint()} part is not evaluated immediately.  The
+\code{.rankUpdate} method detects that it has been passed a matrix
+that is to be used in its transposed form and evaluates the update
+by taking inner products of columns of $\bm X$ instead of rows of $\bm
+X^\prime$.
+
+Occasionally the method for \code{Rcpp::wrap} will not force an
+evaluation when it should.  This is at least what Bill Venables calls
+an ``infelicity'' in the code, if not an outright bug.  In the first
+example of the transpose of an integer matrix we assigned the
+transpose as a \code{MatrixXi} before returning it with \code{wrap}.
+The assignment forces the evaluation.  If we skip this step we get an
+answer with the correct shape but the wrong contents.
+
+<<echo=FALSE>>=
+code <- 'using Eigen::Map;
+using Eigen::MatrixXi;
+const Map<MatrixXi>  A(as<Map<MatrixXi> >(AA));
+return wrap(A.transpose());'
+writeLines( code, "code.cpp" )
+@
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+<<>>=
+Ai <- matrix(1:6, ncol=2L)
+ftrans2 <- cxxfunction(signature(AA = "matrix"),
+                      paste(readLines( "code.cpp" ), collapse = "\n"),
+                      plugin = "RcppEigen")
+(At <- ftrans2(Ai))
+all.equal(At, t(Ai))
+@
+
+Another recommended practice is to assign objects before wrapping them
+for return to \proglang{R}.
+
+\section{Sparse matrices}
+\label{sec:sparse}
+
+\pkg{Eigen} provides sparse matrix classes.  An \proglang{R} object of
+class \code{dgCMatrix} (from the \pkg{Matrix} package) can be mapped
+as shown below.
+
+<<echo=FALSE>>=
+code <- 'using Eigen::Map;
+using Eigen::MappedSparseMatrix;
+using Eigen::SparseMatrix;
+using Eigen::VectorXd;
+
+const MappedSparseMatrix<double>  A(as<MappedSparseMatrix<double> >(AA));
+const Map<VectorXd>               y(as<Map<VectorXd> >(yy));
+const SparseMatrix<double>       At(A.adjoint());
+return List::create(_["At"] = At,
+                    _["Aty"] = At * y);'
+writeLines( code, "code.cpp" )
+@
+<<echo=FALSE,results=tex>>=
+ex_highlight( "code.cpp" )
+@
+<<>>=
+sparse1 <- cxxfunction(signature(AA = "dgCMatrix", yy = "numeric"),
+                      paste(readLines( "code.cpp" ), collapse = "\n"),
+                      plugin = "RcppEigen")
+data(KNex, package="Matrix")
+rr <- sparse1(KNex$mm, KNex$y)
+stopifnot(all.equal(rr$At, t(KNex$mm)),
+          all.equal(rr$Aty, as.vector(crossprod(KNex$mm, KNex$y))))
+@
+
+A sparse Cholesky decomposition is provided in \pkg{Eigen} as the
+\code{SimplicialCholesky} class. There are also linkages to the
+\pkg{CHOLMOD} code from the \pkg{Matrix} package.  At present, both of
+these are regarded as experimental.
+
 \bibliographystyle{plainnat}
 \bibliography{Rcpp}
 

Modified: pkg/RcppEigen/inst/doc/RcppEigen-Intro.pdf
===================================================================
--- pkg/RcppEigen/inst/doc/RcppEigen-Intro.pdf	2011-10-26 15:14:19 UTC (rev 3221)
+++ pkg/RcppEigen/inst/doc/RcppEigen-Intro.pdf	2011-10-26 15:18:28 UTC (rev 3222)
@@ -73,538 +73,1114 @@
 (Least squares solutions)
 endobj
 49 0 obj
-<< /S /GoTo /D [50 0 R  /Fit ] >>
+<< /S /GoTo /D (subsection.4.1) >>
 endobj
-57 0 obj <<
-/Length 3144      
+52 0 obj
+(Least squares using the ``LLt'' Cholesky)
+endobj
+53 0 obj
+<< /S /GoTo /D (subsection.4.2) >>
+endobj
+56 0 obj
+(Least squares using the unpivoted QR decomposition)
+endobj
+57 0 obj
+<< /S /GoTo /D (subsection.4.3) >>
+endobj
+60 0 obj
+(Handling the rank-deficient case)
+endobj
+61 0 obj
+<< /S /GoTo /D (subsection.4.4) >>
+endobj
+64 0 obj
+(Least squares with the SVD)
+endobj
+65 0 obj
+<< /S /GoTo /D (subsection.4.5) >>
+endobj
+68 0 obj
+(Least squares with the eigendecomposition)
+endobj
+69 0 obj
+<< /S /GoTo /D (subsection.4.6) >>
+endobj
+72 0 obj
+(Least squares with the column-pivoted QR decomposition)
+endobj
+73 0 obj
+<< /S /GoTo /D (subsection.4.7) >>
+endobj
+76 0 obj
+(Comparative speed)
+endobj
+77 0 obj
+<< /S /GoTo /D (section.5) >>
+endobj
+80 0 obj
+(Delayed evaluation)
+endobj
+81 0 obj
+<< /S /GoTo /D (section.6) >>
+endobj
+84 0 obj
+(Sparse matrices)
+endobj
+85 0 obj
+<< /S /GoTo /D [86 0 R  /Fit ] >>
+endobj
+94 0 obj <<
+/Length 3349      
 /Filter /FlateDecode
 >>
 stream
-xÚZYÛF~Ÿ_!äe5ˆÕaì#»X`ãu. X¬c,ˆó ÑÐ3ÚŒF™ø%¿}ëèn6)’’íÀ‡"»«»ëøê«¢ªÙÝ¬š}sUüýêÍÕ_[7“N(Y›Ù›w3é½pÎÏlÐÂ?{s;ûiþÇë…®ªùw×rþxÜü°Û^/T=¿=Â|ë-=£a‡íõÏo¾ÿâk§:’µZÂº$ò5N|‚ÿž^ÁkøÜ5<MÉ™”"ÔµÂiÑj¶P^ø xö?aü×¾ƒÏ|–ðÙÃêÊÎ¿Âë|š}»Bž¯…4äi¡“¼‹wc+á]ñÛµòóf‡kãœ¤PL%àZÂÿj¾·ŒÄaïøò_¸.îu{ú´ó.wüHE/x²ªð»Ä=˜l¨f^kIÑ¦up³…6ÂÂ^Ù‚0ûfŠÙ-yÍaAu(-d at g ÝK#ŒŠÎðæu›UÛ—^(£.0pgæY£OÑ–«kæ¿\×Ž¾ÝÑŠ  š€óÁÓßÖ·×
-L÷—«è‰Íž]oÐK™F·¬¤¨¥K«ÓNï2#
-ö¾¬ŒÐÎ¦±/aÔçŸH)PösrÄÍSá¦‡tK·Ö7x"|°{Ï»~·CÙì b³’¡Õðéâ5\eˆ´;µÑÐ9´Yjöƒ>Yé¤)9¿å¯¨¸ý\»y³ŠFÇ!xÎ÷¸8e!‚ üdAZ%ÝGµ7åûÔ
-Îd.±‹wúŒ]ð$ÁÎßãÉVñA„7<?o ”ì•€M’—í¨ÎúR‰ªå(VÝ©< MãóNŸ	xƒv’ÂÔ¶ $>kµP²tjþVjS (¸ù8m†J[ìÑðëGšA*Â1 Zµÿ¯IZÚóœqéÂK=åÁ^XeJ+m‡ä _hq5¨ñÙÂ€«¾#Þt</ÆO¨f€ÚXÊ ø° Ó<C‚Ê¤215jXÚ3‚ì¸*|çt"¬ðÊµ)Â(Š²Cé”Úp8=òå( YŠ±—Æ·¯òPò–»¨¢—=4œ}”tÉY§-=FßŠqºÈ²A«Ë6RA‚ÏñÙHfêH7µÂÀî139ø{ÑqƒÎÇ}·ŒÁ½&>”€6 ·¾M÷–ø½Æ£‚SÐˆùÇ}ðØf(œwÃ x at nðŒãšf(›H¦rr*‚v/|7ñVm
-*sehÝ3XpôßÑ
-:Bºv+I¯1d·½t±²åAÃX	Ž$”ïf (?_ÆÏˆâö€Õø¶‘’¼j¤G	·0‚V¼	ÄA²Á®ÃÐŒÌ,Æž@Ùž‡Là˜²B);m¼J…³8†HrUÀ¢Ó(8¬äýMå +um¨ù\n©+Q™P¬üÁjž 3àæ"¯U-ëºL‘g™éP2AŸò°¡î‚É8ÍËVö#h.ÒžKh®®e™ÆQìmÆ¥ªœñr™S=[î¿$BöC³JÍ@Ö`Ž¶t.eõ®ð¡ƒC§ý…d«¬KŒr–+ãæÖ¥WÆý.#ýÔ&Ì×¢¬(G-¢…« œô§¬syŠYÞ
-„•þüÇ¢Tf¶ÖR@ïýŸ¾eÜöìÜ5]€hð˜§F?ùdôÄÇ*sêb@Ý„‡õyou§ÖÁ&þ¨»†ž
-eSÕ.W ú‰ã†ºÕê†îânDÇ
-õ?Ç'KþsŸ`$ÊZÄ‚
-ç±7Ë"á6ÀÐLåçyÝ>Q›,<Wjø5#At¨]sŸ"†!J£±»&Ú7	í39Í}Ÿ`Ú ™-j/\]wÍp[.Âëì
-â}Æ]1f“ÖÚW	alo*o9U"¤¨=öN ÔYÆÏ4äð%¯ŸÑæÆ ²Wòò ‡´ö6~Ç‚§êGš	üÉb{¢âwM”ˆ%…-ª¸}òi*#
-5f¼¸Úw'%]Sh±§/dd.\º©ü99àÇ:;ué/5›¶ˆ!m6%¦vágßFÒÊ¸¹{
-?ó¬þ÷¤ª2¸Gžåª=ñ!Ž)"¶CWHÖëDÎÔ€*Ö€¯ÖwìâàuceéÜŒ¤„^
-ø*…>I©Õ ^P:×“´|Pp7ç ˜WÜm¬m$sä=I<»4~ÝA.†ï“‡ù‹Xx¼ŽGˆxEO³†ã»–MR….¡
-ºOÖéðÇÔ¨ÍE+6Iµ©ü¹¦«ÀqX\g¢òh14Ç¦]b¡l`ÚÌE-üz¶Ãš–/_s•º³i¿©¬ÜìyzÙdƒå¿‰¶;$Ï'äÛfýæš¿˜\—ÄcŸY}ez“ebÉØð¦Ó¤sÕŽbšKO9®ÈùØ¡âC²JË†÷ü `–KžY¸vâì=Ìû¶EÄ¨(ôÅPÿ³ÞÊS¾¦¯¦:kp c”cÄ\MÏ êœŸÿ…u‚£RìñŒ¬Lüz¬ÆX†h><}OÚÙA&n3Tô
-š0R)( ½òÓåj°9d at HoÚ3,»fÖ±j“.{Œq:¥Wœ·çÝrW„r
-
-X?FÞœ°¶GÛ®=—›èÍ“îŒïKú.ü‚‘¼—E¸-Ãž×Ü¡¦Gæm£9Š
-=jårZ¯¯ÂÚ9²1sþ]·ón9q)v†cÄ’&7Øï¥^ªqzÛ-Š¡R½XÍ$±C™vùNjFÂe×Nè
-Óv:1II  mU
-ï*"	Šêº-ê_Ìy"¤ÛŽðcBÚøâ"ÕäÌªµ5óçd¤Øð£-mšeIæZ°‹¡JÛ¾ÇwSïùù'“kÅ§ë«æÏd‡ØVåhPØï™y9ÖQ&Ü¸Úrñ‡KwqÜïðùo[„v”å*a\Qa…ü6ál œõ„³Ì·"¹+£‹ë†Üm-˜à±-FÛ¨ë´å"Cm#r•’Í¾
-›lÃDŒh‘³ß'Ù£Ù¿w'Œ‰)çÞ?úè«7W¿^Ir_9s ™Êa±d”Õæê§Ÿ«Ù-<û;øÙ3ÜÌv{¶?f?^ý»”„¬,”ØP¦j;%£âp:Š Ê¡„Ô¶ÈûOøµ¾ê¾¡R°]9Q{Ãö<î£<ÞõZî}yýÖ¢ƒ"]OFÝ„0DevÈóãÎ:ì™ù=3&=é˜¹VFØJ~„N\Eí>ÄUÈvÝ(Œ²œèO¯S%aµlÝ÷X×VÔyË6R ®#Ù©h2‚¤_éU.Z!6tš]®*iNŒõ¦×b*r\YàR~\¶ÐÚ)µÚ\ ÑÃ^{0©Ô‚qéÛ$ï‰ÞµQÙUWŸºàÊÎp-™ðe¨(Æ A˜ø«ËfBÑ 5uÛÇë’ÌÕ-‰µgÎà{~'n’óf£„A¹Å/‡žðS„»eYÌÜªÎY/?Je‰qÅ%µQZÂR–Ÿô¼$Šu’5A‡#M|~Ãº<î ÍïFK]h›ò+?}—T½âÈrKRå~yoR&±àÜÖúùË]“3E,êŒ”Øþ¡¿ëÔÐ:tºr¼b25¬¶?,o,ùÏÐ‘ZãËD]êfmD5ýÊ# ¥Î8}Ú"ñëöåÍ6ö+¹P+uû”ê›öuïÀZOE„²üžN{¡+˜rZFfÈ>!©õ\~‰. ç/·;¦€ñG$&·W4Xeæ„ûç÷«ð‰Â^ó
-¬Œ¿óÜü® ôº…;¾K2Sg%iâ-_pK‡û9ñ‹>i¨ËºP†6^Ô!œá©¥úT¦ý¡Ã6Õ!OÜÄtÒ—ô0Wø‹?µ˜¤)J:QYÛÍ=²ªÊ@øi ,Òi¡jÙñµrÃP³u5ô3~Æ…aÔO¡ZWtã[Þ
-ëÇ»ñý ió²êÂ¾Ù]n›ÔœQ—…¿¶þ`u•»’¡˜ƒÄ_	—Œ{9nšDaV¬Í²KjN½!\••zyï`š¾àdÊBÖùœÑ÷î([ÿÙûY_ºŸÕ–ú—„1Jû¡ûIY}ˆrF,¡"âÄÙíü'j¶“.¶“=g§?q7Ÿb¥üü¤¬ÚŸe
-n`ÌÄ©°½húr´Èq!¡Ëo˜2W¯3þe'rÜf.9ÿû8–øZhÿ!H2ø»ìn> d	µI¿:ëáÿÿÅ•
[TRUNCATED]

To get the complete diff run:
    svnlook diff /svnroot/rcpp -r 3222