[Lme4-commits] r1831 - in pkg/lme4.0: . inst/doc vignettes

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Fri Aug 9 13:18:57 CEST 2013


Author: mmaechler
Date: 2013-08-09 13:18:57 +0200 (Fri, 09 Aug 2013)
New Revision: 1831

Added:
   pkg/lme4.0/vignettes/
   pkg/lme4.0/vignettes/Implementation.Rnw
   pkg/lme4.0/vignettes/PLSvGLS.Rnw
   pkg/lme4.0/vignettes/Theory.Rnw
   pkg/lme4.0/vignettes/lme4.bib
Removed:
   pkg/lme4.0/inst/doc/Implementation.Rnw
   pkg/lme4.0/inst/doc/PLSvGLS.Rnw
   pkg/lme4.0/inst/doc/Theory.Rnw
   pkg/lme4.0/inst/doc/lme4.bib
Log:
moved stuff to new vignettes/

Deleted: pkg/lme4.0/inst/doc/Implementation.Rnw
===================================================================
--- pkg/lme4.0/inst/doc/Implementation.Rnw	2013-07-29 18:07:41 UTC (rev 1830)
+++ pkg/lme4.0/inst/doc/Implementation.Rnw	2013-08-09 11:18:57 UTC (rev 1831)
@@ -1,1449 +0,0 @@
-\documentclass[12pt]{article}
-\usepackage{Sweave}
-\usepackage{amsmath}
-\usepackage{bm}
-\usepackage[authoryear,round]{natbib}
-\bibliographystyle{plainnat}
-\DefineVerbatimEnvironment{Sinput}{Verbatim}
-{formatcom={\vspace{-2.5ex}},fontshape=sl,
-  fontfamily=courier,fontseries=b, fontsize=\scriptsize}
-\DefineVerbatimEnvironment{Soutput}{Verbatim}
-{formatcom={\vspace{-2.5ex}},fontfamily=courier,fontseries=b,%
-  fontsize=\scriptsize}
-%%\VignetteIndexEntry{Implementation Details}
-%%\VignetteDepends{lme4.0, MEMSS, mlmRev}
-%%
-\newcommand{\trans}{\ensuremath{^\mathsf{T}}}
-\newcommand{\invtrans}{\ensuremath{^\mathsf{-T}}}
-\title{Linear mixed model implementation in lme4.0}
-\author{Douglas Bates\\Department of Statistics\\%
-  University of Wisconsin -- Madison}
-\begin{document}
-\SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,strip.white=true,keep.source=TRUE}
-\SweaveOpts{include=FALSE}
-\setkeys{Gin}{width=\textwidth}
-\newcommand{\code}[1]{\texttt{\small{#1}}}
-\newcommand{\package}[1]{\textsf{\small{#1}}}
-\maketitle
-\begin{abstract}
-  We describe the form of the linear mixed-effects and generalized
-  linear mixed-effects models fit by \code{lmer} and give details of
-  the representation and the computational techniques used to fit such
-  models.  These techniques are illustrated on several examples.
-\end{abstract}
-
-<<preliminaries,echo=FALSE,print=FALSE>>=
-options(width=80, show.signif.stars = FALSE,
-        lattice.theme = function() canonical.theme("pdf", color = FALSE))
-library(lattice)
-library(Matrix)
-library(lme4.0)
-data("Rail", package = "MEMSS")
-data("ScotsSec", package = "mlmRev")
-@
-
-\section{A simple example}
-\label{sec:example}
-
-The \code{Rail} data set from the \package{MEMSS} package is described
-in \citet{pinh:bate:2000} as consisting of three measurements of
-the travel time of a type of sound wave on each of six sample railroad
-rails. We can examine the structure of these data with the \code{str}
-function
-<<strRail>>=
-str(Rail)
-@
-
-Because there are only three observations on each of the rails a
-dotplot (Figure~\ref{fig:Raildotplot}) shows the structure of the data
-well.
-<<Raildotplot,fig=TRUE,include=FALSE,width=8,height=2>>=
-print(dotplot(reorder(Rail,travel)~travel,Rail,xlab="Travel time (ms)",ylab="Rail"))
-@
-\begin{figure}[tb]
-  \centering
-  \includegraphics{Implementation-Raildotplot}
-  \caption{Travel time of sound waves in a sample of six railroad
-    rails.  There were three measurements of the travel time on each
-    rail. The order of the rails is by increasing mean travel time.}
-  \label{fig:Raildotplot}
-\end{figure}
-
-In building a model for these data
-<<Raildata>>=
-Rail
-@
-we wish to characterize a typical travel time, say $\mu$, for the
-population of such railroad rails and the deviations, say
-$b_i,i=1,\dots,6$ of the individual rails from this population mean.
-Because these specific rails are not of interest by themselves as much
-as the variation in the population we model the $b_i$, which are
-called the ``random effects'' for the rails, as having a normal
-(also called ``Gaussian'') distribution of the form $\mathcal{N}(0,\sigma^2_b)$.  The
-$j$th measurement on the $i$th rail is expressed as
-\begin{equation}
-  \label{eq:1}
-  y_{ij}=\mu+b_i+\epsilon_{ij}\quad
-  b_i\sim\mathcal{N}(0,\sigma^2_b),\epsilon_{ij}\sim\mathcal{N}(0,\sigma^2)
-  \quad  i=1,\dots,6\;j=1,\dots,3
-\end{equation}
-
-The parameters of this model are $\mu$, $\sigma^2_b$ and $\sigma^2$.
-Technically the $b_i,i=1,\dots,6$ are not parameters but instead are
-considered to be unobserved random variables for which we form
-``predictions'' instead of ``estimates''.
-
-To express generalizations of models like (\ref{eq:1}) more
-conveniently we switch to a matrix/vector
-representation in which the 18 observations of the travel time form
-the response vector $\bm{y}$, the fixed-effect parameter $\mu$ forms a
-$1$-dimensional column vector $\bm{\beta}$ and the six random effects
-$b_i,i=1,\dots,6$ form the random effects vector $\bm{b}$.  The
-structure of the data and the values of any covariates (none are used
-in this model) are used to create model matrices $\bm{X}$ and
-$\bm{Z}$.
-
-Using these vectors and matrices and the $18$-dimensional vector
-$\bm{\epsilon}$ that represents the per-observation noise terms the
-model becomes
-\begin{equation}
-  \label{eq:lmm}
-  \bm{y} = \bm{X}\bm{\beta} + \bm{Z}\bm{b}+\bm{\epsilon},\quad
-  \bm{\epsilon}\sim\mathcal{N},\left(\bm{0},\sigma^2\bm{I}\right),\;
-  \bm{b}\sim\mathcal{N}\left(\bm{0},\sigma^2\bm{\Sigma}\right)\;\mathrm{and}\;
-  \bm{b}\perp\bm{\epsilon}
-\end{equation}
-
-In the general form we write $p$ for the dimension of $\bm{\beta}$,
-the fixed-effects parameter vector, and $q$ for the dimension of
-$\bm{b}$, the vector of random effects.  Thus the model matrix
-$\bm{X}$ has dimension $n\times p$, the model matrix $\bm{Z}$ has
-dimension $n\times q$ and the relative variance-covariance matrix,
-$\bm{\Sigma}$, for the random-effects has dimension $q\times q$.  The
-symbol $\perp$ indicates independence of random variables and
-$\mathcal{N}$ denotes the multivariate normal (Gaussian) distribution.
-
-We say that matrix $\bm{\Sigma}$ is the relative variance-covariance
-matrix of the random effects in the sense that it is the variance of
-$\bm{b}$ relative to $\sigma^2$, the scalar variance of the
-per-observation noise term $\bm{\epsilon}$.  Although it size, $q$,
-can be very large, $\bm{\Sigma}$ is highly structured.  It is
-symmetric, positive semi-definite and zero except for the diagonal
-elements and certain elements close to the diagonal.
-
-\subsection{Fitting the model and examining the results}
-\label{sec:fitt-model-exam}
-
-The maximum likelihood estimates for parameters in model (\ref{eq:1})
-fit to the \code{Rail} data are obtained as
-<<RailfitML>>=
-Rm1ML <- lmer(travel ~ 1 + (1|Rail), Rail, REML = FALSE, verbose = TRUE)
-@
-In this fit we have set \code{verbose = TRUE}
-indicating that information on the progress of the iterations should
-be printed after every iteration.  Each line gives the iteration
-number, the value of the deviance (negative twice the log-likelihood)
-and the value of the parameter $s$ which is the standard deviation of
-the random effects relative to the standard deviation of the
-residuals.
-
-The printed form of the model
-<<Railfitshow>>=
-Rm1ML
-@
-provides additional information about the parameter estimates and some
-of the measures of the fit such as the log-likelihood (-64.28), the
-deviance for the maximum likelihood criterion (128.6), the deviance
-for the REML criterion (122.2), Akaike's Information Criterion
-(AIC$=132.6$) and Schwartz's Bayesian Information Criterion
-(BIC$=134.3$).
-
-The transpose of the model matrix $\bm{Z}$ is stored as a sparse matrix in the \code{Zt}
-slot of the fitted model.  For this model $\bm{Z}$ is simply the matrix
-of indicators of the levels of the \code{Rail}.
-<<ZXytRail>>=
-Rm1ML at Zt
-as(Rail[["Rail"]], "sparseMatrix")
-@
-The elements represented as `.' in the output are known to be zero and
-are not stored explicitly.
-
-The \code{L} component of this fitted model is a Cholesky
-factorization of a matrix $\bm{A}^*(\bm{\theta})$ where $\bm{\theta}$ is a
-parameter vector determining $\bm{\Sigma}(\bm{\theta})$.  This matrix can be
-factored as $\bm{\Sigma}=\bm{T}\bm{S}\bm{S}\bm{T}\trans$, where
-$\bm{T}$ is a unit, lower triangular matrix (that is, all the elements
-above the diagonal are zero and all the elements on the diagonal are
-unity) and $\bm{S}$ is a diagonal matrix with non-negative elements on
-the diagonal.
-The matrix $\bm{A}^*(\bm{\theta})$ is
-\begin{equation}
-  \label{eq:Astar}
-  \begin{split}
-    \bm{A}^*(\bm{\theta})&=
-    \begin{bmatrix}
-      {\bm{Z}^*}\trans\bm{Z}^*+\bm{I} & {\bm{Z}^*}\trans\bm{X} &
-      -{\bm{Z}^*}\trans\bm{y}\\
-      {\bm{X}}\trans\bm{Z}^* & {\bm{X}}\trans\bm{X} &
-      -{\bm{X}}\trans\bm{y}\\
-      -{\bm{y}}\trans\bm{Z}^*&-{\bm{y}}\trans\bm{X}&{\bm{y}}\trans\bm{y}
-    \end{bmatrix} \\
-    &=
-    \begin{bmatrix}
-      \bm{T}\trans\bm{S} & \bm{0} & \bm{0} \\
-      \bm{0} & \bm{I} & \bm{0} \\
-      \bm{0} & \bm{0} & 1
-    \end{bmatrix}
-    \bm{A}
-    \begin{bmatrix}
-      \bm{S}\bm{T} & \bm{0} & \bm{0} \\
-      \bm{0} & \bm{I} & \bm{0} \\
-      \bm{0} & \bm{0} & 1
-    \end{bmatrix}+
-    \begin{bmatrix}
-      \bm{I} & \bm{0} & \bm{0} \\
-      \bm{0} & \bm{0} & \bm{0} \\
-      \bm{0} & \bm{0} & 0
-    \end{bmatrix} .
-  \end{split}
-\end{equation}
-
-For model (\ref{eq:1}) the matrices $\bm{T}$ and $\bm{S}$ are
-particularly simple, $\bm{T}=\bm{I}_6$, the $6\times 6$ identity matrix
-and $\bm{S}=s_{1,1}\bm{I}_6$ where $s_{1,1}=\sigma_b/\sigma$ is the
-standard deviation of the random effects relative to the standard
-deviation of the per-observation noise term $\bm{\bm{\epsilon}}$.
-
-<<fewdigits,echo=FALSE,results=hide>>=
-op <- options(digits=4)
-@
-The Cholesky decomposition of $\bm{A}^*$ is a lower triangular sparse
-matrix $\bm{L}$
-<<LRm1ML>>=
-as(Rm1ML at L, "sparseMatrix")
-@
-<<unfewdigits,echo=FALSE,results=hide>>=
-options(op)
-@
-
-As explained in later sections the matrix $\bm{L}$ provides all the
-information needed to evaluate the ML deviance or the REML deviance as
-a function of $\bm{\theta}$.  The components of the deviance are given
-in the \code{deviance} slot of the fitted model
-<<devRm1ML>>=
-Rm1ML at deviance
-@
-The element labelled \code{ldL2} is the logarithm of the square of
-the determinant of the upper left $6\times 6$ section of $\bm{L}$.
-This corresponds to
-$\log\left|{\bm{Z}^*}\trans\bm{Z}^*+\bm{I}_q\right|$ where
-$\bm{Z}^*=\bm{Z}\bm{T}\bm{S}$.  We can verify that the value
-$27.38292$ can indeed be calculated in this way.
-<<ldZRm1ML>>=
-L <- as(Rm1ML at L, "sparseMatrix")
-2 * sum(log(diag(L)))
-@
-
-The \code{lr2} element of the \code{deviance} slot is the logarithm of
-the penalized residual sum of squares.  It can be calculated as the
-logarithm of the square of the last diagonal element in $\bm{L}$.
-<<lr2Rm1ML>>=
-(RX <- Rm1ML at RX)
-@
-
-For completeness we mention that the \code{ldRX2} element of the
-\code{deviance} slot is the logarithm of the product of the squares of
-the diagonal elements of $\bm{L}$ corresponding to columns of $\bm{X}$.
-<<ldXRm1ML>>=
-2 * sum(log(diag(Rm1ML at RX)))
-@
-This element is used in the calculation of the REML criterion.
-
-Another slot in the fitted model object is \code{dims}, which
-contains information on the dimensions of the model and some of the
-characteristics of the fit.
-<<dimsRm1ML>>=
-(dd <- Rm1ML at dims)
-@
-We can reconstruct the ML estimate of the residual variance as the
-penalized residual sum of squares divided by the number of
-observations.
-%% FIXME: This is not calculated correctly in the current version of lmer
-<<s2Rm1ML>>=
-Rm1ML at deviance["pwrss"]/dd["n"]
-@
-
-The \emph{profiled deviance} function
-\begin{equation}
-  \label{eq:2}
-  \begin{aligned}
-  \tilde{\mathcal{D}}(\bm{\theta}) &= \log\left|{\bm{Z}^*}\trans\bm{Z}^*+\bm{I}_q\right| +
-  n \log \left(1+\frac{2\pi r^2}{n}\right) \\
-  &= n\left[1+\log\left(\frac{2\pi}{n}\right)\right] +
-    \log\left|{\bm{Z}^*}\trans\bm{Z}^*+\bm{I}_q\right| + n\log r^2
-  \end{aligned}
-\end{equation}
-is a function of $\bm{\theta}$ only.  In this case
-$\bm{\theta}=\sigma_1$, the relative standard deviation of the random
-effects, is one-dimensional. The maximum likelihood estimate (mle) of
-$\bm{\theta}$ minimizes the profiled deviance. The mle's of all the
-other parameters in the model can be derived from the estimate of this
-parameters.
-
-<<devcomp,echo=FALSE,fig=TRUE,width=8, height=6>>=
-mm <- Rm1ML
-sg <- seq(0, 20, len = 101)
-dev <- mm at deviance
-nc <- length(dev)
-nms <- names(dev)
-vals <- matrix(0, nrow = length(sg), ncol = nc, dimnames = list(NULL, nms))
-for (i in seq(along = sg)) {
-    .Call(lme4.0:::mer_ST_setPars, mm, sg[i])
-    .Call(lme4.0:::mer_update_L, mm)
-    res <- try(.Call(lme4.0:::mer_update_RX, mm), silent = TRUE)
-    if (inherits(res, "try-error")) {
-        vals[i,] <- NA
-    } else {
-        .Call(lme4.0:::mer_update_ranef, mm)
-        vals[i,] <- mm at deviance
-    }
-}
-print(xyplot(ML + ldL2 + sigmaML + pwrss + disc + usqr ~ sg, as.data.frame(vals),
-             type = c("g", "l"), outer = TRUE, layout = c(1,6),
-             xlab = expression(sigma[b]/sigma), ylab = NULL,
-             scales = list(x = list(axs = 'i'),
-             y = list(relation = "free", rot = 0)),
-             strip = FALSE, strip.left = TRUE))
-@
-
-The term $n\left[1+\log\left(2\pi/n\right)\right]$ in (\ref{eq:2})
-does not depend on $\bm{\theta}$.  The other two terms,
-$\log\left|{\bm{Z}^*}\trans\bm{Z}^*+\bm{I}_q\right|$ and $n\log r^2$,
-measure the complexity of the model and the fidelity of the fitted
-values to the observed data, respectively.  We plot the value of each
-of the varying terms versus $\sigma_1$ in Figure~\ref{fig:devcomp}.
-\begin{figure}[tb]
-  \centering
-  \includegraphics{Implementation-devcomp}
-  \caption{The profiled deviance, and those components of the profiled
-    deviance that vary with $\bm{\theta}$, as a function of
-    $\bm{\theta}$ in model \code{Rm1ML} for the \code{Rail} data. In
-    this model the parameter $\bm{\theta}$ is the scalar $\sigma_1$,
-    the standard deviation of the random effects relative to the
-    standard deviation of the per-observation noise.}
-  \label{fig:devcomp}
-\end{figure}
-
-The component $\log\left|\bm{S}\bm{Z}\trans\bm{Z}\bm{S}+\bm{I}\right|$
-has the value $0$ at $\sigma_1=0$ and increases as $\sigma_1$
-increases.  It is unbounded as $\sigma_1\rightarrow\infty$.  The
-component $n\log\left(r^2\right)$ has a finite value at $\sigma_1=0$
-from which it decreases as $\sigma_1$ increases.  The value at
-$\sigma_1=0$ corresponds to the residual sum of squares for the
-regression of $\bm{y}$ on the columns of $\bm{X}$.
-<<leftlr2>>=
-18 * log(deviance(lm(travel ~ 1, Rail)))
-@
-As $\sigma_1\rightarrow\infty$, $n\log\left(r^2\right)$ approaches the
-value corresponding to the residual sum of squares for the regression
-of $\bm{y}$ on the columns of $\bm{X}$ and $\bm{Z}$.  For this model
-that is
-<<rightlr2>>=
-18 * log(deviance(lm(travel ~ Rail, Rail)))
-@
-
-<<devcomp2,echo=FALSE,fig=TRUE,height=4>>=
-mm <- Rm1ML
-sg <- seq(3.75, 8.25, len = 101)
-vals <- matrix(0, nrow = length(sg), ncol = length(dev),
-               dimnames = list(NULL, names(dev)))
-for (i in seq_along(sg)) {
-    .Call(lme4.0:::mer_ST_setPars, mm, sg[i])
-    .Call(lme4.0:::mer_update_L, mm)
-    res <- try(.Call(lme4.0:::mer_update_RX, mm), silent = TRUE)
-    if (inherits(res, "try-error")) {
-        vals[i,] <- NA
-    } else {
-        .Call(lme4.0:::mer_update_ranef, mm)
-        vals[i,] <- mm at deviance
-    }
-}
-print(xyplot(ML + ldL2 + I(18 * log(pwrss)) ~ sg, as.data.frame(vals),
-             type = c("g", "l"), outer = TRUE, layout = c(1,3),
-             xlab = expression(sigma[b]/sigma), ylab = NULL,
-             scales = list(x = list(axs = 'i'),
-             y = list(relation = "free", rot = 0)),
-             strip = FALSE, strip.left = TRUE))
-## base <- 18 * log(deviance(lm(travel ~ Rail, Rail)))
-## print(xyplot(devC + ldL2 ~ sg,
-##              data.frame(ldL2 = vals[,"ldL2"], devC =
-##                         vals[, "ldL2"] + mm at dims['n'] * log(vals[, "pwrss"]) - base,
-##                         sg = sg), type = c("g", "l"),
-##              scales = list(x = list(axs = 'i')), aspect = 1.8,
-##              xlab = expression(sigma[1]), ylab = "Shifted deviance",
-##              auto.key = list(text = c("deviance", "log|SZ'ZS + I|"),
-##              space = "right", points = FALSE, lines = TRUE)))
-@
-\begin{figure}[tb]
-  \centering
-  \includegraphics[width=0.7\textwidth]{Implementation-devcomp2}
-  \caption{The part of the deviance that varies with $\sigma_1$ as a
-    function of $\sigma_1$ near the optimum.  The component
-    $\log\left|\bm{S}\bm{Z}\trans\bm{Z}\bm{S}+\bm{I}\right|$ is shown
-    at the bottom of the frame.  This is the component of the deviance
-    that increases with $\sigma_1$.  Added to this component is
-    $n\log\left[r^2(\sigma_1)\right] - n\log\left[r^2(\infty)\right]$,
-    the comonent of the deviance that decreases as $\sigma_1$
-    increases.  Their sum is minimized at $\widehat{\sigma}_1=5.626$.}
-  \label{fig:devcomp2}
-\end{figure}
-
-\section{Multiple random effects per level}
-\label{sec:multiple-re}
-
-<<sleepxyplot,fig=TRUE,echo=FALSE,height=7,width=7>>=
-print(xyplot(Reaction ~ Days | Subject, sleepstudy, aspect = "xy",
-             layout = c(6,3), type = c("g", "p", "r"),
-             index.cond = function(x,y) coef(lm(y ~ x))[1],
-             xlab = "Days of sleep deprivation",
-             ylab = "Average reaction time (ms)"))
-@
-
-The \code{sleepstudy} data are an example of longitudinal data.  That
-is, they are repeated measurements taken on the same experimental units
-over time.  A plot of reaction time versus days of sleep deprivation
-by subject (Figure~\ref{fig:sleepxyplot})
-\begin{figure}[tbp]
-  \centerline{\includegraphics[width=\textwidth]{Implementation-sleepxyplot}}
-  \caption{A lattice plot of the average reaction time versus number
-    of days of sleep deprivation by subject for the \code{sleepstudy}
-    data.  Each subject's data are shown in a separate panel, along
-    with a simple linear regression line fit to the data in that
-    panel.  The panels are ordered, from left to right along rows
-    starting at the bottom row, by increasing intercept of these
-    per-subject linear regression lines.  The subject number is given
-    in the strip above the panel.}
-  \label{fig:sleepxyplot}
-\end{figure}
-shows there is considerable variation between subjects in both the
-intercept and the slope of the linear trend.
-
-The model
-<<sm1>>=
-print(sm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy))
-@
-provides for fixed effects for the intercept (the typical reaction
-time in the population after zero days of sleep deprivation) and the
-slope with respect to \code{Days} (the typical change in reaction time
-per day of sleep deprivation).  In addition there are random effects
-per subject for both the intercept and the slope parameters.
-
-<<sm1struct,fig=TRUE,echo=FALSE,height=3,width=6>>=
-print(image(tcrossprod(sm1 at A), colorkey = FALSE, sub = NULL),
-      split = c(1,1,2,1), more = TRUE)
-print(image(sm1 at L, colorkey = FALSE, sub = NULL),
-      split = c(2,1,2,1))
-@
-
-With two random effects per subject the matrix $\bm{\Sigma}$ for this
-model is $36\times 36$ with $18$ $2\times 2$ diagonal blocks.  The
-matrix $\bm{A}$ is $39\times 39$ as is $\bm{L}$, the Cholesky factor
-of $\bm{A}^*$.  The structure of $\bm{A}$ and of $\bm{L}$ are shown in
-Figure~\ref{fig:ALstruct}.
-\begin{figure}[tbp]
-  \centerline{\includegraphics[width=\textwidth]{Implementation-sm1struct}}
-  \caption{Structure of the sparse matrices $\bm{A}$ (left panel) and
-    $\bm{L}$ (right panel) for the model \code{sm1}.  The non-zero
-    elements as depicted as gray squares with larger magnitudes shown as
-    darker gray.}
-    \label{fig:ALstruct}
-\end{figure}
-For this model (as for all models with a
-single random effects expression) the structure of $\bm{L}$ is
-identical to that of the lower triangle of $\bm{A}$.
-
-Like the \code{Rail} data, the \code{sleepstudy} data are balanced in
-that each subject's reaction time is measured the same number of times
-and at the same times.  One consequence of the balance is that the
-diagonal blocks in the first $36$ rows of $\bm{A}$ are identical as
-are those in the first $36$ rows of $\bm{L}$.
-<<A1212>>=
-as(sm1 at L, "sparseMatrix")[1:2,1:2]
-sm1 at RX
-@
-
-The determinant quantities in
-<<sm1deviance>>=
-sm1 at deviance
-@
-are derived from the diagonal elements of $\bm{L}$.  \code{ldZ} is the
-logarithm of square of the product of the first $36$ elements of the
-diagonal, \code{ldX} is the logarithm of the square of the product of
-the $37$th and $38$th elements and \code{lr2} is the logarithm of the
-square of the $39$th element.
-<<sm1reconstruct>>=
-sm1 at RX
-str(dL <- diag(as(sm1 at L, "sparseMatrix")))
-c(ldL2 = sum(log(dL^2)), ldRX2 = sum(log(diag(sm1 at RX)^2)), log(sm1 at deviance["pwrss"]))
-@
-
-The $36\times 36$ matrices $\bm{S}$, $\bm{T}$ and
-$\bm{\Sigma}=\bm{T}\bm{S}\bm{S}\bm{T}\trans$ are block-diagonal
-consisting of $18$ identical $2\times 2$ diagonal blocks.  The
-template for the diagonal blocks of $\bm{S}$ and $\bm{T}$ is stored as
-a single matrix
-<<sm1ST>>=
-show(st <- sm1 at ST[[1]])
-@ %$
-where the diagonal elements are those of $\bm{S}$ and the strict lower
-triangle is that of $\bm{T}$.
-
-The \code{VarCorr} generic function extracts the estimates of the
-variance-covariance matrices of the random effects. Because model
-\code{sm1} has a single random-effects expression there is only one
-estimated variance-covariance matrix
-<<sm1VarCorr>>=
-show(vc <- VarCorr(sm1))
-@
-The \code{"sc"} attribute of this matrix is the estimate of $\sigma$,
-the standard deviation of the per-observation noise term.
-
-We can reconstruct this variance-covariance estimate as
-<<sm1VarCorrRec>>=
-T <- st
-diag(T) <- 1
-S <- diag(diag(st))
-T
-S
-T %*% S %*% S %*% t(T) * attr(vc, "sc")^2
-@
-
-\section{A model with two nested grouping factors}
-\label{sec:nested}
-
-
-<<Oatsxy,fig=TRUE,echo=FALSE,height=4,width=6>>=
-data(Oats, package = 'MEMSS')
-print(xyplot(yield ~ nitro | Block, Oats, groups = Variety, type = c("g", "b"),
-             auto.key = list(lines = TRUE, space = 'top', columns = 3),
-             xlab = "Nitrogen concentration (cwt/acre)",
-             ylab = "Yield (bushels/acre)",
-             aspect = 'xy'))
-@
-The \code{Oats} data from the \code{nlme} package came from an
-experiment in which fields in $6$ different locations (the
-\code{Block} factor) were each divided into three plots and each of
-these $18$ plots was further subdivided into four subplots.  Three
-varieties of oats were assigned randomly to the three plots in each
-block and four concentrations of fertilizer (measured as nitrogen
-concentration) were randomly assigned to the subplots in each plot.
-The yield on each subplot is the response shown in
-Figure~\ref{fig:Oatsxy}.
-\begin{figure}[tbp]
-  \centerline{\includegraphics[width=\textwidth]{Implementation-Oatsxy}}
-  \caption{Yield of oats versus applied concentration of nitrogen
-    fertilizer for three different varieties of oats in 6 different
-    locations.}
-  \label{fig:Oatsxy}
-\end{figure}
-
-The fitted model \code{Om1}
-<<Om1>>=
-print(Om1 <- lmer(yield ~ nitro + Variety + (1|Block/Variety), Oats), corr = FALSE)
-@
-provides fixed effects for the nitrogen concentration and for the
-varieties (coded as differences relative to the reference variety
-``Golden Rain'') and random effects for each block and for each plot
-within the block.  In this case a plot can be indexed by the
-combination of variety and block, denoted
-\code{Variety:Block} in the output.  Notice that there are 18 levels of this
-grouping factor corresponding to the 18 unique combinations of variety
-and block.
-
-A given plot occurs in one and only one block.  We say that the plot
-grouping factor is \emph{nested within} the block grouping factor.
-The structure of the matrices $\bm{A}$ and $\bm{L}$ for this model
-(Figure~\ref{fig:Om1struct})
-<<Om1struct,fig=TRUE,echo=FALSE,height=3,width=6>>=
-print(image(tcrossprod(Om1 at A), colorkey = FALSE, sub = NULL),
-       split = c(1,1,2,1), more = TRUE)
-print(image(Om1 at L, colorkey = FALSE, sub = NULL), split = c(2,1,2,1))
-@
-\begin{figure}[tbp]
-  \centerline{\includegraphics[width=\textwidth]{Implementation-Om1struct}}
-  \caption{Structure of the sparse matrices $\bm{A}$ (left panel) and
-    $\bm{L}$ (right panel) for the model \code{Om1}.}
-    \label{fig:Om1struct}
-\end{figure}
-In the matrix $\bm{A}$ the first $18$ rows and columns correspond to
-the $18$ random effects ($1$ random effect for each of the $18$ levels
-of this grouping factor).  The next $6$ rows and columns correspond to
-the $6$ random effects for block ($6$ levels and $1$ random effect per
-level). The off-diagonal elements in rows $19$ to $24$ and columns $1$
-to $18$ indicate which plots and blocks are observed simultaneously.
-Because the plot grouping factor is nested within the block grouping
-factor there will be exactly one nonzero in the rows $19$ to $24$ for
-each of the columns $1$ to $18$.
-
-For this model the fixed-effects specification includes indicator
-vectors with systematic zeros.  These appear as systematic zeros in
-rows $27$ and $28$ of $\bm{A}$ and $\bm{L}$.  The statistical analysis
-of model \code{Om1} indicates that the systematic effect of the
-\code{Variety} factor is not significant and we could omit it from the
-model, leaving us with
-<<Om1a>>=
-print(Om1a <- lmer(yield ~ nitro + (1|Block/Variety), Oats), corr = FALSE)
-@
-with matrices $\bm{A}$ and $\bm{L}$ whose patterns are shown in
-Figure~\ref{fig:Om1astruct}.
-<<Om1astruct,fig=TRUE,echo=FALSE,height=3,width=6>>=
-print(image(tcrossprod(Om1a at A), colorkey = FALSE, sub = NULL),
-      split = c(1,1,2,1), more = TRUE)
-print(image(Om1a at L, colorkey = FALSE, sub = NULL), split = c(2,1,2,1))
-@
-\begin{figure}[tbp]
-  \centerline{\includegraphics[width=\textwidth]{Implementation-Om1astruct}}
-  \caption{Structure of the sparse matrices $\bm{A}$ (left panel) and
-    $\bm{L}$ (right panel) for the model \code{Om1a}.}
-    \label{fig:Om1astruct}
-\end{figure}
-
-In Figures~\ref{fig:Om1struct} and \ref{fig:Om1astruct} the pattern in
-$\bm{L}$ is different from that of the lower triangle of $\bm{A}$ but
-only because a permutation has been applied to the rows and columns of
-$\bm{A}^*$ before computing the Cholesky decomposition.  The effect of
-this permutation is to isolate connected blocks of rows and columns
-close to the diagonal.
-
-The isolation of connected blocks close to the diagonal is perhaps
-more obvious when the model multiple random-effects expressions based on
-the same grouping factor.  This construction is used to model
-independent random effects for each level of the grouping factor.
-
-For example, the random effect for the intercept and the random effect
-for the slope in the sleep-study data could reasonably be modeled as
-independent, as in the model
-<<sm2>>=
-print(sm2 <- lmer(Reaction ~ Days + (1|Subject) + (0+Days|Subject), sleepstudy), corr = FALSE)
-@
-
-The structures of the matrices $\bm{A}$ and $\bm{L}$ for this model are
-shown in Figure~\ref{fig:sm2struct}.
-<<sm2struct,fig=TRUE,echo=FALSE,height=3,width=6>>=
-print(image(tcrossprod(sm2 at A), colorkey = FALSE, sub = NULL),
-      split = c(1,1,2,1), more = TRUE)
-print(image(sm2 at L, colorkey = FALSE, sub = NULL), split = c(2,1,2,1))
-@
-\begin{figure}[tbp]
-  \centerline{\includegraphics[width=\textwidth]{Implementation-sm2struct}}
-  \caption{Structure of the sparse matrices $\bm{A}$ (left panel) and
-    $\bm{L}$ (right panel) for the model \code{sm2}.}
-    \label{fig:sm2struct}
-\end{figure}
-
-The first $18$ elements of $\bm{b}$ are the random effects for the
-intercept for each of the $18$ subjects followed by the random effects
-for the slopes for each of the $18$ subjects.  The (0-based)
-permutation vector applied to the rows and columns of $\bm{A}^*$
-before taking the decomposition is
-<<sm2perm>>=
-str(sm2 at L@perm)
-@
-This means that, in the 1-based indexing system used in R, the
-permutation will pair up the first and $19$th rows and columns, the $2$nd and
-$20$th rows and columns, etc. resulting in the pattern for $\bm{L}$
-shown in Figure~\ref{fig:sm2struct}
-
-Figure~\ref{fig:Oatsxy} indicates that the slope of yield versus
-nitrogen concentration may depend on the block but not the plot within
-the block.  We can fit such a model as
-<<Om2>>=
-print(Om2 <- lmer(yield ~ nitro + (1|Variety:Block) + (nitro|Block), Oats), corr = FALSE)
-@
-The structures of the matrices $\bm{A}$ and $\bm{L}$ for this
-model are shown in Figure~\ref{fig:Om2struct}.
-<<Om2struct,fig=TRUE,echo=FALSE,height=3,width=6>>=
-print(image(tcrossprod(Om2 at A), colorkey = FALSE, sub = NULL),
-      split = c(1,1,2,1), more = TRUE)
-print(image(Om2 at L, colorkey = FALSE, sub = NULL),
-      split = c(2,1,2,1))
-@
-\begin{figure}[tbp]
-  \centerline{\includegraphics[width=\textwidth]{Implementation-Om2struct}}
-  \caption{Structure of the sparse matrices $\bm{A}$ (left panel) and
-    $\bm{L}$ (right panel) for the model \code{Om2}.}
-    \label{fig:Om2struct}
-\end{figure}
-
-We see that the only difference in the structure of the $\bm{A}$
-matrices from models \code{Om1a} and \code{Om2} is that rows and
-columns $19$ to $24$ from model \code{Om1a} have been replicated.
-Thus the $1\times 1$ blocks on the diagonal of $\bm{A}$ in positions
-$19$ to $24$ for model \code{Om1a} become $2\times 2$ blocks in
-model \code{Om2}.
-
-This replication of rows associated with levels of the \code{Block}
-factor carries over to the matrix $\bm{L}$.
-
-The property of being nested or not is often attributed to random
-effects.  In fact, nesting is a property of the grouping factors with
-whose levels the random effects are associated.  In both models
-\code{Om1a} and \code{Om2} the \code{Variety:Block} factor is nested
-within \code{Block}.  If the grouping factors in the random effects
-terms in a model form a nested sequence then the matrix $\bm{A}^*$ and
-its Cholesky decomposition $\bm{L}$ will have the property that the
-number of nonzeros in $\bm{L}$ is the same as the number of nonzeros
-in the lower triangle of $\bm{A}^*$.  That is, there will be no
-``fill-in'' generating new nonzero positions when forming the Cholesky
-decomposition.
-
-To check this we can examine the number of nonzero elements in
-$\bm{A}$ and $\bm{L}$ for these models.  Because the matrix $\bm{A}$
-is stored as a symmetric matrix with only the non-redundant elements
-being stored explicitly, the number of stored nonzeros in these two
-matrices are identical.
-<<Om2ALsize>>=
-length(tcrossprod(Om2 at A)@x)
-length(Om2 at L@x)
-@
-
-\section{Non-nested grouping factors}
-\label{sec:non-nested}
-
-When grouping factors are not nested they are said to be ``crossed''.
-Sometimes we will distinguish between \code{partially crossed}
-grouping factors and \code{completely crossed} grouping factors.  When
-two grouping factors are completely crossed, every level of the first
-factor occurs at least once with every level of the second factor -
-creating matrices $\bm{A}$ and $\bm{L}$ with dense off-diagonal blocks.
-
-In observational studies it is more common to encounter partially
-crossed grouping factors.  For example, the \code{ScotsSec} data in
-the \code{mlmRev} package provides the attainment scores of 3435
-students in Scottish secondary schools as well as some demographic
-information on the students and an indicator of which secondary school
-and which primary school the student attended.
-<<ScotsSec>>=
-str(ScotsSec)
-@
-
-If we use both \code{primary} and \code{second} as grouping factors
-for random effects in a model the only possibility for these factors
-to form a nested sequence is to have \code{primary} nested within
-\code{second} (because there are 148 levels of \code{primary} and 19 levels
-of \code{second}).  We could check if these are nested by doing a
-cross-tabulation of these factors but it is easier to fit an initial model
-<<Sm1>>=
-print(Sm1 <- lmer(attain ~ verbal * sex + (1|primary) + (1|second), ScotsSec), corr = FALSE)
-@
-and examine the \code{"nest"} element of the \code{dims} slot.
-<<Sm1dims>>=
-Sm1 at dims
-@
-We see that these grouping factors are not nested.  That is, some of
-the elementary schools sent students to more than one secondary school.
-
-Now that we know the answer we can confirm it by checking the first
-few rows of the cross-tabulation
-<<Scotscrosstab>>=
-head(xtabs(~ primary + second, ScotsSec))
-@
-We see that primary schools 1, 4 and 6 each occurred with multiple secondary schools.
-
-For non-nested grouping factors like these, the structure of $\bm{A}$
-and $\bm{L}$, shown in Figure~\ref{fig:Sm1struct}
-<<Sm1struct,fig=TRUE,echo=FALSE,height=3,width=6>>=
-print(image(tcrossprod(Sm1 at A), colorkey = FALSE, sub = NULL),
-      split = c(1,1,2,1), more = TRUE)
-print(image(Sm1 at L, colorkey = FALSE, sub = NULL), split = c(2,1,2,1))
-@
-\begin{figure}[tbp]
-  \centerline{\includegraphics[width=\textwidth]{Implementation-Sm1struct}}
-  \caption{Structure of the sparse matrices $\bm{A}$ (left panel) and
-    $\bm{L}$ (right panel) for the model \code{Sm1}.}
-    \label{fig:Sm1struct}
-\end{figure}
-is more complex than for nested grouping factors.  The matrix $\bm{A}$
-has a $148\times 148$ diagonal block in the upper left, corresponding
-the the $148$ levels of the \code{primary} factor, followed on the
-diagonal by a $19\times 19$ diagonal block corresponding to the $19$
-levels of the \code{second} factor.  However, the off-diagonal block
-in rows $149$ to $167$ and columns $1$ to $148$ does not have a simple
[TRUNCATED]

To get the complete diff run:
    svnlook diff /svnroot/lme4 -r 1831


More information about the Lme4-commits mailing list