[Lme4-commits] r1897 - www/lMMwR
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Fri Feb 9 21:33:52 CET 2018
Author: mmaechler
Date: 2018-02-09 21:33:50 +0100 (Fri, 09 Feb 2018)
New Revision: 1897
Added:
www/lMMwR/TODO.org
Removed:
www/lMMwR/lrgprt.pdf
Modified:
www/lMMwR/ChCovariates.Rnw
www/lMMwR/ChGLMMBinomial.Rnw
www/lMMwR/ChIntro.Rnw
www/lMMwR/ChLongitudinal.Rnw
www/lMMwR/ChMultiple.Rnw
www/lMMwR/ChProfPairs.Rnw
www/lMMwR/ChTheory.Rnw
www/lMMwR/DMBmacros.tex
www/lMMwR/Makefile
www/lMMwR/acronym.tex
www/lMMwR/lMMwR.bib
www/lMMwR/lMMwR.tex
Log:
updated R code (from 2012 lme4.0) to current lme4 (apart from the "modular part" in ChTheory)
add *.bib -- remove lrgprt.pdf from svn for now
tweaks in Makefile, *macros.tex
+ TODO
Modified: www/lMMwR/ChCovariates.Rnw
===================================================================
--- www/lMMwR/ChCovariates.Rnw 2016-07-05 11:00:26 UTC (rev 1896)
+++ www/lMMwR/ChCovariates.Rnw 2018-02-09 20:33:50 UTC (rev 1897)
@@ -1,10 +1,11 @@
-\SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,width=8,strip.white=all}
+\SweaveOpts{engine=R, eps=FALSE, pdf=TRUE, width=8, strip.white=all}
\SweaveOpts{keep.source=TRUE}
-\SweaveOpts{prefix=TRUE,prefix.string=figs/Covariates,include=TRUE}
+\SweaveOpts{prefix=TRUE, prefix.string=figs/Covariates, include=TRUE}
\setkeys{Gin}{width=\textwidth}
+%% MM: Have we made sure the reader sees this in one place ?
<<preliminaries,echo=FALSE,print=FALSE,results=hide>>=
-options(show.signif.stars = FALSE,
+options(show.signif.stars = FALSE,# <- Doug
lattice.theme = function() canonical.theme("pdf", color = FALSE),
str = strOptions(strict.width = "cut"))
library(splines)
@@ -12,7 +13,7 @@
library(Matrix)
library(Rcpp)
library(minqa)
-library(lme4a)
+library(lme4)
data(Machines, package = "MEMSS")
data(ergoStool, package = "MEMSS")
.f4 <- "%.4f"
@@ -68,7 +69,7 @@
})
save(ratbrain, file = "ratbrain.rda")
}
-@
+@
%% Practical issues: Number of levels of a factor
%% Model building - the tendency to over-model
@@ -98,7 +99,7 @@
\label{sec:categoricalcov}
As we have seen in the earlier chapters, each random-effects term
-ina linear mixed model is defined with respect to a \emph{grouping
+in a linear mixed model is defined with respect to a \emph{grouping
factor}, which is a categorical covariate whose levels are regarded
as a sample from a population of possible levels. In some cases we
may have other categorical covariates with a fixed set of levels (such
@@ -120,7 +121,7 @@
<<strMachines>>=
str(Machines)
xtabs(~ Machine + Worker, Machines)
-@
+@
The cross-tablulation shows that each of the six operators used each
of the three machines on three occasions producing replicate
@@ -137,12 +138,12 @@
\centering
<<Machinesplot,fig=TRUE,echo=FALSE,height=3.5>>=
print(dotplot(reorder(Worker, score) ~ score, Machines,
- groups = Machine,
- xlab = "Quality and productivity score",
- ylab = "Worker", type = c("p","a"),
- auto.key = list(columns = 3, lines = TRUE),
+ groups = Machine,
+ xlab = "Quality and productivity score",
+ ylab = "Worker", type = c("p","a"),
+ auto.key = list(columns = 3, lines = TRUE),
jitter.y = TRUE))
-@
+@
\caption[Quality score by operator and machine]{A quality and
productivity score for each of six operators (the \code{Worker}
factor) on each of three machine types.}
@@ -173,7 +174,7 @@
fm10 <- lmer(score ~ Machine + (1|Worker) + (1|Machine:Worker),
Machines, REML=FALSE)
anova(fm08, fm09, fm10)
-@
+@
Notice that in the \code{anova} summary the order of the models has
been rearranged according to the complexity of the models, as measured
@@ -196,7 +197,7 @@
line, which is the comparison of the more complex model \code{fm09} to
the simpler model \code{fm10}, the change in the deviance is
\Sexpr{sprintf(.f4,deviance(fm10)-deviance(fm09))} at a cost of 4
-additional parameters with a p-value of
+additional parameters with a p-value of
\Sexpr{sprintf(.f1,100*pchisq(deviance(fm10)-deviance(fm09),4,lower=FALSE))}\%.
In formal hypothesis tests we establish a boundary, often chosen to
be 5\%, below which we regard the p-value as providing ``significant''
@@ -219,11 +220,11 @@
(Fig.~\ref{fig:fm10ranef})
\begin{figure}[tbp]
\centering
-<<fm10ranef,fig=TRUE,echo=FALSE,height=5.5>>=
-qrr10 <- dotplot(ranef(fm10, postVar = TRUE), strip = FALSE)
+<<fm10ranef,fig=TRUE,echo=FALSE,height=5.5>>=
+qrr10 <- dotplot(ranef(fm10, condVar = TRUE), strip = FALSE)
print(qrr10[[1]], pos = c(0,0,1,0.75), more = TRUE)
print(qrr10[[2]], pos = c(0,0.65,1,1))
-@
+@
\caption[Random effects prediction intervals for model \code{fm10}]{95\%
prediction intervals on the random effects for model
\code{fm10} fit to the \code{Machines} data. The intervals in the
@@ -274,7 +275,7 @@
can either extract the fixed-effects coefficient estimates by themselves
<<fm10fixef>>=
fixef(fm10)
-@
+@
or extract the coefficients and their standard errors and t ratios
<<coefsumfm10>>=
coef(summary(fm10))
@@ -299,31 +300,38 @@
statistics summarizing the fit are the same --- for example,
<<AICfm10a10>>=
AIC(fm10a,fm10)
-@
+@
Even the fitted values are the same
<<fittedfm10afm10,eval=FALSE>>=
all.equal(fitted(fm10a), fitted(fm10))
-@
+@
<<mufm10afm10,echo=FALSE>>=
-all.equal(fm10 at resp@mu, fm10a at resp@mu)
+all.equal(getME(fm10, "mu"), getME(fm10a, "mu"))
@
-There are several things to note about model \code{fm10a} compared to model \code{fm10}. First, we can see that the
-To see some of the approaches to removing this redundancy, including the defau
-If we suppress the intercept we can estimate one coefficient for each level of the factor
+There are several things to note about model \code{fm10a} compared to model
+\code{fm10}. First, we can see that the
+%%% ????
+To see some of the approaches to
+removing this redundancy, including the defau
+%%% ???
+If we suppress the intercept
+we can estimate one coefficient for each level of the factor
<<fm10a>>=
coef(summary(update(fm10, . ~ . + 0)))
-@
+@
unless we impose additional \code{estimability conditions} on the
-coefficients.
+coefficients.
+%%% ???
+only one of the prediction intervals for the interaction that does not
+%%% ???
-
-only one of the prediction intervals for the interaction that does not
-
-In both models \code{fm09} and \code{fm10} we incorporate the \code{Machine} factor is in
-fit models having different configurations of
+In both models \code{fm09} and \code{fm10} we incorporate the
+\code{Machine} factor
+%%% ???
+is in fit models having different configurations of
simple, scalar random effects but always with the same fixed-effects
specification of an intercept, or overall mean response, only.
@@ -354,7 +362,7 @@
groups = Type, type = c("p","a"),
xlab = "Effort to arise (Borg scale)",
auto.key = list(columns=4,lines=TRUE)))
-@
+@
\caption[Effort to arise by subject and stool type]{Subjective
evaluation of the effort required to arise (on the Borg scale) by 9
subjects, each of whom tried each of four types of stool.}
@@ -391,7 +399,7 @@
\label{sec:SubjTypeRE}
A model with random effects for both \code{Subject} and \code{Type} is
-fit in the same way that we fit such in Chap.~\ref{chap:Multiple},
+fit in the same way that we fit such in Chap.~\ref{chap:longitudinal},
<<fm06>>=
(fm06 <- lmer(effort ~ 1 + (1|Subject) + (1|Type), ergoStool, REML=FALSE))
@
@@ -436,10 +444,10 @@
\centering
<<pr06zeta,fig=TRUE,echo=FALSE,height=3.5>>=
print(xyplot(pr06, aspect=1.3, layout=c(4,1)))
-@
+@
\caption[Profile zeta plot for the parameters in model \code{fm06}]
{Profile zeta plot for the parameters in model \code{fm06} fit to the
- \code{ergoStool} data}
+ \code{ergoStool} data.}
\label{fig:pr06zeta}
\end{figure}
and the corresponding profile pairs plot (Fig.~\ref{fig:pr06pairs}).
@@ -447,10 +455,10 @@
\centering
<<pr06pairs,fig=TRUE,echo=FALSE,height=8>>=
print(splom(pr06))
-@
+@
\caption[Profile pairs plot for the parameters in model \code{fm06}]
{Profile pairs plot for the parameters in model \code{fm06} fit to the
- \code{ergoStool} data}
+ \code{ergoStool} data.}
\label{fig:pr06pairs}
\end{figure}
@@ -465,18 +473,18 @@
@
is very wide. The upper end point of this 95\% confidence interval,
\Sexpr{sprintf(f3,confint(pr06)[2,2])}, is more than twice as large as
-the estimate, $\widehat{\sigma_2}=$
+the estimate, $\widehat{\sigma_2}=$
\Sexpr{sprintf(f3, attr(VarCorr(fm06)[["Type"]], "stddev"))}.
-A plot
+A plot
\begin{figure}[tbp]
\centering
<<fm06dot,fig=TRUE,echo=FALSE,height=2>>=
-print(dotplot(ranef(fm06, postVar = TRUE), strip=FALSE)[["Type"]])
-@
+print(dotplot(ranef(fm06, condVar = TRUE), strip=FALSE)[["Type"]])
+@
\caption[Prediction intervals on the random effects for stool
type]{95\% prediction intervals on the random effects for \code{Type}
- from model \code{fm06} fit to the \code{ergoStool} data}
+ from model \code{fm06} fit to the \code{ergoStool} data.}
\label{fig:fm06dot}
\end{figure}
of the prediction intervals on the random effects for \code{Type}
@@ -510,7 +518,7 @@
fixed-effects specification.
<<fm07>>=
(fm07 <- lmer(effort ~ 1 + Type + (1|Subject), ergoStool, REML = 0))
-@
+@
It appears that the last three levels of the \code{Type} factor are
now modeled as fixed-effects parameters, in addition to the
@@ -563,7 +571,7 @@
global setting with
<<optionscontrasts>>=
getOption("contrasts")
-@
+@
Because these were the contrasts in effect when model \code{fm07} was
fit, the particular contrasts used for the \code{Type}
@@ -572,7 +580,7 @@
contr.treatment(4)
@
In this display the rows correspond to the levels of the \code{Type}
-factor and the columns correspond to the parameters labeled
+factor and the columns correspond to the parameters labeled
\code{TypeT2}, \code{TypeT3} and \code{TypeT4}.
The values of \code{Type} in the data frame, whose first few rows are
@@ -620,7 +628,7 @@
$8.5556+0.6667=9.2222$, respectively, as can be verified from
<<fixeffm07>>=
head(as.vector(model.matrix(fm07) %*% fixef(fm07)))
-@
+@
The fact that the parameter labeled \code{TypeT2} is the difference
between the fixed-effects prediction for levels \code{T2} and
@@ -647,20 +655,20 @@
\centering
<<pr07zeta,fig=TRUE,echo=FALSE,height=3.5>>=
print(xyplot(pr07, aspect=1.3, layout=c(6,1)))
-@
+@
\caption[Profile zeta plot for the parameters in model \code{fm06}]
{Profile zeta plot for the parameters in model \code{fm06} fit to the
- \code{ergoStool} data}
+ \code{ergoStool} data.}
\label{fig:pr07zeta}
\end{figure}
(Fig.~\ref{fig:pr07zeta}) and check, say, the confidence intervals on
these parameters.
<<confintfm07>>=
confint(pr07, c("TypeT2", "TypeT3", "TypeT4"))
-@
+@
According to these intervals, and from what we see from
Fig.~\ref{fig:pr07zeta}, types \code{T2} and \code{T3} are
-significantly different from type {T1} (the intervals do not contain
+significantly different from type \code{T1} (the intervals do not contain
zero) but type \code{T4} is not (the confidence interval on this
contrast contains zero).
@@ -674,7 +682,7 @@
essentially arbitrary. For completeness we should evaluate all six
possible contrasts of pairs of levels.
-We can do this by refitting the model with a difference reference
+We can do this by refitting the model with a different reference
level for the \code{Type} factor and profiling the modified model
fit. The \code{relevel} function allows us to change the reference
level of a factor.
@@ -685,20 +693,20 @@
pr07b <- profile(lmer(effort ~ 1 + Type + (1|Subject),
within(ergoStool, Type <- relevel(Type, "T3")),
REML=FALSE))
-@
+@
The other constrasts of interest are
<<confintpr07ab>>=
confint(pr07a, c("TypeT3", "TypeT4"))
confint(pr07b, "TypeT4")
@
-from which would conclude that type \code{T2} requires significantly
+from which we would conclude that type \code{T2} requires significantly
greater effort than any of the other types at the 5\% level (because
none of the 95\% confidence intervals on contrasts with \code{T2}
contain zero) and that types \code{T3} and \code{T4} are significantly
different at the 5\% level.
However, we must take into account that we are performing multiple,
-simulataneous comparisons of levels.
+simultaneous comparisons of levels.
\subsubsection{Multiple comparisons}
\label{sec:multcomp}
@@ -742,7 +750,7 @@
have a greater coverage probability in such a way that the
``worst-case'' probability is the desired level. With six comparisons
to get a family-wise coverage probability of 95\% the individual
-intervals are chosen to have coverage of
+intervals are chosen to have coverage of
<<covrge>>=
(covrge <- 1 - 0.05/6)
@
@@ -752,11 +760,11 @@
rbind(confint(pr07, c("TypeT2","TypeT3","TypeT4"), covrge),
confint(pr07a, c("TypeT3","TypeT4"), covrge),
confint(pr07b, "TypeT4", covrge))
-@
+@
We again reach the conclusion that the only pair of stool types for
which zero is within the confidence interval on the difference in
-effects is the (\code{T1},\code{T4}) pair but, for these intervals,
+effects is the (\code{T1},\code{T4}) pair but, for these intervals,
the family-wise coverage of all six intervals is at least 95\%.
There are other, perhaps more effective, techniques for adjusting
@@ -791,7 +799,7 @@
the fitted model.
<<fm08>>=
summary(fm08 <- aov(effort ~ Subject + Type, ergoStool))
-@
+@
As seen above, the \code{summary} method for objects of class
\code{"aov"} provides an analysis of variance table. The order in
@@ -874,7 +882,7 @@
number tends to be small.
<<kappafm08>>=
kappa(fm08, exact = TRUE)
-@
+@
However, when $p$ is large the condition number of $\vec X$ tends to
become very large and evaluation of fixed-effects parameter
estimates and their standard errors is an ill-conditioned problem.
@@ -885,9 +893,11 @@
$\vec L_\theta$, defined in (\ref{eq:sparseCholesky1}) which is less
than the condition number of $\Lambda_\theta\trans\vec Z\trans\vec
Z\Lambda_\theta$, even when $\Lambda_\theta$ is singular.
+%% base :: .kappa_tri() is doc.ed as "an internal function called by
+%% 'kappa.qr' and 'kappa.default'. "
<<kappafm06>>=
-kappa.tri(as(as(fm06 at re@L, "sparseMatrix"), "matrix"), exact = TRUE)
-@
+.kappa_tri(as(as(getME(fm06, "L"), "sparseMatrix"), "matrix"), exact = TRUE)
+@
The evaluation of $\vec L_\theta$ is an example of
\emph{regularization} methods for solving ill-posed problems or to
prevent overfitting.
@@ -908,7 +918,7 @@
several schools.
<<classroom>>=
str(classroom)
-@
+@
The response modeled in Chap.~4 of \citet{west07:_linear_mixed_model}
is \code{mathgain}, the difference between a student's mathematics
@@ -950,7 +960,7 @@
ylab = "Classroom within school", type = c("p", "a"),
xlab = "Gain in Mathematics score from kindergarten to first grade",
jitter.y = TRUE))
-@
+@
\caption{Comparative dotplots of gain in the mathematics scores in classrooms within schools.}
\label{fig:classroomdot}
\end{figure}
@@ -960,13 +970,13 @@
it as
<<fm09>>=
(fm09 <- lmer(mathgain ~ (1|classid) + (1|schoolid), classroom))
-@
+@
The results from this model fit using the REML criterion can be
compared to Table 4.6 (page 156) of
\citet{west07:_linear_mixed_model}.
-It seems that the \code{housepov} value is a property of the school.
+It seems that the \code{housepov} value is a property of the school.
We can check this by considering the number of unique combinations of
\code{housepov} and \code{schoolid} and comparing that to the number
of levels of \code{schoolid}. For safety we check the number of
@@ -975,7 +985,7 @@
<<schoolidhousepov>>=
with(classroom, length(levels(factor(schoolid))))
nrow(unique(subset(classroom, select = c(schoolid, housepov))))
-@
+@
In some formulations of multilevel models or hierarchical linear
models it is important to associate covariates with different levels
@@ -984,25 +994,25 @@
<<fm7>>=
(fm7 <- lmer(mathgain ~ 1 + I(mathkind-450) + sex + minority + ses
+ housepov + (1|classid) + (1|schoolid), classroom))
-@
+@
A profile plot of the parameters in model \code{fm7}
\begin{figure}[tbp]
\centering
<<fm4prplot,fig=TRUE,echo=FALSE,height=5>>=
print(xyplot(pr7, absVal=0, aspect=1.3, layout=c(5,2)))
-@
+@
\caption{Profile plot of the parameters in model \code{fm4}.}
\label{fig:fm4prplot}
\end{figure}
-is shown in Fig.~\ref{fig:fm4prplot}
+is shown in Fig.~\ref{fig:fm4prplot}.
\section{Rat Brain example}
\label{sec:RatBrain}
<<ratbraindat>>=
ftable(xtabs(activate ~ animal + treatment + region, ratbrain))
-@
+@
Description of the Rat Brain data should go here.
\begin{figure}[tbp]
@@ -1012,8 +1022,8 @@
ratbrain, type = c("p","a"), layout = c(1,2),
strip=FALSE, strip.left = TRUE,
auto.key = list(space = "right", lines = TRUE)))
-@
+@
\caption[Activation of brain regions in rats]{Activation of brain
- regions in rats}
+ regions in rats.}
\label{fig:Ratbraindot}
\end{figure}
Modified: www/lMMwR/ChGLMMBinomial.Rnw
===================================================================
--- www/lMMwR/ChGLMMBinomial.Rnw 2016-07-05 11:00:26 UTC (rev 1896)
+++ www/lMMwR/ChGLMMBinomial.Rnw 2018-02-09 20:33:50 UTC (rev 1897)
@@ -102,7 +102,7 @@
auto.key = list(space = "top", points = FALSE,
lines = TRUE, columns = 4),
ylab = "Proportion", xlab = "Centered age"))
-@
+@
\caption[Contraception use versus centered age]{Contraception use
versus centered age for women in the Bangladesh Fertility Survey
1989. Panels are determined by whether the woman is in an urban
@@ -152,7 +152,7 @@
<<fm10,eval=FALSE>>=
fm10 <- glmer(use ~ 1+age+I(age^2)+urban+livch+(1|district),
Contraception, binomial)
-@
+@
\index{fitted models!fm10}
When displaying a fitted model like this that has several
@@ -191,7 +191,7 @@
<<Contrach>>=
Contraception <- within(Contraception,
ch <- factor(livch != 0, labels = c("N", "Y")))
-@
+@
we fit a reduced model, \code{fm11}, with summary
<<fm11,echo=FALSE>>=
print(fm11 <- glmer(use ~ age + I(age^2) + urban + ch + (1|district),
@@ -201,7 +201,7 @@
Comparing this model to the previous model
<<anovafm10fm11>>=
anova(fm11,fm10)
-@
+@
indicates that the reduced model is adequate.
A plot of the smoothed observed proportions versus centered age
@@ -214,7 +214,7 @@
auto.key = list(space = "top", points = FALSE,
lines = TRUE, columns = 2),
ylab = "Proportion", xlab = "Centered age"))
-@
+@
\caption[Contraception use versus centered age (children/no
children)]{Contraception use versus centered age for women in the
Bangladesh Fertility Survey 1989. Panels are determined by whether
@@ -236,7 +236,7 @@
Comparing this fitted model to the previous one
<<anovafm11fm12>>=
anova(fm11, fm12)
-@
+@
confirms the usefulness of this term.
Continuing with the model-building we turn our attention to the random
@@ -248,8 +248,9 @@
<<fm13,echo=FALSE>>=
(fm13 <- glmer(use ~ age*ch + I(age^2) + urban + (1|urban:district),
Contraception, binomial))
+<<anovafm12fm13>>
anova(fm12,fm13)
-@
+@
\index{fitted models!fm13}
Notice that although there are 60 distinct districts there are only
@@ -259,7 +260,7 @@
shown in
<<Contraxtab>>=
xtabs(~ urban + district, Contraception)
-@
+@
\section{Link functions and interpreting coefficients}
\label{sec:GLMMlink}
@@ -296,7 +297,7 @@
$1-p$. It is easy to establish that the expected value is also $p$.
For consistency across distribution families we write this expected
response as $\mu$ instead of $p$. We should, however, keep
-in mind that, for this distribution, $\mu$ corresponds to a probability
+in mind that, for this distribution, $\mu$ corresponds to a probability
and hence must satisfy $0\le\mu\le 1$.
In general we don't want to have restrictions on the values of the
@@ -321,15 +322,15 @@
\label{eq:logitinv}
\mu = \frac{1}{1+\exp(-\eta)}
\end{equation}
-shown in
+shown in
\begin{figure}[tbp]
\centering
<<logitinv,fig=TRUE,echo=FALSE,height=3>>=
eta <- seq(-7, 7, len=701)
-print(xyplot(plogis(eta) ~ eta, type = c("g","l"),
+print(xyplot(plogis(eta) ~ eta, type = c("g","l"),
xlab = expression(eta),
ylab = expression(mu == frac(1,1+exp(-eta)))))
-@
+@
\caption[Inverse of the logit link function]{Inverse of the logit link
function. The linear predictor value, $\eta$, which is on an
unrestricted scale, is mapped to $\mu$ on the probability scale, $[0,1]$.}
@@ -355,11 +356,11 @@
distributions) or the probability density function (for continuous
distributions). The probability function for the Bernoulli
distribution is $\mu$ for $y=1$ and $1-\mu$ for $y=0$. If we write this
-in a somewhat peculiar way as $\mu^y+(1-\mu)^{1-y}$ for $y\in\{0,1\}$
+in a somewhat peculiar way as $\mu^y\cdot(1-\mu)^{1-y}$ for $y\in\{0,1\}$
then the logarithm of the probability function becomes
\begin{equation}
\label{eq:BernoulliProb}
- \log\left(\mu^y+(1-\mu)^{1-y}\right) = \log(1-\mu) +
+ \log\left(\mu^y\cdot(1-\mu)^{1-y}\right) = \log(1-\mu) +
y\,\log\left(\frac{\mu}{1-\mu}\right) .
\end{equation}
Notice that the logit link function is the multiple of $y$ in the last term.
@@ -375,7 +376,7 @@
probability function is
\begin{equation}
\label{eq:PoissonProb}
- -\log(y!)-\mu+y\log(\mu) .
+ -\log(y!)-\mu+y\log(\mu)
\end{equation}
and the canonical link function is $\log(\mu)$.
@@ -398,25 +399,25 @@
\Sexpr{sprintf(.f5, fixef(fm13)[[1]])} corresponding to a fitted probability of
<<probfm13>>=
plogis(fixef(fm13)[[1]])
-@
+@
or \Sexpr{sprintf(.f1, 100*plogis(fixef(fm13)[[1]]))}\%.
-Similarly the predicted log-odds of a childless woman with a centered
+Similarly, the predicted log-odds of a childless woman with a centered
age of 0 in an urban setting of a typical district using artificial
contraception is
<<logoddsUrban>>=
sum(fixef(fm13)[c("(Intercept)","urbanY")])
-@
+@
corresponding to a probability of
<<probUrban>>=
plogis(sum(fixef(fm13)[c("(Intercept)","urbanY")]))
-@
+@
The predicted log-odds and predicted probability for a woman with
children and at the same age and location are
<<logoddsUrbanKids>>=
logodds <- sum(fixef(fm13)[c("(Intercept)","chY","urbanY")])
c("log-odds"=logodds, "probability"=plogis(logodds))
-@
+@
We should also be aware that the random effects are defined on the
linear predictor scale and not on the probability scale. A normal
@@ -424,23 +425,23 @@
model \code{fm13} (Fig.~\ref{fig:fm13predqq})
\begin{figure}[tbp]
<<fm13predqq,fig=TRUE,echo=FALSE,results=hide,height=3.5>>=
-print(qqmath(ranef(fm13, postVar=TRUE), strip = FALSE))
-@
+print(qqmath(ranef(fm13, condVar =TRUE), strip = FALSE))
+@
\caption{95\% prediction intervals on the random effects in
\code{fm13} versus quantiles of the standard normal distribution.}
\label{fig:fm13predqq}
\end{figure}
-shows that the smallest random effects are approximately -1 and the
-largest are approximately 1. The numerical values and the identifier
+shows that the smallest random effects are approximately $-1$ and the
+largest are approximately $1$. The numerical values and the identifier
of the combination of \code{urban} and \code{district} for these
extreme values can be obtained as
<<fm13reLow>>=
head(sort(ranef(fm13, drop=TRUE)[[1]]), 3)
-@
+@
and
<<fm13reLow>>=
tail(sort(ranef(fm13, drop=TRUE)[[1]]), 3)
-@
+@
The exponential of the random effect is the relative odds of a woman
in a particular urban/district combination using artificial birth
@@ -450,7 +451,7 @@
\code{urban:district} interaction) using artifical contraception is
<<N1oddsratio>>=
exp(ranef(fm13, drop=TRUE)[[1]]["N:1"])
-@
+@
or about 40\% of that of her rural counterpart in a typical district.
Notice that there is considerable variability in the lengths of the
Modified: www/lMMwR/ChIntro.Rnw
===================================================================
--- www/lMMwR/ChIntro.Rnw 2016-07-05 11:00:26 UTC (rev 1896)
+++ www/lMMwR/ChIntro.Rnw 2018-02-09 20:33:50 UTC (rev 1897)
@@ -13,7 +13,7 @@
library(MatrixModels)
library(Rcpp)
library(minqa)
-library(lme4a)
+library(lme4)
.f2 <- "%.2f"
.f5 <- "%.5f"
.f6 <- "%.6f"
@@ -26,7 +26,7 @@
pr01 <- profile(fm01ML, delta = 0.2)
save(pr01, file = "pr01.rda")
}
-@
+@
\chapter{A Simple, Linear, Mixed-effects Model}
\label{chap:ExamLMM}
@@ -204,11 +204,11 @@
another value if we so choose)
<<headDyestuff>>=
head(Dyestuff)
-@
+@
or we could ask for a \code{summary} of the data
<<summaryDyestuff>>=
summary(Dyestuff)
-@
+@
\begin{figure}[tbp]
\centering
@@ -218,7 +218,7 @@
ylab = "Batch", jitter.y = TRUE, pch = 21,
xlab = "Yield of dyestuff (grams of standard color)",
type = c("p", "a")))
-@
+@
\caption[Yield of dyestuff from 6 batches of an intermediate]{Yield of
dyestuff (Napthalene Black 12B) for 5 preparations from each of 6
batches of an intermediate product (H-acid). The line joins the
@@ -239,7 +239,7 @@
variability in yield, even for preparations from the same batch, but
there is also noticeable batch-to-batch variability. For example,
four of the five preparations from batch F provided lower yields than
-did any of the preparations from batches C and E.
+did any of the preparations from batches C and E.
This plot, and essentially all the other plots in this book,
were created using Deepayan Sarkar's \package{lattice} package for
@@ -293,7 +293,7 @@
ylab = "Batch", jitter.y = TRUE, pch = 21,
xlab = "Simulated response (dimensionless)",
type = c("p", "a")))
-@
+@
\caption[Simulated data similar in structure to the \code{Dyestuff}
data]{Simulated data presented in
\citet{box73:_bayes_infer_statis_analy} with a structure similar to
@@ -334,7 +334,7 @@
<<fm01>>=
fm01 <- lmer(Yield ~ 1 + (1|Batch), Dyestuff)
print(fm01)
-@
+@
In the first line we call the \code{lmer} function to fit a model with formula
<<fm01formula,echo=FALSE>>=
Yield ~ 1 + (1|Batch)
@@ -376,7 +376,7 @@
are obtained by specifying the optional argument \code{REML=FALSE}.
<<fm01ML>>=
(fm01ML <- lmer(Yield ~ 1 + (1|Batch), Dyestuff, REML=FALSE))
-@
+@
(Notice that this code fragment also illustrates a way to condense
the assignment and the display of the fitted model into a single
step. The redundant set of parentheses surrounding the assignment
@@ -481,7 +481,7 @@
estimate $\widehat{\sigma}_1=0$ in both the REML
<<fm02>>=
(fm02 <- lmer(Yield ~ 1 + (1|Batch), Dyestuff2))
-@
+@
and the ML fits.
<<fm02ML>>=
(fm02ML <- update(fm02, REML=FALSE))
@@ -514,7 +514,7 @@
summary(fm02a <- lm(Yield ~ 1, Dyestuff2))
@
because the random effects are inert, in the sense that they have a
-variance of zero, and hence can be removed.
+variance of zero, and hence can be removed.
Notice that the estimate of $\sigma$ from the linear model (called the
\code{Residual standard error} in the summary) corresponds to the
@@ -549,9 +549,9 @@
\section{The Linear Mixed-effects Probability Model}
\label{sec:Probability}
-In explaining some of parameter estimates related to the random
+In explaining some of the parameter estimates related to the random
effects we have used terms such as ``unconditional distribution'' from
-the theory of probability. Before proceeding further we
+the theory of probability. Before proceeding further we
clarify the linear mixed-effects probability model and define several
terms and concepts that will be used throughout the book. Readers who
are more interested in practical results than in the statistical
@@ -575,7 +575,7 @@
\label{eq:LMMdist}
\begin{aligned}
(\bc Y|\bc B=\vec b)&\sim\mathcal{N}(\vec X\vec\beta+\vec Z\vec
- b,\sigma^2\vec I)\\
+ b,\sigma^2\vec I)\\ % I_n is used in {eq:condYgiven} in ./ChTheory.Rnw
\bc{B}&\sim\mathcal{N}(\vec0,\Sigma_\theta) .
\end{aligned}
\end{equation}
@@ -622,7 +622,7 @@
\vec L_\theta\vec L_\theta\trans=
\Lambda_\theta\trans\vec Z\trans\vec Z\Lambda_\theta+\vec I_q .
\end{equation}
-where $\vec I_q$ is the $q\times q$ \emph{identity matrix}.
+where $\vec I_q$ is the $q\times q$ \emph{identity matrix}.
The \emph{deviance} (negative twice the log-likelihood) of the
parameters, given the data, $\vec y$, is
@@ -658,7 +658,7 @@
Minimizing this expression with respect to $\sigma^2$ produces the
conditional estimate
\begin{equation}
- \widehat{\sigma^2}_\theta=\frac{r^2_\theta}{n}
+ \widehat{\sigma^2}_\theta=\frac{r^2_\theta}{n}
\end{equation}
which provides the \emph{profiled deviance},
\begin{equation}
@@ -698,12 +698,12 @@
$\tilde{d}(\vec\theta|\vec y)$.
<<fm01MLverb>>=
fm01ML <- lmer(Yield ~ 1|Batch, Dyestuff, REML=FALSE, verbose=1)
-@
+@
The algorithm converges after 17 function evaluations to a profiled
deviance of \Sexpr{sprintf(.f5, deviance(fm01ML))} at
-$\theta=$\Sexpr{sprintf(.f6, fm01ML at re@theta)}. In this model the
+$\theta=$\Sexpr{sprintf(.f6, getME(fm01ML, "theta"))}. In this model the
scalar parameter $\theta$ is the ratio $\sigma_1/\sigma$.
-
+
The actual values of many of the matrices and vectors defined above
are available as \emph{slots} of the fitted model object. In general
the object returned by a model-fitting function from the
@@ -720,16 +720,16 @@
For example, $\Lambda_{\widehat{\theta}}$ is stored as
<<fm01MLLambda>>=
-fm01ML at re@Lambda
-@
+getME(fm01ML, "Lambda")
+@
Often we will show the structure of sparse matrices as an image
(Fig.~\ref{fig:fm01Lambdaimage}).
\begin{figure}[tbp]
\sidecaption[t]
\setkeys{Gin}{width=1.5in}
<<fm01Lambdaimage,fig=TRUE,echo=FALSE,width=2.5,height=2.5>>=
-print(image(fm01 at re@Lambda, sub=NULL, xlab=NULL, ylab=NULL))
-@
+print(image(getME(fm01, "Lambda"), sub=NULL, xlab=NULL, ylab=NULL))
+@
\caption[Image of the $\Lambda$ for model \texttt{fm01ML}]{Image of the
relative covariance factor, $\Lambda_{\widehat{\theta}}$ for model
\code{fm01ML}. The non-zero elements are shown as darkened squares.
@@ -741,8 +741,8 @@
\begin{figure}[tbp]
\centering
<<fm01Ztimage,fig=TRUE,echo=FALSE,height=2.5>>=
-print(image(fm01 at re@Zt, sub=NULL))
-@
+print(image(getME(fm01, "Zt"), sub=NULL))
+@
\caption[Image of the random-effects model matrix, $\vec Z\trans$, for
\texttt{fm01}]{Image of the transpose of the random-effects model
matrix, $\vec Z$, for model \code{fm01}. The non-zero elements,
@@ -751,13 +751,13 @@
\label{fig:fm01Ztimage}
\end{figure}
-In this simple model $\Lambda=\theta\vec I_6$ is a multiple of the
+In this simple model $\Lambda_\theta=\theta\vec I_6$ is a multiple of the
identity matrix and the $30\times 6$ model matrix $\vec Z$, whose
transpose is shown in Fig.~\ref{fig:fm01Ztimage}, consists of the
indicator columns for \code{Batch}. Because the data are balanced
with respect to \code{Batch}, the Cholesky factor, $\vec L$ is also a
-multiple of the identity (use \code{image(fm01ML at re@L)} to check if
-you wish). The vector $\vec u$ is available in \code{fm01ML at re@u}. The
+multiple of the identity (use \code{image(getME(fm01ML, "L"))} to check if
+you wish). The vector $\vec u$ is available in \code{getME(fm01ML, "u")}. The
vector $\vec\beta$ and the model matrix $\vec X$ are in
\code{fm01ML at fe}.
@@ -778,7 +778,7 @@
parameters for which we obtained estimates. These parameters are
$\sigma_1$, the standard deviation of the random effects, $\sigma$,
the standard deviation of the residual or ``per-observation'' noise
-term and $\beta_0$, the fixed-effects parameter that is labeled as
+term and $\beta_0$, the fixed-effects parameter that is labeled as
\code{(Intercept)}.
The \code{profile} function systematically varies the parameters in a
@@ -804,12 +804,12 @@
starting with an ML fit.
<<pr01,eval=FALSE>>=
pr01 <- profile(fm01ML)
-@
+@
\begin{figure}[tb]
\centering
<<fm01prof,fig=TRUE,echo=FALSE,height=3.5>>=
print(xyplot(pr01, aspect = 1.3))
-@
+@
\caption[Profile zeta plots of the parameters in model \code{fm01ML}]
{Signed square root, $\zeta$, of the likelihood ratio test statistic
for each of the parameters in model \code{fm01ML}. The vertical
@@ -834,7 +834,7 @@
extractor.
<<confintfm0195>>=
confint(pr01)
-@
+@
By default the 95\% confidence interval is returned. The optional
argument, \code{level}, is used to obtain other confidence levels.
<<confintfm0199>>=
@@ -847,7 +847,7 @@
\centering
<<fm01absprof,fig=TRUE,echo=FALSE,height=2.5>>=
print(xyplot(pr01, absVal = TRUE, aspect = 0.7, strip=FALSE, strip.left = TRUE))
-@
+@
\caption[Absolute value profile zeta plots of the parameters in model
\code{fm01ML}] {Profiled deviance, on the scale $|\zeta|$, the square
root of the change in the deviance, for each of the parameters in
@@ -909,7 +909,7 @@
exp(pr$x),
exp(pr$x * 2)), pr$y)
}))
-@
+@
\caption[Profile zeta plots comparing $\log(\sigma)$, $\sigma$ and
$\sigma^2$ in model \code{fm01ML}]{Signed square root, $\zeta$, of the
likelihood ratio test statistic as a function of $\log(\sigma)$, of
@@ -984,7 +984,7 @@
pr$x,
pr$x^2), pr$y)
}))
-@
+@
\caption[Profile zeta plots comparing $\log(\sigma_1)$, $\sigma_1$ and
$\sigma_1^2$ in model \code{fm01ML}]{Signed square root, $\zeta$, of the
likelihood ratio test statistic as a function of $\log(\sigma_1)$,
@@ -1025,12 +1025,12 @@
This is not quite the same as evaluating the distribution of the
estimator of the parameter, which for mixed-effects models can be very
difficult, but it gives us a good indication of what the distribution
-of the estimator would be. Fig.~\ref{fig:fm01dens},
[TRUNCATED]
To get the complete diff run:
svnlook diff /svnroot/lme4 -r 1897
More information about the Lme4-commits
mailing list