[Depmix-commits] r308 - papers/jss
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Wed Aug 5 16:03:05 CEST 2009
Author: maarten
Date: 2009-08-05 16:03:04 +0200 (Wed, 05 Aug 2009)
New Revision: 308
Modified:
papers/jss/article.tex
Log:
- changes to EM description
Modified: papers/jss/article.tex
===================================================================
--- papers/jss/article.tex 2009-08-05 13:35:22 UTC (rev 307)
+++ papers/jss/article.tex 2009-08-05 14:03:04 UTC (rev 308)
@@ -309,31 +309,45 @@
for the prior model, transition model, and response model respectively. The
joint log likelihood can be written as
\begin{equation}
-\log \Prob(O_{1:T}, S_{1:T}|\greekv{\theta}) = \log \Prob(S_1|\greekv{\theta}_1)
-+ \sum_{t=2}^{T} \log \Prob(S_t|S_{t-1},\greekv{\theta}_2)
-+ \sum_{t=1}^{T} \log \Prob(O_t|S_t,\greekv{\theta}_3)
+\log \Prob(\vc{O}_{1:T}, \vc{S}_{1:T}|\vc{z}_{1:T},\greekv{\theta}) = \log
+\Prob(S_1|\vc{z}_{1},\greekv{\theta}_1)
++ \sum_{t=2}^{T} \log \Prob(S_t|S_{t-1},\vc{z}_{t-1},\greekv{\theta}_2)
++ \sum_{t=1}^{T} \log \Prob(O_t|S_t,\vc{z}_t,\greekv{\theta}_3)
\end{equation}
-This likelihood depends on the unobserved states $S_t$. In the Expectation step,
-we replace these with their expected values given a set of (initial) parameters
-$\greekv{\theta}' = (\greekv{\theta}'_1, \greekv{\theta}'_2,\greekv{\theta}'_3)$
-and observations $O_{1:T}$. The expected log likelihood
+This likelihood depends on the unobserved states $\vc{S}_{1:T}$. In the
+Expectation step, we replace these with their expected values given a set of
+(initial) parameters $\greekv{\theta}' = (\greekv{\theta}'_1,
+\greekv{\theta}'_2,\greekv{\theta}'_3)$ and observations $O_{1:T}$. The expected
+log likelihood
\begin{equation}
Q(\greekv{\theta},\greekv{\theta}') = E_{\greekv{\theta}'}
-(\log \Prob(O_{1:T},S_{1:T}|O_{1:T},\greekv{\theta}))
+(\log \Prob(\vc{O}_{1:T},\vc{S}_{1:T}|\vc{O}_{1:T},\vc{z}_{1:T},\greekv{\theta}))
\end{equation}
can be written as
%\begin{equation}
\begin{multline}
\label{eq:Q}
Q(\greekv{\theta},\greekv{\theta}') =
-\sum_{j=1}^n \gamma_1(j) \log \Prob(S_1=j|\greekv{\theta}_1) \\
+\sum_{j=1}^n \gamma_1(j) \log \Prob(S_1=j|\vc{z}_1,\greekv{\theta}_1) \\
+ \sum_{t=2}^T \sum_{j=1}^n \sum_{k=1}^n \xi_t^i(j,k) \log \Prob(S_t = k|S_{t-1}
-= j,\greekv{\theta}_2) \\
+= j,\vc{z}_{t-1},\greekv{\theta}_2) \\
+ \sum_{t=1}^T \sum_{j=1}^n \sum_{k=1}^m \gamma_t(j)
-\ln \Prob(O^k_t|S_t=j,\greekv{\theta}_3),
+\ln \Prob(O^k_t|S_t=j,\vc{z}_t,\greekv{\theta}_3),
\end{multline}
%\end{equation}
-where the expected values $\xi_t(j,k) = P(S_t = k, S_{t-1} = j|O_{1:T},\greekv{\theta}')$ and $\gamma_t(j) = P(S_t = j|O_{1:T},\greekv{\theta}')$ can be computed effectively by the Forward-Backward algorithm \citep[see e.g.,][]{Rabiner1989}. The Maximisation step consists of the maximisation of (\ref{eq:Q}) for $\greekv{\theta}$. As the r.h.s. of (\ref{eq:Q}) consists of three separate parts, we can maximise separately for $\greekv{\theta}_1$, $\greekv{\theta}_2$ and $\greekv{\theta}_3$. In common models, maximisation for $\greekv{\theta}_1$ and $\greekv{\theta}_2$ is performed by the \code{nnet.default} routine in \pkg{MASS}, and maximisation for $\greekv{\theta}_3$ by the \code{glm} routine.
+where the expected values $\xi_t(j,k) = P(S_t = k, S_{t-1} = j|\vc{O}_{1:T},
+\vc{z}_{1:T},\greekv{\theta}')$ and $\gamma_t(j) = P(S_t = j|\vc{O}_{1:T},
+\vc{z}_{1:T},\greekv{\theta}')$ can be computed effectively by the
+Forward-Backward algorithm \citep[see e.g.,][]{Rabiner1989}. The Maximisation
+step consists of the maximisation of (\ref{eq:Q}) for $\greekv{\theta}$. As the
+right hand side of (\ref{eq:Q}) consists of three separate parts, we can
+maximise separately for $\greekv{\theta}_1$, $\greekv{\theta}_2$ and
+$\greekv{\theta}_3$. In common models, maximisation for $\greekv{\theta}_1$ and
+$\greekv{\theta}_2$ is performed by the \code{nnet.default} routine in the
+\pkg{nnet} package \citep{Venables2002}, and maximisation for
+$\greekv{\theta}_3$ by the standard \code{glm} routine. Note that for the latter
+maximisation, the expected values $\gamma_t(j)$ are used as prior weights of the
+observations $O^k_t$.
@@ -353,7 +367,7 @@
EM can lead to wrong parameter estimates when applying constraints.
Hence, in \pkg{depmixS4}, EM is used by default in unconstrained
models, but otherwise, direct optimization is done using \pkg{Rdonlp2}
-\cite{Tamura2009,Spellucci2002}, because it handles general linear
+\citep{Tamura2009,Spellucci2002}, because it handles general linear
(in)equality constraints, and optionally also non-linear constraints.
%Need some more on EM and how/why it is justified to do separate weighted
@@ -379,7 +393,7 @@
\subsection{Example data: speed}
-Throughout this manual a data set called \code{speed} is used. It
+Throughout this article a data set called \code{speed} is used. It
consists of three time series with three variables: response time,
accuracy, and a covariate Pacc which defines the relative pay-off for
speeded and accurate responding. The participant in this experiment
@@ -406,8 +420,8 @@
The \code{depmix} function returns an object of class \code{depmix}
which contains the model specification (and not a fitted model!).
Note also that start values for the transition parameters are provided
-in this call using the \code{trstart} argument. The package does not
-provide automatic starting values.
+in this call using the \code{trstart} argument. At this time, the package does
+not provide automatic starting values.
The so-defined models needs to be \code{fit}ted with the following
line of code:
@@ -434,7 +448,7 @@
BIC: 211.275
\end{CodeOutput}
\end{CodeChunk}
-These statistics may be extracted using \code{logLik},
+These statistics can also be extracted using \code{logLik},
\code{AIC} and \code{BIC}, respectively.
The \code{summary} method of \code{fit}ted models provides the parameter
@@ -494,10 +508,13 @@
logistic model. In particular, each row of the transition matrix is
parameterized by a baseline category logistic multinomial,
meaning that the parameter for the
-base category is fixed at zero (see \citet[see][p.\ 267
-ff.]{Agresti2002} for multinomial logistic models and various
-parameterizations). See also \citet{Chung2007} for similar models, latent transition models using logistic regression on the transition parameters. They fit such models on repeated measurement
-data ($T=2$) using Bayesian methods. The default baseline category is the first state.
+base category is fixed at zero \citep[see][p.\ 267
+ff., for multinomial logistic models and various
+parameterizations]{Agresti2002}. See also \citet{Chung2007} for similar models,
+latent transition models using logistic regression on the transition parameters.
+They fit such models on repeated measurement
+data ($T=2$) using Bayesian methods. The default baseline category is the
+first state.
Hence, for example, for a 3-state model, the initial state probability
model would have three parameters of which the first is fixed at zero
and the other two are freely estimated.
More information about the depmix-commits
mailing list