[Depmix-commits] r308 - papers/jss

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Wed Aug 5 16:03:05 CEST 2009


Author: maarten
Date: 2009-08-05 16:03:04 +0200 (Wed, 05 Aug 2009)
New Revision: 308

Modified:
   papers/jss/article.tex
Log:
- changes to EM description

Modified: papers/jss/article.tex
===================================================================
--- papers/jss/article.tex	2009-08-05 13:35:22 UTC (rev 307)
+++ papers/jss/article.tex	2009-08-05 14:03:04 UTC (rev 308)
@@ -309,31 +309,45 @@
 for the prior model, transition model, and response model respectively. The 
 joint log likelihood can be written as
 \begin{equation}
-\log \Prob(O_{1:T}, S_{1:T}|\greekv{\theta}) = \log \Prob(S_1|\greekv{\theta}_1) 
-+ \sum_{t=2}^{T} \log \Prob(S_t|S_{t-1},\greekv{\theta}_2) 
-+ \sum_{t=1}^{T} \log \Prob(O_t|S_t,\greekv{\theta}_3)
+\log \Prob(\vc{O}_{1:T}, \vc{S}_{1:T}|\vc{z}_{1:T},\greekv{\theta}) = \log 
+\Prob(S_1|\vc{z}_{1},\greekv{\theta}_1) 
++ \sum_{t=2}^{T} \log \Prob(S_t|S_{t-1},\vc{z}_{t-1},\greekv{\theta}_2) 
++ \sum_{t=1}^{T} \log \Prob(O_t|S_t,\vc{z}_t,\greekv{\theta}_3)
 \end{equation}
-This likelihood depends on the unobserved states $S_t$. In the Expectation step,
-we replace these with their expected values given a set of (initial) parameters 
-$\greekv{\theta}' = (\greekv{\theta}'_1, \greekv{\theta}'_2,\greekv{\theta}'_3)$
-and observations $O_{1:T}$. The expected log likelihood 
+This likelihood depends on the unobserved states $\vc{S}_{1:T}$. In the 
+Expectation step, we replace these with their expected values given a set of 
+(initial) parameters $\greekv{\theta}' = (\greekv{\theta}'_1, 
+\greekv{\theta}'_2,\greekv{\theta}'_3)$ and observations $O_{1:T}$. The expected 
+log likelihood 
 \begin{equation}
 Q(\greekv{\theta},\greekv{\theta}') = E_{\greekv{\theta}'} 
-(\log \Prob(O_{1:T},S_{1:T}|O_{1:T},\greekv{\theta}))
+(\log \Prob(\vc{O}_{1:T},\vc{S}_{1:T}|\vc{O}_{1:T},\vc{z}_{1:T},\greekv{\theta}))
 \end{equation}
 can be written as
 %\begin{equation}
 \begin{multline}
 \label{eq:Q}
 Q(\greekv{\theta},\greekv{\theta}') = 
-\sum_{j=1}^n \gamma_1(j) \log \Prob(S_1=j|\greekv{\theta}_1) \\ 
+\sum_{j=1}^n \gamma_1(j) \log \Prob(S_1=j|\vc{z}_1,\greekv{\theta}_1) \\ 
 + \sum_{t=2}^T \sum_{j=1}^n \sum_{k=1}^n \xi_t^i(j,k) \log \Prob(S_t = k|S_{t-1} 
-= j,\greekv{\theta}_2)  \\
+= j,\vc{z}_{t-1},\greekv{\theta}_2)  \\
  + \sum_{t=1}^T \sum_{j=1}^n \sum_{k=1}^m \gamma_t(j) 
-\ln \Prob(O^k_t|S_t=j,\greekv{\theta}_3),
+\ln \Prob(O^k_t|S_t=j,\vc{z}_t,\greekv{\theta}_3),
 \end{multline}
 %\end{equation}
-where the expected values $\xi_t(j,k) =  P(S_t = k, S_{t-1} = j|O_{1:T},\greekv{\theta}')$ and $\gamma_t(j) = P(S_t = j|O_{1:T},\greekv{\theta}')$ can be computed effectively by the Forward-Backward algorithm \citep[see e.g.,][]{Rabiner1989}. The Maximisation step consists of the maximisation of (\ref{eq:Q}) for $\greekv{\theta}$. As the r.h.s. of (\ref{eq:Q}) consists of three separate parts, we can maximise separately for $\greekv{\theta}_1$, $\greekv{\theta}_2$ and $\greekv{\theta}_3$. In common models, maximisation for $\greekv{\theta}_1$ and $\greekv{\theta}_2$ is performed by the \code{nnet.default} routine in \pkg{MASS}, and maximisation for $\greekv{\theta}_3$ by the \code{glm} routine. 
+where the expected values $\xi_t(j,k) =  P(S_t = k, S_{t-1} = j|\vc{O}_{1:T},
+\vc{z}_{1:T},\greekv{\theta}')$ and $\gamma_t(j) = P(S_t = j|\vc{O}_{1:T},
+\vc{z}_{1:T},\greekv{\theta}')$ can be computed effectively by the 
+Forward-Backward algorithm \citep[see e.g.,][]{Rabiner1989}. The Maximisation 
+step consists of the maximisation of (\ref{eq:Q}) for $\greekv{\theta}$. As the 
+right hand side of (\ref{eq:Q}) consists of three separate parts, we can 
+maximise separately for $\greekv{\theta}_1$, $\greekv{\theta}_2$ and 
+$\greekv{\theta}_3$. In common models, maximisation for $\greekv{\theta}_1$ and 
+$\greekv{\theta}_2$ is performed by the \code{nnet.default} routine in the 
+\pkg{nnet} package \citep{Venables2002}, and maximisation for 
+$\greekv{\theta}_3$ by the standard \code{glm} routine. Note that for the latter 
+maximisation, the expected values $\gamma_t(j)$ are used as prior weights of the 
+observations $O^k_t$.
 
 
 
@@ -353,7 +367,7 @@
 EM can lead to wrong parameter estimates when applying constraints.
 Hence, in \pkg{depmixS4}, EM is used by default in unconstrained
 models, but otherwise, direct optimization is done using \pkg{Rdonlp2}
-\cite{Tamura2009,Spellucci2002}, because it handles general linear
+\citep{Tamura2009,Spellucci2002}, because it handles general linear
 (in)equality constraints, and optionally also non-linear constraints.
 
 %Need some more on EM and how/why it is justified to do separate weighted
@@ -379,7 +393,7 @@
 
 \subsection{Example data: speed}
 
-Throughout this manual a data set called \code{speed} is used.  It
+Throughout this article a data set called \code{speed} is used.  It
 consists of three time series with three variables: response time,
 accuracy, and a covariate Pacc which defines the relative pay-off for
 speeded and accurate responding.  The participant in this experiment
@@ -406,8 +420,8 @@
 The \code{depmix} function returns an object of class \code{depmix}
 which contains the model specification (and not a fitted model!).
 Note also that start values for the transition parameters are provided
-in this call using the \code{trstart} argument. The package does not 
-provide automatic starting values. 
+in this call using the \code{trstart} argument. At this time, the package does 
+not provide automatic starting values. 
 
 The so-defined models needs to be \code{fit}ted with the following
 line of code:
@@ -434,7 +448,7 @@
 BIC:  211.275 
 \end{CodeOutput}
 \end{CodeChunk}
-These statistics may be extracted using \code{logLik},
+These statistics can also be extracted using \code{logLik},
 \code{AIC} and \code{BIC}, respectively.
 
 The \code{summary} method of \code{fit}ted models provides the parameter
@@ -494,10 +508,13 @@
 logistic model.   In particular, each row of the transition matrix is
 parameterized by a baseline category logistic multinomial, 
 meaning that the parameter for the
-base category is fixed at zero (see \citet[see][p.\ 267
-ff.]{Agresti2002} for multinomial logistic models and various
-parameterizations). See also \citet{Chung2007} for similar models, latent transition models using logistic regression on the transition parameters. They fit such models on repeated measurement
-data ($T=2$) using Bayesian methods.  The default baseline category is the first state.
+base category is fixed at zero \citep[see][p.\ 267
+ff., for multinomial logistic models and various
+parameterizations]{Agresti2002}. See also \citet{Chung2007} for similar models, 
+latent transition models using logistic regression on the transition parameters. 
+They fit such models on repeated measurement
+data ($T=2$) using Bayesian methods.  The default baseline category is the 
+first state.
 Hence, for example, for a 3-state model, the initial state probability
 model would have three parameters of which the first is fixed at zero
 and the other two are freely estimated.



More information about the depmix-commits mailing list