[Depmix-commits] r285 - in papers: jss jss/graphs vignette

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Wed Jul 8 16:57:03 CEST 2009


Author: ingmarvisser
Date: 2009-07-08 16:57:02 +0200 (Wed, 08 Jul 2009)
New Revision: 285

Added:
   papers/jss/README
   papers/jss/article.tex
   papers/jss/bookreview.tex
   papers/jss/codesnippet.tex
   papers/jss/depmix-intro.Rnw
   papers/jss/depmixS4-jss.tex
   papers/jss/graphs/speed1.pdf
   papers/jss/jss.bst
   papers/jss/jss.cls
   papers/jss/jss.dbj
   papers/jss/jss.drv
   papers/jss/jss.dtx
   papers/jss/jss.ins
   papers/jss/jss.pdf
   papers/jss/jsslogo.jpg
   papers/jss/robKalman.Rnw
   papers/jss/softwarereview.tex
   papers/jss/toaddlater.tex
   papers/vignette/depmix-intro.aux
   papers/vignette/depmix-intro.log
   papers/vignette/depmix-intro.pdf
   papers/vignette/depmix-intro.toc
Removed:
   papers/jss/depmix-manual.Rnw
   papers/jss/dmm4.tex
   papers/jss/graphs/stimuli.pdf
   papers/jss/hmmlta.tex
   papers/jss/information.tex
   papers/vignette/toaddlater.tex
Modified:
   papers/vignette/depmix-intro.tex
Log:
Moved around files in the papers directory

Added: papers/jss/README
===================================================================
--- papers/jss/README	                        (rev 0)
+++ papers/jss/README	2009-07-08 14:57:02 UTC (rev 285)
@@ -0,0 +1,47 @@
+***************************************************
+** jss: A Document Class for Publications in the **
+**        Journal of Statistical Software        **
+***************************************************
+
+This zip-archive contains the pdfLaTeX infrastructure
+for publications in the Journal of Statistical Software.
+The files
+  - jss.cls (LaTeX2e class)
+  - jss.bst (BibTeX style)
+  - jsslogo.jpg (JPG logo)
+need to be included in your search path (local working
+directory, texmf or localtexmf tree).
+
+A manual how to use jss.cls is provided in
+  - jss.pdf.
+
+Furthermore, there are templates for articles, code
+snippets, book reviews and software reviews available
+in 
+  - article.tex
+  - codesnippet.tex
+  - bookreview.tex
+  - softwarereview.tex
+
+The final version of JSS papers should be prepared using
+JSS styles; the submission of the final version needs to
+include the full sources (.tex, .bib, and all graphics).
+A quick check for the most important aspects of the JSS
+style is given below; authors should make sure that all
+of them are addressed in the final version:  
+  - The manuscript can be compiled by pdfLaTeX.
+  - \proglang, \pkg and \code has been used for highlighting
+    throughout the paper (including references).    
+  - References are provided in a .bib BibTeX database
+    and included in the text by \cite, \citep, \citet,
+    etc.
+  - Titles and headers are formatted as described in
+    the JSS manual:
+      - \title in title style,
+      - \section etc. in sentence style,
+      - all titles in the BibTeX file in title style.
+  - Figures, tables and equations are marked with a \label
+    and referred to by \ref, e.g., "Figure~\ref{...}".
+  - Software packes are \cite{}d properly.
+For more details, see the manual jss.pdf, in particular
+the style checklist in Section 2.1.


Property changes on: papers/jss/README
___________________________________________________________________
Name: svn:eol-style
   + native

Added: papers/jss/article.tex
===================================================================
--- papers/jss/article.tex	                        (rev 0)
+++ papers/jss/article.tex	2009-07-08 14:57:02 UTC (rev 285)
@@ -0,0 +1,63 @@
+\documentclass[article]{jss}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%% declarations for jss.cls %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+%% almost as usual
+\author{Achim Zeileis\\Wirtschaftsuniversit\"at Wien \And 
+        Second Author\\Plus Affiliation}
+\title{A Capitalized Title: Something About a Package \pkg{foo}}
+
+%% for pretty printing and a nice hypersummary also set:
+\Plainauthor{Achim Zeileis, Second Author} %% comma-separated
+\Plaintitle{A Capitalized Title: Something About a Package foo} %% without formatting
+\Shorttitle{A Capitalized Title} %% a short title (if necessary)
+
+%% an abstract and keywords
+\Abstract{
+  The abstract of the article.
+}
+\Keywords{keywords, comma-separated, not capitalized, \proglang{Java}}
+\Plainkeywords{keywords, comma-separated, not capitalized, Java} %% without formatting
+%% at least one keyword must be supplied
+
+%% publication information
+%% NOTE: Typically, this can be left commented and will be filled out by the technical editor
+%% \Volume{13}
+%% \Issue{9}
+%% \Month{September}
+%% \Year{2004}
+%% \Submitdate{2004-09-29}
+%% \Acceptdate{2004-09-29}
+
+%% The address of (at least) one author should be given
+%% in the following format:
+\Address{
+  Achim Zeileis\\
+  Department f\"ur Statistik \& Mathematik\\
+  Wirtschaftsuniversit\"at Wien\\
+  1090 Wien, Austria\\
+  E-mail: \email{Achim.Zeileis at wu-wien.ac.at}\\
+  URL: \url{http://statmath.wu-wien.ac.at/~zeileis/}
+}
+%% It is also possible to add a telephone and fax number
+%% before the e-mail in the following format:
+%% Telephone: +43/1/31336-5053
+%% Fax: +43/1/31336-734
+
+%% for those who use Sweave please include the following line (with % symbols):
+%% need no \usepackage{Sweave.sty}
+
+%% end of declarations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+
+\begin{document}
+
+%% include your article here, just as usual
+%% Note that you should use the \pkg{}, \proglang{} and \code{} commands.
+
+\section[About Java]{About \proglang{Java}}
+%% Note: If there is markup in \(sub)section, then it has to be escape as above.
+
+\end{document}


Property changes on: papers/jss/article.tex
___________________________________________________________________
Name: svn:eol-style
   + native

Added: papers/jss/bookreview.tex
===================================================================
--- papers/jss/bookreview.tex	                        (rev 0)
+++ papers/jss/bookreview.tex	2009-07-08 14:57:02 UTC (rev 285)
@@ -0,0 +1,51 @@
+\documentclass[bookreview]{jss}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%% declarations for jss.cls %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+%% reviewer
+\Reviewer{Frederic Udina\\Pompeu Fabra University}
+\Plainreviewer{Frederic Udina}
+
+%% about the book
+\Booktitle{Visualizing Categorical Data}
+\Bookauthor{Michael Friendly}
+\Publisher{SAS Institute Inc.}
+\Pubaddress{Carey, NC}
+\Pubyear{2000}
+\ISBN{1-58025-660-0}
+\Pages{456}
+\Price{USD 69.95 (P)}
+\URL{http://www.math.yorku.ca/SCS/vcd/}
+%% if different from \Booktitle also set
+%% \Plaintitle{Visualizing Categorical Data}
+%% \Shorttitle{Visualizing Categorical Data}
+
+%% publication information
+%% NOTE: Typically, this can be left commented and will be filled out by the technical editor
+%% \Volume{11}
+%% \Issue{7}
+%% \Month{July}
+%% \Year{2004}
+%% \Submitdate{2004-07-26}
+
+%% address of (at least one) author
+\Address{
+  Frederic Udina\\
+  Pompeu Fabra University\\
+  Department of Economics and Business\\
+  Barcelona, Spain 08005\\
+  E-mail: \email{udina at upf.es}\\
+  URL: \url{http://libiya.upf.es/}
+}
+
+%% end of declarations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+
+\begin{document}
+
+%% include the review as usual
+%% Note that you should use the \pkg{}, \proglang{} and \code{} commands.
+
+\end{document}


Property changes on: papers/jss/bookreview.tex
___________________________________________________________________
Name: svn:eol-style
   + native

Added: papers/jss/codesnippet.tex
===================================================================
--- papers/jss/codesnippet.tex	                        (rev 0)
+++ papers/jss/codesnippet.tex	2009-07-08 14:57:02 UTC (rev 285)
@@ -0,0 +1,64 @@
+\documentclass[codesnippet]{jss}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%% declarations for jss.cls %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+%% almost as usual
+\author{Achim Zeileis\\Wirtschaftsuniversit\"at Wien \And 
+        Second Author\\Plus Affiliation}
+\title{A Capitalized Title: Possibly Containing More Details}
+
+%% for pretty printing and a nice hypersummary also set:
+\Plainauthor{Achim Zeileis, Second Author} %% comma-separated
+\Plaintitle{A Capitalized Title: Possibly Containing More Details} %% without formatting
+\Shorttitle{A Capitalized Title} %% a short title (if necessary)
+
+%% an abstract and keywords
+\Abstract{
+  Here should be the abstract.
+}
+\Keywords{keywords, comma-separated, not capitalized, \proglang{Java}}
+\Plainkeywords{keywords, comma-separated, not capitalized, Java} %% without formatting
+%% at least one keyword must be supplied
+
+%% publication information
+%% NOTE: Typically, this can be left commented and will be filled out by the technical editor
+%% \Volume{13}
+%% \Issue{9}
+%% \Month{September}
+%% \Year{2004}
+%% \Submitdate{2004-09-29}
+%% \Acceptdate{2004-09-29}
+
+%% The address of (at least) one author should be given
+%% in the following format:
+\Address{
+  Achim Zeileis\\
+  Department f\"ur Statistik \& Mathematik\\
+  Wirtschaftsuniversit\"at Wien\\
+  1090 Wien, Austria\\
+  E-mail: \email{Achim.Zeileis at wu-wien.ac.at}\\
+  URL: \url{http://statmath.wu-wien.ac.at/~zeileis/}
+}
+%% It is also possible to add a telephone and fax number
+%% before the e-mail in the following format:
+%% Telephone: +43/1/31336-5053
+%% Fax: +43/1/31336-734
+
+%% for those who use Sweave please include the following line (with % symbols):
+%% need no \usepackage{Sweave.sty}
+
+%% end of declarations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+
+\begin{document}
+
+%% include your article here, just as usual
+%% Note that you should use the \pkg{}, \proglang{} and \code{} commands.
+
+\section[About Java]{About \proglang{Java}}
+%% Note: If there is markup in \(sub)section, then it has to be escape as above.
+
+
+\end{document}


Property changes on: papers/jss/codesnippet.tex
___________________________________________________________________
Name: svn:eol-style
   + native

Added: papers/jss/depmix-intro.Rnw
===================================================================
--- papers/jss/depmix-intro.Rnw	                        (rev 0)
+++ papers/jss/depmix-intro.Rnw	2009-07-08 14:57:02 UTC (rev 285)
@@ -0,0 +1,900 @@
+
+% -*- mode: noweb; noweb-default-code-mode: R-mode; -*-
+%&LaTeX
+
+\documentclass[]{report}
+
+\usepackage[]{amsmath, amsfonts, amstext, amsthm}
+
+%\batchmode
+
+\usepackage{Sweave}
+
+\usepackage{Rd}
+
+\usepackage[round]{natbib}
+
+\title{depmix: An R-package for fitting mixture models on mixed
+multivariate data with Markov dependencies}
+
+\author{Version: 0.9.1 \\ Ingmar Visser\thanks{Correspondence concerning this manual
+should be adressed to: Ingmar Visser, Department of Psychology,
+University of Amsterdam, Roetersstraat 15, 1018 WB, Amsterdam, The
+Netherlands}\\
+Developmental Processes Research Group\\
+Department of Psychology, University of Amsterdam\\
+i.visser at uva.nl}
+
+\date{\today}
+
+%%\VignetteIndexEntry{depmix: An R-package for fitting mixture models on mixed multivariate data with Markov dependencies}
+%%\VignetteDepends{depmix}
+%%\VignetteKeywords{R, latent/hidden Markov model, latent class model, finite mixture model, direct optimization, gradients}
+%%\VignettePackage{depmix}
+
+\SweaveOpts{prefix.string=graphs/depmix}
+
+\newcommand{\vc}{\mathbf}
+\newcommand{\mat}{\mathbf}
+
+% \newcommand{\pkg}{\texttt}
+% \newcommand{\code}{\texttt}
+
+\begin{document}
+
+\bibliographystyle{plainnat}
+
+\maketitle
+
+\begin{abstract}
+	\pkg{depmix} implements a general class of mixture models with
+	Markovian dependencies between them in the R programming language
+	\citep{R2006}.  This includes standard Markov models,
+	latent/hidden Markov models, and latent class and finite mixture
+	distribution models.  The models can be fitted on mixed
+	multivariate data from a number of distributions including the
+	binomial, multinomial, gaussian, lognormal and Weibull
+	distributions.  Parameters can be estimated subject to general
+	linear constraints, and with optional inclusion of regression on
+	(time-dependent) covariates.  Parameter estimation is done
+	through a direct optimization approach with gradients using the
+	\code{nlm} optimization routine.  Optional support for using the NPSOL
+	optimization routines is provided as well.  A number of illustrative
+	examples are included.
+\end{abstract}
+
+\tableofcontents
+
+\chapter{Introduction}
+
+Markov and latent Markov models are frequently used in the social
+sciences, in different areas and applications.  In psychology, they
+are used for modelling learning processes, see \citet{Wickens1982}, for an
+overview, and \citet{Schmittmann2005} for a recent application.  In
+economics, latent Markov models are commonly used as regime switching
+models, see e.g.\ \citet{Kim1994} and \citet{Ghysels1994}.  Further
+applications include speech recognition \citep{Rabiner1989}, EEG analysis
+\citep{Rainer2000}, and genetics \citep{Krogh1998}.  In those latter areas
+of application, latent Markov models are usually referred to as hidden
+Markov models.
+
+The \pkg{depmix} package was motivated by the fact that Markov models
+are used commonly in the social sciences, but no comprehensive package
+was available for fitting such models.  Common programs for Markovian
+models include Panmark \citep{Pol1996}, and for latent class models
+Latent Gold 
+\citep{Vermunt2003}.  Those programs are lacking a number of
+important features, besides not being freely available.  In
+particular, \pkg{depmix}: 1) handles multiple case, or multiple group,
+analysis; 2) handles arbitrarily long time series; 3) estimates models
+with general linear constraints between parameters; 4) analyzes mixed
+distributions, i.e., combinations of categorical and continuous
+observed variables; 5) fits mixtures of latent Markov models to deal
+with population heterogeneity; 6) can fit models with covariates.
+Although \pkg{depmix} is specifically meant for dealing with
+longitudinal or time series data, for say $T>100$, it can also handle
+the limit case with $T=1$.  In those cases, there are no time
+dependencies between observed data, and the model reduces to a finite
+mixture model, or a latent class model.  In the next chapter, an outline
+is provided of the model and the likelihood equations.  In the
+chapters after that a number of examples are presented.
+
+
+\section*{Acknowledgements} 
+
+Development of this pacakge was supported by European Commission grant
+51652 (NEST) and by a VENI grant from the Dutch Organization for Scientific
+Research (NWO) to Ingmar Visser.
+
+I am indebted to many people for providing help in writing this package.
+First and foremost Maartje Raijmakers and Verena Schmittmann tested
+countless earlier versions, spotted bugs and suggested many features.
+Moreover, Maartje Raijmakers provided the discrimination data set.  Han van
+der Maas provided the speed-accuracy data and thereby neccessitated
+implementing models with time-dependent covariates.  Conor Dolan and Raoul
+Grasman both provided valuable advice on statistics in general and
+optimization in particular.
+
+
+\chapter{Dependent mixture models}
+
+The data considered here, has the general form $O_{1}^{1}, \ldots,
+O_{1}^{m}$, $O_{2}^{1}, \ldots, O_{2}^{m}$, \ldots, $O_{T}^{1}, \ldots,
+O_{T}^{m}$ for an $m$-variate time series of length $T$.  As an
+example, consider a time series of responses generated by a single
+subject in a reaction time experiment.  The data consists of three
+variables, reaction time, accuracy and a covariate which is a pay-off
+factor which determines the reward for speed and accuracy.  These
+variables are measured at 168, 134 and 137 occasions respectively.
+Below, a summary is provided for these data, as well as a plot of the
+first timeseries, which is selected by \code{nind=1}.
+
+<<echo=FALSE, results=hide>>=
+.libPaths(new="/Users/ivisser/Library/R/library/")
+library(depmix)
+load("models.Rda")
+@
+
+<<>>=
+data(speed)
+summary(speed)
+@
+\begin{figure}[htbp]
+  \begin{center}
+<<fig=TRUE>>=
+plot(speed,nind=1) 
+@
+	\caption{Reaction times, accuracy and pay-off values for
+	the first series of responses in dataset \texttt{speed}.}
+  \end{center}
+\end{figure}
+
+The latent Markov model is commonly associated with data of this type,
+albeit usually only multinomial variables are considered.  However,
+common estimation procedures, such as those implemented in
+\citet{Pol1996} are not suitable for long time series due to
+underflow problems.  In contrast, the hidden Markov model is typically
+only used for `long' univariate time series.  In the next section, the
+likelihood and estimation procedure for the hidden Markov model is
+described, given data of the above form.
+
+The dependent mixture model is defined by the following elements:
+\begin{enumerate}
+	
+	\item a set $\vc{S}$ of latent classes or states $S_{i},\, i=1,
+	\ldots , n$,
+	
+	\item a matrix $\mat{A}$ of transition probabilities $a_{ij}$ for
+	the transition from state $S_{i}$ to state $S_{j}$,
+	
+	\item a set $\vc{B}$ of observation functions $b_{j}(\cdot)$ that
+	provide the conditional probabilities associated with latent state
+	$S_{j}$,
+	
+	\item a vector $\pmb{\pi}$ of latent state initial probabilities
+	$\pi_{i}$
+\end{enumerate}
+When transitions are added to the latent class model, it is more 
+appropriate to refer to the classes as states. The word class is 
+rather more associated with a stable trait-like attribute whereas a 
+state can change over time. 
+
+
+\section{Likelihood}
+
+The loglikelihood of hidden Markov models is usually computed by the
+so-called forward-backward algorithm \citep{Baum1966,Rabiner1989}, or rather by
+the forward part of this algorithm.  \cite{Lystig2002} changed the forward
+algorithm in such a way as to allow computing the gradients of the
+loglikelihood at the same time.  They start by rewriting the
+likelihood as follows (for ease of exposition the dependence on the
+model parameters is dropped here):
+\begin{equation}
+	L_{T} = Pr(\vc{O}_{1}, \ldots, \vc{O}_{T}) = \prod_{t=1}^{T} 
+Pr(\vc{O}_{t}|\vc{O}_{1}, 
+	\ldots, \vc{O}_{t-1}), 
+	\label{condLike}
+\end{equation}
+where $Pr(\vc{O}_{1}|\vc{O}_{0}):=Pr(\vc{O}_{1})$. Note that for a 
+simple, i.e.\ observed, Markov chain these probabilities reduce to 
+$Pr(\vc{O}_{t}|\vc{O}_{1},\ldots, 
+\vc{O}_{t-1})=Pr(\vc{O}_{t}|\vc{O}_{t-1})$.
+The log-likelihood can now be expressed as:
+\begin{equation}
+	l_{T} = \sum_{t=1}^{T} \log[Pr(\vc{O}_{t}|\vc{O}_{1}, \ldots, 
+\vc{O}_{t-1})].
+	\label{eq:condLogl}
+\end{equation}
+
+To compute the log-likelihood, \cite{Lystig2002} define the following 
+(forward) recursion:
+\begin{align}
+	\phi_{1}(j) &:= Pr(\vc{O}_{1}, S_{1}=j) = \pi_{j} b_{j}(\vc{O}_{1}) 
+	\label{eq:fwd1} \\
+\begin{split}
+	\phi_{t}(j) &:= Pr(\vc{O}_{t}, S_{t}=j|\vc{O}_{1}, \ldots, 
+\vc{O}_{t-1}) \\
+	&= \sum_{i=1}^{N} [\phi_{t-1}(i)a_{ij}b_{j}(\vc{O}_{t})] \times 
+(\Phi_{t-1})^{-1},
+	\label{eq:fwdt} 
+\end{split} 
+\end{align}
+where $\Phi_{t}=\sum_{i=1}^{N} \phi_{t}(i)$. Combining 
+$\Phi_{t}=Pr(\vc{O}_{t}|\vc{O}_{1}, \ldots, \vc{O}_{t-1})$, and 
+equation~(\ref{eq:condLogl}) gives the following expression for the 
+log-likelihood:
+\begin{equation}
+	l_{T} = \sum_{t=1}^{T} \log \Phi_{t}.
+	\label{eq:logl}
+\end{equation}
+
+The above forward recursion can readily be generalized to mixture 
+models, in which it is assumed that the data are realizations of a 
+number of different LMMs and the goal is to assign posterior 
+probabilities to sequences of observations. This situation occurs, 
+for example, in learning data where different learning strategies may 
+lead to different answer patterns. From an observed sequence of 
+responses, it may not be immediately clear from which learning 
+process they stem. Hence, it is interesting to consider a mixture of 
+latent Markov models which incorporate restrictions that are 
+consistent with each of the learning strategies. 
+
+To compute the likelihood of a mixture of $K$ models, define the 
+forward recursion variables as follows (these variables now have an 
+extra index $k$ indicating that observation and transition 
+probabilities are from latent model $k$):
+\begin{align}
+\begin{split}
+\phi_{1}(j_{k}) &=  Pr(\vc{O}_{1}, 
+S_{1}=j_{k})=p_{k}\pi_{j_{k}}b_{j_{k}}(\vc{O}_{1}).
+\end{split}\label{eq:fwd1mix} \\
+\begin{split}
+\phi_{t}(j_{k})   &=   Pr(\vc{O}_{t}, S_{t}=j_{k}|\vc{O}_{1}, \ldots, 
+\vc{O}_{t-1}) \\
+			&= \left[ \sum_{k=1}^{K} \sum_{i=1}^{n_{k}} \phi_{t-1}(i_{k}) 
+			a_{ij_{k}}b_{j_{k}}(\vc{O}_{t}) \right] \times (\Phi_{t-1})^{-1},
+\end{split}\label{eq:fwdtmix} 
+\end{align}
+where $\Phi_{t} = \sum_{k=1}^{K}\sum_{i=1}^{n_{k}} \phi_{t}(j_{k})$.
+Note that the double sum over $k$ and $n_{k}$ is simply an enumeration
+of all the states of the model.  Now, because $a_{ij_{k}}=0$ whenever
+$S_{i}$ is not part of component $k$, the sum over $k$ can be dropped
+and hence equation~\ref{eq:fwdtmix} reduces to:
+\begin{equation}
+	\phi_{t}(j_{k}) = \left[ \sum_{i=1}^{n_{k}} \phi_{t-1}(i_{k}) 
+			a_{ij_{k}}b_{j_{k}}(\vc{O}_{t}) \right] \times (\Phi_{t-1})^{-1}
+\end{equation}
+The loglikelihood is computed by applying equation~\ref{eq:logl} on
+these terms.  For multiple cases, the log-likelihood is simply the sum
+over the individual log-likelihoods. 
+
+\paragraph{Computational considerations} From
+equations~(\ref{eq:fwd1}--\ref{eq:fwdt}), it can be seen that
+computing the forward variables, and hence the log-likelihood, takes
+$O(Tn^{2})$ computations, for an $n$-state model and a time series of
+length $T$.  Consider a mixture of two components, one with two states
+and the other with three states.  Using
+equations~(\ref{eq:fwd1}--\ref{eq:fwdt}) to compute the log-likelihood
+of this model one needs $O(Tn^{2})=O(T\times 25)$ computations whereas
+with the mixture equations~(\ref{eq:fwd1mix}--\ref{eq:fwdtmix}),
+$\sum_{n_{i}} O(n_{i}^{2}T)$ computations are needed, in this case
+$O(T \times 13)$.  So, it can be seen that in this easy example the
+computational cost is almost halved. 
+
+
+\section{Gradients}
+
+\newcommand{\fpp}{\frac{\partial} {\partial \lambda_{1}}}
+
+See equations 10--12  in \cite{Lystig2002} for the score recursion 
+functions of the hidden Markov model for a univariate time series. 
+Here the corresponding score recursion for the multivariate mixture 
+case are provided. The $t=1$ components of this score recursion are 
+defined as (for an arbitrary parameter $\lambda_{1}$):
+\begin{align}
+\psi_{1}(j_{k};\lambda_{1}) &:=  \fpp Pr(\vc{O}_{1}|S_{1}=j_{k}) \\
+\begin{split} 
+	&= \left[  \fpp p_{k} \right] \pi_{j_{k}}b_{j_{k}}(\vc{O}_{1}) + 
+	p_{k}\left[ \fpp \pi_{j_{k}} \right] b_{j_{k}}(\vc{O}_{1}) \\
+	& \qquad  + p_{k}\pi_{j_{k}} \left[ \fpp 
+b_{j_{k}}(\vc{O}_{1})\right],
+\end{split} \label{eq:psi1}
+\end{align}
+and for $t>1$ the definition is:
+\begin{align}
+\psi_{t}(j_{k};\lambda_{1})  & =  \frac{\fpp Pr(\vc{O}_{1}, \ldots, 
+\vc{O}_{t}, S_{t}=j_{k})}
+			{Pr(\vc{O}_{1}, \ldots, \vc{O}_{t-1})}  \\
+\begin{split} 
+	& =  
+			 \sum_{i=1}^{n_{k}} \Bigg\{ \psi_{t-1}(i;\lambda_{1})a_{ij_{k}} 
+			 b_{j_{k}}(\vc{O}_{t}) \\ 
+			 &\qquad +\phi_{t-1}(i) \left[ \fpp a_{ij_{k}} \right] b_{j_{k}} 
+(\vc{O}_{t}) \\
+			&\qquad +\phi_{t-1}(i)a_{ij_{k}}  \left[ \fpp b_{j_{k}} 
+(\vc{O}_{t}) \right] \Bigg\} 
+			\times (\Phi_{t-1})^{-1}.
+\end{split} \label{eq:psit}
+\end{align}
+
+Using above equations, \cite{Lystig2002} derive the following equation 
+for the partial derivative of the likelihood:
+\begin{equation}
+	\fpp l_{T}= 	
+		\frac{\mathbf{\Psi}_{T}(\lambda_{1})}{\mathbf{\Phi}_{T}},
+\end{equation}
+where $\Psi_{t}=\sum_{k=1}^{K} \sum_{i=1}^{n_{k}} 
+\psi_{t}(j_{k};\lambda_{1})$. 
+Starting from the equation from the logarithm of the likelihood, this 
+is easily seen to be correct: 
+\begin{eqnarray*}
+	\fpp \log Pr(\vc{O}_{1}, \ldots, \vc{O}_{T}) &=& Pr(\vc{O}_{1}, 
+\ldots, \vc{O}_{T})^{-1} 
+	\fpp Pr(\vc{O}_{1}, \ldots, \vc{O}_{T}) \\
+	&=&  \frac{Pr(\vc{O}_{1}, \ldots, \vc{O}_{T-1})}{Pr(\vc{O}_{1}, 
+\ldots, \vc{O}_{T})}  \Psi_{T} (\lambda_{1}) \\
+	&=&  \frac{\mathbf{\Psi}_{T}(\lambda_{1})}{\mathbf{\Phi}_{T}}.
+\end{eqnarray*}
+
+Further, to actually compute the gradients, the partial derivatives of
+the parameters and observation distribution functions are neccessary,
+i.e., $\fpp p_{k}$, $\fpp \pi_{i}$, $\fpp a_{ij}$, and $\fpp
+\vc{b}_{i}(\vc{O}_{t})$.  Only the latter case requires some
+attention.  We need the following derivatives $\fpp
+\vc{b}_{j}(\vc{O}_{t})=\fpp \vc{b}_{j}(O_{t}^{1}, \ldots, O_{t}^{m})$, for
+arbitrary parameters $\lambda_{1}$. To stress that $\vc{b}_{j}$ is a
+vector of functions, we here used boldface. First note that because of local
+independence we can write:
+\begin{equation*}
+	\fpp \left[ b_{j}(O_{t}^{1}, \ldots, O_{t}^{m}) \right] = \frac{\partial} 
+{\partial \lambda_{1} } \left[ b_{j}(O_{t}^{1}) \right] \times  
+\left[ b_{j}(O_{t}^{2}) \right], \ldots,  \left[ b_{j}(O_{t}^{m}) \right].  
+\end{equation*}
+Applying the chain rule for products we get:
+\begin{equation}
+	\fpp [b_{j}(O_{t}^{1}, \ldots, O_{t}^{m})] =
+	\sum_{l=1}^{m} \left[ \prod_{i=1, \ldots, \hat{l}, \ldots, m} 
+b_{j}(O_{t}^{i}) \right] \times
+	\fpp  [b_{j}(O_{t}^{l})],
+	\label{partialProd}
+\end{equation}
+where $\hat{l}$ means that that term is left out of the product. 
+These latter terms, $\frac{\partial} {\partial \lambda_{1} }  
+[b_{j}(O_{t}^{k})]$, are easy to compute given either multinomial or 
+gaussian observation densities $b_{j}(\dot)$
+
+
+\section{Parameter estimation}
+
+Parameters are estimated in \pkg{depmix} using a direct optimization
+approach instead of the EM algorithm which is frequently used for this type
+of model.  The EM algorithm however has some drawbacks.  First, it can be
+slow to converge.  Second, applying constraints to parameters can be
+problmatic.  The EM algorithm can sometimes lead to incorrect estimates
+when constraints are applied to parameters in the M-step of the algorithm.
+The package was designed to be used with the \code{npsol}-library, the
+main reason being that it handles general linear (in-)equality
+constraints very well. Unfortunately, \code{npsol} is not freeware and
+hence is not distributed with \pkg{depmix}. Two other options are
+available for optimization using  \code{nlm} and \code{optim}
+respectively. Linear equality constraints are fitted through
+reparametrization. Inequality constraints are fitted through adding a 
+penalty to the likelihood depending on the amount by which a
+constraint is not satisfied. The argument \code{vfactor} to the
+fitting function can be used to control this bahavior. See details of 
+this in the chapter on fitting models. 
+
+\chapter{Using depmix}
+
+Three steps are involved in using \pkg{depmix} which are illustrated
+below with examples:
+\begin{enumerate}
+	\item  data specification with function \code{markovdata}
+	
+	\item  model specification with function \code{dmm}
+	
+	\item  model fitting with function \code{fitdmm}
+\end{enumerate}
+
+To be able to fit models, data need to in a specific format created
+for this package.  Basically, data should be in the form of a matrix
+with each row corresponding to measures taken at a single measurement
+occasion for a single subject.  The function \code{markovdata} further
+only requires one argument providing the itemtypes, being one of
+categorical, continuous or covariate.  A markovdata object is a matrix
+with a number of attributes.  
+
+\section{Creating data sets}
+
+As an example we make a dataset with two variables measured at two
+times 50 occasions.
+
+<<>>=
+x=rnorm(100,10,2)
+y=ifelse(runif(100)<0.5,0,1)
+z=matrix(c(x,y),100,2)
+md=markovdata(z,itemtypes=c("cont","cat"),ntimes=c(50,50))
+md[1:10,]
+@
+
+In the example below, we split the dataset \code{speed} into three
+separate datasets, which we later use as an example to do multi-group
+analysis.
+
+<<>>=
+data(speed)
+r1=markovdata(dat=speed[1:168,],itemt=itemtypes(speed))
+r2=markovdata(dat=speed[169:302,],itemt=itemtypes(speed))
+r3=markovdata(dat=speed[303:439,],itemt=itemtypes(speed))
+summary(r2)
+@
+
+
+Here is the full specification of the \code{markovdata} function. 
+
+\input{markovdata}
+
+
+\section{Data set speed}
+
+Throughout this manual we will use a data set called speed, and hence
+we provide some background information on how these data were
+gathered.
+
+\input{speed}
+
+Interesting hypotheses to test are: is the switching regime symmetric?
+Is there evidence for two states or does one state suffice?  Is the
+guessing state actually a guessing state, i.e., is the probability
+correct at chance level of 0.5?
+
+
+\section{Defining models}
+
+A dependent mixture model is defined by the number of states, and by
+the item distribution functions, and can be created with the
+\code{dmm}-function as follows:
+
+<<>>= 
+mod <-dmm(nstates=2,itemtypes=c("gaus",2))
+summary(mod)
+@
+
+Here \code{itemtypes} is a vector of length the number of items
+measured at each occasion specifying the desired distributions, in
+this case the first item is to follow a normal distribution, and the
+second item follows a bernouilli distribution.  Allowable
+distributions are listed in Table~\ref{tbl:dist}, along with their
+internal code, and the parameter names.  The R-internal code is used
+for estimating these parameters.  Specifics of these distributions and
+their estimation can be found in their respective help files.
+Itemtypes can be specified by their name or by their internal code,
+except in the case of multinomial items, which have to be specified by
+a number.
+
+\begin{table}[tbp]
+	\centering
+	\begin{tabular}{lcc}
+		\hline
+		distribution & code & parameters \\
+		\hline
+		\hline
+		multinomial & $2,3,4, \ldots $ & $p_1, p_2, p_3, \ldots $  \\
+		\hline
+		gaussian, normal & 1  & $ \mu, \sigma $ \\
+		\hline
+		lognormal & -21 & $l\mu$ , $l\sigma$  \\
+		\hline
+		weibull & -22 & shape (a), scale (b) \\
+		\hline
+		gamma & -23 & shape (a), scale (s)  \\
+		\hline
+		3lognormal & -31 &  $l\mu$ , $l\sigma$, shift \\
+		\hline
+		3weibull & -32 & shape (a), scale (b), shift \\
+		\hline
+		3gamma & -33 & shape (a), scale (s), shift \\
+		\hline
+	\end{tabular}
+	\caption{Allowable distribution names, internal codes, and number 
+	of parameters.}
+	\label{tbl:dist}
+\end{table}
+
+The function \code{dmm} returns an object of class \code{dmm} which
+has its own summary function providing the parameter values of the
+model. See the help files for further details. Except in simple 
+cases, starting values can be a problem in latent Markov models, and
+so in general it's best to provide them if you have a fairly good idea
+of what to expect. Providing starting values is done through the stval
+argument: 
+
+<<>>=
+st <- c(1,0.9,0.1,0.2,0.8,2,1,0.7,0.3,5,2,0.2,0.8,0.5,0.5)
+mod <- dmm(nsta=2,itemt=c(1,2), stval=st)
+summary(mod)
+@
+
+\input{dmm}
+
+\subsection{Generating data}
+
+The \code{dmm}-class has a \code{generate} method that can be used to 
+generate data according to a specified model. 
+
+<<>>=
+gen<-generate(c(100,50),mod)
+summary(gen)
+@
+\begin{figure}[htbp]
+  \begin{center}
+<<fig=TRUE>>=
+plot(gen)
+@
+	\caption{Two timeseries generated by 2-state model with one 
+	gaussian item and one binary item.}
+  \end{center}
+\end{figure}
+
+\input{generate}
+
+\section{Fitting models\label{fitmod}}
+
+Fitting models is done using the function \code{fitdmm}. The standard
+call only requires a dataset and a model as in:
+
+<<>>= 
+data(speed)
+mod <- dmm(nstates=2,itemtypes=c(1,2))
+fitex <- fitdmm(speed,mod)
+@
+
+Calling \code{fitdmm} produces some online ouput about the progress of
+the optimization which can be controlled with the \code{printlevel}
+argument.  Its default value of 1 just gives the first and the last
+iteration of the optimization; 0 gives no output, and setting it to 5
+or higher values will produce output at each iteration.  These values
+correspond with the 0,1, and 2 printlevel of \code{nlm}.  When using
+\code{optim}, the \code{printlevel} argument is used to set the
+\code{REPORT} argument of \code{optim} (see its help page for
+details).  Printlevel 0 gives report 0, printlevel 1 gives report 10,
+printlevels 2--4 give report 5 and printlevel>4 gives report 1,
+producing output at every iteration.  Printlevels starting from 15 and
+higher produce increasingly annoying output from the C-routines that
+compute the loglikelihood.
+
+\code{Fitdmm} returns an object of class \code{fit} which has a
+summary method showing the estimated parameter values, along with
+standard errors, and t-ratios whenever those are available.  Along
+with the log-likelihood, the AIC and BIC values are provided.  Apart
+from the printed values (see summary below), a \code{fit}-object has a
+number of other fields.  Most importantly, it contains a copy of the
+fitted model in \code{mod} and it has a field \code{post} containing
+posterior state estimates.  That is, for each group $g$,
+\code{post\$states[[g]]} is a matrix with dimensions the number of
+states of the model + 2, and \code{sum(ntimes(dat))}.  The first
+column contains the a posteriori component model, the second column
+has the state number within the component, and the other columns are
+used for the a posteriori probabilities of each of the states.
+
+<<>>=
+summary(fitex)
+@
+
+
+\input{fitdmm}
+
+\chapter{Extending and constraining models}
+
+\section{Fixing and constraining parameters}
+
+Continuing the example from above, it can be seen that in one of the
+states, the probability of a correct answer is about .5, as is the
+probability of an incorrect answer, i.e., these are parameters Item2,p1 and
+Item2,p2. This latent state, is supposed to be a guessing state, and hence 
+it makes sense to constrain these parameters to their theoretical values of 
+.5. Similarly, the initial state probability for the slow state is one, and
+zero for the other state, and hence it makes sense to fix these parameters.
+The third constraint that we consider here is an equality constraint
+between the transition parameters. Using this constraint, we can test the
+hypothesis whether the switching between states is a symmetric process or
+not. Hence, we constrain the transition parameters $a_{11}$ and $a_{22}$.
+
+Constraining and fixing parameters is done in a similar fashion as the
+\code{pa} command that is used in LISREL \citep{Joreskog1999}. The
[TRUNCATED]

To get the complete diff run:
    svnlook diff /svnroot/depmix -r 285


More information about the depmix-commits mailing list