[Depmix-commits] r285 - in papers: jss jss/graphs vignette
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Wed Jul 8 16:57:03 CEST 2009
Author: ingmarvisser
Date: 2009-07-08 16:57:02 +0200 (Wed, 08 Jul 2009)
New Revision: 285
Added:
papers/jss/README
papers/jss/article.tex
papers/jss/bookreview.tex
papers/jss/codesnippet.tex
papers/jss/depmix-intro.Rnw
papers/jss/depmixS4-jss.tex
papers/jss/graphs/speed1.pdf
papers/jss/jss.bst
papers/jss/jss.cls
papers/jss/jss.dbj
papers/jss/jss.drv
papers/jss/jss.dtx
papers/jss/jss.ins
papers/jss/jss.pdf
papers/jss/jsslogo.jpg
papers/jss/robKalman.Rnw
papers/jss/softwarereview.tex
papers/jss/toaddlater.tex
papers/vignette/depmix-intro.aux
papers/vignette/depmix-intro.log
papers/vignette/depmix-intro.pdf
papers/vignette/depmix-intro.toc
Removed:
papers/jss/depmix-manual.Rnw
papers/jss/dmm4.tex
papers/jss/graphs/stimuli.pdf
papers/jss/hmmlta.tex
papers/jss/information.tex
papers/vignette/toaddlater.tex
Modified:
papers/vignette/depmix-intro.tex
Log:
Moved around files in the papers directory
Added: papers/jss/README
===================================================================
--- papers/jss/README (rev 0)
+++ papers/jss/README 2009-07-08 14:57:02 UTC (rev 285)
@@ -0,0 +1,47 @@
+***************************************************
+** jss: A Document Class for Publications in the **
+** Journal of Statistical Software **
+***************************************************
+
+This zip-archive contains the pdfLaTeX infrastructure
+for publications in the Journal of Statistical Software.
+The files
+ - jss.cls (LaTeX2e class)
+ - jss.bst (BibTeX style)
+ - jsslogo.jpg (JPG logo)
+need to be included in your search path (local working
+directory, texmf or localtexmf tree).
+
+A manual how to use jss.cls is provided in
+ - jss.pdf.
+
+Furthermore, there are templates for articles, code
+snippets, book reviews and software reviews available
+in
+ - article.tex
+ - codesnippet.tex
+ - bookreview.tex
+ - softwarereview.tex
+
+The final version of JSS papers should be prepared using
+JSS styles; the submission of the final version needs to
+include the full sources (.tex, .bib, and all graphics).
+A quick check for the most important aspects of the JSS
+style is given below; authors should make sure that all
+of them are addressed in the final version:
+ - The manuscript can be compiled by pdfLaTeX.
+ - \proglang, \pkg and \code has been used for highlighting
+ throughout the paper (including references).
+ - References are provided in a .bib BibTeX database
+ and included in the text by \cite, \citep, \citet,
+ etc.
+ - Titles and headers are formatted as described in
+ the JSS manual:
+ - \title in title style,
+ - \section etc. in sentence style,
+ - all titles in the BibTeX file in title style.
+ - Figures, tables and equations are marked with a \label
+ and referred to by \ref, e.g., "Figure~\ref{...}".
+ - Software packes are \cite{}d properly.
+For more details, see the manual jss.pdf, in particular
+the style checklist in Section 2.1.
Property changes on: papers/jss/README
___________________________________________________________________
Name: svn:eol-style
+ native
Added: papers/jss/article.tex
===================================================================
--- papers/jss/article.tex (rev 0)
+++ papers/jss/article.tex 2009-07-08 14:57:02 UTC (rev 285)
@@ -0,0 +1,63 @@
+\documentclass[article]{jss}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%% declarations for jss.cls %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+%% almost as usual
+\author{Achim Zeileis\\Wirtschaftsuniversit\"at Wien \And
+ Second Author\\Plus Affiliation}
+\title{A Capitalized Title: Something About a Package \pkg{foo}}
+
+%% for pretty printing and a nice hypersummary also set:
+\Plainauthor{Achim Zeileis, Second Author} %% comma-separated
+\Plaintitle{A Capitalized Title: Something About a Package foo} %% without formatting
+\Shorttitle{A Capitalized Title} %% a short title (if necessary)
+
+%% an abstract and keywords
+\Abstract{
+ The abstract of the article.
+}
+\Keywords{keywords, comma-separated, not capitalized, \proglang{Java}}
+\Plainkeywords{keywords, comma-separated, not capitalized, Java} %% without formatting
+%% at least one keyword must be supplied
+
+%% publication information
+%% NOTE: Typically, this can be left commented and will be filled out by the technical editor
+%% \Volume{13}
+%% \Issue{9}
+%% \Month{September}
+%% \Year{2004}
+%% \Submitdate{2004-09-29}
+%% \Acceptdate{2004-09-29}
+
+%% The address of (at least) one author should be given
+%% in the following format:
+\Address{
+ Achim Zeileis\\
+ Department f\"ur Statistik \& Mathematik\\
+ Wirtschaftsuniversit\"at Wien\\
+ 1090 Wien, Austria\\
+ E-mail: \email{Achim.Zeileis at wu-wien.ac.at}\\
+ URL: \url{http://statmath.wu-wien.ac.at/~zeileis/}
+}
+%% It is also possible to add a telephone and fax number
+%% before the e-mail in the following format:
+%% Telephone: +43/1/31336-5053
+%% Fax: +43/1/31336-734
+
+%% for those who use Sweave please include the following line (with % symbols):
+%% need no \usepackage{Sweave.sty}
+
+%% end of declarations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+
+\begin{document}
+
+%% include your article here, just as usual
+%% Note that you should use the \pkg{}, \proglang{} and \code{} commands.
+
+\section[About Java]{About \proglang{Java}}
+%% Note: If there is markup in \(sub)section, then it has to be escape as above.
+
+\end{document}
Property changes on: papers/jss/article.tex
___________________________________________________________________
Name: svn:eol-style
+ native
Added: papers/jss/bookreview.tex
===================================================================
--- papers/jss/bookreview.tex (rev 0)
+++ papers/jss/bookreview.tex 2009-07-08 14:57:02 UTC (rev 285)
@@ -0,0 +1,51 @@
+\documentclass[bookreview]{jss}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%% declarations for jss.cls %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+%% reviewer
+\Reviewer{Frederic Udina\\Pompeu Fabra University}
+\Plainreviewer{Frederic Udina}
+
+%% about the book
+\Booktitle{Visualizing Categorical Data}
+\Bookauthor{Michael Friendly}
+\Publisher{SAS Institute Inc.}
+\Pubaddress{Carey, NC}
+\Pubyear{2000}
+\ISBN{1-58025-660-0}
+\Pages{456}
+\Price{USD 69.95 (P)}
+\URL{http://www.math.yorku.ca/SCS/vcd/}
+%% if different from \Booktitle also set
+%% \Plaintitle{Visualizing Categorical Data}
+%% \Shorttitle{Visualizing Categorical Data}
+
+%% publication information
+%% NOTE: Typically, this can be left commented and will be filled out by the technical editor
+%% \Volume{11}
+%% \Issue{7}
+%% \Month{July}
+%% \Year{2004}
+%% \Submitdate{2004-07-26}
+
+%% address of (at least one) author
+\Address{
+ Frederic Udina\\
+ Pompeu Fabra University\\
+ Department of Economics and Business\\
+ Barcelona, Spain 08005\\
+ E-mail: \email{udina at upf.es}\\
+ URL: \url{http://libiya.upf.es/}
+}
+
+%% end of declarations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+
+\begin{document}
+
+%% include the review as usual
+%% Note that you should use the \pkg{}, \proglang{} and \code{} commands.
+
+\end{document}
Property changes on: papers/jss/bookreview.tex
___________________________________________________________________
Name: svn:eol-style
+ native
Added: papers/jss/codesnippet.tex
===================================================================
--- papers/jss/codesnippet.tex (rev 0)
+++ papers/jss/codesnippet.tex 2009-07-08 14:57:02 UTC (rev 285)
@@ -0,0 +1,64 @@
+\documentclass[codesnippet]{jss}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%% declarations for jss.cls %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+%% almost as usual
+\author{Achim Zeileis\\Wirtschaftsuniversit\"at Wien \And
+ Second Author\\Plus Affiliation}
+\title{A Capitalized Title: Possibly Containing More Details}
+
+%% for pretty printing and a nice hypersummary also set:
+\Plainauthor{Achim Zeileis, Second Author} %% comma-separated
+\Plaintitle{A Capitalized Title: Possibly Containing More Details} %% without formatting
+\Shorttitle{A Capitalized Title} %% a short title (if necessary)
+
+%% an abstract and keywords
+\Abstract{
+ Here should be the abstract.
+}
+\Keywords{keywords, comma-separated, not capitalized, \proglang{Java}}
+\Plainkeywords{keywords, comma-separated, not capitalized, Java} %% without formatting
+%% at least one keyword must be supplied
+
+%% publication information
+%% NOTE: Typically, this can be left commented and will be filled out by the technical editor
+%% \Volume{13}
+%% \Issue{9}
+%% \Month{September}
+%% \Year{2004}
+%% \Submitdate{2004-09-29}
+%% \Acceptdate{2004-09-29}
+
+%% The address of (at least) one author should be given
+%% in the following format:
+\Address{
+ Achim Zeileis\\
+ Department f\"ur Statistik \& Mathematik\\
+ Wirtschaftsuniversit\"at Wien\\
+ 1090 Wien, Austria\\
+ E-mail: \email{Achim.Zeileis at wu-wien.ac.at}\\
+ URL: \url{http://statmath.wu-wien.ac.at/~zeileis/}
+}
+%% It is also possible to add a telephone and fax number
+%% before the e-mail in the following format:
+%% Telephone: +43/1/31336-5053
+%% Fax: +43/1/31336-734
+
+%% for those who use Sweave please include the following line (with % symbols):
+%% need no \usepackage{Sweave.sty}
+
+%% end of declarations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+
+\begin{document}
+
+%% include your article here, just as usual
+%% Note that you should use the \pkg{}, \proglang{} and \code{} commands.
+
+\section[About Java]{About \proglang{Java}}
+%% Note: If there is markup in \(sub)section, then it has to be escape as above.
+
+
+\end{document}
Property changes on: papers/jss/codesnippet.tex
___________________________________________________________________
Name: svn:eol-style
+ native
Added: papers/jss/depmix-intro.Rnw
===================================================================
--- papers/jss/depmix-intro.Rnw (rev 0)
+++ papers/jss/depmix-intro.Rnw 2009-07-08 14:57:02 UTC (rev 285)
@@ -0,0 +1,900 @@
+
+% -*- mode: noweb; noweb-default-code-mode: R-mode; -*-
+%&LaTeX
+
+\documentclass[]{report}
+
+\usepackage[]{amsmath, amsfonts, amstext, amsthm}
+
+%\batchmode
+
+\usepackage{Sweave}
+
+\usepackage{Rd}
+
+\usepackage[round]{natbib}
+
+\title{depmix: An R-package for fitting mixture models on mixed
+multivariate data with Markov dependencies}
+
+\author{Version: 0.9.1 \\ Ingmar Visser\thanks{Correspondence concerning this manual
+should be adressed to: Ingmar Visser, Department of Psychology,
+University of Amsterdam, Roetersstraat 15, 1018 WB, Amsterdam, The
+Netherlands}\\
+Developmental Processes Research Group\\
+Department of Psychology, University of Amsterdam\\
+i.visser at uva.nl}
+
+\date{\today}
+
+%%\VignetteIndexEntry{depmix: An R-package for fitting mixture models on mixed multivariate data with Markov dependencies}
+%%\VignetteDepends{depmix}
+%%\VignetteKeywords{R, latent/hidden Markov model, latent class model, finite mixture model, direct optimization, gradients}
+%%\VignettePackage{depmix}
+
+\SweaveOpts{prefix.string=graphs/depmix}
+
+\newcommand{\vc}{\mathbf}
+\newcommand{\mat}{\mathbf}
+
+% \newcommand{\pkg}{\texttt}
+% \newcommand{\code}{\texttt}
+
+\begin{document}
+
+\bibliographystyle{plainnat}
+
+\maketitle
+
+\begin{abstract}
+ \pkg{depmix} implements a general class of mixture models with
+ Markovian dependencies between them in the R programming language
+ \citep{R2006}. This includes standard Markov models,
+ latent/hidden Markov models, and latent class and finite mixture
+ distribution models. The models can be fitted on mixed
+ multivariate data from a number of distributions including the
+ binomial, multinomial, gaussian, lognormal and Weibull
+ distributions. Parameters can be estimated subject to general
+ linear constraints, and with optional inclusion of regression on
+ (time-dependent) covariates. Parameter estimation is done
+ through a direct optimization approach with gradients using the
+ \code{nlm} optimization routine. Optional support for using the NPSOL
+ optimization routines is provided as well. A number of illustrative
+ examples are included.
+\end{abstract}
+
+\tableofcontents
+
+\chapter{Introduction}
+
+Markov and latent Markov models are frequently used in the social
+sciences, in different areas and applications. In psychology, they
+are used for modelling learning processes, see \citet{Wickens1982}, for an
+overview, and \citet{Schmittmann2005} for a recent application. In
+economics, latent Markov models are commonly used as regime switching
+models, see e.g.\ \citet{Kim1994} and \citet{Ghysels1994}. Further
+applications include speech recognition \citep{Rabiner1989}, EEG analysis
+\citep{Rainer2000}, and genetics \citep{Krogh1998}. In those latter areas
+of application, latent Markov models are usually referred to as hidden
+Markov models.
+
+The \pkg{depmix} package was motivated by the fact that Markov models
+are used commonly in the social sciences, but no comprehensive package
+was available for fitting such models. Common programs for Markovian
+models include Panmark \citep{Pol1996}, and for latent class models
+Latent Gold
+\citep{Vermunt2003}. Those programs are lacking a number of
+important features, besides not being freely available. In
+particular, \pkg{depmix}: 1) handles multiple case, or multiple group,
+analysis; 2) handles arbitrarily long time series; 3) estimates models
+with general linear constraints between parameters; 4) analyzes mixed
+distributions, i.e., combinations of categorical and continuous
+observed variables; 5) fits mixtures of latent Markov models to deal
+with population heterogeneity; 6) can fit models with covariates.
+Although \pkg{depmix} is specifically meant for dealing with
+longitudinal or time series data, for say $T>100$, it can also handle
+the limit case with $T=1$. In those cases, there are no time
+dependencies between observed data, and the model reduces to a finite
+mixture model, or a latent class model. In the next chapter, an outline
+is provided of the model and the likelihood equations. In the
+chapters after that a number of examples are presented.
+
+
+\section*{Acknowledgements}
+
+Development of this pacakge was supported by European Commission grant
+51652 (NEST) and by a VENI grant from the Dutch Organization for Scientific
+Research (NWO) to Ingmar Visser.
+
+I am indebted to many people for providing help in writing this package.
+First and foremost Maartje Raijmakers and Verena Schmittmann tested
+countless earlier versions, spotted bugs and suggested many features.
+Moreover, Maartje Raijmakers provided the discrimination data set. Han van
+der Maas provided the speed-accuracy data and thereby neccessitated
+implementing models with time-dependent covariates. Conor Dolan and Raoul
+Grasman both provided valuable advice on statistics in general and
+optimization in particular.
+
+
+\chapter{Dependent mixture models}
+
+The data considered here, has the general form $O_{1}^{1}, \ldots,
+O_{1}^{m}$, $O_{2}^{1}, \ldots, O_{2}^{m}$, \ldots, $O_{T}^{1}, \ldots,
+O_{T}^{m}$ for an $m$-variate time series of length $T$. As an
+example, consider a time series of responses generated by a single
+subject in a reaction time experiment. The data consists of three
+variables, reaction time, accuracy and a covariate which is a pay-off
+factor which determines the reward for speed and accuracy. These
+variables are measured at 168, 134 and 137 occasions respectively.
+Below, a summary is provided for these data, as well as a plot of the
+first timeseries, which is selected by \code{nind=1}.
+
+<<echo=FALSE, results=hide>>=
+.libPaths(new="/Users/ivisser/Library/R/library/")
+library(depmix)
+load("models.Rda")
+@
+
+<<>>=
+data(speed)
+summary(speed)
+@
+\begin{figure}[htbp]
+ \begin{center}
+<<fig=TRUE>>=
+plot(speed,nind=1)
+@
+ \caption{Reaction times, accuracy and pay-off values for
+ the first series of responses in dataset \texttt{speed}.}
+ \end{center}
+\end{figure}
+
+The latent Markov model is commonly associated with data of this type,
+albeit usually only multinomial variables are considered. However,
+common estimation procedures, such as those implemented in
+\citet{Pol1996} are not suitable for long time series due to
+underflow problems. In contrast, the hidden Markov model is typically
+only used for `long' univariate time series. In the next section, the
+likelihood and estimation procedure for the hidden Markov model is
+described, given data of the above form.
+
+The dependent mixture model is defined by the following elements:
+\begin{enumerate}
+
+ \item a set $\vc{S}$ of latent classes or states $S_{i},\, i=1,
+ \ldots , n$,
+
+ \item a matrix $\mat{A}$ of transition probabilities $a_{ij}$ for
+ the transition from state $S_{i}$ to state $S_{j}$,
+
+ \item a set $\vc{B}$ of observation functions $b_{j}(\cdot)$ that
+ provide the conditional probabilities associated with latent state
+ $S_{j}$,
+
+ \item a vector $\pmb{\pi}$ of latent state initial probabilities
+ $\pi_{i}$
+\end{enumerate}
+When transitions are added to the latent class model, it is more
+appropriate to refer to the classes as states. The word class is
+rather more associated with a stable trait-like attribute whereas a
+state can change over time.
+
+
+\section{Likelihood}
+
+The loglikelihood of hidden Markov models is usually computed by the
+so-called forward-backward algorithm \citep{Baum1966,Rabiner1989}, or rather by
+the forward part of this algorithm. \cite{Lystig2002} changed the forward
+algorithm in such a way as to allow computing the gradients of the
+loglikelihood at the same time. They start by rewriting the
+likelihood as follows (for ease of exposition the dependence on the
+model parameters is dropped here):
+\begin{equation}
+ L_{T} = Pr(\vc{O}_{1}, \ldots, \vc{O}_{T}) = \prod_{t=1}^{T}
+Pr(\vc{O}_{t}|\vc{O}_{1},
+ \ldots, \vc{O}_{t-1}),
+ \label{condLike}
+\end{equation}
+where $Pr(\vc{O}_{1}|\vc{O}_{0}):=Pr(\vc{O}_{1})$. Note that for a
+simple, i.e.\ observed, Markov chain these probabilities reduce to
+$Pr(\vc{O}_{t}|\vc{O}_{1},\ldots,
+\vc{O}_{t-1})=Pr(\vc{O}_{t}|\vc{O}_{t-1})$.
+The log-likelihood can now be expressed as:
+\begin{equation}
+ l_{T} = \sum_{t=1}^{T} \log[Pr(\vc{O}_{t}|\vc{O}_{1}, \ldots,
+\vc{O}_{t-1})].
+ \label{eq:condLogl}
+\end{equation}
+
+To compute the log-likelihood, \cite{Lystig2002} define the following
+(forward) recursion:
+\begin{align}
+ \phi_{1}(j) &:= Pr(\vc{O}_{1}, S_{1}=j) = \pi_{j} b_{j}(\vc{O}_{1})
+ \label{eq:fwd1} \\
+\begin{split}
+ \phi_{t}(j) &:= Pr(\vc{O}_{t}, S_{t}=j|\vc{O}_{1}, \ldots,
+\vc{O}_{t-1}) \\
+ &= \sum_{i=1}^{N} [\phi_{t-1}(i)a_{ij}b_{j}(\vc{O}_{t})] \times
+(\Phi_{t-1})^{-1},
+ \label{eq:fwdt}
+\end{split}
+\end{align}
+where $\Phi_{t}=\sum_{i=1}^{N} \phi_{t}(i)$. Combining
+$\Phi_{t}=Pr(\vc{O}_{t}|\vc{O}_{1}, \ldots, \vc{O}_{t-1})$, and
+equation~(\ref{eq:condLogl}) gives the following expression for the
+log-likelihood:
+\begin{equation}
+ l_{T} = \sum_{t=1}^{T} \log \Phi_{t}.
+ \label{eq:logl}
+\end{equation}
+
+The above forward recursion can readily be generalized to mixture
+models, in which it is assumed that the data are realizations of a
+number of different LMMs and the goal is to assign posterior
+probabilities to sequences of observations. This situation occurs,
+for example, in learning data where different learning strategies may
+lead to different answer patterns. From an observed sequence of
+responses, it may not be immediately clear from which learning
+process they stem. Hence, it is interesting to consider a mixture of
+latent Markov models which incorporate restrictions that are
+consistent with each of the learning strategies.
+
+To compute the likelihood of a mixture of $K$ models, define the
+forward recursion variables as follows (these variables now have an
+extra index $k$ indicating that observation and transition
+probabilities are from latent model $k$):
+\begin{align}
+\begin{split}
+\phi_{1}(j_{k}) &= Pr(\vc{O}_{1},
+S_{1}=j_{k})=p_{k}\pi_{j_{k}}b_{j_{k}}(\vc{O}_{1}).
+\end{split}\label{eq:fwd1mix} \\
+\begin{split}
+\phi_{t}(j_{k}) &= Pr(\vc{O}_{t}, S_{t}=j_{k}|\vc{O}_{1}, \ldots,
+\vc{O}_{t-1}) \\
+ &= \left[ \sum_{k=1}^{K} \sum_{i=1}^{n_{k}} \phi_{t-1}(i_{k})
+ a_{ij_{k}}b_{j_{k}}(\vc{O}_{t}) \right] \times (\Phi_{t-1})^{-1},
+\end{split}\label{eq:fwdtmix}
+\end{align}
+where $\Phi_{t} = \sum_{k=1}^{K}\sum_{i=1}^{n_{k}} \phi_{t}(j_{k})$.
+Note that the double sum over $k$ and $n_{k}$ is simply an enumeration
+of all the states of the model. Now, because $a_{ij_{k}}=0$ whenever
+$S_{i}$ is not part of component $k$, the sum over $k$ can be dropped
+and hence equation~\ref{eq:fwdtmix} reduces to:
+\begin{equation}
+ \phi_{t}(j_{k}) = \left[ \sum_{i=1}^{n_{k}} \phi_{t-1}(i_{k})
+ a_{ij_{k}}b_{j_{k}}(\vc{O}_{t}) \right] \times (\Phi_{t-1})^{-1}
+\end{equation}
+The loglikelihood is computed by applying equation~\ref{eq:logl} on
+these terms. For multiple cases, the log-likelihood is simply the sum
+over the individual log-likelihoods.
+
+\paragraph{Computational considerations} From
+equations~(\ref{eq:fwd1}--\ref{eq:fwdt}), it can be seen that
+computing the forward variables, and hence the log-likelihood, takes
+$O(Tn^{2})$ computations, for an $n$-state model and a time series of
+length $T$. Consider a mixture of two components, one with two states
+and the other with three states. Using
+equations~(\ref{eq:fwd1}--\ref{eq:fwdt}) to compute the log-likelihood
+of this model one needs $O(Tn^{2})=O(T\times 25)$ computations whereas
+with the mixture equations~(\ref{eq:fwd1mix}--\ref{eq:fwdtmix}),
+$\sum_{n_{i}} O(n_{i}^{2}T)$ computations are needed, in this case
+$O(T \times 13)$. So, it can be seen that in this easy example the
+computational cost is almost halved.
+
+
+\section{Gradients}
+
+\newcommand{\fpp}{\frac{\partial} {\partial \lambda_{1}}}
+
+See equations 10--12 in \cite{Lystig2002} for the score recursion
+functions of the hidden Markov model for a univariate time series.
+Here the corresponding score recursion for the multivariate mixture
+case are provided. The $t=1$ components of this score recursion are
+defined as (for an arbitrary parameter $\lambda_{1}$):
+\begin{align}
+\psi_{1}(j_{k};\lambda_{1}) &:= \fpp Pr(\vc{O}_{1}|S_{1}=j_{k}) \\
+\begin{split}
+ &= \left[ \fpp p_{k} \right] \pi_{j_{k}}b_{j_{k}}(\vc{O}_{1}) +
+ p_{k}\left[ \fpp \pi_{j_{k}} \right] b_{j_{k}}(\vc{O}_{1}) \\
+ & \qquad + p_{k}\pi_{j_{k}} \left[ \fpp
+b_{j_{k}}(\vc{O}_{1})\right],
+\end{split} \label{eq:psi1}
+\end{align}
+and for $t>1$ the definition is:
+\begin{align}
+\psi_{t}(j_{k};\lambda_{1}) & = \frac{\fpp Pr(\vc{O}_{1}, \ldots,
+\vc{O}_{t}, S_{t}=j_{k})}
+ {Pr(\vc{O}_{1}, \ldots, \vc{O}_{t-1})} \\
+\begin{split}
+ & =
+ \sum_{i=1}^{n_{k}} \Bigg\{ \psi_{t-1}(i;\lambda_{1})a_{ij_{k}}
+ b_{j_{k}}(\vc{O}_{t}) \\
+ &\qquad +\phi_{t-1}(i) \left[ \fpp a_{ij_{k}} \right] b_{j_{k}}
+(\vc{O}_{t}) \\
+ &\qquad +\phi_{t-1}(i)a_{ij_{k}} \left[ \fpp b_{j_{k}}
+(\vc{O}_{t}) \right] \Bigg\}
+ \times (\Phi_{t-1})^{-1}.
+\end{split} \label{eq:psit}
+\end{align}
+
+Using above equations, \cite{Lystig2002} derive the following equation
+for the partial derivative of the likelihood:
+\begin{equation}
+ \fpp l_{T}=
+ \frac{\mathbf{\Psi}_{T}(\lambda_{1})}{\mathbf{\Phi}_{T}},
+\end{equation}
+where $\Psi_{t}=\sum_{k=1}^{K} \sum_{i=1}^{n_{k}}
+\psi_{t}(j_{k};\lambda_{1})$.
+Starting from the equation from the logarithm of the likelihood, this
+is easily seen to be correct:
+\begin{eqnarray*}
+ \fpp \log Pr(\vc{O}_{1}, \ldots, \vc{O}_{T}) &=& Pr(\vc{O}_{1},
+\ldots, \vc{O}_{T})^{-1}
+ \fpp Pr(\vc{O}_{1}, \ldots, \vc{O}_{T}) \\
+ &=& \frac{Pr(\vc{O}_{1}, \ldots, \vc{O}_{T-1})}{Pr(\vc{O}_{1},
+\ldots, \vc{O}_{T})} \Psi_{T} (\lambda_{1}) \\
+ &=& \frac{\mathbf{\Psi}_{T}(\lambda_{1})}{\mathbf{\Phi}_{T}}.
+\end{eqnarray*}
+
+Further, to actually compute the gradients, the partial derivatives of
+the parameters and observation distribution functions are neccessary,
+i.e., $\fpp p_{k}$, $\fpp \pi_{i}$, $\fpp a_{ij}$, and $\fpp
+\vc{b}_{i}(\vc{O}_{t})$. Only the latter case requires some
+attention. We need the following derivatives $\fpp
+\vc{b}_{j}(\vc{O}_{t})=\fpp \vc{b}_{j}(O_{t}^{1}, \ldots, O_{t}^{m})$, for
+arbitrary parameters $\lambda_{1}$. To stress that $\vc{b}_{j}$ is a
+vector of functions, we here used boldface. First note that because of local
+independence we can write:
+\begin{equation*}
+ \fpp \left[ b_{j}(O_{t}^{1}, \ldots, O_{t}^{m}) \right] = \frac{\partial}
+{\partial \lambda_{1} } \left[ b_{j}(O_{t}^{1}) \right] \times
+\left[ b_{j}(O_{t}^{2}) \right], \ldots, \left[ b_{j}(O_{t}^{m}) \right].
+\end{equation*}
+Applying the chain rule for products we get:
+\begin{equation}
+ \fpp [b_{j}(O_{t}^{1}, \ldots, O_{t}^{m})] =
+ \sum_{l=1}^{m} \left[ \prod_{i=1, \ldots, \hat{l}, \ldots, m}
+b_{j}(O_{t}^{i}) \right] \times
+ \fpp [b_{j}(O_{t}^{l})],
+ \label{partialProd}
+\end{equation}
+where $\hat{l}$ means that that term is left out of the product.
+These latter terms, $\frac{\partial} {\partial \lambda_{1} }
+[b_{j}(O_{t}^{k})]$, are easy to compute given either multinomial or
+gaussian observation densities $b_{j}(\dot)$
+
+
+\section{Parameter estimation}
+
+Parameters are estimated in \pkg{depmix} using a direct optimization
+approach instead of the EM algorithm which is frequently used for this type
+of model. The EM algorithm however has some drawbacks. First, it can be
+slow to converge. Second, applying constraints to parameters can be
+problmatic. The EM algorithm can sometimes lead to incorrect estimates
+when constraints are applied to parameters in the M-step of the algorithm.
+The package was designed to be used with the \code{npsol}-library, the
+main reason being that it handles general linear (in-)equality
+constraints very well. Unfortunately, \code{npsol} is not freeware and
+hence is not distributed with \pkg{depmix}. Two other options are
+available for optimization using \code{nlm} and \code{optim}
+respectively. Linear equality constraints are fitted through
+reparametrization. Inequality constraints are fitted through adding a
+penalty to the likelihood depending on the amount by which a
+constraint is not satisfied. The argument \code{vfactor} to the
+fitting function can be used to control this bahavior. See details of
+this in the chapter on fitting models.
+
+\chapter{Using depmix}
+
+Three steps are involved in using \pkg{depmix} which are illustrated
+below with examples:
+\begin{enumerate}
+ \item data specification with function \code{markovdata}
+
+ \item model specification with function \code{dmm}
+
+ \item model fitting with function \code{fitdmm}
+\end{enumerate}
+
+To be able to fit models, data need to in a specific format created
+for this package. Basically, data should be in the form of a matrix
+with each row corresponding to measures taken at a single measurement
+occasion for a single subject. The function \code{markovdata} further
+only requires one argument providing the itemtypes, being one of
+categorical, continuous or covariate. A markovdata object is a matrix
+with a number of attributes.
+
+\section{Creating data sets}
+
+As an example we make a dataset with two variables measured at two
+times 50 occasions.
+
+<<>>=
+x=rnorm(100,10,2)
+y=ifelse(runif(100)<0.5,0,1)
+z=matrix(c(x,y),100,2)
+md=markovdata(z,itemtypes=c("cont","cat"),ntimes=c(50,50))
+md[1:10,]
+@
+
+In the example below, we split the dataset \code{speed} into three
+separate datasets, which we later use as an example to do multi-group
+analysis.
+
+<<>>=
+data(speed)
+r1=markovdata(dat=speed[1:168,],itemt=itemtypes(speed))
+r2=markovdata(dat=speed[169:302,],itemt=itemtypes(speed))
+r3=markovdata(dat=speed[303:439,],itemt=itemtypes(speed))
+summary(r2)
+@
+
+
+Here is the full specification of the \code{markovdata} function.
+
+\input{markovdata}
+
+
+\section{Data set speed}
+
+Throughout this manual we will use a data set called speed, and hence
+we provide some background information on how these data were
+gathered.
+
+\input{speed}
+
+Interesting hypotheses to test are: is the switching regime symmetric?
+Is there evidence for two states or does one state suffice? Is the
+guessing state actually a guessing state, i.e., is the probability
+correct at chance level of 0.5?
+
+
+\section{Defining models}
+
+A dependent mixture model is defined by the number of states, and by
+the item distribution functions, and can be created with the
+\code{dmm}-function as follows:
+
+<<>>=
+mod <-dmm(nstates=2,itemtypes=c("gaus",2))
+summary(mod)
+@
+
+Here \code{itemtypes} is a vector of length the number of items
+measured at each occasion specifying the desired distributions, in
+this case the first item is to follow a normal distribution, and the
+second item follows a bernouilli distribution. Allowable
+distributions are listed in Table~\ref{tbl:dist}, along with their
+internal code, and the parameter names. The R-internal code is used
+for estimating these parameters. Specifics of these distributions and
+their estimation can be found in their respective help files.
+Itemtypes can be specified by their name or by their internal code,
+except in the case of multinomial items, which have to be specified by
+a number.
+
+\begin{table}[tbp]
+ \centering
+ \begin{tabular}{lcc}
+ \hline
+ distribution & code & parameters \\
+ \hline
+ \hline
+ multinomial & $2,3,4, \ldots $ & $p_1, p_2, p_3, \ldots $ \\
+ \hline
+ gaussian, normal & 1 & $ \mu, \sigma $ \\
+ \hline
+ lognormal & -21 & $l\mu$ , $l\sigma$ \\
+ \hline
+ weibull & -22 & shape (a), scale (b) \\
+ \hline
+ gamma & -23 & shape (a), scale (s) \\
+ \hline
+ 3lognormal & -31 & $l\mu$ , $l\sigma$, shift \\
+ \hline
+ 3weibull & -32 & shape (a), scale (b), shift \\
+ \hline
+ 3gamma & -33 & shape (a), scale (s), shift \\
+ \hline
+ \end{tabular}
+ \caption{Allowable distribution names, internal codes, and number
+ of parameters.}
+ \label{tbl:dist}
+\end{table}
+
+The function \code{dmm} returns an object of class \code{dmm} which
+has its own summary function providing the parameter values of the
+model. See the help files for further details. Except in simple
+cases, starting values can be a problem in latent Markov models, and
+so in general it's best to provide them if you have a fairly good idea
+of what to expect. Providing starting values is done through the stval
+argument:
+
+<<>>=
+st <- c(1,0.9,0.1,0.2,0.8,2,1,0.7,0.3,5,2,0.2,0.8,0.5,0.5)
+mod <- dmm(nsta=2,itemt=c(1,2), stval=st)
+summary(mod)
+@
+
+\input{dmm}
+
+\subsection{Generating data}
+
+The \code{dmm}-class has a \code{generate} method that can be used to
+generate data according to a specified model.
+
+<<>>=
+gen<-generate(c(100,50),mod)
+summary(gen)
+@
+\begin{figure}[htbp]
+ \begin{center}
+<<fig=TRUE>>=
+plot(gen)
+@
+ \caption{Two timeseries generated by 2-state model with one
+ gaussian item and one binary item.}
+ \end{center}
+\end{figure}
+
+\input{generate}
+
+\section{Fitting models\label{fitmod}}
+
+Fitting models is done using the function \code{fitdmm}. The standard
+call only requires a dataset and a model as in:
+
+<<>>=
+data(speed)
+mod <- dmm(nstates=2,itemtypes=c(1,2))
+fitex <- fitdmm(speed,mod)
+@
+
+Calling \code{fitdmm} produces some online ouput about the progress of
+the optimization which can be controlled with the \code{printlevel}
+argument. Its default value of 1 just gives the first and the last
+iteration of the optimization; 0 gives no output, and setting it to 5
+or higher values will produce output at each iteration. These values
+correspond with the 0,1, and 2 printlevel of \code{nlm}. When using
+\code{optim}, the \code{printlevel} argument is used to set the
+\code{REPORT} argument of \code{optim} (see its help page for
+details). Printlevel 0 gives report 0, printlevel 1 gives report 10,
+printlevels 2--4 give report 5 and printlevel>4 gives report 1,
+producing output at every iteration. Printlevels starting from 15 and
+higher produce increasingly annoying output from the C-routines that
+compute the loglikelihood.
+
+\code{Fitdmm} returns an object of class \code{fit} which has a
+summary method showing the estimated parameter values, along with
+standard errors, and t-ratios whenever those are available. Along
+with the log-likelihood, the AIC and BIC values are provided. Apart
+from the printed values (see summary below), a \code{fit}-object has a
+number of other fields. Most importantly, it contains a copy of the
+fitted model in \code{mod} and it has a field \code{post} containing
+posterior state estimates. That is, for each group $g$,
+\code{post\$states[[g]]} is a matrix with dimensions the number of
+states of the model + 2, and \code{sum(ntimes(dat))}. The first
+column contains the a posteriori component model, the second column
+has the state number within the component, and the other columns are
+used for the a posteriori probabilities of each of the states.
+
+<<>>=
+summary(fitex)
+@
+
+
+\input{fitdmm}
+
+\chapter{Extending and constraining models}
+
+\section{Fixing and constraining parameters}
+
+Continuing the example from above, it can be seen that in one of the
+states, the probability of a correct answer is about .5, as is the
+probability of an incorrect answer, i.e., these are parameters Item2,p1 and
+Item2,p2. This latent state, is supposed to be a guessing state, and hence
+it makes sense to constrain these parameters to their theoretical values of
+.5. Similarly, the initial state probability for the slow state is one, and
+zero for the other state, and hence it makes sense to fix these parameters.
+The third constraint that we consider here is an equality constraint
+between the transition parameters. Using this constraint, we can test the
+hypothesis whether the switching between states is a symmetric process or
+not. Hence, we constrain the transition parameters $a_{11}$ and $a_{22}$.
+
+Constraining and fixing parameters is done in a similar fashion as the
+\code{pa} command that is used in LISREL \citep{Joreskog1999}. The
[TRUNCATED]
To get the complete diff run:
svnlook diff /svnroot/depmix -r 285
More information about the depmix-commits
mailing list