[Depmix-commits] r592 - in pkg/depmixS4: . inst/doc vignettes

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Tue Sep 17 09:29:15 CEST 2013


Author: ingmarvisser
Date: 2013-09-17 09:29:14 +0200 (Tue, 17 Sep 2013)
New Revision: 592

Added:
   pkg/depmixS4/vignettes/
   pkg/depmixS4/vignettes/baldist.pdf
   pkg/depmixS4/vignettes/depmixS4.Rnw
   pkg/depmixS4/vignettes/depmixS4.bib
   pkg/depmixS4/vignettes/depmixS4.pdf
Removed:
   pkg/depmixS4/inst/doc/baldist.pdf
   pkg/depmixS4/inst/doc/depmixS4.Rnw
   pkg/depmixS4/inst/doc/depmixS4.bib
   pkg/depmixS4/inst/doc/depmixS4.pdf
Log:
Moved vignette to vignettes directory.

Deleted: pkg/depmixS4/inst/doc/baldist.pdf
===================================================================
(Binary files differ)

Deleted: pkg/depmixS4/inst/doc/depmixS4.Rnw
===================================================================
--- pkg/depmixS4/inst/doc/depmixS4.Rnw	2013-09-17 07:28:08 UTC (rev 591)
+++ pkg/depmixS4/inst/doc/depmixS4.Rnw	2013-09-17 07:29:14 UTC (rev 592)
@@ -1,1121 +0,0 @@
-\documentclass[article,nojss]{jss}
-\usepackage{amsmath}
-\graphicspath{{Figures/}}
-
-\newcommand{\vc}{\mathbf}
-\newcommand{\mat}{\mathbf}
-\newcommand{\greekv}[1]{\mbox{\boldmath$\mathrm{#1}$}}
-\newcommand{\greekm}[1]{\mbox{\boldmath$#1$}}
-
-%% \usepackage{Sweave}
-\SweaveOpts{engine = R, echo = TRUE, keep.source = TRUE, eps = FALSE}
-
-%\VignetteIndexEntry{depmixS4: An R Package for Hidden Markov Models}
-%\VignetteDepends{depmixS4,gamlss,gamlss.dist}
-%\VignetteKeywords{hidden Markov model, dependent mixture model, mixture model, constraints}
-%\VignettePackage{depmixS4}
-
-\author{Ingmar Visser\\University of Amsterdam \And 
-        Maarten Speekenbrink\\University College London}
-\Plainauthor{Ingmar Visser, Maarten Speekenbrink}
-
-\title{\pkg{depmixS4}: An \proglang{R} Package for Hidden Markov Models}
-\Plaintitle{depmixS4: An R Package for Hidden Markov Models}
-
-\Abstract{	
-
-	This introduction to the \proglang{R} package \pkg{depmixS4} is a
-	(slightly) modified version of \cite{Visser+Speekenbrink:2010},
-	published in the \emph{Journal of Statistical Software}.  Please
-	refer to that article when using \pkg{depmixS4}.  The current
-	version is 1.3-0; the version history and changes can be found in
-	the NEWS file of the package. Below, the major versions are listed 
-	along with the most noteworthy changes. 
-
-	\pkg{depmixS4} implements a general framework for defining and
-	estimating dependent mixture models in the \proglang{R} programming
-	language.  This includes standard Markov models, latent/hidden
-	Markov models, and latent class and finite mixture distribution
-	models.  The models can be fitted on mixed multivariate data with
-	distributions from the \code{glm} family, the (logistic)
-	multinomial, or the multivariate normal distribution.  Other
-	distributions can be added easily, and an example is provided with
-	the {\em exgaus} distribution.  Parameters are estimated by the
-	expectation-maximization (EM) algorithm or, when (linear)
-	constraints are imposed on the parameters, by direct numerical
-	optimization with the \pkg{Rsolnp} or \pkg{Rdonlp2} routines.  
-	
-}
-
-\Keywords{hidden Markov model, dependent mixture model, mixture model, constraints}
-
-\Volume{36}
-\Issue{7}
-\Month{August}
-\Year{2010}
-\Submitdate{2009-08-19}
-\Acceptdate{2010-06-21}
-
-\Address{
-  Ingmar Visser\\
-  Department of Psychology\\
-  University of Amsterdam\\
-  Roetersstraat 15\\
-  1018 WB, Amsterdam, The Netherlands\\
-  E-mail: \email{i.visser at uva.nl} \\
-  URL: \url{http://www.ingmar.org/}
-}
-
-\begin{document}
-
-<<echo=FALSE,results=hide>>=	
-options(prompt = "R> ", continue = "+  ", width = 70, useFancyQuotes = FALSE, digits = 4)
-library("depmixS4")
-@
-
-\section*{Version history}
-
-\begin{description}
-		\item[1.3-0] The EM algorithm has gained an extra argument
-		\code{classification}, passed from the \code{fit} function using
-		argument \code{emcontrol}, to allow a choice between maximising
-		the regular (\code{classification="soft"}, the default) or
-		classification (\code{classification="hard"}) likelihood.  See
-		section~???  for details.
-		
-		Parameters are now given proper names, following the \code{glm()}
-		scheme (e.g. \code{(Intercept)}, \code{x1}, et cetera); with this,
-		the \code{show} and \code{summary} methods have changed
-		considerably and now produce more compact and more readable
-		output.  The \code{summary} method (for both fitted and unfitted
-		models) now has an argument \code{compact} (\code{TRUE}
-		by default) that controls the presentation of the repsonse model
-		parameters.  The prior and transition models are now presented
-		more compactly when there are no covariates.
-		
-		\item[1.2-0] Added support for missing data, see 
-		section~\ref{sec:missingdata}.
-		
-		\item[1.1-0] Speed improvements due to writing the main loop in C code.
-		
-		\item[1.0-0] First release with this vignette, a modified version 
-		of the paper in the Journal of Statistical Software. 
-		
-		\item[0.1-0] First version on CRAN. 
-\end{description}
-
-
-\section{Introduction}
-
-Markov and latent Markov models are frequently used in the social
-sciences, in different areas and applications.  In psychology, they
-are used for modelling learning processes; see \citet{Wickens1982},
-for an overview, and e.g., \citet{Schmittmann2006}, for a recent
-application.  In economics, latent Markov models are so-called regime
-switching models (see e.g., \citealp{Kim1994} and
-\citealp{Ghysels1994}).  Further applications include speech
-recognition \citep{Rabiner1989}, EEG analysis \citep{Rainer2000}, and
-genetics \citep{Krogh1998}.  In these latter areas of application,
-latent Markov models are usually referred to as hidden Markov models.
-See for example \citet{Fruhwirth2006} for an overview of hidden Markov
-models with extensions.  Further examples of applications can be found
-in e.g., \citet[][Chapter~1]{Cappe2005}.  A more gentle introduction
-into hidden Markov models with applications is the book by
-\citet{Zucchini2009}.
-
-The \pkg{depmixS4} package was motivated by the fact that while Markov
-models are used commonly in the social sciences, no comprehensive
-package was available for fitting such models.  Existing software for
-estimating Markovian models include \pkg{Panmark} \citep{Pol1996}, and for
-latent class models \pkg{Latent Gold} \citep{Vermunt2003}.  These programs
-lack a number of important features, besides not being freely
-available.  There are currently some packages in \proglang{R} that
-handle hidden Markov models but they lack a number of features that we
-needed in our research.  In particular, \pkg{depmixS4} was designed to
-meet the following goals:
-
-\begin{enumerate}	
-	\item to be able to estimate parameters subject to general
-	linear (in)equality constraints;
-	
-	\item to be able to fit transition models with covariates, i.e.,
-	to have time-dependent transition matrices;
-	
-	\item to be able to include covariates in the prior or initial
-	state probabilities;
-	
-	\item to be easily extensible, in particular, to allow users to
-	easily add new uni- or multivariate response distributions and
-	new transition models, e.g., continuous time observation models.
-	
-\end{enumerate}
-
-Although \pkg{depmixS4} was designed to deal with longitudinal or time
-series data, for say $T>100$, it can also handle the limit case when
-$T=1$.  In this case, there are no time dependencies between observed
-data and the model reduces to a finite mixture or latent class model.
-While there are specialized packages to deal with mixture data, as far
-as we know these do not allow the inclusion of covariates on the prior
-probabilities of class membership.  The possibility to estimate the
-effects of covariates on prior and transition probabilities is a
-distinguishing feature of \pkg{depmixS4}.  In
-Section~\ref{sec:outline}, we provide an outline of the model and
-likelihood equations.
-
-The \pkg{depmixS4} package is implemented using the \proglang{R}
-system for statistical computing \citep{R2010} and is available from
-the Comprehensive \proglang{R} Archive Network at
-\url{http://CRAN.R-project.org/package=depmixS4}.
-
-
-\section{The dependent mixture model} \label{sec:outline}
-
-The data considered here have the general form $\vc{O}_{1:T}=
-(O_{1}^{1}, \ldots, O_{1}^{m}$, $O_{2}^{1}, \ldots, O_{2}^{m}$,
-\ldots, $O_{T}^{1}, \ldots, O_{T}^{m})$ for an $m$-variate time series
-of length $T$.  In the following, we use $\vc{O}_{t}$ as shorthand for
-$O_{t}^{1}, \ldots, O_{t}^{m}$.  As an example, consider a time series
-of responses generated by a single participant in a psychological
-response time experiment.  The data consists of three variables:
-response time, response accuracy, and a covariate which is a pay-off
-variable reflecting the relative reward for speeded and/or accurate
-responding.  These variables are measured on 168, 134 and 137
-occasions respectively (the first series of 168 trials is plotted in
-Figure~\ref{fig:speed}).  These data are more fully described in
-\citet{Dutilh2011}, and in the next section a number of example models
-for these data is described.
-
-\setkeys{Gin}{width=0.8\textwidth}
-\begin{figure}[t!]
-\begin{center}
-<<fig=TRUE,echo=FALSE,height=5,width=7,eps=FALSE>>=	
-data("speed")
-plot(as.ts(speed[1:168,]), main = "Speed-accuracy trade-off")
-@
-	\caption{Response times (rt), accuracy (corr) and pay-off 
-	values (Pacc) for the first series of responses in dataset 
-	\code{speed}.}
-	\label{fig:speed}
-\end{center}
-\end{figure}
-
-The latent Markov model is usually associated with data of this type,
-in particular for multinomially distributed responses.  However,
-commonly employed estimation procedures \citep[e.g.,][]{Pol1996}, are
-not suitable for long time series due to underflow problems.  In
-contrast, the hidden Markov model is typically only used for `long'
-univariate time series \citep[][Chapter~1]{Cappe2005}.  We use the
-term ``dependent mixture model" because one of the authors (Ingmar
-Visser) thought it was time for a new name to relate these
-models\footnote{Only later he found out that \citet{Leroux1992}
-already coined the term dependent mixture models in an application
-with hidden Markov mixtures of Poisson count data.}.
-
-The fundamental assumption of a dependent mixture model is that at any
-time point, the observations are distributed as a mixture with $n$
-components (or states), and that time-dependencies between the
-observations are due to time-dependencies between the mixture
-components (i.e., transition probabilities between the components).
-These latter dependencies are assumed to follow a first-order Markov
-process.  In the models we are considering here, the mixture
-distributions, the initial mixture probabilities and transition
-probabilities can all depend on covariates $\vc{z}_t$.
-
-In a dependent mixture model, the joint likelihood of observations
-$\vc{O}_{1:T}$ and latent states $\vc{S}_{1:T} = (S_1,\ldots,S_T)$,
-given model parameters $\greekv{\theta}$ and covariates $\vc{z}_{1:T}
-= (\vc{z}_1,\ldots,\vc{z}_T)$, can be written as:
-\begin{equation}
-	\Prob(\vc{O}_{1:T},\vc{S}_{1:T}|\greekv{\theta},\vc{z}_{1:T}) =  
-	\pi_{i}(\vc{z}_1) \vc{b}_{S_1}(\vc{O}_1|\vc{z}_{1})
-	\prod_{t=1}^{T-1} a_{ij}(\vc{z}_{t}) \vc{b}_{S_t}(\vc{O}_{t+1}|\vc{z}_{t+1}),
-\end{equation}
-where we have the following elements:
-\begin{enumerate}
-	
-	\item $S_{t}$ is an element of $\mathcal{S}=\{1\ldots n\}$, a set
-	of $n$ latent classes or states.
-	
-	\item $\pi_{i}(\vc{z}_1) = \Prob(S_1 = i|\vc{z}_1)$, giving the
-	probability of class/state $i$ at time $t=1$ with covariate
-	$\vc{z}_1$.
-	
-	\item $a_{ij}(\vc{z}_t) = \Prob(S_{t+1}=j|S_{t}=i,\vc{z}_t)$,
-	provides the probability of a transition from state $i$ to state
-	$j$ with covariate $\vc{z}_t$.
-	
-	\item $\vc{b}_{S_t}$ is a vector of observation densities
-	$b_{j}^k(\vc{z}_t) = \Prob(O_{t}^k|S_t = j, \vc{z}_t)$ that
-	provide the conditional densities of observations $O_{t}^k$
-	associated with latent class/state $j$ and covariate $\vc{z}_t$,
-	$j=1, \ldots, n$, $k=1, \ldots, m$.
-	
-\end{enumerate}
-
-For the example data above, $b_j^k$ could be a Gaussian distribution
-function for the response time variable, and a Bernoulli distribution
-for the accuracy variable.  In the models considered here,
-both the transition probability functions $a_{ij}$ and the initial
-state probability functions $\greekv{\pi}$ may depend on covariates as
-well as the response distributions $b_{j}^{k}$.
-
-
-\subsection{Likelihood}
-
-To obtain maximum likelihood estimates of the model parameters, we
-need the marginal likelihood of the observations.  For hidden Markov
-models, this marginal (log-)likelihood is usually computed by the
-so-called forward-backward algorithm \citep{Baum1966,Rabiner1989}, or
-rather by the forward part of this algorithm.  \cite{Lystig2002}
-changed the forward algorithm in such a way as to allow computing the
-gradients of the log-likelihood at the same time.  They start by
-rewriting the likelihood as follows (for ease of exposition the
-dependence on the model parameters and covariates is dropped here):
-\begin{equation}
-	L_{T} = \Prob(\vc{O}_{1:T}) = \prod_{t=1}^{T} 
-	\Prob(\vc{O}_{t}|\vc{O}_{1:(t-1)}), 
-	\label{condLike}
-\end{equation}
-where $\Prob(\vc{O}_{1}|\vc{O}_{0}):=\Prob(\vc{O}_{1})$. Note that for a 
-simple, i.e., observed, Markov chain these probabilities reduce to 
-$\Prob(\vc{O}_{t}|\vc{O}_{1},\ldots, 
-\vc{O}_{t-1})=\Prob(\vc{O}_{t}|\vc{O}_{t-1})$.
-The log-likelihood can now be expressed as:
-\begin{equation}
-	l_{T} = \sum_{t=1}^{T} \log[\Prob(\vc{O}_{t}|\vc{O}_{1:(t-1)}].
-	\label{eq:condLogl}
-\end{equation}
-
-To compute the log-likelihood, \cite{Lystig2002} define the following 
-(forward) recursion:
-\begin{align}
-	\phi_{1}(j) &:= \Prob(\vc{O}_{1}, S_{1}=j) = \pi_{j} \vc{b}_{j}(\vc{O}_{1}) 
-	\label{eq:fwd1} \\
-	\phi_{t}(j) &:= \Prob(\vc{O}_{t}, S_{t}=j|\vc{O}_{1:(t-1)}) %\\
-	= \sum_{i=1}^{n} [\phi_{t-1}(i)a_{ij} \vc{b}_{j}(\vc{O}_{t})] \times 
-(\Phi_{t-1})^{-1},
-	\label{eq:fwdt} 
-\end{align}
-where $\Phi_{t}=\sum_{i=1}^{n} \phi_{t}(i)$. Combining 
-$\Phi_{t}=\Prob(\vc{O}_{t}|\vc{O}_{1:(t-1)})$, and 
-equation~(\ref{eq:condLogl}) gives the following expression for the 
-log-likelihood:
-\begin{equation}
-	l_{T} = \sum_{t=1}^{T} \log \Phi_{t}.
-	\label{eq:logl}
-\end{equation}
-
-
-\subsection{Parameter estimation}
-
-Parameters are estimated in \pkg{depmixS4} using the expectation-maximization (EM) algorithm or
-through the use of a general Newton-Raphson optimizer.  In the EM
-algorithm, parameters are estimated by iteratively maximizing the
-expected joint log-likelihood of the parameters given the observations and
-states.  Let $\greekv{\theta} = (\greekv{\theta}_1,
-\greekv{\theta}_2,\greekv{\theta}_3)$ be the general parameter vector
-consisting of three subvectors with parameters for the prior model,
-transition model, and response models respectively.  The joint
-log-likelihood can be written as:
-\begin{multline}
-\log \Prob(\vc{O}_{1:T}, \vc{S}_{1:T}|\vc{z}_{1:T},\greekv{\theta}) = \log 
-\Prob(S_1|\vc{z}_{1},\greekv{\theta}_1) 
-+ \sum_{t=2}^{T} \log \Prob(S_t|S_{t-1},\vc{z}_{t-1},\greekv{\theta}_2) \\
-+ \sum_{t=1}^{T} \log \Prob(\vc{O}_t|S_t,\vc{z}_t,\greekv{\theta}_3)
-\end{multline}
-This likelihood depends on the unobserved states $\vc{S}_{1:T}$. In the 
-Expectation step, we replace these with their expected values given a set of 
-(initial) parameters $\greekv{\theta}' = (\greekv{\theta}'_1, 
-\greekv{\theta}'_2,\greekv{\theta}'_3)$ and observations $\vc{O}_{1:T}$. 
-The expected log-likelihood:
-\begin{equation}
-Q(\greekv{\theta},\greekv{\theta}') = E_{\greekv{\theta}'} 
-(\log \Prob(\vc{O}_{1:T},\vc{S}_{1:T}|\vc{O}_{1:T},\vc{z}_{1:T},\greekv{\theta})),
-\end{equation}
-can be written as:
-\begin{multline}
-\label{eq:Q}
-Q(\greekv{\theta},\greekv{\theta}') = 
-\sum_{j=1}^n \gamma_1(j) \log \Prob(S_1=j|\vc{z}_1,\greekv{\theta}_1) \\ 
-+ \sum_{t=2}^T \sum_{j=1}^n \sum_{k=1}^n \xi_t(j,k) \log \Prob(S_t = k|S_{t-1} 
-= j,\vc{z}_{t-1},\greekv{\theta}_2)  \\
- + \sum_{t=1}^T \sum_{j=1}^n \sum_{k=1}^m \gamma_t(j) 
-\log \Prob(O^k_t|S_t=j,\vc{z}_t,\greekv{\theta}_3),
-\end{multline}
-where the expected values $\xi_t(j,k) = P(S_t = k, S_{t-1} =
-j|\vc{O}_{1:T}, \vc{z}_{1:T},\greekv{\theta}')$ and $\gamma_t(j) =
-P(S_t = j|\vc{O}_{1:T}, \vc{z}_{1:T},\greekv{\theta}')$ can be
-computed effectively by the forward-backward algorithm \citep[see
-e.g.,][]{Rabiner1989}.  The Maximization step consists of the
-maximization of (\ref{eq:Q}) for $\greekv{\theta}$.  As the right hand
-side of (\ref{eq:Q}) consists of three separate parts, we can maximize
-separately for $\greekv{\theta}_1$, $\greekv{\theta}_2$ and
-$\greekv{\theta}_3$.  In common models, maximization for
-$\greekv{\theta}_1$ and $\greekv{\theta}_2$ is performed by the
-\code{nnet.default} routine in the \pkg{nnet} package
-\citep{Venables2002}, and maximization for $\greekv{\theta}_3$ by the
-standard \code{glm} routine.  Note that for the latter maximization,
-the expected values $\gamma_t(j)$ are used as prior weights of the
-observations $O^k_t$.
-
-The EM algorithm however has some drawbacks.  First, it can be slow to
-converge towards the end of optimization.  Second, applying
-constraints to parameters can be problematic; in particular, EM can
-lead to wrong parameter estimates when applying constraints.  Hence,
-in \pkg{depmixS4}, EM is used by default in unconstrained models, but
-otherwise, direct optimization is used.  Two options are available for
-direct optimization using package \pkg{Rsolnp}
-\citep{Ghalanos2010,Ye1987}, or \pkg{Rdonlp2}
-\citep{Tamura2009,Spellucci2002}.  Both packages can handle general
-linear (in)equality constraints (and optionally also non-linear
-constraints).
-
-\subsection[Missing data]{Missing data}\label{sec:missingdata}
-
-Missing data can be dealt with straightforwardly in computing the
-likelihood using the forward recursion in
-Equations~(\ref{eq:fwd1}--\ref{eq:fwdt}).  Assume we have observed
-$\vc{O}_{1:(t-1)}$ but that observation $\vc{O}_{t}$ is missing.  The
-key idea that, in this case, the filtering distribution, the
-probabilities $\phi_{t}$, should be identical to the state prediction
-distribution, as there is no additional information to estimate
-the current state.  Thus, the forward variables $\phi_{t}$ are now 
-computed as:
-%
-\begin{align}
-	\phi_t(i) &= \Prob(S_{t} = i|\vc{O}_{1:(t-1)}) 
-	\\ &=  \sum_{j=1}^n \phi_{t-1}(j) \Prob(S_t = i|S_{t-1}=j).
-\end{align}
-%
-For later observations, we can then use this latter equation again,
-realizing that the filtering distribution is technically e.g.
-$\Prob(S_{t+1}|\vc{O}_{1:(t-1),t+1})$.  Computationally, the easiest
-way to implement this is to simply set $\vc{b}(\vc{O}_t|S_t) = 1$ if
-$\vc{O}_t$ is missing.
-
-
-\section[Using depmixS4]{Using \pkg{depmixS4}}
-
-Two steps are involved in using \pkg{depmixS4} which are illustrated
-below with examples:
-\begin{enumerate}
-	
-	\item model specification with function \code{depmix} (or with
-	\code{mix} for latent class and finite mixture models, see example
-	below on adding covariates to prior probabilities);
-	
-	\item  model fitting with function \code{fit}.
-\end{enumerate}
-We have separated the stages of model specification and model fitting
-because fitting large models can be fairly time-consuming and it is
-hence useful to be able to check the model specification before
-actually fitting the model.
-
-\subsection[Example data: speed]{Example data: \code{speed}}
-
-Throughout this article a data set called \code{speed} is used.  As
-already indicated in the introduction, it consists of three time
-series with three variables: response time \code{rt}, accuracy
-\code{corr}, and a covariate, \code{Pacc}, which defines the relative
-pay-off for speeded versus accurate responding.  Before describing
-some of the models that are fitted to these data, we provide a brief
-sketch of the reasons for gathering these data in the first place.
-
-Response times are a very common dependent variable in psychological
-experiments and hence form the basis for inference about many
-psychological processes.  A potential threat to such inference based
-on response times is formed by the speed-accuracy trade-off: different
-participants in an experiment may respond differently to typical
-instructions to `respond as fast and accurate as possible'.  A popular
-model which takes the speed-accuracy trade-off into account is the
-diffusion model \citep{Ratcliff1978}, which has proven to provide
-accurate descriptions of response times in many different settings.
-
-One potential problem with the diffusion model is that it predicts a
-continuous trade-off between speed and accuracy of responding, i.e.,
-when participants are pressed to respond faster and faster, the
-diffusion model predicts that this would lead to a gradual decrease in
-accuracy.  The \code{speed} data set that we analyze below was
-gathered to test this hypothesis versus the alternative hypothesis
-stating that there is a sudden transition from slow and accurate
-responding to fast responding at chance level.  At each trial of the
-experiment, the participant is shown the current setting of the
-relative reward for speed versus accuracy.  The bottom panel of
-Figure~\ref{fig:speed} shows the values of this variable.  The
-experiment was designed to investigate what would happen when this
-reward variable changes from reward for accuracy only to reward for
-speed only.  The \code{speed} data that we analyse here are from
-participant A in Experiment 1 in \citet{Dutilh2011}, who provide a
-complete description of the experiment and the relevant theoretical
-background.
-
-The central question regarding this data is whether it is indeed best
-described by two modes of responding rather than a single mode of
-responding with a continuous trade-off between speed and accuracy.
-The hallmark of a discontinuity between slow versus speeded responding
-is that switching between the two modes is asymmetric \citep[see
-e.g.][for a theoretical underpinning of this claim]{Maas1992}.  The
-\code{fit} help page of \pkg{depmixS4} provides a number of examples
-in which the asymmetry of the switching process is tested; those
-examples and other candidate models are discussed at length in
-\citet{Visser2009b}.
-
-
-\subsection{A simple model}
-
-A dependent mixture model is defined by the number of states and the
-initial state, state transition, and response distribution functions.
-A dependent mixture model can be created with the
-\code{depmix} function as follows:
-<<>>=
-library("depmixS4")
-data("speed")
-set.seed(1)
-mod <- depmix(response = rt ~ 1, data = speed, nstates = 2,
-  trstart = runif(4))
-@
-
-The first line of code loads the \pkg{depmixS4} package and
-\code{data(speed)} loads the \code{speed} data set.  The line
-\code{set.seed(1)} is necessary to get starting values that will
-result in the right model, see more on starting values below.
-
-The call to \code{depmix} specifies the model with a number of
-arguments.  The \code{response} argument is used to specify the
-response variable, possibly with covariates, in the familiar format
-using formulae such as in \code{lm} or \code{glm} models.  The second
-argument, \code{data = speed}, provides the \code{data.frame} in which
-the variables from \code{response} are interpreted.  Third, the number
-of states of the model is given by \code{nstates = 2}.
-
-
-\paragraph{Starting values.} Note also that start values for the
-transition parameters are provided in this call by the \code{trstart}
-argument.  Starting values for parameters can be provided using three
-arguments: \code{instart} for the parameters relating to the prior
-probabilities, \code{trstart}, relating the transition models, and
-\code{respstart} for the parameters of the response models.  Note that
-the starting values for the initial and transition models as well as
-multinomial logit response models are interpreted as {\em
-probabilities}, and internally converted to multinomial logit
-parameters (if a logit link function is used).  Start values can also
-be generated randomly within the EM algorithm by generating random
-uniform values for the values of $\gamma_{t}$ in (\ref{eq:Q}) at
-iteration 0.  Automatic generation of starting values is not available
-for constrained models (see below). In the call to \code{fit} below, 
-the argument \code{emc=em.control(rand=FALSE)} controls the EM 
-algorithm and specifies that random start values should not be 
-generated\footnote{As of version 1.0-1, the default is have random 
-parameter initialization when using the EM algorithm.}. 
-
-\paragraph{Fitting the model and printing results.} The \code{depmix}
-function returns an object of class `\code{depmix}' which contains the
-model specification, and not a fitted model.  Hence, the model needs
-to be fitted by calling \code{fit}:
-<<>>=
-fm <- fit(mod, emc=em.control(rand=FALSE))
-@
-
-The \code{fit} function returns an object of class
-`\code{depmix.fitted}' which extends the `\code{depmix}' class, adding
-convergence information (and information about constraints if these
-were applied, see below).  The class has \code{print} and
-\code{summary} methods to see the results.  The \code{print} method
-provides information on convergence, the log-likelihood and
-the AIC and BIC values:
-<<>>=
-fm 
-@
-
-These statistics can also be extracted using \code{logLik}, \code{AIC}
-and \code{BIC}, respectively.  By comparison, a 1-state model for
-these data, i.e., assuming there is no mixture, has a log-likelihood of
-$-305.33$, and 614.66, and 622.83 for the AIC and BIC respectively.
-Hence, the 2-state model fits the data much better than the 1-state
-model.  Note that the 1-state model can be specified using
-\code{mod <- depmix(rt ~ 1, data = speed, nstates = 1)}, although this model is
-trivial as it will simply return the mean and standard deviation of
-the \code{rt} variable.
-
-The \code{summary} method of \code{fit}ted models provides the
-parameter estimates, first for the prior probabilities model, second
-for the transition models, and third for the response models.
-<<>>=
-summary(fm)
-@
-
-Since no further arguments were specified, the initial state, state
-transition and response distributions were set to their defaults
-(multinomial distributions for the first two, and a Gaussian
-distribution for the response).  The resulting model indicates two
-well-separated states, one with slow and the second with fast
-responses.  The transition probabilities indicate rather stable
-states, i.e., the probability of remaining in either of the states is
-around 0.9.  The initial state probability estimates indicate that
-state 1 is the starting state for the process, with a negligible
-probability of starting in state 2.
-
-\subsection{Covariates on transition parameters}
-
-By default, the transition probabilities and the initial state
-probabilities are parameterized using a multinomial model with an
-identity link function.  Using a multinomial logistic model allows one
-to include covariates on the initial state and transition
-probabilities.  In this case, each row of the transition matrix is
-parameterized by a baseline category logistic multinomial, meaning
-that the parameter for the base category is fixed at zero
-\citep[see][p.~267~ff., for multinomial logistic models and various
-parameterizations]{Agresti2002}.  The default baseline category is the
-first state.  Hence, for example, for a 3-state model, the initial
-state probability model would have three parameters of which the first
-is fixed at zero and the other two are freely estimated.
-\citet{Chung2007} discuss a related latent transition model for
-repeated measurement data ($T=2$) using logistic regression on the
-transition parameters; they rely on Bayesian methods of estimation.
-Covariates on the transition probabilities can be specified using a
-one-sided formula as in the following example:
-<<>>=
-set.seed(1)
-mod <- depmix(rt ~ 1, data = speed, nstates = 2, family = gaussian(),
-  transition = ~ scale(Pacc), instart = runif(2)) 
-fm <- fit(mod, verbose = FALSE, emc=em.control(rand=FALSE)) 
-@
-
-Note the use of \code{verbose = FALSE} to suppress printing of
-information on the iterations of the fitting process.  Applying
-\code{summary} to the \code{fit}ted object gives (only transition
-models printed here by using argument \code{which}): 
-<<>>= 
-summary(fm, which = "transition") 
-@
-
-The summary provides all parameters of the model, also the (redundant)
-zeroes for the base-line category in the multinomial model.  The
-summary also prints the transition probabilities at the zero value of
-the covariate.  Note that scaling of the covariate is useful in this
-regard as it makes interpretation of these intercept probabilities
-easier.
-
-\subsection{Multivariate data}
-
-Multivariate data can be modelled by providing a list of formulae as
-well as a list of family objects for the distributions of the various
-responses.  In above examples we have only used the response times
-which were modelled as a Gaussian distribution.  The accuracy
-variable in the \code{speed} data can be modelled with a multinomial
-by specifying the following:
-<<>>=
-set.seed(1)
-mod <- depmix(list(rt ~ 1,corr ~ 1), data = speed, nstates = 2, 
-  family = list(gaussian(), multinomial("identity")),
-  transition = ~ scale(Pacc), instart = runif(2))
-fm <- fit(mod, verbose = FALSE, emc=em.control(rand=FALSE))
-@
-
-This provides the following fitted model parameters (only the 
-response parameters are given here): 
-<<>>=
-summary(fm, which = "response")
-@
-
-As can be seen, state 1 has fast response times and accuracy is
-approximately at chance level (.4747), whereas state 2 corresponds with
-slower responding at higher accuracy levels (.9021).
-
-Note that by specifying multivariate observations in terms of a list,
-the variables are considered conditionally independent (given the
-states).  Conditionally \emph{dependent} variables must be handled as
-a single element in the list.  Effectively, this means specifying a
-multivariate response model.  Currently, \pkg{depmixS4} has one 
-multivariate response model which is for multivariate normal
-variables.
-
-
-\subsection{Fixing and constraining parameters}
-
-Using package \pkg{Rsolnp} \citep{Ghalanos2010} or \pkg{Rdonlp2}
-\citep{Tamura2009}, parameters may be fitted subject to general linear
-(in-)equality constraints.  Constraining and fixing parameters is done
-using the \code{conpat} argument to the \code{fit} function, which
-specifies for each parameter in the model whether it's fixed (0) or
-free (1 or higher).  Equality constraints can be imposed by giving two
-parameters the same number in the \code{conpat} vector.  When only
-fixed values are required, the \code{fixed} argument can be used
-instead of \code{conpat}, with zeroes for fixed parameters and other
-values (e.g., ones) for non-fixed parameters.  Fitting the models
-subject to these constraints is handled by the optimization routine
-\code{solnp} or, optionally, by \code{donlp2}.  To be able to
-construct the \code{conpat} and/or \code{fixed} vectors one needs the
-correct ordering of parameters which is briefly discussed next before
-proceeding with an example.
-
-\paragraph{Parameter numbering.} When using the \code{conpat} and
-\code{fixed} arguments, complete parameter vectors should be supplied,
-i.e., these vectors should have length equal to the number of parameters of
-the model, which can be obtained by calling \code{npar(object)}.  Note
-that this is not the same as the degrees of freedom used e.g., in the
-\code{logLik} function because \code{npar} also counts the baseline
-category zeroes from the multinomial logistic models.  Parameters are
-numbered in the following order:
-\begin{enumerate}
-	\item  the prior model parameters;
-	\item  the parameters for the transition models;
-	\item  the response model parameters per state (and subsequently
-	per response in the case of multivariate time series).
-\end{enumerate}
-
-To see the ordering of parameters use the following:
-<<term=FALSE>>=
-setpars(mod, value = 1:npar(mod))
-@
-
-To see which parameters are fixed (by default only baseline parameters
-in the multinomial logistic models for the transition models and the
-initial state probabilities model):
-<<term=FALSE>>=
-setpars(mod, getpars(mod, which = "fixed"))
-@
-When fitting constraints it is useful to have good starting values 
-for the parameters and hence we first fit the following model without
-constraints:
-<<>>=
-trst <- c(0.9, 0.1, 0, 0, 0.1, 0.9, 0, 0)
-mod <- depmix(list(rt ~ 1,corr ~ 1), data = speed, transition = ~ Pacc,
-  nstates = 2, family = list(gaussian(), multinomial("identity")),
-  trstart = trst, instart = c(0.99, 0.01))
-fm1 <- fit(mod,verbose = FALSE, emc=em.control(rand=FALSE))
-@
-
-After this, we use the fitted values from this model to constrain the
-regression coefficients on the transition matrix (parameters number~6
-and~10):
-<<term=FALSE, echo=TRUE,results=hide>>=
-pars <- c(unlist(getpars(fm1)))
-pars[6] <- pars[10] <- 11
-pars[1] <- 0
-pars[2] <- 1
-pars[13] <- pars[14] <- 0.5
-fm1 <- setpars(mod, pars)
-conpat <- c(0, 0, rep(c(0, 1), 4), 1, 1, 0, 0, 1, 1, 1, 1)
-conpat[6] <- conpat[10] <- 2
-fm2 <- fit(fm1, equal = conpat)
-@
-
-Using \code{summary} on the fitted model shows that the regression
-coefficients are now estimated at the same value of 12.66.  Setting
-elements 13 and 14 of \code{conpat} to zero resulted in fixing those
-parameters at their starting values of 0.5.  This means that state 1
-can now be interpreted as a 'guessing' state which is associated with
-comparatively fast responses.  Similarly for elements 1 and 2,
[TRUNCATED]

To get the complete diff run:
    svnlook diff /svnroot/depmix -r 592


More information about the depmix-commits mailing list