[Depmix-commits] r516 - in pkg/depmixS4: . inst/doc man

Tue Jun 12 15:14:18 CEST 2012

Author: ingmarvisser
Date: 2012-06-12 15:14:18 +0200 (Tue, 12 Jun 2012)
New Revision: 516

Modified:
   pkg/depmixS4/DESCRIPTION
   pkg/depmixS4/inst/doc/depmixS4.Rnw
   pkg/depmixS4/inst/doc/depmixS4.bib
   pkg/depmixS4/inst/doc/depmixS4.pdf
   pkg/depmixS4/man/em.control.Rd
Log:
Added subsection on missing data to the vignette as well as a table with possible response models.

Modified: pkg/depmixS4/DESCRIPTION
===================================================================

--- pkg/depmixS4/DESCRIPTION	2012-05-10 20:08:59 UTC (rev 515)
+++ pkg/depmixS4/DESCRIPTION	2012-06-12 13:14:18 UTC (rev 516)
@@ -1,10 +1,10 @@
 Package: depmixS4
-Version: 1.1-1
-Date: 2012-02-29
+Version: 1.2-0
+Date: 2012-06-12
 Title: Dependent Mixture Models - Hidden Markov Models of GLMs and Other Distributions in S4
 Author: Ingmar Visser <i.visser at uva.nl>, Maarten Speekenbrink <m.speekenbrink at ucl.ac.uk>
 Maintainer: Ingmar Visser <i.visser at uva.nl>
-Depends: R (>= 2.14.2), stats, nnet, methods, MASS, Rsolnp, stats4
+Depends: R (>= 2.15.0), stats, nnet, methods, MASS, Rsolnp, stats4
 Suggests: gamlss, gamlss.dist, TTR
 Description: Fit latent (hidden) Markov models on mixed categorical and continuous (timeseries)
    data, otherwise known as dependent mixture models

Modified: pkg/depmixS4/inst/doc/depmixS4.Rnw
===================================================================
--- pkg/depmixS4/inst/doc/depmixS4.Rnw	2012-05-10 20:08:59 UTC (rev 515)
+++ pkg/depmixS4/inst/doc/depmixS4.Rnw	2012-06-12 13:14:18 UTC (rev 516)
@@ -18,28 +18,33 @@
 \author{Ingmar Visser\\University of Amsterdam \And 
         Maarten Speekenbrink\\University College London}
 \Plainauthor{Ingmar Visser, Maarten Speekenbrink}
-        
+
 \title{\pkg{depmixS4}: An \proglang{R} Package for Hidden Markov Models}
 \Plaintitle{depmixS4: An R Package for Hidden Markov Models}
 
 \Abstract{	
-  This introduction to the \proglang{R} package \pkg{depmixS4} is a (slightly)
-  modified version of \cite{Visser+Speekenbrink:2010}, published in the
-  \emph{Journal of Statistical Software}. Please refer to that 
-  article when using \pkg{depmixS4}. The current version is 1.0-1. 
 
-  \pkg{depmixS4} implements a general framework for defining and
-  estimating dependent mixture models in the \proglang{R}
-  programming language.  This includes standard Markov
-  models, latent/hidden Markov models, and latent class and finite
-  mixture distribution models.  The models can be fitted on mixed
-  multivariate data with distributions from the \code{glm} family,
-  the (logistic) multinomial, or the multivariate normal
-  distribution.  Other distributions can be added easily, and an
-  example is provided with the {\em exgaus} distribution.
-  Parameters are estimated by the expectation-maximization (EM) algorithm or, when (linear)
-  constraints are imposed on the parameters, by direct numerical
-  optimization with the \pkg{Rsolnp} or \pkg{Rdonlp2} routines.   
+	This introduction to the \proglang{R} package \pkg{depmixS4} is a
+	(slightly) modified version of \cite{Visser+Speekenbrink:2010},
+	published in the \emph{Journal of Statistical Software}.  Please
+	refer to that article when using \pkg{depmixS4}.  The current
+	version is 1.1-1; the version history and changes can be found in
+	the NEWS file of the package. Below, the major versions are listed 
+	along with the most noteworthy changes. 
+
+	\pkg{depmixS4} implements a general framework for defining and
+	estimating dependent mixture models in the \proglang{R} programming
+	language.  This includes standard Markov models, latent/hidden
+	Markov models, and latent class and finite mixture distribution
+	models.  The models can be fitted on mixed multivariate data with
+	distributions from the \code{glm} family, the (logistic)
+	multinomial, or the multivariate normal distribution.  Other
+	distributions can be added easily, and an example is provided with
+	the {\em exgaus} distribution.  Parameters are estimated by the
+	expectation-maximization (EM) algorithm or, when (linear)
+	constraints are imposed on the parameters, by direct numerical
+	optimization with the \pkg{Rsolnp} or \pkg{Rdonlp2} routines.  
+	
 }
 
 \Keywords{hidden Markov model, dependent mixture model, mixture model, constraints}
@@ -68,6 +73,16 @@
 library("depmixS4")
 @
 
+\section*{Version history}
+
+\begin{description}
+		\item[1.1-0] Speed improvements due to writing the main loop in C code.
+		\item[1.0-0] First release with this vignette, a modified version 
+		of the paper in the Journal of Statistical Software. 
+		\item[0.1-0] First version on CRAN. 
+\end{description}
+
+
 \section{Introduction}
 
 Markov and latent Markov models are frequently used in the social
@@ -143,7 +158,7 @@
 responding.  These variables are measured on 168, 134 and 137
 occasions respectively (the first series of 168 trials is plotted in
 Figure~\ref{fig:speed}).  These data are more fully described in
-\citet{Dutilh2010}, and in the next section a number of example models
+\citet{Dutilh2011}, and in the next section a number of example models
 for these data is described.
 
 \setkeys{Gin}{width=0.8\textwidth}
@@ -331,7 +346,30 @@
 linear (in)equality constraints (and optionally also non-linear
 constraints).
 
+\subsection[Missing data]{Missing data}\label{missingdata}
 
+Missing data can be dealt with straightforwardly in computing the
+likelihood using the forward recursion in
+Equations~(\ref{eq:fwd1}--\ref{eq:fwdt}).  Assume we have observed
+$\vc{O}_{1:(t-1)}$ but that observation $\vc{O}_{t}$ is missing.  The
+key idea that, in this case, the filtering distribution, the
+probabilities $\phi_{t}$, should be identical to the state prediction
+distribution, as there is no additional information to estimate
+the current state.  Thus, the forward variables $\phi_{t}$ are now 
+computed as:
+%
+\begin{align}
+	\phi_t(i) &= \Prob(S_{t} = i|\vc{O}_{1:(t-1)}) 
+	\\ &=  \sum_{j=1}^N \phi_{t-1}(j) \Prob(S_t = i|S_{t-1}=j).
+\end{align}
+%
+For later observations, we can then use this latter equation again,
+realizing that the filtering distribution is technically e.g.
+$\Prob(S_{t+1}|\vc{O}_{1:(t-1),t+1})$.  Computationally, the easiest
+way to implement this is to simply set $\vc{b}(\vc{O}_t|S_t) = 1$ if
+$\vc{O}_t$ is missing.
+
+
 \section[Using depmixS4]{Using \pkg{depmixS4}}
 
 Two steps are involved in using \pkg{depmixS4} which are illustrated
@@ -353,11 +391,11 @@
 
 Throughout this article a data set called \code{speed} is used.  As
 already indicated in the introduction, it consists of three time
-series with three variables: response time \code{rt}, accuracy \code{corr}, and a covariate,
-\code{Pacc}, which defines the relative pay-off for speeded versus accurate
-responding.  Before describing some of the models that are fitted to
-these data, we provide a brief sketch of the reasons for gathering
-these data in the first place.
+series with three variables: response time \code{rt}, accuracy
+\code{corr}, and a covariate, \code{Pacc}, which defines the relative
+pay-off for speeded versus accurate responding.  Before describing
+some of the models that are fitted to these data, we provide a brief
+sketch of the reasons for gathering these data in the first place.
 
 Response times are a very common dependent variable in psychological
 experiments and hence form the basis for inference about many
@@ -383,7 +421,7 @@
 experiment was designed to investigate what would happen when this
 reward variable changes from reward for accuracy only to reward for
 speed only.  The \code{speed} data that we analyse here are from
-participant A in Experiment 1 in \citet{Dutilh2010}, who provide a
+participant A in Experiment 1 in \citet{Dutilh2011}, who provide a
 complete description of the experiment and the relevant theoretical
 background.
 
@@ -794,8 +832,33 @@
 separate fit functions for each part of the model, the prior
 probability model, the transition models, and the response models.  As
 a consequence, adding user-specified response models is
-straightforward.
+straightforward.  The currently implemented distributions are listed
+in Table~\ref{tbl:responses}.
 
+\begin{table}
+		\begin{center}
+		\begin{tabular}{lll}
+				\hline
+				package & family & link \\
+				\hline
+				\hline
+				\pkg{stats} & binomial & logit, probit, cauchit, log, cloglog \\
+				\pkg{stats} & gaussian & identity, log, inverse \\
+				\pkg{stats} & Gamma & inverse, identity, log \\
+				\pkg{stats} & poisson & log, identity, sqrt \\
+				\pkg{depmixS4} & multinomial & logit, identity (no covariates allowed) \\
+				\pkg{depmixS4} & multivariate normal & identity (only 
+				available through makeDepmix) \\
+				\pkg{depmixS4} & ex-gauss & identity (only 
+				available through makeDepmix as example) \\
+				\hline
+		\end{tabular}
+		\end{center}
+		\caption{Response distribution available in \pkg{depmixS4}.}
+		\label{tbl:responses}
+\end{table}
+
+
 User-defined distributions should extend the `\code{response}' class and
 have the following slots:
 \begin{enumerate}
@@ -1021,7 +1084,7 @@
 (NEST).  Maarten Speekenbrink was supported by ESRC grant
 RES-062-23-1511 and the ESRC Centre for Economic Learning and Social
 Evolution (ELSE).  Han van der Maas provided the speed-accuracy data
-\citep{Dutilh2010} and thereby necessitated implementing models with
+\citep{Dutilh2011} and thereby necessitated implementing models with
 time-dependent covariates.  Brenda Jansen provided the balance scale
 data set \citep{Jansen2002} which was the perfect opportunity to test
 the covariates on the prior model parameters.  The examples in the

Modified: pkg/depmixS4/inst/doc/depmixS4.bib
===================================================================
--- pkg/depmixS4/inst/doc/depmixS4.bib	2012-05-10 20:08:59 UTC (rev 515)
+++ pkg/depmixS4/inst/doc/depmixS4.bib	2012-06-12 13:14:18 UTC (rev 516)
@@ -1,3 +1,24 @@
+%% This BibTeX bibliography file was created using BibDesk.
+%% http://bibdesk.sourceforge.net/
+
+
+%% Created for Ingmar Visser at 2012-06-12 11:37:41 +0200 
+
+
+%% Saved with string encoding Unicode (UTF-8) 
+
+
+
+ at article{Dutilh2011,
+	Author = {Gilles Dutilh and Eric--Jan Wagenmakers and Ingmar Visser and Han L. J. van der Maas},
+	Date-Added = {2012-06-12 09:36:06 +0000},
+	Date-Modified = {2012-06-12 09:36:06 +0000},
+	Journal = {Cognitive Science},
+	Pages = {211-250},
+	Title = {A Phase Transition Model for the Speed--Accuracy Trade--Off in Response Time Experiments},
+	Volume = {35},
+	Year = {2011}}
+
 @book{Zucchini2009,
 	Address = {Boca Raton},
 	Author = {Walter Zucchini and Iain MacDonald},
@@ -12,17 +33,17 @@
 	Title = {\pkg{Rsolnp}: General Non-Linear Optimization Using Augmented Lagrange Multiplier Method},
 	Url = {http://CRAN.R-project.org/package=Rsolnp},
 	Year = {2010},
-}
+	Bdsk-Url-1 = {http://CRAN.R-project.org/package=Rsolnp}}
 
 @manual{R2010,
-  title		= {\proglang{R}: {A} Language and Environment for Statistical Computing},
-  author	= {{\proglang{R} Development Core Team}},
-  organization	= {\proglang{R} Foundation for Statistical Computing},
-  address	= {Vienna, Austria},
-  year		= {2010},
-  note          = {{ISBN} 3-900051-07-0},
-  url		= {http://www.R-project.org/}
-}
+	Address = {Vienna, Austria},
+	Author = {{\proglang{R} Development Core Team}},
+	Note = {{ISBN} 3-900051-07-0},
+	Organization = {\proglang{R} Foundation for Statistical Computing},
+	Title = {\proglang{R}: {A} Language and Environment for Statistical Computing},
+	Url = {http://www.R-project.org/},
+	Year = {2010},
+	Bdsk-Url-1 = {http://www.R-project.org/}}
 
 @book{Salzberg1998,
 	Address = {Amsterdam},
@@ -50,7 +71,8 @@
 	Publisher = {Statistical Innovations Inc.},
 	Title = {\pkg{Latent Gold}~3.0},
 	Url = {http://www.statisticalinnovations.com/},
-	Year = 2003}
+	Year = 2003,
+	Bdsk-Url-1 = {http://www.statisticalinnovations.com/}}
 
 @book{Venables2002,
 	Address = {New York},
@@ -58,8 +80,7 @@
 	Edition = {4th},
 	Publisher = {Springer-Verlag},
 	Title = {Modern Applied Statistics with \proglang{S}},
-	Year = {2002}
-}
+	Year = {2002}}
 
 @article{Maas1992,
 	Author = {Han L. J. {Van der Maas} and Peter C. M. Molenaar},
@@ -90,7 +111,7 @@
 	Title = {\pkg{Rdonlp2}: An \proglang{R} Extension Library to Use Peter Spelluci's \pkg{donlp2} from \proglang{R}},
 	Url = {http://arumat.net/Rdonlp2/},
 	Year = {2009},
-}
+	Bdsk-Url-1 = {http://arumat.net/Rdonlp2/}}
 
 @manual{Stasinopoulos2009a,
 	Author = {D. Mikis Stasinopoulos and Bob A. Rigby and Calliope Akantziliotou},
@@ -98,7 +119,7 @@
 	Title = {\pkg{gamlss}: Generalized Additive Models for Location Scale and Shape},
 	Url = {http://CRAN.R-project.org/package=gamlss},
 	Year = {2009},
-}
+	Bdsk-Url-1 = {http://CRAN.R-project.org/package=gamlss}}
 
 @article{Rigby2005,
 	Author = {R. A. Rigby and D. M. Stasinopoulos},
@@ -113,12 +134,12 @@
 	Author = {D. Mikis Stasinopoulos and Bob A. Rigby},
 	Journal = {Journal of Statistical Software},
 	Number = 7,
+	Pages = {1--46},
 	Title = {Generalized Additive Models for Location Scale and Shape (GAMLSS) in \proglang{R}},
 	Url = {http://www.jstatsoft.org/v23/i07},
 	Volume = 23,
-	Pages = {1--46},
 	Year = 2007,
-}
+	Bdsk-Url-1 = {http://www.jstatsoft.org/v23/i07}}
 
 @manual{Stasinopoulos2009b,
 	Author = {D. Mikis Stasinopoulos and Bob A. Rigby and Calliope Akantziliotou and Gillian Heller and Raydonal Ospina and Nicoletta Motpan},
@@ -126,15 +147,15 @@
 	Title = {\pkg{gamlss.dist}: Distributions to Be Used for GAMLSS Modelling},
 	Url = {http://CRAN.R-project.org/package=gamlss.dist},
 	Year = {2010},
-}
+	Bdsk-Url-1 = {http://CRAN.R-project.org/package=gamlss.dist}}
 
 @manual{Spellucci2002,
 	Author = {Peter Spellucci},
+	Organization = {TU Darmstadt},
 	Title = {\pkg{donlp2} Users Guide},
-	Organization = {TU Darmstadt},
 	Url = {http://www.mathematik.tu-darmstadt.de/fbereiche/numerik/staff/spellucci/DONLP2/},
 	Year = {2002},
-}
+	Bdsk-Url-1 = {http://www.mathematik.tu-darmstadt.de/fbereiche/numerik/staff/spellucci/DONLP2/}}
 
 @book{Siegler1981,
 	Author = {Robert S. Siegler},
@@ -163,11 +184,11 @@
 @article{Lystig2002,
 	Author = {Theodore C. Lystig and James P. Hughes},
 	Journal = {Journal of Computational and Graphical Statistics},
+	Number = {3},
+	Pages = {678--689},
 	Title = {Exact Computation of the Observed Information Matrix for Hidden Markov Models},
-	Year = {2002},
 	Volume = {11},
-	Number = {3},
-	Pages = {678--689}}
+	Year = {2002}}
 
 @article{Leroux1992,
 	Author = {B. G. Leroux and M. L. Puterman},
@@ -185,21 +206,20 @@
 	Title = {FlexMix: A General Framework for Finite Mixture Models and Latent Class Regression in \proglang{R}},
 	Url = {http://www.jstatsoft.org/v11/i08/},
 	Volume = {11},
-	Year = {2004}
-}
+	Year = {2004},
+	Bdsk-Url-1 = {http://www.jstatsoft.org/v11/i08/}}
 
+ at article{GruenLeisch2008,
+	Author = {Bettina Gr\"un and Friedrich Leisch},
+	Journal = {Journal of Statistical Software},
+	Number = {4},
+	Pages = {1--35},
+	Title = {{FlexMix} Version~2: Finite Mixtures with Concomitant Variables and Varying and Constant Parameters},
+	Url = {http://www.jstatsoft.org/v28/i04/},
+	Volume = {28},
+	Year = {2008},
+	Bdsk-Url-1 = {http://www.jstatsoft.org/v28/i04/}}
 
- at Article{GruenLeisch2008,
-  title = {{FlexMix} Version~2: Finite Mixtures with Concomitant Variables and Varying and Constant Parameters},
-  author = {Bettina Gr\"un and Friedrich Leisch},
-  journal = {Journal of Statistical Software},
-  year = {2008},
-  volume = {28},
-  number = {4},
-  pages = {1--35},
-  url = {http://www.jstatsoft.org/v28/i04/},
-}
-
 @incollection{Krogh1998,
 	Address = {Amsterdam},
 	Author = {Anders Krogh},
@@ -238,9 +258,9 @@
 	Year = 1994}
 
 @book{Fruhwirth2006,
+	Address = {New York},
 	Author = {Sylvia Fr\"uhwirth-Schnatter},
 	Publisher = {Springer-Verlag},
-        Address = {New York},
 	Title = {Finite Mixture and Markov Switching Models},
 	Year = {2006}}
 
@@ -251,8 +271,7 @@
 	Pages = {309-321},
 	Title = {Likelihood Ratio Testing for Hidden Markov Models under Non-Standard Conditions},
 	Volume = {25},
-	Year = {2007},
-}
+	Year = {2007}}
 
 @article{Chung2007,
 	Author = {Hwan Chung and Theodore Walls and Yousung Park},
@@ -261,8 +280,7 @@
 	Pages = {413-435},
 	Title = {A Latent Transition Model With Logistic Regression},
 	Volume = {72},
-	Year = 2007,
-}
+	Year = 2007}
 
 @book{Cappe2005,
 	Address = {New York},
@@ -303,8 +321,7 @@
 	Pages = {2079--2091},
 	Title = {Multiple Learning Modes in the Development of Rule-based Category-learning Task Performance},
 	Volume = 44,
-	Year = 2006,
-}
+	Year = 2006}
 
 @incollection{Visser2009b,
 	Address = {New York},
@@ -315,22 +332,15 @@
 	Pages = {269-289},
 	Publisher = {Springer-Verlag},
 	Title = {Hidden Markov Models for Individual Time Series},
-	Year = {2009},
-}
+	Year = {2009}}
 
- at Unpublished{Dutilh2010,
-	Author = {Gilles Dutilh and Eric-Jan Wagenmakers and Ingmar Visser and Han L. J. van der Maas},
-	Note = {Submitted for publication},
-	Title = {A Phase Transition Model for the Speed-Accuracy Trade-Off in Response Time Experiments},
-	Year = {2010}}
-
- at Article{Visser+Speekenbrink:2010,
-  author        = {Ingmar Visser and Maarten Speekenbrink},
-  title         = {\pkg{depmixS4}: An \proglang{R} Package for Hidden Markov Models},
-  journal       = {Journal of Statistical Software},
-  year          = {2010},
-  volume        = {36},
-  number        = {7},
-  pages         = {1--21},
-  url           = {http://www.jstatsoft.org/v36/i07/}
-}
+ at article{Visser+Speekenbrink:2010,
+	Author = {Ingmar Visser and Maarten Speekenbrink},
+	Journal = {Journal of Statistical Software},
+	Number = {7},
+	Pages = {1--21},
+	Title = {\pkg{depmixS4}: An \proglang{R} Package for Hidden Markov Models},
+	Url = {http://www.jstatsoft.org/v36/i07/},
+	Volume = {36},
+	Year = {2010},
+	Bdsk-Url-1 = {http://www.jstatsoft.org/v36/i07/}}

Modified: pkg/depmixS4/inst/doc/depmixS4.pdf
===================================================================
(Binary files differ)

Modified: pkg/depmixS4/man/em.control.Rd
===================================================================
--- pkg/depmixS4/man/em.control.Rd	2012-05-10 20:08:59 UTC (rev 515)
+++ pkg/depmixS4/man/em.control.Rd	2012-06-12 13:14:18 UTC (rev 516)
@@ -8,7 +8,7 @@
 
 \usage{
 	
-	em.control(maxit = 100, tol = 1e-08, crit = "relative", random.start = TRUE)
+	em.control(maxit = 500, tol = 1e-08, crit = "relative", random.start = TRUE)
 	
 }