[Depmix-commits] r516 - in pkg/depmixS4: . inst/doc man
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Tue Jun 12 15:14:18 CEST 2012
Author: ingmarvisser
Date: 2012-06-12 15:14:18 +0200 (Tue, 12 Jun 2012)
New Revision: 516
Modified:
pkg/depmixS4/DESCRIPTION
pkg/depmixS4/inst/doc/depmixS4.Rnw
pkg/depmixS4/inst/doc/depmixS4.bib
pkg/depmixS4/inst/doc/depmixS4.pdf
pkg/depmixS4/man/em.control.Rd
Log:
Added subsection on missing data to the vignette as well as a table with possible response models.
Modified: pkg/depmixS4/DESCRIPTION
===================================================================
--- pkg/depmixS4/DESCRIPTION 2012-05-10 20:08:59 UTC (rev 515)
+++ pkg/depmixS4/DESCRIPTION 2012-06-12 13:14:18 UTC (rev 516)
@@ -1,10 +1,10 @@
Package: depmixS4
-Version: 1.1-1
-Date: 2012-02-29
+Version: 1.2-0
+Date: 2012-06-12
Title: Dependent Mixture Models - Hidden Markov Models of GLMs and Other Distributions in S4
Author: Ingmar Visser <i.visser at uva.nl>, Maarten Speekenbrink <m.speekenbrink at ucl.ac.uk>
Maintainer: Ingmar Visser <i.visser at uva.nl>
-Depends: R (>= 2.14.2), stats, nnet, methods, MASS, Rsolnp, stats4
+Depends: R (>= 2.15.0), stats, nnet, methods, MASS, Rsolnp, stats4
Suggests: gamlss, gamlss.dist, TTR
Description: Fit latent (hidden) Markov models on mixed categorical and continuous (timeseries)
data, otherwise known as dependent mixture models
Modified: pkg/depmixS4/inst/doc/depmixS4.Rnw
===================================================================
--- pkg/depmixS4/inst/doc/depmixS4.Rnw 2012-05-10 20:08:59 UTC (rev 515)
+++ pkg/depmixS4/inst/doc/depmixS4.Rnw 2012-06-12 13:14:18 UTC (rev 516)
@@ -18,28 +18,33 @@
\author{Ingmar Visser\\University of Amsterdam \And
Maarten Speekenbrink\\University College London}
\Plainauthor{Ingmar Visser, Maarten Speekenbrink}
-
+
\title{\pkg{depmixS4}: An \proglang{R} Package for Hidden Markov Models}
\Plaintitle{depmixS4: An R Package for Hidden Markov Models}
\Abstract{
- This introduction to the \proglang{R} package \pkg{depmixS4} is a (slightly)
- modified version of \cite{Visser+Speekenbrink:2010}, published in the
- \emph{Journal of Statistical Software}. Please refer to that
- article when using \pkg{depmixS4}. The current version is 1.0-1.
- \pkg{depmixS4} implements a general framework for defining and
- estimating dependent mixture models in the \proglang{R}
- programming language. This includes standard Markov
- models, latent/hidden Markov models, and latent class and finite
- mixture distribution models. The models can be fitted on mixed
- multivariate data with distributions from the \code{glm} family,
- the (logistic) multinomial, or the multivariate normal
- distribution. Other distributions can be added easily, and an
- example is provided with the {\em exgaus} distribution.
- Parameters are estimated by the expectation-maximization (EM) algorithm or, when (linear)
- constraints are imposed on the parameters, by direct numerical
- optimization with the \pkg{Rsolnp} or \pkg{Rdonlp2} routines.
+ This introduction to the \proglang{R} package \pkg{depmixS4} is a
+ (slightly) modified version of \cite{Visser+Speekenbrink:2010},
+ published in the \emph{Journal of Statistical Software}. Please
+ refer to that article when using \pkg{depmixS4}. The current
+ version is 1.1-1; the version history and changes can be found in
+ the NEWS file of the package. Below, the major versions are listed
+ along with the most noteworthy changes.
+
+ \pkg{depmixS4} implements a general framework for defining and
+ estimating dependent mixture models in the \proglang{R} programming
+ language. This includes standard Markov models, latent/hidden
+ Markov models, and latent class and finite mixture distribution
+ models. The models can be fitted on mixed multivariate data with
+ distributions from the \code{glm} family, the (logistic)
+ multinomial, or the multivariate normal distribution. Other
+ distributions can be added easily, and an example is provided with
+ the {\em exgaus} distribution. Parameters are estimated by the
+ expectation-maximization (EM) algorithm or, when (linear)
+ constraints are imposed on the parameters, by direct numerical
+ optimization with the \pkg{Rsolnp} or \pkg{Rdonlp2} routines.
+
}
\Keywords{hidden Markov model, dependent mixture model, mixture model, constraints}
@@ -68,6 +73,16 @@
library("depmixS4")
@
+\section*{Version history}
+
+\begin{description}
+ \item[1.1-0] Speed improvements due to writing the main loop in C code.
+ \item[1.0-0] First release with this vignette, a modified version
+ of the paper in the Journal of Statistical Software.
+ \item[0.1-0] First version on CRAN.
+\end{description}
+
+
\section{Introduction}
Markov and latent Markov models are frequently used in the social
@@ -143,7 +158,7 @@
responding. These variables are measured on 168, 134 and 137
occasions respectively (the first series of 168 trials is plotted in
Figure~\ref{fig:speed}). These data are more fully described in
-\citet{Dutilh2010}, and in the next section a number of example models
+\citet{Dutilh2011}, and in the next section a number of example models
for these data is described.
\setkeys{Gin}{width=0.8\textwidth}
@@ -331,7 +346,30 @@
linear (in)equality constraints (and optionally also non-linear
constraints).
+\subsection[Missing data]{Missing data}\label{missingdata}
+Missing data can be dealt with straightforwardly in computing the
+likelihood using the forward recursion in
+Equations~(\ref{eq:fwd1}--\ref{eq:fwdt}). Assume we have observed
+$\vc{O}_{1:(t-1)}$ but that observation $\vc{O}_{t}$ is missing. The
+key idea that, in this case, the filtering distribution, the
+probabilities $\phi_{t}$, should be identical to the state prediction
+distribution, as there is no additional information to estimate
+the current state. Thus, the forward variables $\phi_{t}$ are now
+computed as:
+%
+\begin{align}
+ \phi_t(i) &= \Prob(S_{t} = i|\vc{O}_{1:(t-1)})
+ \\ &= \sum_{j=1}^N \phi_{t-1}(j) \Prob(S_t = i|S_{t-1}=j).
+\end{align}
+%
+For later observations, we can then use this latter equation again,
+realizing that the filtering distribution is technically e.g.
+$\Prob(S_{t+1}|\vc{O}_{1:(t-1),t+1})$. Computationally, the easiest
+way to implement this is to simply set $\vc{b}(\vc{O}_t|S_t) = 1$ if
+$\vc{O}_t$ is missing.
+
+
\section[Using depmixS4]{Using \pkg{depmixS4}}
Two steps are involved in using \pkg{depmixS4} which are illustrated
@@ -353,11 +391,11 @@
Throughout this article a data set called \code{speed} is used. As
already indicated in the introduction, it consists of three time
-series with three variables: response time \code{rt}, accuracy \code{corr}, and a covariate,
-\code{Pacc}, which defines the relative pay-off for speeded versus accurate
-responding. Before describing some of the models that are fitted to
-these data, we provide a brief sketch of the reasons for gathering
-these data in the first place.
+series with three variables: response time \code{rt}, accuracy
+\code{corr}, and a covariate, \code{Pacc}, which defines the relative
+pay-off for speeded versus accurate responding. Before describing
+some of the models that are fitted to these data, we provide a brief
+sketch of the reasons for gathering these data in the first place.
Response times are a very common dependent variable in psychological
experiments and hence form the basis for inference about many
@@ -383,7 +421,7 @@
experiment was designed to investigate what would happen when this
reward variable changes from reward for accuracy only to reward for
speed only. The \code{speed} data that we analyse here are from
-participant A in Experiment 1 in \citet{Dutilh2010}, who provide a
+participant A in Experiment 1 in \citet{Dutilh2011}, who provide a
complete description of the experiment and the relevant theoretical
background.
@@ -794,8 +832,33 @@
separate fit functions for each part of the model, the prior
probability model, the transition models, and the response models. As
a consequence, adding user-specified response models is
-straightforward.
+straightforward. The currently implemented distributions are listed
+in Table~\ref{tbl:responses}.
+\begin{table}
+ \begin{center}
+ \begin{tabular}{lll}
+ \hline
+ package & family & link \\
+ \hline
+ \hline
+ \pkg{stats} & binomial & logit, probit, cauchit, log, cloglog \\
+ \pkg{stats} & gaussian & identity, log, inverse \\
+ \pkg{stats} & Gamma & inverse, identity, log \\
+ \pkg{stats} & poisson & log, identity, sqrt \\
+ \pkg{depmixS4} & multinomial & logit, identity (no covariates allowed) \\
+ \pkg{depmixS4} & multivariate normal & identity (only
+ available through makeDepmix) \\
+ \pkg{depmixS4} & ex-gauss & identity (only
+ available through makeDepmix as example) \\
+ \hline
+ \end{tabular}
+ \end{center}
+ \caption{Response distribution available in \pkg{depmixS4}.}
+ \label{tbl:responses}
+\end{table}
+
+
User-defined distributions should extend the `\code{response}' class and
have the following slots:
\begin{enumerate}
@@ -1021,7 +1084,7 @@
(NEST). Maarten Speekenbrink was supported by ESRC grant
RES-062-23-1511 and the ESRC Centre for Economic Learning and Social
Evolution (ELSE). Han van der Maas provided the speed-accuracy data
-\citep{Dutilh2010} and thereby necessitated implementing models with
+\citep{Dutilh2011} and thereby necessitated implementing models with
time-dependent covariates. Brenda Jansen provided the balance scale
data set \citep{Jansen2002} which was the perfect opportunity to test
the covariates on the prior model parameters. The examples in the
Modified: pkg/depmixS4/inst/doc/depmixS4.bib
===================================================================
--- pkg/depmixS4/inst/doc/depmixS4.bib 2012-05-10 20:08:59 UTC (rev 515)
+++ pkg/depmixS4/inst/doc/depmixS4.bib 2012-06-12 13:14:18 UTC (rev 516)
@@ -1,3 +1,24 @@
+%% This BibTeX bibliography file was created using BibDesk.
+%% http://bibdesk.sourceforge.net/
+
+
+%% Created for Ingmar Visser at 2012-06-12 11:37:41 +0200
+
+
+%% Saved with string encoding Unicode (UTF-8)
+
+
+
+ at article{Dutilh2011,
+ Author = {Gilles Dutilh and Eric--Jan Wagenmakers and Ingmar Visser and Han L. J. van der Maas},
+ Date-Added = {2012-06-12 09:36:06 +0000},
+ Date-Modified = {2012-06-12 09:36:06 +0000},
+ Journal = {Cognitive Science},
+ Pages = {211-250},
+ Title = {A Phase Transition Model for the Speed--Accuracy Trade--Off in Response Time Experiments},
+ Volume = {35},
+ Year = {2011}}
+
@book{Zucchini2009,
Address = {Boca Raton},
Author = {Walter Zucchini and Iain MacDonald},
@@ -12,17 +33,17 @@
Title = {\pkg{Rsolnp}: General Non-Linear Optimization Using Augmented Lagrange Multiplier Method},
Url = {http://CRAN.R-project.org/package=Rsolnp},
Year = {2010},
-}
+ Bdsk-Url-1 = {http://CRAN.R-project.org/package=Rsolnp}}
@manual{R2010,
- title = {\proglang{R}: {A} Language and Environment for Statistical Computing},
- author = {{\proglang{R} Development Core Team}},
- organization = {\proglang{R} Foundation for Statistical Computing},
- address = {Vienna, Austria},
- year = {2010},
- note = {{ISBN} 3-900051-07-0},
- url = {http://www.R-project.org/}
-}
+ Address = {Vienna, Austria},
+ Author = {{\proglang{R} Development Core Team}},
+ Note = {{ISBN} 3-900051-07-0},
+ Organization = {\proglang{R} Foundation for Statistical Computing},
+ Title = {\proglang{R}: {A} Language and Environment for Statistical Computing},
+ Url = {http://www.R-project.org/},
+ Year = {2010},
+ Bdsk-Url-1 = {http://www.R-project.org/}}
@book{Salzberg1998,
Address = {Amsterdam},
@@ -50,7 +71,8 @@
Publisher = {Statistical Innovations Inc.},
Title = {\pkg{Latent Gold}~3.0},
Url = {http://www.statisticalinnovations.com/},
- Year = 2003}
+ Year = 2003,
+ Bdsk-Url-1 = {http://www.statisticalinnovations.com/}}
@book{Venables2002,
Address = {New York},
@@ -58,8 +80,7 @@
Edition = {4th},
Publisher = {Springer-Verlag},
Title = {Modern Applied Statistics with \proglang{S}},
- Year = {2002}
-}
+ Year = {2002}}
@article{Maas1992,
Author = {Han L. J. {Van der Maas} and Peter C. M. Molenaar},
@@ -90,7 +111,7 @@
Title = {\pkg{Rdonlp2}: An \proglang{R} Extension Library to Use Peter Spelluci's \pkg{donlp2} from \proglang{R}},
Url = {http://arumat.net/Rdonlp2/},
Year = {2009},
-}
+ Bdsk-Url-1 = {http://arumat.net/Rdonlp2/}}
@manual{Stasinopoulos2009a,
Author = {D. Mikis Stasinopoulos and Bob A. Rigby and Calliope Akantziliotou},
@@ -98,7 +119,7 @@
Title = {\pkg{gamlss}: Generalized Additive Models for Location Scale and Shape},
Url = {http://CRAN.R-project.org/package=gamlss},
Year = {2009},
-}
+ Bdsk-Url-1 = {http://CRAN.R-project.org/package=gamlss}}
@article{Rigby2005,
Author = {R. A. Rigby and D. M. Stasinopoulos},
@@ -113,12 +134,12 @@
Author = {D. Mikis Stasinopoulos and Bob A. Rigby},
Journal = {Journal of Statistical Software},
Number = 7,
+ Pages = {1--46},
Title = {Generalized Additive Models for Location Scale and Shape (GAMLSS) in \proglang{R}},
Url = {http://www.jstatsoft.org/v23/i07},
Volume = 23,
- Pages = {1--46},
Year = 2007,
-}
+ Bdsk-Url-1 = {http://www.jstatsoft.org/v23/i07}}
@manual{Stasinopoulos2009b,
Author = {D. Mikis Stasinopoulos and Bob A. Rigby and Calliope Akantziliotou and Gillian Heller and Raydonal Ospina and Nicoletta Motpan},
@@ -126,15 +147,15 @@
Title = {\pkg{gamlss.dist}: Distributions to Be Used for GAMLSS Modelling},
Url = {http://CRAN.R-project.org/package=gamlss.dist},
Year = {2010},
-}
+ Bdsk-Url-1 = {http://CRAN.R-project.org/package=gamlss.dist}}
@manual{Spellucci2002,
Author = {Peter Spellucci},
+ Organization = {TU Darmstadt},
Title = {\pkg{donlp2} Users Guide},
- Organization = {TU Darmstadt},
Url = {http://www.mathematik.tu-darmstadt.de/fbereiche/numerik/staff/spellucci/DONLP2/},
Year = {2002},
-}
+ Bdsk-Url-1 = {http://www.mathematik.tu-darmstadt.de/fbereiche/numerik/staff/spellucci/DONLP2/}}
@book{Siegler1981,
Author = {Robert S. Siegler},
@@ -163,11 +184,11 @@
@article{Lystig2002,
Author = {Theodore C. Lystig and James P. Hughes},
Journal = {Journal of Computational and Graphical Statistics},
+ Number = {3},
+ Pages = {678--689},
Title = {Exact Computation of the Observed Information Matrix for Hidden Markov Models},
- Year = {2002},
Volume = {11},
- Number = {3},
- Pages = {678--689}}
+ Year = {2002}}
@article{Leroux1992,
Author = {B. G. Leroux and M. L. Puterman},
@@ -185,21 +206,20 @@
Title = {FlexMix: A General Framework for Finite Mixture Models and Latent Class Regression in \proglang{R}},
Url = {http://www.jstatsoft.org/v11/i08/},
Volume = {11},
- Year = {2004}
-}
+ Year = {2004},
+ Bdsk-Url-1 = {http://www.jstatsoft.org/v11/i08/}}
+ at article{GruenLeisch2008,
+ Author = {Bettina Gr\"un and Friedrich Leisch},
+ Journal = {Journal of Statistical Software},
+ Number = {4},
+ Pages = {1--35},
+ Title = {{FlexMix} Version~2: Finite Mixtures with Concomitant Variables and Varying and Constant Parameters},
+ Url = {http://www.jstatsoft.org/v28/i04/},
+ Volume = {28},
+ Year = {2008},
+ Bdsk-Url-1 = {http://www.jstatsoft.org/v28/i04/}}
- at Article{GruenLeisch2008,
- title = {{FlexMix} Version~2: Finite Mixtures with Concomitant Variables and Varying and Constant Parameters},
- author = {Bettina Gr\"un and Friedrich Leisch},
- journal = {Journal of Statistical Software},
- year = {2008},
- volume = {28},
- number = {4},
- pages = {1--35},
- url = {http://www.jstatsoft.org/v28/i04/},
-}
-
@incollection{Krogh1998,
Address = {Amsterdam},
Author = {Anders Krogh},
@@ -238,9 +258,9 @@
Year = 1994}
@book{Fruhwirth2006,
+ Address = {New York},
Author = {Sylvia Fr\"uhwirth-Schnatter},
Publisher = {Springer-Verlag},
- Address = {New York},
Title = {Finite Mixture and Markov Switching Models},
Year = {2006}}
@@ -251,8 +271,7 @@
Pages = {309-321},
Title = {Likelihood Ratio Testing for Hidden Markov Models under Non-Standard Conditions},
Volume = {25},
- Year = {2007},
-}
+ Year = {2007}}
@article{Chung2007,
Author = {Hwan Chung and Theodore Walls and Yousung Park},
@@ -261,8 +280,7 @@
Pages = {413-435},
Title = {A Latent Transition Model With Logistic Regression},
Volume = {72},
- Year = 2007,
-}
+ Year = 2007}
@book{Cappe2005,
Address = {New York},
@@ -303,8 +321,7 @@
Pages = {2079--2091},
Title = {Multiple Learning Modes in the Development of Rule-based Category-learning Task Performance},
Volume = 44,
- Year = 2006,
-}
+ Year = 2006}
@incollection{Visser2009b,
Address = {New York},
@@ -315,22 +332,15 @@
Pages = {269-289},
Publisher = {Springer-Verlag},
Title = {Hidden Markov Models for Individual Time Series},
- Year = {2009},
-}
+ Year = {2009}}
- at Unpublished{Dutilh2010,
- Author = {Gilles Dutilh and Eric-Jan Wagenmakers and Ingmar Visser and Han L. J. van der Maas},
- Note = {Submitted for publication},
- Title = {A Phase Transition Model for the Speed-Accuracy Trade-Off in Response Time Experiments},
- Year = {2010}}
-
- at Article{Visser+Speekenbrink:2010,
- author = {Ingmar Visser and Maarten Speekenbrink},
- title = {\pkg{depmixS4}: An \proglang{R} Package for Hidden Markov Models},
- journal = {Journal of Statistical Software},
- year = {2010},
- volume = {36},
- number = {7},
- pages = {1--21},
- url = {http://www.jstatsoft.org/v36/i07/}
-}
+ at article{Visser+Speekenbrink:2010,
+ Author = {Ingmar Visser and Maarten Speekenbrink},
+ Journal = {Journal of Statistical Software},
+ Number = {7},
+ Pages = {1--21},
+ Title = {\pkg{depmixS4}: An \proglang{R} Package for Hidden Markov Models},
+ Url = {http://www.jstatsoft.org/v36/i07/},
+ Volume = {36},
+ Year = {2010},
+ Bdsk-Url-1 = {http://www.jstatsoft.org/v36/i07/}}
Modified: pkg/depmixS4/inst/doc/depmixS4.pdf
===================================================================
(Binary files differ)
Modified: pkg/depmixS4/man/em.control.Rd
===================================================================
--- pkg/depmixS4/man/em.control.Rd 2012-05-10 20:08:59 UTC (rev 515)
+++ pkg/depmixS4/man/em.control.Rd 2012-06-12 13:14:18 UTC (rev 516)
@@ -8,7 +8,7 @@
\usage{
- em.control(maxit = 100, tol = 1e-08, crit = "relative", random.start = TRUE)
+ em.control(maxit = 500, tol = 1e-08, crit = "relative", random.start = TRUE)
}
More information about the depmix-commits
mailing list