[Depmix-commits] r372 - papers/jss
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Thu Feb 25 22:55:10 CET 2010
Author: ingmarvisser
Date: 2010-02-25 22:55:09 +0100 (Thu, 25 Feb 2010)
New Revision: 372
Modified:
papers/jss/dpx4Rev.R
papers/jss/dpx4Rev.Rnw
papers/jss/dpx4Rev.tex
Log:
Minor updates in jss paper
Modified: papers/jss/dpx4Rev.R
===================================================================
--- papers/jss/dpx4Rev.R 2010-02-25 21:54:31 UTC (rev 371)
+++ papers/jss/dpx4Rev.R 2010-02-25 21:55:09 UTC (rev 372)
@@ -12,7 +12,7 @@
library(depmixS4)
data(speed)
set.seed(1)
-mod <- depmix(rt~1, data=speed, nstates=2, trstart=runif(4))
+mod <- depmix(response=rt~1, data=speed, nstates=2, trstart=runif(4))
###################################################
@@ -24,7 +24,7 @@
###################################################
### chunk number 4:
###################################################
-fm
+fm
###################################################
@@ -45,7 +45,7 @@
###################################################
### chunk number 7:
###################################################
-summary(fm)
+summary(fm, which="transition")
###################################################
@@ -61,7 +61,7 @@
###################################################
### chunk number 9:
###################################################
-summary(fm)
+summary(fm,which="response")
###################################################
@@ -80,7 +80,7 @@
###################################################
### chunk number 11:
###################################################
-summary(fm)
+summary(fm,which="prior")
###################################################
@@ -151,12 +151,16 @@
)
+
###################################################
### chunk number 18:
###################################################
setMethod("dens","exgaus",
function(object,log=FALSE) {
- dexGAUS(object at y, mu = predict(object), sigma = exp(object at parameters$sigma), nu = exp(object at parameters$nu), log = log)
+ dexGAUS(object at y, mu = predict(object),
+ sigma = exp(object at parameters$sigma),
+ nu = exp(object at parameters$nu),
+ log = log)
}
)
@@ -191,7 +195,7 @@
"fixed" = {
object at fixed <- as.logical(values)
}
- )
+ )
names(object at parameters) <- nms
return(object)
}
Modified: papers/jss/dpx4Rev.Rnw
===================================================================
--- papers/jss/dpx4Rev.Rnw 2010-02-25 21:54:31 UTC (rev 371)
+++ papers/jss/dpx4Rev.Rnw 2010-02-25 21:55:09 UTC (rev 372)
@@ -516,16 +516,16 @@
complete description of the experiment and the relevant theoretical
background.
-The central hypothesis about this data is whether indeed it is best
-described by two modes of responding rather than a single mode of
-responding with a continuous trade-off between speed and accuracy.
-The hallmark of having a discontinuity between slow versus speeded
+The central hypothesis about this data is whether indeed it is best
+described by two modes of responding rather than a single mode of
+responding with a continuous trade-off between speed and accuracy.
+The hallmark of having a discontinuity between slow versus speeded
responding is that the switching between the two modes is assymetric
-\citep[see e.g.][for a theoretical underpinning of this
-claim]{Maas1992}. The \code{fit} help page of \pkg{depmixS4} provides
-a number of examples in which the assymetricity of the switching
-process is tested; those examples are discussed at length in
-\citet{Visser2009b}.
+\citep[see e.g.][for a theoretical underpinning of this
+claim]{Maas1992}. The \code{fit} help page of \pkg{depmixS4} provides
+a number of examples in which the assymetricity of the switching
+process is tested; those examples and other candidate models are
+discussed at length in \citet{Visser2009b}.
\subsection{A simple model}
@@ -554,37 +554,37 @@
the variables from \code{response} are interpreted. Third, the number
of states of the model is given by \code{nstates=2}.
-\subsubsection{Starting values}
-Note also that start values for the transition parameters are provided
-in this call using the \code{trstart} argument. At this time, the
-package does not provide automatic starting values. Starting values
-for parameters can be provided using three arguments: \code{instart}
-for the parameters relating to the prior probabilities,
-\coce{trstart}, relating the transition models, and \code{respstart}
-for the parameters of the response models. Note that the starting
-values for the initial and transition models as well as of the
-multinomial logit response models are interpreted as {\em probabilities},
-and internally converted to multinomial logit parameters.
+\paragraph{Starting values} Note also that start values for the
+transition parameters are provided in this call using the
+\code{trstart} argument. At this time, the package does not provide
+automatic starting values. Starting values for parameters can be
+provided using three arguments: \code{instart} for the parameters
+relating to the prior probabilities, \coce{trstart}, relating the
+transition models, and \code{respstart} for the parameters of the
+response models. Note that the starting values for the initial and
+transition models as well as of the multinomial logit response models
+are interpreted as {\em probabilities}, and internally converted to
+multinomial logit parameters.
-\subsubsection{Fitting the model and printing results}
-
-The \code{depmix} function returns an object of class \code{depmix}
-which contains the model specification, and not a fitted model. Hence,
-the model needs to be fitted by calling \code{fit}:
-
+\paragraph{Fitting the model and printing results} The \code{depmix}
+function returns an object of class \code{depmix} which contains the
+model specification, and not a fitted model. Hence, the model needs
+to be fitted by calling \code{fit}:
<<>>=
fm <- fit(mod)
@
+
The \code{fit}-function returns an object of class
\code{depmix.fitted} which extends the \code{depmix} class, adding
convergence information (and information about constraints if these
-were applied, see below). The \code{print} method provides summary
-information on convergence, the log-likelihood and the AIC and BIC
-values:
+were applied, see below). The class has \code{print} and
+\code{summary} methods to see the results. The \code{print} method
+provides summary information on convergence, the log-likelihood and
+the AIC and BIC values:
<<>>=
-fm
+fm
@
These statistics can also be extracted using \code{logLik}, \code{AIC}
@@ -603,8 +603,6 @@
summary(fm)
@
-
-
Since no further arguments were specified, the initial state, state
transition and response distributions were set to their defaults
(multinomial distributions for the first two, and a Gaussian
@@ -640,7 +638,7 @@
Applying \code{summary} to the \code{fit}ted object gives (only
transition models printed here):
<<>>=
-summary(fm)
+summary(fm, which="transition")
@
The summary provides all parameters of the model, also the (redundant)
@@ -669,7 +667,7 @@
This provides the following fitted model parameters (only the
response parameters are given here):
<<>>=
-summary(fm)
+summary(fm,which="response")
@
As can be seen, state 1 has fast response times and accuracy is
@@ -710,7 +708,6 @@
\end{center}
\end{figure}
-
Children in the lower grades of primary school are known to ignore the
distance dimension, and base their answer only on the number of
weights on each side. Hence, they would typically provide the wrong
@@ -751,9 +748,14 @@
The summary of the \code{fit}ted model gives the following (only the
prior model is shown here):
<<>>=
-summary(fm)
+summary(fm,which="prior")
@
+%
+% add more explanation about the parameters, possibly add a plot of
+% the class membership as function of age
+%
+
Hence at young ages, children have a high probability of incorrect
responding in class~1, whereas the prior probability for class~2
increases with age.
@@ -761,20 +763,20 @@
\subsection{Fixing and constraining parameters}
-Using package \pkg{Rdonlp2} by \citet{Tamura2009}, parameters may be
-fitted subject to general linear (in-)equality constraints.
-Constraining and fixing parameters is done using the \code{conpat}
-argument to the \code{fit}-function, which specifies for each
-parameter in the model whether it's fixed (0) or free (1 or higher).
-Equality constraints can be imposed by having two parameters have the
-same number in the \code{conpat} vector. When only fixed values are
-required, the \code{fixed} argument can be used instead of
-\code{conpat}, with zeroes for fixed parameters and other values (ones
-e.g.) for non-fixed parameters. Fitting the models subject to these
-constraints is handled by the optimization routine \code{donlp2}. To
-be able to construct the \code{conpat} and/or \code{fixed} vectors one
-needs the correct ordering of parameters which is briefly discussed
-next before proceeding with an example.
+Using package \pkg{Rdonlp2} by \citet{Tamura2009} or \pkg{Rsolnp}
+\citet{ZZZZ???}, parameters may be fitted subject to general linear
+(in-)equality constraints. Constraining and fixing parameters is done
+using the \code{conpat} argument to the \code{fit}-function, which
+specifies for each parameter in the model whether it's fixed (0) or
+free (1 or higher). Equality constraints can be imposed by having two
+parameters have the same number in the \code{conpat} vector. When
+only fixed values are required, the \code{fixed} argument can be used
+instead of \code{conpat}, with zeroes for fixed parameters and other
+values (ones e.g.) for non-fixed parameters. Fitting the models
+subject to these constraints is handled by the optimization routine
+\code{donlp2}. To be able to construct the \code{conpat} and/or
+\code{fixed} vectors one needs the correct ordering of parameters
+which is briefly discussed next before proceeding with an example.
\paragraph{Parameter numbering} When using the \code{conpat} and
\code{fixed} arguments, complete parameter vectors should be supplied,
@@ -916,12 +918,18 @@
mod
}
)
+
@
+
+
<<echo=FALSE>>=
setMethod("dens","exgaus",
function(object,log=FALSE) {
- dexGAUS(object at y, mu = predict(object), sigma = exp(object at parameters$sigma), nu = exp(object at parameters$nu), log = log)
+ dexGAUS(object at y, mu = predict(object),
+ sigma = exp(object at parameters$sigma),
+ nu = exp(object at parameters$nu),
+ log = log)
}
)
@@ -956,7 +964,7 @@
"fixed" = {
object at fixed <- as.logical(values)
}
- )
+ )
names(object at parameters) <- nms
return(object)
}
Modified: papers/jss/dpx4Rev.tex
===================================================================
--- papers/jss/dpx4Rev.tex 2010-02-25 21:54:31 UTC (rev 371)
+++ papers/jss/dpx4Rev.tex 2010-02-25 21:55:09 UTC (rev 372)
@@ -167,29 +167,18 @@
(O_{1}^{1}, \ldots, O_{1}^{m}$, $O_{2}^{1}, \ldots, O_{2}^{m}$,
\ldots, $O_{T}^{1}, \ldots, O_{T}^{m})$ for an $m$-variate time series
of length $T$. In the following, we use $\vc{O}_{t}$ as shorthand for
-$O_{t}^{1}, \ldots, O_{t}^{m}$. As an example, consider a time series of responses
-generated by a single participant in a psychological response time
-experiment. The data consists of three variables, response time,
-response accuracy, and a covariate which is a pay-off variable
-reflecting the relative reward for speeded and/or accurate responding.
-These variables are measured on 168, 134 and 137 occasions
-respectively (the first part of this series is plotted in
+$O_{t}^{1}, \ldots, O_{t}^{m}$. As an example, consider a time series
+of responses generated by a single participant in a psychological
+response time experiment. The data consists of three variables,
+response time, response accuracy, and a covariate which is a pay-off
+variable reflecting the relative reward for speeded and/or accurate
+responding. These variables are measured on 168, 134 and 137
+occasions respectively (the first part of this series is plotted in
Figure~\ref{fig:speed}). These data are more fully described in
\citet{Dutilh2009}.
\begin{figure}[htbp]
\begin{center}
-\begin{Schunk}
-\begin{Soutput}
-##
-## Markov Chain Monte Carlo Package (MCMCpack)
-## Copyright (C) 2003-2010 Andrew D. Martin, Kevin M. Quinn, and Jong Hee Park
-##
-## Support provided by the U.S. National Science Foundation
-## (Grants SES-0350646 and SES-0350613)
-##
-\end{Soutput}
-\end{Schunk}
\includegraphics{dpx4Rev-001}
\caption{Response times (rt), accuracy (corr) and pay-off values (Pacc) for
the first series of responses in dataset \code{speed}.}
@@ -350,14 +339,12 @@
To compute the log-likelihood, \cite{Lystig2002} define the following
(forward) recursion:
\begin{align}
- \phi_{1}(j) &:= \Prob(\vc{O}_{1}, S_{1}=j) = \pi_{j} b_{j}(\vc{O}_{1})
+ \phi_{1}(j) &:= \Prob(\vc{O}_{1}, S_{1}=j) = \pi_{j} \vc{b}_{j}(\vc{O}_{1})
\label{eq:fwd1} \\
-%\begin{split}
\phi_{t}(j) &:= \Prob(\vc{O}_{t}, S_{t}=j|\vc{O}_{1:(t-1)}) %\\
- = \sum_{i=1}^{N} [\phi_{t-1}(i)a_{ij}b_{j}(\vc{O}_{t})] \times
+ = \sum_{i=1}^{N} [\phi_{t-1}(i)a_{ij} \vc{b}_{j}(\vc{O}_{t})] \times
(\Phi_{t-1})^{-1},
\label{eq:fwdt}
-%\end{split}
\end{align}
where $\Phi_{t}=\sum_{i=1}^{N} \phi_{t}(i)$. Combining
$\Phi_{t}=\Prob(\vc{O}_{t}|\vc{O}_{1:(t-1)})$, and
@@ -374,7 +361,7 @@
Parameters are estimated in \pkg{depmixS4} using the EM algorithm or
through the use of a general Newton-Raphson optimizer. In the EM
algorithm, parameters are estimated by iteratively maximising the
-expected joint likelihood of the parameters given the observations and
+expected joint log-likelihood of the parameters given the observations and
states. Let $\greekv{\theta} = (\greekv{\theta}_1,
\greekv{\theta}_2,\greekv{\theta}_3)$ be the general parameter vector
consisting of three subvectors with parameters for the prior model,
@@ -389,14 +376,13 @@
This likelihood depends on the unobserved states $\vc{S}_{1:T}$. In the
Expectation step, we replace these with their expected values given a set of
(initial) parameters $\greekv{\theta}' = (\greekv{\theta}'_1,
-\greekv{\theta}'_2,\greekv{\theta}'_3)$ and observations $\vc{O}_{1:T}$. The expected
-log likelihood
+\greekv{\theta}'_2,\greekv{\theta}'_3)$ and observations $\vc{O}_{1:T}$.
+The expected log-likelihood:
\begin{equation}
Q(\greekv{\theta},\greekv{\theta}') = E_{\greekv{\theta}'}
(\log \Prob(\vc{O}_{1:T},\vc{S}_{1:T}|\vc{O}_{1:T},\vc{z}_{1:T},\greekv{\theta}))
\end{equation}
can be written as
-%\begin{equation}
\begin{multline}
\label{eq:Q}
Q(\greekv{\theta},\greekv{\theta}') =
@@ -404,21 +390,22 @@
+ \sum_{t=2}^T \sum_{j=1}^n \sum_{k=1}^n \xi_t(j,k) \log \Prob(S_t = k|S_{t-1}
= j,\vc{z}_{t-1},\greekv{\theta}_2) \\
+ \sum_{t=1}^T \sum_{j=1}^n \sum_{k=1}^m \gamma_t(j)
-\ln \Prob(O^k_t|S_t=j,\vc{z}_t,\greekv{\theta}_3),
+\log \Prob(O^k_t|S_t=j,\vc{z}_t,\greekv{\theta}_3),
\end{multline}
-%\end{equation}
-where the expected values $\xi_t(j,k) = P(S_t = k, S_{t-1} = j|\vc{O}_{1:T},
-\vc{z}_{1:T},\greekv{\theta}')$ and $\gamma_t(j) = P(S_t = j|\vc{O}_{1:T},
-\vc{z}_{1:T},\greekv{\theta}')$ can be computed effectively by the
-forward-backward algorithm \citep[see e.g.,][]{Rabiner1989}. The Maximisation
-step consists of the maximisation of (\ref{eq:Q}) for $\greekv{\theta}$. As the
-right hand side of (\ref{eq:Q}) consists of three separate parts, we can
-maximise separately for $\greekv{\theta}_1$, $\greekv{\theta}_2$ and
-$\greekv{\theta}_3$. In common models, maximisation for $\greekv{\theta}_1$ and
-$\greekv{\theta}_2$ is performed by the \code{nnet.default} routine in the
-\pkg{nnet} package \citep{Venables2002}, and maximisation for
-$\greekv{\theta}_3$ by the standard \code{glm} routine. Note that for the latter
-maximisation, the expected values $\gamma_t(j)$ are used as prior weights of the
+where the expected values $\xi_t(j,k) = P(S_t = k, S_{t-1} =
+j|\vc{O}_{1:T}, \vc{z}_{1:T},\greekv{\theta}')$ and $\gamma_t(j) =
+P(S_t = j|\vc{O}_{1:T}, \vc{z}_{1:T},\greekv{\theta}')$ can be
+computed effectively by the forward-backward algorithm \citep[see
+e.g.,][]{Rabiner1989}. The Maximisation step consists of the
+maximisation of (\ref{eq:Q}) for $\greekv{\theta}$. As the right hand
+side of (\ref{eq:Q}) consists of three separate parts, we can maximise
+separately for $\greekv{\theta}_1$, $\greekv{\theta}_2$ and
+$\greekv{\theta}_3$. In common models, maximisation for
+$\greekv{\theta}_1$ and $\greekv{\theta}_2$ is performed by the
+\code{nnet.default} routine in the \pkg{nnet} package
+\citep{Venables2002}, and maximisation for $\greekv{\theta}_3$ by the
+standard \code{glm} routine. Note that for the latter maximisation,
+the expected values $\gamma_t(j)$ are used as prior weights of the
observations $O^k_t$.
@@ -428,19 +415,43 @@
%\label{eq:MLA}
%\hat{a}_{jk} = \frac{1}{N(T-1)} \sum_{i=1}^N \sum_{t=2}^T \frac{\xi^i_t(j,k)}{\gamma_{t-1}^i(j)}.
%\end{equation}
-%Fixing certain elements of $\mat{A}$ to 0, as in the DT version, does not affect the estimation of the other elements. When elements $a_{jk}$ are assumed identical, we simply extend the summation in (\ref{eq:MLA}) to include all those elements, changing the denominator $N(T-1)$ accordingly.
-%To estimate $\greekv{\lambda}$, we note that the term containing this parameter has the form of a weighted likelihood, where $\gamma_t^i(j)$ can be interpreted as the number of replications of $r^i_t$. Hence, we can rely on standard ML estimates of the logistic regression coefficients $\lambda_i$, using the values $\gamma_t^i(j)$ as ``case weights'' \cite<e.g.,>{Agresti02,McCullagh83}\footnote{To be more specific, we replicate each observation $r^i_t$ a total of 6 times, once for each of the states besides the random state (which offers no information regarding $\lambda_i$). For the $j$-th replication (corresponding to the $j$-th state), we used $v_j(\vec{x}_t)$ as a predictor variable and $\gamma^i_t(j)$ as a case weight. All these replications were used to obtain the maximum likelihood estimate of $\lambda$ from a single GLM, using the ``glm.fit'' function in R \cite{R}.}. The resulting parameter estimates are then used in a new Expectation step, followed by another Maximisation step, until convergence. Under mild regularity conditions, it can be shown that the EM algorithm converges to a (local) maximum of the likelihood \cite{EM}.
+%Fixing certain elements of $\mat{A}$ to 0, as in the DT version, does
+%not affect the estimation of the other elements. When elements
+%$a_{jk}$ are assumed identical, we simply extend the summation in
+%(\ref{eq:MLA}) to include all those elements, changing the
+%denominator $N(T-1)$ accordingly.
+%To estimate $\greekv{\lambda}$, we note that the term containing this
+%parameter has the form of a weighted likelihood, where
+%$\gamma_t^i(j)$ can be interpreted as the number of replications of
+%$r^i_t$. Hence, we can rely on standard ML estimates of the logistic
+%regression coefficients $\lambda_i$, using the values $\gamma_t^i(j)$
+%as ``case weights'' \cite<e.g.,>{Agresti02,McCullagh83}\footnote{To
+%be more specific, we replicate each observation $r^i_t$ a total of 6
+%times, once for each of the states besides the random state (which
+%offers no information regarding $\lambda_i$). For the $j$-th
+%replication (corresponding to the $j$-th state), we used
+%$v_j(\vec{x}_t)$ as a predictor variable and $\gamma^i_t(j)$ as a
+%case weight. All these replications were used to obtain the maximum
+%likelihood estimate of $\lambda$ from a single GLM, using the
+%``glm.fit'' function in R \cite{R}.}. The resulting parameter
+%estimates are then used in a new Expectation step, followed by
+%another Maximisation step, until convergence. Under mild regularity
+%conditions, it can be shown that the EM algorithm converges to a
+%(local) maximum of the likelihood \cite{EM}.
+
The EM algorithm however has some drawbacks. First, it can be slow to
converge towards the end of optimization. Second, applying
constraints to parameters can be problematic; in particular, EM can
lead to wrong parameter estimates when applying constraints. Hence,
in \pkg{depmixS4}, EM is used by default in unconstrained models, but
-otherwise, direct optimization is done using \pkg{Rdonlp2}
-\citep{Tamura2009,Spellucci2002}, because it handles general linear
-(in)equality constraints, and optionally also non-linear constraints.
+otherwise, direct optimization is used. Two options are available for
+direct optimization using package \pkg{Rdonlp2}
+\citep{Tamura2009,Spellucci2002}, or package \pkg{Rsolnp}. Both
+packages can handle general linear (in)equality constraints, and
+optionally also non-linear constraints.
%Need some more on EM and how/why it is justified to do separate weighted
%fits of the response models and transition and prior models.
@@ -454,7 +465,7 @@
\begin{enumerate}
\item model specification with function \code{depmix} (or with \code{mix}
for latent class and finite mixture models, see example below on adding
- covariates to prior probabilities)
+ covariates to prior probabilities in section~\ref{sec:prior})
\item model fitting with function \code{fit}
\end{enumerate}
@@ -467,30 +478,102 @@
Throughout this article a data set called \code{speed} is used. As
already indicated in the Introduction, it consists of three time
-series with three variables: response time, accuracy, and a covariate
-Pacc which defines the relative pay-off for speeded versus accurate
-responding. The participant in this experiment switches between fast
-responding at chance level and relatively slower responding at a high
-level of accuracy. Interesting hypotheses to test are: is the
-switching regime symmetric? Is there evidence for two states or does
-one state suffice? Is the guessing state actually a guessing state,
-i.e., is the probability of a correct response at chance level (0.5)?
+series with three variables: response time, accuracy, and a covariate,
+Pacc, which defines the relative pay-off for speeded versus accurate
+responding. Before describing some of the models that are fitted
+to these data, we provide a brief sketch of the reasons for gathering
+these data in the first place.
+Response times are a very common dependent variable in psychological
+experiments and hence form the basis for infernce about many
+psychological processes. A potential threat to such inference based
+on response times is formed by the speed-accuracy trade-off: different
+participants in an experiment may respond differently to typical
+instructions to `respond as fast and accurate as possible'. A popular
+model which takes the speed-accuracy trade-off into account is the
+diffusion model \citep{Ratcliff1978}, which has proven to provide
+accurate descriptions of response times in many different settings.
+
+One potential problem with the diffusion model is that it predicts a
+continuous trade-off between speed and accuracy of responding, i.e.,
+when participants are pressed to respond faster and faster, the
+diffusion model predicts that this would lead to a gradual decrease in
+accuracy. The \code{speed} data set that we analyze below was
+gathered to test this hypothesis versus the alternative hypothesis
+stating that there is a sudden transition from slow and accurate
+responding to fast responding at chance level. At each trial of the
+experiment, the participant is shown the current setting of the
+relative reward for speed versus accuracy. The bottom panel of
+figure~ref{fig:speed} shows the values of this variable. The main
+question in this experiment was what would happen when this reward
+variable would change from reward for accuracy only to reward for
+speed only. The \code{speed} data that we analyse here are from
+participant A in Experiment 1 in \citet{Dutilh2009}, who provide a
+complete description of the experiment and the relevant theoretical
+background.
+
+The central hypothesis about this data is whether indeed it is best
+described by two modes of responding rather than a single mode of
+responding with a continuous trade-off between speed and accuracy.
+The hallmark of having a discontinuity between slow versus speeded
+responding is that the switching between the two modes is assymetric
+\citep[see e.g.][for a theoretical underpinning of this
+claim]{Maas1992}. The \code{fit} help page of \pkg{depmixS4} provides
+a number of examples in which the assymetricity of the switching
+process is tested; those examples and other candidate models are
+discussed at length in \citet{Visser2009b}.
+
+
\subsection{A simple model}
A dependent mixture model is defined by the number of states and the
initial state, state transition, and response distribution functions.
A dependent mixture model can be created with the
\code{depmix}-function as follows:
+\begin{Schunk}
+\begin{Sinput}
+> library(depmixS4)
+> data(speed)
+> set.seed(1)
+> mod <- depmix(response = rt ~ 1, data = speed, nstates = 2, trstart = runif(4))
+\end{Sinput}
+\end{Schunk}
+The first line of code loads the \pkg{depmixS4} package and
+\code{data(speed)} loads the \code{speed} data set. The line
+\code{set.seed(1)} is necessary to get starting values that will
+result in the right model, see more on starting values below.
-The \code{depmix} function returns an object of class \code{depmix}
-which contains the model specification (and not a fitted model!).
-Note also that start values for the transition parameters are provided
-in this call using the \code{trstart} argument. At this time, the
-package does not provide automatic starting values.
+The call to \code{depmix} specifies the model with a number of
+arguments. The \code{response} argument is used to specify the
+response variable, possibly with covariates, in the familiar format
+using formulae such as in \code{lm} or \code{glm} models. The second
+argument, \code{data=speed}, provides the \code{data.frame} in which
+the variables from \code{response} are interpreted. Third, the number
+of states of the model is given by \code{nstates=2}.
+
+\paragraph{Starting values} Note also that start values for the
+transition parameters are provided in this call using the
+\code{trstart} argument. At this time, the package does not provide
+automatic starting values. Starting values for parameters can be
+provided using three arguments: \code{instart} for the parameters
+relating to the prior probabilities, \coce{trstart}, relating the
+transition models, and \code{respstart} for the parameters of the
+response models. Note that the starting values for the initial and
+transition models as well as of the multinomial logit response models
+are interpreted as {\em probabilities}, and internally converted to
+multinomial logit parameters.
+
+
+\paragraph{Fitting the model and printing results} The \code{depmix}
+function returns an object of class \code{depmix} which contains the
+model specification, and not a fitted model. Hence, the model needs
+to be fitted by calling \code{fit}:
\begin{Schunk}
+\begin{Sinput}
+> fm <- fit(mod)
+\end{Sinput}
\begin{Soutput}
iteration 0 logLik: -305.3302
iteration 5 logLik: -305.3251
@@ -506,22 +589,26 @@
iteration 50 logLik: -84.34175
\end{Soutput}
\end{Schunk}
+
The \code{fit}-function returns an object of class
\code{depmix.fitted} which extends the \code{depmix} class, adding
convergence information (and information about constraints if these
-were applied, see below). The \code{print} method provides summary
-information on convergence, the log-likelihood and the AIC and BIC
-values:
+were applied, see below). The class has \code{print} and
+\code{summary} methods to see the results. The \code{print} method
+provides summary information on convergence, the log-likelihood and
+the AIC and BIC values:
\begin{Schunk}
+\begin{Sinput}
+> fm
+\end{Sinput}
\begin{Soutput}
-Convergence info: Log likelihood converged to within tol.
+Convergence info: Log likelihood converged to within tol. (relative change crit.)
'log Lik.' -84.34175 (df=7)
AIC: 182.6835
BIC: 211.275
\end{Soutput}
\end{Schunk}
-
These statistics can also be extracted using \code{logLik}, \code{AIC}
and \code{BIC}, respectively. By comparison, a 1-state model for
these data, i.e. assuming there is no mixture, has a log-likelihood of
@@ -533,8 +620,11 @@
The \code{summary} method of \code{fit}ted models provides the
parameter estimates, first for the prior probabilities model, second
-for the transition model, and third for the response models.
+for the transition models, and third for the response models.
\begin{Schunk}
+\begin{Sinput}
+> summary(fm)
+\end{Sinput}
\begin{Soutput}
Initial state probabilties model
Model of type multinomial, formula: ~1
@@ -578,14 +668,13 @@
\end{Soutput}
\end{Schunk}
+Since no further arguments were specified, the initial state, state
+transition and response distributions were set to their defaults
+(multinomial distributions for the first two, and a Gaussian
+distribution for the response).
-Note that, since no further arguments were specified, the initial
-state, state transition and response distributions were set to their
-defaults (multinomial distributions for the first two, and Gaussian
-distributions for the response distributions).
+\subsection{Covariates on transition parameters\label{sec:trans}}
-\subsection{Covariates on transition parameters}
-
By default, the transition probabilities and the initial state
probabilities are parameterized using the multinomial logistic model.
More precisely, each row of the transition matrix is parameterized by
@@ -605,6 +694,12 @@
probabilities can be specified using a one-sided formula as in the
following example:
\begin{Schunk}
+\begin{Sinput}
+> set.seed(1)
+> mod <- depmix(rt ~ 1, data = speed, nstates = 2, family = gaussian(),
++ transition = ~scale(Pacc), instart = runif(2))
+> fm <- fit(mod)
+\end{Sinput}
\begin{Soutput}
iteration 0 logLik: -305.3304
iteration 5 logLik: -305.2360
@@ -617,18 +712,13 @@
\end{Soutput}
\end{Schunk}
-Applying \code{summary} to the \code{fit}ted object gives (only transition models
-printed here):
+Applying \code{summary} to the \code{fit}ted object gives (only
+transition models printed here):
\begin{Schunk}
+\begin{Sinput}
+> summary(fm, which = "transition")
+\end{Sinput}
\begin{Soutput}
-Initial state probabilties model
-Model of type multinomial, formula: ~1
-Coefficients:
- [,1] [,2]
-[1,] 0 10.71779
-Probalities at zero values of the covariates.
-2.214681e-05 0.9999779
-
Transition model for state (component) 1
Model of type multinomial, formula: ~scale(Pacc)
Coefficients:
@@ -646,24 +736,6 @@
[2,] 0 3.570856
Probalities at zero values of the covariates.
0.07788458 0.9221154
-
-
-Response model(s) for state 1
-
-Response model for response 1
-Model of type gaussian, formula: rt ~ 1
-Coefficients:
-[1] 5.512179
-sd 0.1921098
-
-
-Response model(s) for state 2
-
-Response model for response 1
-Model of type gaussian, formula: rt ~ 1
-Coefficients:
-[1] 6.3885
-sd 0.2402693
\end{Soutput}
\end{Schunk}
@@ -683,6 +755,13 @@
variable in the \code{speed} data can be modelled with a multinomial
by specifying the following:
\begin{Schunk}
+\begin{Sinput}
+> set.seed(1)
+> mod <- depmix(list(rt ~ 1, corr ~ 1), data = speed, nstates = 2,
++ family = list(gaussian(), multinomial()), transition = ~scale(Pacc),
++ instart = runif(2))
+> fm <- fit(mod)
+\end{Sinput}
\begin{Soutput}
iteration 0 logLik: -554.6365
iteration 5 logLik: -553.9195
@@ -691,56 +770,32 @@
iteration 20 logLik: -248.9757
iteration 25 logLik: -248.9724
iteration 30 logLik: -248.9723
-iteration 33 logLik: -248.9723
+iteration 31 logLik: -248.9723
\end{Soutput}
\end{Schunk}
This provides the following fitted model parameters (only the
response parameters are given here):
\begin{Schunk}
+\begin{Sinput}
+> summary(fm, which = "response")
+\end{Sinput}
\begin{Soutput}
-Initial state probabilties model
-Model of type multinomial, formula: ~1
-Coefficients:
- [,1] [,2]
-[1,] 0 10.24230
-Probalities at zero values of the covariates.
-3.562961e-05 0.9999644
-
-Transition model for state (component) 1
-Model of type multinomial, formula: ~scale(Pacc)
-Coefficients:
- [,1] [,2]
-[1,] 0 -0.8924638
-[2,] 0 2.1293542
-Probalities at zero values of the covariates.
-0.7093984 0.2906016
-
-Transition model for state (component) 2
-Model of type multinomial, formula: ~scale(Pacc)
-Coefficients:
- [,1] [,2]
-[1,] 0 2.390396
-[2,] 0 3.690785
-Probalities at zero values of the covariates.
-0.08390802 0.916092
-
-
Response model(s) for state 1
Response model for response 1
Model of type gaussian, formula: rt ~ 1
Coefficients:
-[1] 5.52169
-sd 0.2028857
+[1] 5.521695
+sd 0.2028917
Response model for response 2
Model of type multinomial, formula: corr ~ 1
Coefficients:
[,1] [,2]
-[1,] 0 0.1030554
+[1,] 0 0.1030452
Probalities at zero values of the covariates.
-0.4742589 0.5257411
+0.4742615 0.5257385
Response model(s) for state 2
@@ -748,16 +803,16 @@
Response model for response 1
Model of type gaussian, formula: rt ~ 1
Coefficients:
-[1] 6.39369
-sd 0.2373650
+[1] 6.393693
+sd 0.2373635
Response model for response 2
Model of type multinomial, formula: corr ~ 1
Coefficients:
[,1] [,2]
-[1,] 0 2.245514
+[1,] 0 2.245562
[TRUNCATED]
To get the complete diff run:
svnlook diff /svnroot/depmix -r 372
More information about the depmix-commits
mailing list