[Pomp-commits] r239 - in pkg/inst: . doc
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Mon May 17 15:51:00 CEST 2010
Author: kingaa
Date: 2010-05-17 15:51:00 +0200 (Mon, 17 May 2010)
New Revision: 239
Added:
pkg/inst/doc/ou2-trajmatch.rda
Modified:
pkg/inst/ChangeLog
pkg/inst/doc/advanced_topics_in_pomp.pdf
pkg/inst/doc/intro_to_pomp.Rnw
pkg/inst/doc/intro_to_pomp.pdf
pkg/inst/doc/pomp.bib
Log:
- add section on trajectory matching to the intro vignette
Modified: pkg/inst/ChangeLog
===================================================================
--- pkg/inst/ChangeLog 2010-05-17 12:54:41 UTC (rev 238)
+++ pkg/inst/ChangeLog 2010-05-17 13:51:00 UTC (rev 239)
@@ -1,5 +1,19 @@
+2010-05-16 kingaa
+
+ * [r237] inst/doc/intro_to_pomp.Rnw, inst/doc/intro_to_pomp.pdf: -
+ some minor additions to the intro vignette
+
+2010-05-11 kingaa
+
+ * [r236] man/eulermultinom.Rd, man/plugins.Rd, man/pomp.Rd: -
+ update documentation
+ * [r235] DESCRIPTION: - added Helen Wearing as an author for her
+ work with the stochastic simulation algorithms
+
2010-05-10 kingaa
+ * [r234] inst/ChangeLog, inst/doc/advanced_topics_in_pomp.pdf,
+ inst/doc/intro_to_pomp.pdf: - updated vignettes
* [r233] man/euler.Rd: - the old plugins are deprecated: move their
man pages to 'internal'
* [r232] inst/doc/ou2-first-mif.rda: - binary file with saved
Modified: pkg/inst/doc/advanced_topics_in_pomp.pdf
===================================================================
(Binary files differ)
Modified: pkg/inst/doc/intro_to_pomp.Rnw
===================================================================
--- pkg/inst/doc/intro_to_pomp.Rnw 2010-05-17 12:54:41 UTC (rev 238)
+++ pkg/inst/doc/intro_to_pomp.Rnw 2010-05-17 13:51:00 UTC (rev 239)
@@ -24,6 +24,7 @@
\newcommand\code[1]{\texttt{#1}}
\newcommand{\R}{\textsf{R}}
\newcommand{\pomp}{\texttt{pomp}}
+\newcommand{\expect}[1]{\mathbb{E}\left[#1\right]}
\title[Introduction to pomp]{Introduction to pomp:\\ inference for partially-observed Markov processes}
@@ -41,7 +42,7 @@
48109-1048 USA
}
-\email{kingaa at umich.edu}
+\email{kingaa at umich dot edu}
\urladdr{http://pomp.r-forge.r-project.org}
@@ -361,6 +362,7 @@
To compute the likelihood of the data, we can use the function \code{pfilter}.
This runs a plain vanilla particle filter (AKA sequential Monte Carlo) algorithm and results in an unbiased estimate of the likelihood.
+See \citet{Arulampalam2002} for an excellent tutorial on particle filtering.
To do this, we must decide how many concurrrent realizations (\emph{particles}) to use: the larger the number of particles, the smaller the Monte Carlo error but the greater the computational effort.
Let's run \code{pfilter} with 1000 particles and evaluate the likelihood at the true parameters:
<<ou2-pfilter-truth,eval=F>>=
@@ -485,6 +487,7 @@
%% In the ``advanced_topics_in_pomp'' vignette, we show how one can get access to more of the underlying structure of a \pomp\ object.
+\clearpage
\section{Estimating parameters using iterated filtering: \code{mif}}
Iterated filtering is a technique for maximizing the likelihood obtained by filtering.
@@ -615,6 +618,8 @@
\section{Nonlinear forecasting: \code{nlf}}
+To be added.
+
<<first-nlf,echo=F,eval=F>>=
estnames <- c('alpha.2','alpha.3','tau')
out <- nlf(
@@ -644,11 +649,28 @@
)
@
+\clearpage
\section{Trajectory matching: \code{traj.match}}
-To be added.
-
-<<ou2-skeleton-def,echo=F,eval=F>>=
+The idea behind trajectory matching is a simple one.
+One attempts to fit a deterministic dynamical trajectory to the data.
+This is tantamount to assuming that all the stochasticity in the system is in the measurement process.
+In \pomp, the trajectory is computed using the \code{trajectory} function, which in turn uses the \code{skeleton} slot of the \pomp\ object.
+The \code{skeleton} slot should be filled with the deterministic skeleton of the process model.
+In the discrete-time case, this is the map
+\begin{equation*}
+ x\,\mapsto\,\expect{X_{t+1}\;\vert\;X_{t}=x,\theta}.
+\end{equation*}
+In the continuous-time case, this is the vectorfield
+\begin{equation*}
+ x\,\mapsto\,\lim_{{\Delta}{t}\,\to\,0}\,\frac{\expect{X_{t+{\Delta}{t}}\;\vert\;X_{t}=x,\theta}}{{\Delta}{t}}.
+\end{equation*}
+Our discrete-time bivariate autoregressive process has the deterministic skeleton
+\begin{equation}\label{eq:ou-skel}
+ X_{t} = \alpha\,X_{t-1}
+\end{equation}
+which can be implemented in the \R\ function
+<<ou2-skeleton-def,echo=T>>=
ou2.skel <- function (x, t, params, ...) {
xnew <- c(
params["alpha.1"]*x["x1"]+params["alpha.3"]*x["x2"],
@@ -659,13 +681,12 @@
}
@
-<<new-ou2-pomp-construction,eval=F,echo=F>>=
+We can incorporate the deterministic skeleton into a new \pomp\ object in the same way as before:
+<<new-ou2-pomp-construction,echo=T>>=
+dat <- subset(as(ou2,"data.frame"),time<=60)
+theta <- coef(ou2)
new.ou2 <- pomp(
- data=data.frame(
- time=1:100,
- y1=NA,
- y2=NA
- ),
+ data=dat[c("time","y1","y2")],
times="time",
rprocess=euler.sim(ou2.proc.sim,delta.t=1),
rmeasure=ou2.meas.sim,
@@ -674,28 +695,25 @@
t0=0
)
coef(new.ou2) <- theta
-coef(new.ou2,c("sigma.1","sigma.2","sigma.3","tau")) <- c(0,0,0,0.2)
+coef(new.ou2,c("sigma.1","sigma.2","sigma.3","tau")) <- c(0,0,0,0.5)
new.ou2 <- simulate(new.ou2,seed=88737400L)
@
-<<echo=F,eval=F>>=
-new.ou2 <- ou2
-coef(new.ou2) <- theta
-coef(new.ou2,c("sigma.1","sigma.2","sigma.3","tau")) <- c(0,0,0,0.2)
-new.ou2 <- simulate(new.ou2,seed=88737400L)
-@
+Note that we have turned off the process noise in \code{new.ou2} (next to last line) so that trajectory matching is actually formally appropriate for this model.
-<<ou2-trajmatch-calc,eval=F,echo=F>>=
+The \pomp\ function \code{traj.match} calls the optimizer \code{optim} to minimize the discrepancy between the trajectory and the data.
+The discrepancy is measured using the \code{dmeasure} function from the \pomp\ object.
+Fig.~\ref{fig:trajmatch-plot} shows the results of this fit.
+<<ou2-trajmatch-calc,eval=F>>=
tm <- traj.match(
new.ou2,
start=coef(new.ou2),
- est=c("alpha.1","alpha.2","alpha.3","alpha.4","tau","x1.0","x2.0"),
+ est=c("alpha.2","alpha.3","tau","x1.0","x2.0"),
method="Nelder-Mead",
maxit=1000,
reltol=1e-8
)
@
-
-<<ou2-trajmatch-eval,echo=F,eval=F>>=
+<<ou2-trajmatch-eval,echo=F,eval=T>>=
binary.file <- "ou2-trajmatch.rda"
if (file.exists(binary.file)) {
load(binary.file)
@@ -705,20 +723,25 @@
}
@
-%%\begin{figure}
-<<trajmatch-plot,fig=T,echo=F,eval=F>>=
+\begin{figure}
+<<trajmatch-plot,fig=T,echo=F,eval=T>>=
op <- par(mfrow=c(2,1),mar=c(3,3,0,0),mgp=c(2,1,0),bty='l')
plot(time(new.ou2),data.array(new.ou2,"y1"),xlab="",xaxt='n',ylab=expression(y[1]))
x <- trajectory(new.ou2,params=tm$params)
-lines(time(new.ou2),x["x1",1,-1])
+lines(time(new.ou2),x["x1",1,-1],lwd=2)
plot(time(new.ou2),data.array(new.ou2,"y2"),xlab="time",ylab=expression(y[2]))
-lines(time(new.ou2),x["x2",1,-1])
+lines(time(new.ou2),x["x2",1,-1],lwd=2)
par(op)
@
-%% \caption{Trajectory matching.}
-%% \label{fig:trajmatch-plot}
-%% \end{figure}
+\caption{
+ Illustration of trajectory matching.
+ The points show data simulated from \code{new.ou2}, which has no process noise but only measurement error.
+ The solid line shows the trajectory of the best-fitting model, obtained using \code{traj.match}.
+}
+\label{fig:trajmatch-plot}
+\end{figure}
+\clearpage
\section{A more complex example: a seasonal epidemic model}
The SIR model is a mainstay of theoretical epidemiology.
@@ -868,6 +891,7 @@
\label{fig:sir-plot}
\end{figure}
+\clearpage
\bibliographystyle{ecology}
\bibliography{pomp}
Modified: pkg/inst/doc/intro_to_pomp.pdf
===================================================================
(Binary files differ)
Added: pkg/inst/doc/ou2-trajmatch.rda
===================================================================
(Binary files differ)
Property changes on: pkg/inst/doc/ou2-trajmatch.rda
___________________________________________________________________
Added: svn:mime-type
+ application/octet-stream
Modified: pkg/inst/doc/pomp.bib
===================================================================
--- pkg/inst/doc/pomp.bib 2010-05-17 12:54:41 UTC (rev 238)
+++ pkg/inst/doc/pomp.bib 2010-05-17 13:51:00 UTC (rev 239)
@@ -1,6 +1,20 @@
% This file was created with JabRef 2.5.
% Encoding: ISO8859_1
+ at ARTICLE{Arulampalam2002,
+ author = {Arulampalam, M. S. and Maskell, S. and Gordon, N. and Clapp, T.},
+ title = {A Tutorial on Particle Filters for Online Nonlinear, Non-{G}aussian
+ {B}ayesian Tracking},
+ journal = {IEEE Transactions on Signal Processing},
+ year = {2002},
+ volume = {50},
+ pages = {174 -- 188},
+ doi = {10.1109/78.978374},
+ file = {Arulampalam2002.pdf:Desktop/Arulampalam2002.pdf:PDF},
+ owner = {kingaa},
+ timestamp = {2007.07.20}
+}
+
@ARTICLE{Breto2009,
author = {Carles Bret\'{o} and Daihai He and Edward L. Ionides and Aaron A.
King},
@@ -33,6 +47,43 @@
timestamp = {2009.09.15}
}
+ at ARTICLE{Ellner1998,
+ author = {S. P. Ellner and B. A. Bailey and G. V. Bobashev and A. R. Gallant
+ and B. T. Grenfell and D. W. Nychka},
+ title = {Noise and nonlinearity in measles epidemics: Combining mechanistic
+ and statistical approaches to population modeling},
+ journal = {American Naturalist},
+ year = {1998},
+ volume = {151},
+ pages = {425--440},
+ abstract = {We present and evaluate an approach to analyzing population dynamics
+ data using semimechanistic models. These models incorporate reliable
+ information on population structure and underlying dynamic mechanisms
+ but use nonparametric surface-fitting methods to avoid unsupported
+ assumptions about the precise form of rate equations. Using historical
+ data on measles epidemics as a case study, we show how this approach
+ can lead to better forecasts, better characterizations of the dynamics,
+ and a better understanding of the factors causing complex population
+ dynamics relative to either mechanistic models or purely descriptive
+ statistical time-series models. The semimechanistic models are found
+ to have better forecasting accuracy than either of the model types
+ used in previous analyses when tested on data not used to fit the
+ models. The dynamics are characterized as being both nonlinear and
+ noisy, and the global dynamics are clustered very tightly near the
+ border of stability (dominant Lyapunov exponent lambda approximate
+ to 0). However, locally in state space the dynamics oscillate between
+ strong short-term stability and strong short-term chaos (i.e., between
+ negative and positive local Lyapunov exponents). There is statistically
+ significant evidence for short-term chaos in all data sets examined.
+ Thus the nonlinearity in these systems is characterized by the variance
+ over state space in local measures of chaos versus stability rather
+ than a single summary measure of the overall dynamics as either chaotic
+ or nonchaotic.},
+ file = {Ellner1998.pdf:Ellner1998.pdf:PDF},
+ owner = {kingaa},
+ timestamp = {2009.09.22}
+}
+
@ARTICLE{Gillespie1977a,
author = {D. T. Gillespie},
title = {Exact Stochastic Simulation of Coupled Chemical Reactions},
@@ -78,6 +129,174 @@
timestamp = {2009.06.26}
}
+ at ARTICLE{Ionides2006,
+ author = {E. L. Ionides and C. Bret{\'o} and Aaron A. King},
+ title = {Inference for nonlinear dynamical systems},
+ journal = {Proceedings of the National Academy of Sciences of the United States
+ of America},
+ year = {2006},
+ volume = {103},
+ pages = {18438--18443},
+ number = {49},
+ abstract = {Nonlinear stochastic dynamical systems are widely used to model systems
+ across the sciences and engineering. Such models are natural to formulate
+ and can be analyzed mathematically and numerically. However, difficulties
+ associated with inference from time-series data about unknown parameters
+ in these models have been a constraint on their application. We present
+ a new method that makes maximum likelihood estimation feasible for
+ partially-observed nonlinear stochastic dynamical systems (also known
+ as state-space models) where this was not previously the case. The
+ method is based on a sequence of filtering operations which are shown
+ to converge to a maximum likelihood parameter estimate. We make use
+ of recent advances in nonlinear filtering in the implementation of
+ the algorithm. We apply the method to the study of cholera in Bangladesh.
+ We construct confidence intervals, perform residual analysis, and
+ apply other diagnostics. Our analysis, based upon a model capturing
+ the intrinsic nonlinear dynamics of the system, reveals some effects
+ overlooked by previous studies.},
+ doi = {10.1073/pnas.0603181103},
+ file = {Ionides2006.pdf:Ionides2006.pdf:PDF},
+ owner = {kingaa},
+ timestamp = {2006.10.06}
+}
+
+ at ARTICLE{Kendall1999,
+ author = {B. E. Kendall and C. J. Briggs and W. W. Murdoch and P. Turchin and
+ S. P. Ellner and E. McCauley and R. M. Nisbet and S. N. Wood},
+ title = {Why do populations cycle? A synthesis of statistical and mechanistic
+ modeling approaches},
+ journal = {Ecology},
+ year = {1999},
+ volume = {80},
+ pages = {1789--1805},
+ abstract = {Population cycles have long fascinated ecologists. Even in the most-studied
+ populations, however, scientists continue to dispute the relative
+ importance of various potential causes of the cycles, Over the past
+ three decades, theoretical ecologists have cataloged a large number
+ of mechanisms that are capable of generating cycles in population
+ models. At the same time, statisticians have developed new techniques
+ both for characterizing time series and for fitting population models
+ to time-series data. Both disciplines are now sufficiently advanced
+ that great gains in understanding can be made by synthesizing these
+ complementary, and heretofore mostly independent, quantitative approaches.
+ In this paper we demonstrate how to apply this synthesis to the problem
+ of population cycles, using both long-term population time series
+ and the often-rich observational and experimental data on the ecology
+ of the species in question. We quantify hypotheses by writing mathematical
+ models that embody the interactions and forces that might cause cycles.
+ Some hypotheses can be rejected out of hand, as being unable to generate
+ even qualitatively appropriate dynamics, We finish quantifying the
+ remaining hypotheses by estimating parameters, both from independent
+ experiments and from fitting the models to the time-series data using
+ modern statistical techniques, Finally, we compare simulated time
+ series generated by the models to the observed time series, using
+ a variety of statistical descriptors, which we refer to collectively
+ as "probes." The model most similar to the data, as measured by these
+ probes, is considered to be the most likely candidate to represent
+ the mechanism underlying the population cycles. We illustrate this
+ approach by analyzing one of Nicholson's blowfly populations, in
+ which we know the "true" governing mechanism. Our analysis, which
+ uses only a subset of the information available about the population,
+ uncovers the correct answer, suggesting that this synthetic approach
+ might be successfully applied to field populations as well.},
+ file = {Kendall1999.pdf:Kendall1999.pdf:PDF},
+ owner = {kingaa},
+ timestamp = {2009.09.22}
+}
+
+ at ARTICLE{Kendall2005,
+ author = {Kendall, B. E. and Ellner, S. P. and McCauley, E. and Wood, S. N.
+ and Briggs, C. J. and Murdoch, W. M. and Turchin, P.},
+ title = {Population cycles in the pine looper moth: Dynamical tests of mechanistic
+ hypotheses},
+ journal = {Ecological Monographs},
+ year = {2005},
+ volume = {75},
+ pages = {259--276},
+ number = {2},
+ abstract = {The forest insect pest Bupalus piniarius (pine looper moth) is a classic
+ example of a natural population cycle. As is typical for Populations
+ that exhibit regular oscillations in density, there are several biological
+ mechanisms that are hypothesized to be responsible for the cycles;
+ but despite several decades of detailed study there has been no definite
+ conclusion as to which mechanism is most important. We evaluated
+ three hypotheses for which there was direct experimental evidence:
+ (1) food quality (nutritional value of pine needles affected by defoliation);
+ (2) parasitoids (trophic interactions with specialist parasitoids),
+ and (3) maternal effects (maternal body size affects the performance
+ of offspring). We reviewed the empirical evidence for each of these
+ hypotheses and expressed each hypothesis in the form of a mechanistic
+ dynamic model. We used a nonlinear forecasting approach to fit each
+ model to three long-term Population time series in Britain that exhibit
+ some degree of regular cycling, and we used parametric bootstrap
+ to evaluate the significance of differences between models in their
+ goodness of fit to the data. The results differed among the three
+ forests: at Culbin, the parasitoid and maternal effects models fit
+ equally well; at Roseisle, the food quality and maternal effects
+ models fit equally well; and at Tentsmuir, the parasitoid model fit
+ best. However, the best-fit parasitism models required that the parasitism
+ rate vary between nearly 0 and nearly 1 during a cycle, greatly exceeding
+ the range of parasitism rates that have been observed in the field.
+ In contrast, the required variation in the observable maternal quality
+ variable (pupal mass) was within the range of empirical observations.
+ Under mild constraints on the parasitism rate (though allowing a
+ much wider range than has been measured in B. piniarius at any location),
+ the fit of the parasitism model fell off dramatically. The maternal
+ effects model then had uniformly strong support, outperforming the
+ constrained parasitism model at all three sites and the food quality
+ model at two; it performed slightly better than the food quality
+ model at the remaining site. This represents the first system in
+ which the maternal effects hypothesis for population cycles has been
+ supported by both strong biological and dynamical evidence.},
+ file = {Kendall2005.pdf:Kendall2005.pdf:PDF},
+ owner = {kingaa},
+ timestamp = {2009.09.22}
+}
+
+ at ARTICLE{King2008,
+ author = {King, Aaron A. and Ionides, Edward L. and Pascual, Mercedes and Bouma,
+ Menno J.},
+ title = {Inapparent infections and cholera dynamics},
+ journal = {Nature},
+ year = {2008},
+ volume = {454},
+ pages = {877--880},
+ number = {7206},
+ month = aug,
+ abstract = {In many infectious diseases, an unknown fraction of infections produce
+ symptoms mild enough to go unrecorded, a fact that can seriously
+ compromise the interpretation of epidemiological records. This is
+ true for cholera, a pandemic bacterial disease, where estimates of
+ the ratio of asymptomatic to symptomatic infections have ranged from
+ 3 to 100. In the absence of direct evidence, understanding of fundamental
+ aspects of cholera transmission, immunology and control has been
+ based on assumptions about this ratio and about the immunological
+ consequences of inapparent infections. Here we show that a model
+ incorporating high asymptomatic ratio and rapidly waning immunity,
+ with infection both from human and environmental sources, explains
+ 50 yr of mortality data from 26 districts of Bengal, the pathogen's
+ endemic home. We find that the asymptomatic ratio in cholera is far
+ higher than had been previously supposed and that the immunity derived
+ from mild infections wanes much more rapidly than earlier analyses
+ have indicated. We find, too, that the environmental reservoir (free-living
+ pathogen) is directly responsible for relatively few infections but
+ that it may be critical to the disease's endemicity. Our results
+ demonstrate that inapparent infections can hold the key to interpreting
+ the patterns of disease outbreaks. New statistical methods, which
+ allow rigorous maximum likelihood inference based on dynamical models
+ incorporating multiple sources and outcomes of infection, seasonality,
+ process noise, hidden variables and measurement error, make it possible
+ to test more precise hypotheses and obtain unexpected results. Our
+ experience suggests that the confrontation of time-series data with
+ mechanistic models is likely to revise our understanding of the ecology
+ of many infectious diseases.},
+ doi = {10.1038/nature07084},
+ file = {King2008.pdf:King2008.pdf:PDF;King2008_Supplement.pdf:King2008_Supplement.pdf:PDF},
+ owner = {kingaa},
+ publisher = {Macmillan Publishers Limited. All rights reserved},
+ timestamp = {2008.08.13}
+}
+
@ARTICLE{Wearing2009,
author = {Helen J Wearing and Pejman Rohani},
title = {Estimating the duration of pertussis immunity using epidemiological
More information about the pomp-commits
mailing list