[Pomp-commits] r239 - in pkg/inst: . doc

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Mon May 17 15:51:00 CEST 2010


Author: kingaa
Date: 2010-05-17 15:51:00 +0200 (Mon, 17 May 2010)
New Revision: 239

Added:
   pkg/inst/doc/ou2-trajmatch.rda
Modified:
   pkg/inst/ChangeLog
   pkg/inst/doc/advanced_topics_in_pomp.pdf
   pkg/inst/doc/intro_to_pomp.Rnw
   pkg/inst/doc/intro_to_pomp.pdf
   pkg/inst/doc/pomp.bib
Log:
- add section on trajectory matching to the intro vignette


Modified: pkg/inst/ChangeLog
===================================================================
--- pkg/inst/ChangeLog	2010-05-17 12:54:41 UTC (rev 238)
+++ pkg/inst/ChangeLog	2010-05-17 13:51:00 UTC (rev 239)
@@ -1,5 +1,19 @@
+2010-05-16  kingaa
+
+	* [r237] inst/doc/intro_to_pomp.Rnw, inst/doc/intro_to_pomp.pdf: -
+	  some minor additions to the intro vignette
+
+2010-05-11  kingaa
+
+	* [r236] man/eulermultinom.Rd, man/plugins.Rd, man/pomp.Rd: -
+	  update documentation
+	* [r235] DESCRIPTION: - added Helen Wearing as an author for her
+	  work with the stochastic simulation algorithms
+
 2010-05-10  kingaa
 
+	* [r234] inst/ChangeLog, inst/doc/advanced_topics_in_pomp.pdf,
+	  inst/doc/intro_to_pomp.pdf: - updated vignettes
 	* [r233] man/euler.Rd: - the old plugins are deprecated: move their
 	  man pages to 'internal'
 	* [r232] inst/doc/ou2-first-mif.rda: - binary file with saved

Modified: pkg/inst/doc/advanced_topics_in_pomp.pdf
===================================================================
(Binary files differ)

Modified: pkg/inst/doc/intro_to_pomp.Rnw
===================================================================
--- pkg/inst/doc/intro_to_pomp.Rnw	2010-05-17 12:54:41 UTC (rev 238)
+++ pkg/inst/doc/intro_to_pomp.Rnw	2010-05-17 13:51:00 UTC (rev 239)
@@ -24,6 +24,7 @@
 \newcommand\code[1]{\texttt{#1}}
 \newcommand{\R}{\textsf{R}}
 \newcommand{\pomp}{\texttt{pomp}}
+\newcommand{\expect}[1]{\mathbb{E}\left[#1\right]}
 
 \title[Introduction to pomp]{Introduction to pomp:\\ inference for partially-observed Markov processes}
 
@@ -41,7 +42,7 @@
   48109-1048 USA
 }
 
-\email{kingaa at umich.edu} 
+\email{kingaa at umich dot edu} 
 
 \urladdr{http://pomp.r-forge.r-project.org}
 
@@ -361,6 +362,7 @@
 
 To compute the likelihood of the data, we can use the function \code{pfilter}.
 This runs a plain vanilla particle filter (AKA sequential Monte Carlo) algorithm and results in an unbiased estimate of the likelihood.
+See \citet{Arulampalam2002} for an excellent tutorial on particle filtering.
 To do this, we must decide how many concurrrent realizations (\emph{particles}) to use: the larger the number of particles, the smaller the Monte Carlo error but the greater the computational effort.
 Let's run \code{pfilter} with 1000 particles and evaluate the likelihood at the true parameters:
 <<ou2-pfilter-truth,eval=F>>=
@@ -485,6 +487,7 @@
 
 %% In the ``advanced_topics_in_pomp'' vignette, we show how one can get access to more of the underlying structure of a \pomp\ object.
 
+\clearpage
 \section{Estimating parameters using iterated filtering: \code{mif}}
 
 Iterated filtering is a technique for maximizing the likelihood obtained by filtering.
@@ -615,6 +618,8 @@
 
 \section{Nonlinear forecasting: \code{nlf}}
 
+To be added.
+
 <<first-nlf,echo=F,eval=F>>=
 estnames <- c('alpha.2','alpha.3','tau')
 out <- nlf(
@@ -644,11 +649,28 @@
       )
 @ 
 
+\clearpage
 \section{Trajectory matching: \code{traj.match}}
 
-To be added.
-
-<<ou2-skeleton-def,echo=F,eval=F>>=
+The idea behind trajectory matching is a simple one.
+One attempts to fit a deterministic dynamical trajectory to the data.
+This is tantamount to assuming that all the stochasticity in the system is in the measurement process.
+In \pomp, the trajectory is computed using the \code{trajectory} function, which in turn uses the \code{skeleton} slot of the \pomp\ object.
+The \code{skeleton} slot should be filled with the deterministic skeleton of the process model.
+In the discrete-time case, this is the map
+\begin{equation*}
+  x\,\mapsto\,\expect{X_{t+1}\;\vert\;X_{t}=x,\theta}.
+\end{equation*}
+In the continuous-time case, this is the vectorfield
+\begin{equation*}
+  x\,\mapsto\,\lim_{{\Delta}{t}\,\to\,0}\,\frac{\expect{X_{t+{\Delta}{t}}\;\vert\;X_{t}=x,\theta}}{{\Delta}{t}}.
+\end{equation*}
+Our discrete-time bivariate autoregressive process has the deterministic skeleton
+\begin{equation}\label{eq:ou-skel}
+  X_{t} = \alpha\,X_{t-1}
+\end{equation}
+which can be implemented in the \R\ function
+<<ou2-skeleton-def,echo=T>>=
 ou2.skel <- function (x, t, params, ...) {
   xnew <- c(
             params["alpha.1"]*x["x1"]+params["alpha.3"]*x["x2"],
@@ -659,13 +681,12 @@
 }
 @ 
 
-<<new-ou2-pomp-construction,eval=F,echo=F>>=
+We can incorporate the deterministic skeleton into a new \pomp\ object in the same way as before:
+<<new-ou2-pomp-construction,echo=T>>=
+dat <- subset(as(ou2,"data.frame"),time<=60)
+theta <- coef(ou2)
 new.ou2 <- pomp(
-                data=data.frame(
-                  time=1:100,
-                  y1=NA,
-                  y2=NA
-                  ),
+                data=dat[c("time","y1","y2")],
                 times="time",
                 rprocess=euler.sim(ou2.proc.sim,delta.t=1),
                 rmeasure=ou2.meas.sim,
@@ -674,28 +695,25 @@
                 t0=0
                 )
 coef(new.ou2) <- theta
-coef(new.ou2,c("sigma.1","sigma.2","sigma.3","tau")) <- c(0,0,0,0.2)
+coef(new.ou2,c("sigma.1","sigma.2","sigma.3","tau")) <- c(0,0,0,0.5)
 new.ou2 <- simulate(new.ou2,seed=88737400L)
 @ 
-<<echo=F,eval=F>>=
-new.ou2 <- ou2
-coef(new.ou2) <- theta
-coef(new.ou2,c("sigma.1","sigma.2","sigma.3","tau")) <- c(0,0,0,0.2)
-new.ou2 <- simulate(new.ou2,seed=88737400L)
-@ 
+Note that we have turned off the process noise in \code{new.ou2} (next to last line) so that trajectory matching is actually formally appropriate for this model.
 
-<<ou2-trajmatch-calc,eval=F,echo=F>>=
+The \pomp\ function \code{traj.match} calls the optimizer \code{optim} to minimize the discrepancy between the trajectory and the data.
+The discrepancy is measured using the \code{dmeasure} function from the \pomp\ object.
+Fig.~\ref{fig:trajmatch-plot} shows the results of this fit.
+<<ou2-trajmatch-calc,eval=F>>=
 tm <- traj.match(
                  new.ou2,
                  start=coef(new.ou2),
-                 est=c("alpha.1","alpha.2","alpha.3","alpha.4","tau","x1.0","x2.0"),
+                 est=c("alpha.2","alpha.3","tau","x1.0","x2.0"),
                  method="Nelder-Mead",
                  maxit=1000,
                  reltol=1e-8
                  )
 @ 
-
-<<ou2-trajmatch-eval,echo=F,eval=F>>=
+<<ou2-trajmatch-eval,echo=F,eval=T>>=
 binary.file <- "ou2-trajmatch.rda"
 if (file.exists(binary.file)) {
   load(binary.file)
@@ -705,20 +723,25 @@
 }
 @
 
-%%\begin{figure}
-<<trajmatch-plot,fig=T,echo=F,eval=F>>=
+\begin{figure}
+<<trajmatch-plot,fig=T,echo=F,eval=T>>=
 op <- par(mfrow=c(2,1),mar=c(3,3,0,0),mgp=c(2,1,0),bty='l')
 plot(time(new.ou2),data.array(new.ou2,"y1"),xlab="",xaxt='n',ylab=expression(y[1]))
 x <- trajectory(new.ou2,params=tm$params)
-lines(time(new.ou2),x["x1",1,-1])
+lines(time(new.ou2),x["x1",1,-1],lwd=2)
 plot(time(new.ou2),data.array(new.ou2,"y2"),xlab="time",ylab=expression(y[2]))
-lines(time(new.ou2),x["x2",1,-1])
+lines(time(new.ou2),x["x2",1,-1],lwd=2)
 par(op)
 @ 
-%% \caption{Trajectory matching.}
-%% \label{fig:trajmatch-plot}
-%% \end{figure}
+\caption{
+  Illustration of trajectory matching.
+  The points show data simulated from \code{new.ou2}, which has no process noise but only measurement error.
+  The solid line shows the trajectory of the best-fitting model, obtained using \code{traj.match}.
+}
+\label{fig:trajmatch-plot}
+\end{figure}
 
+\clearpage
 \section{A more complex example: a seasonal epidemic model}
 
 The SIR model is a mainstay of theoretical epidemiology.
@@ -868,6 +891,7 @@
   \label{fig:sir-plot}
 \end{figure}
 
+\clearpage
 \bibliographystyle{ecology}
 \bibliography{pomp}
 

Modified: pkg/inst/doc/intro_to_pomp.pdf
===================================================================
(Binary files differ)

Added: pkg/inst/doc/ou2-trajmatch.rda
===================================================================
(Binary files differ)


Property changes on: pkg/inst/doc/ou2-trajmatch.rda
___________________________________________________________________
Added: svn:mime-type
   + application/octet-stream

Modified: pkg/inst/doc/pomp.bib
===================================================================
--- pkg/inst/doc/pomp.bib	2010-05-17 12:54:41 UTC (rev 238)
+++ pkg/inst/doc/pomp.bib	2010-05-17 13:51:00 UTC (rev 239)
@@ -1,6 +1,20 @@
 % This file was created with JabRef 2.5.
 % Encoding: ISO8859_1
 
+ at ARTICLE{Arulampalam2002,
+  author = {Arulampalam, M. S. and Maskell, S. and Gordon, N. and Clapp, T.},
+  title = {A Tutorial on Particle Filters for Online Nonlinear, Non-{G}aussian
+	{B}ayesian Tracking},
+  journal = {IEEE Transactions on Signal Processing},
+  year = {2002},
+  volume = {50},
+  pages = {174 -- 188},
+  doi = {10.1109/78.978374},
+  file = {Arulampalam2002.pdf:Desktop/Arulampalam2002.pdf:PDF},
+  owner = {kingaa},
+  timestamp = {2007.07.20}
+}
+
 @ARTICLE{Breto2009,
   author = {Carles Bret\'{o} and Daihai He and Edward L. Ionides and Aaron A.
 	King},
@@ -33,6 +47,43 @@
   timestamp = {2009.09.15}
 }
 
+ at ARTICLE{Ellner1998,
+  author = {S. P. Ellner and B. A. Bailey and G. V. Bobashev and A. R. Gallant
+	and B. T. Grenfell and D. W. Nychka},
+  title = {Noise and nonlinearity in measles epidemics: Combining mechanistic
+	and statistical approaches to population modeling},
+  journal = {American Naturalist},
+  year = {1998},
+  volume = {151},
+  pages = {425--440},
+  abstract = {We present and evaluate an approach to analyzing population dynamics
+	data using semimechanistic models. These models incorporate reliable
+	information on population structure and underlying dynamic mechanisms
+	but use nonparametric surface-fitting methods to avoid unsupported
+	assumptions about the precise form of rate equations. Using historical
+	data on measles epidemics as a case study, we show how this approach
+	can lead to better forecasts, better characterizations of the dynamics,
+	and a better understanding of the factors causing complex population
+	dynamics relative to either mechanistic models or purely descriptive
+	statistical time-series models. The semimechanistic models are found
+	to have better forecasting accuracy than either of the model types
+	used in previous analyses when tested on data not used to fit the
+	models. The dynamics are characterized as being both nonlinear and
+	noisy, and the global dynamics are clustered very tightly near the
+	border of stability (dominant Lyapunov exponent lambda approximate
+	to 0). However, locally in state space the dynamics oscillate between
+	strong short-term stability and strong short-term chaos (i.e., between
+	negative and positive local Lyapunov exponents). There is statistically
+	significant evidence for short-term chaos in all data sets examined.
+	Thus the nonlinearity in these systems is characterized by the variance
+	over state space in local measures of chaos versus stability rather
+	than a single summary measure of the overall dynamics as either chaotic
+	or nonchaotic.},
+  file = {Ellner1998.pdf:Ellner1998.pdf:PDF},
+  owner = {kingaa},
+  timestamp = {2009.09.22}
+}
+
 @ARTICLE{Gillespie1977a,
   author = {D. T. Gillespie},
   title = {Exact Stochastic Simulation of Coupled Chemical Reactions},
@@ -78,6 +129,174 @@
   timestamp = {2009.06.26}
 }
 
+ at ARTICLE{Ionides2006,
+  author = {E. L. Ionides and C. Bret{\'o} and Aaron A. King},
+  title = {Inference for nonlinear dynamical systems},
+  journal = {Proceedings of the National Academy of Sciences of the United States
+	of America},
+  year = {2006},
+  volume = {103},
+  pages = {18438--18443},
+  number = {49},
+  abstract = {Nonlinear stochastic dynamical systems are widely used to model systems
+	across the sciences and engineering. Such models are natural to formulate
+	and can be analyzed mathematically and numerically. However, difficulties
+	associated with inference from time-series data about unknown parameters
+	in these models have been a constraint on their application. We present
+	a new method that makes maximum likelihood estimation feasible for
+	partially-observed nonlinear stochastic dynamical systems (also known
+	as state-space models) where this was not previously the case. The
+	method is based on a sequence of filtering operations which are shown
+	to converge to a maximum likelihood parameter estimate. We make use
+	of recent advances in nonlinear filtering in the implementation of
+	the algorithm. We apply the method to the study of cholera in Bangladesh.
+	We construct confidence intervals, perform residual analysis, and
+	apply other diagnostics. Our analysis, based upon a model capturing
+	the intrinsic nonlinear dynamics of the system, reveals some effects
+	overlooked by previous studies.},
+  doi = {10.1073/pnas.0603181103},
+  file = {Ionides2006.pdf:Ionides2006.pdf:PDF},
+  owner = {kingaa},
+  timestamp = {2006.10.06}
+}
+
+ at ARTICLE{Kendall1999,
+  author = {B. E. Kendall and C. J. Briggs and W. W. Murdoch and P. Turchin and
+	S. P. Ellner and E. McCauley and R. M. Nisbet and S. N. Wood},
+  title = {Why do populations cycle? A synthesis of statistical and mechanistic
+	modeling approaches},
+  journal = {Ecology},
+  year = {1999},
+  volume = {80},
+  pages = {1789--1805},
+  abstract = {Population cycles have long fascinated ecologists. Even in the most-studied
+	populations, however, scientists continue to dispute the relative
+	importance of various potential causes of the cycles, Over the past
+	three decades, theoretical ecologists have cataloged a large number
+	of mechanisms that are capable of generating cycles in population
+	models. At the same time, statisticians have developed new techniques
+	both for characterizing time series and for fitting population models
+	to time-series data. Both disciplines are now sufficiently advanced
+	that great gains in understanding can be made by synthesizing these
+	complementary, and heretofore mostly independent, quantitative approaches.
+	In this paper we demonstrate how to apply this synthesis to the problem
+	of population cycles, using both long-term population time series
+	and the often-rich observational and experimental data on the ecology
+	of the species in question. We quantify hypotheses by writing mathematical
+	models that embody the interactions and forces that might cause cycles.
+	Some hypotheses can be rejected out of hand, as being unable to generate
+	even qualitatively appropriate dynamics, We finish quantifying the
+	remaining hypotheses by estimating parameters, both from independent
+	experiments and from fitting the models to the time-series data using
+	modern statistical techniques, Finally, we compare simulated time
+	series generated by the models to the observed time series, using
+	a variety of statistical descriptors, which we refer to collectively
+	as "probes." The model most similar to the data, as measured by these
+	probes, is considered to be the most likely candidate to represent
+	the mechanism underlying the population cycles. We illustrate this
+	approach by analyzing one of Nicholson's blowfly populations, in
+	which we know the "true" governing mechanism. Our analysis, which
+	uses only a subset of the information available about the population,
+	uncovers the correct answer, suggesting that this synthetic approach
+	might be successfully applied to field populations as well.},
+  file = {Kendall1999.pdf:Kendall1999.pdf:PDF},
+  owner = {kingaa},
+  timestamp = {2009.09.22}
+}
+
+ at ARTICLE{Kendall2005,
+  author = {Kendall, B. E. and Ellner, S. P. and McCauley, E. and Wood, S. N.
+	and Briggs, C. J. and Murdoch, W. M. and Turchin, P.},
+  title = {Population cycles in the pine looper moth: Dynamical tests of mechanistic
+	hypotheses},
+  journal = {Ecological Monographs},
+  year = {2005},
+  volume = {75},
+  pages = {259--276},
+  number = {2},
+  abstract = {The forest insect pest Bupalus piniarius (pine looper moth) is a classic
+	example of a natural population cycle. As is typical for Populations
+	that exhibit regular oscillations in density, there are several biological
+	mechanisms that are hypothesized to be responsible for the cycles;
+	but despite several decades of detailed study there has been no definite
+	conclusion as to which mechanism is most important. We evaluated
+	three hypotheses for which there was direct experimental evidence:
+	(1) food quality (nutritional value of pine needles affected by defoliation);
+	(2) parasitoids (trophic interactions with specialist parasitoids),
+	and (3) maternal effects (maternal body size affects the performance
+	of offspring). We reviewed the empirical evidence for each of these
+	hypotheses and expressed each hypothesis in the form of a mechanistic
+	dynamic model. We used a nonlinear forecasting approach to fit each
+	model to three long-term Population time series in Britain that exhibit
+	some degree of regular cycling, and we used parametric bootstrap
+	to evaluate the significance of differences between models in their
+	goodness of fit to the data. The results differed among the three
+	forests: at Culbin, the parasitoid and maternal effects models fit
+	equally well; at Roseisle, the food quality and maternal effects
+	models fit equally well; and at Tentsmuir, the parasitoid model fit
+	best. However, the best-fit parasitism models required that the parasitism
+	rate vary between nearly 0 and nearly 1 during a cycle, greatly exceeding
+	the range of parasitism rates that have been observed in the field.
+	In contrast, the required variation in the observable maternal quality
+	variable (pupal mass) was within the range of empirical observations.
+	Under mild constraints on the parasitism rate (though allowing a
+	much wider range than has been measured in B. piniarius at any location),
+	the fit of the parasitism model fell off dramatically. The maternal
+	effects model then had uniformly strong support, outperforming the
+	constrained parasitism model at all three sites and the food quality
+	model at two; it performed slightly better than the food quality
+	model at the remaining site. This represents the first system in
+	which the maternal effects hypothesis for population cycles has been
+	supported by both strong biological and dynamical evidence.},
+  file = {Kendall2005.pdf:Kendall2005.pdf:PDF},
+  owner = {kingaa},
+  timestamp = {2009.09.22}
+}
+
+ at ARTICLE{King2008,
+  author = {King, Aaron A. and Ionides, Edward L. and Pascual, Mercedes and Bouma,
+	Menno J.},
+  title = {Inapparent infections and cholera dynamics},
+  journal = {Nature},
+  year = {2008},
+  volume = {454},
+  pages = {877--880},
+  number = {7206},
+  month = aug,
+  abstract = {In many infectious diseases, an unknown fraction of infections produce
+	symptoms mild enough to go unrecorded, a fact that can seriously
+	compromise the interpretation of epidemiological records. This is
+	true for cholera, a pandemic bacterial disease, where estimates of
+	the ratio of asymptomatic to symptomatic infections have ranged from
+	3 to 100. In the absence of direct evidence, understanding of fundamental
+	aspects of cholera transmission, immunology and control has been
+	based on assumptions about this ratio and about the immunological
+	consequences of inapparent infections. Here we show that a model
+	incorporating high asymptomatic ratio and rapidly waning immunity,
+	with infection both from human and environmental sources, explains
+	50 yr of mortality data from 26 districts of Bengal, the pathogen's
+	endemic home. We find that the asymptomatic ratio in cholera is far
+	higher than had been previously supposed and that the immunity derived
+	from mild infections wanes much more rapidly than earlier analyses
+	have indicated. We find, too, that the environmental reservoir (free-living
+	pathogen) is directly responsible for relatively few infections but
+	that it may be critical to the disease's endemicity. Our results
+	demonstrate that inapparent infections can hold the key to interpreting
+	the patterns of disease outbreaks. New statistical methods, which
+	allow rigorous maximum likelihood inference based on dynamical models
+	incorporating multiple sources and outcomes of infection, seasonality,
+	process noise, hidden variables and measurement error, make it possible
+	to test more precise hypotheses and obtain unexpected results. Our
+	experience suggests that the confrontation of time-series data with
+	mechanistic models is likely to revise our understanding of the ecology
+	of many infectious diseases.},
+  doi = {10.1038/nature07084},
+  file = {King2008.pdf:King2008.pdf:PDF;King2008_Supplement.pdf:King2008_Supplement.pdf:PDF},
+  owner = {kingaa},
+  publisher = {Macmillan Publishers Limited. All rights reserved},
+  timestamp = {2008.08.13}
+}
+
 @ARTICLE{Wearing2009,
   author = {Helen J Wearing and Pejman Rohani},
   title = {Estimating the duration of pertussis immunity using epidemiological



More information about the pomp-commits mailing list