[Dream-commits] r35 - in pkg: . R demo inst man

Mon Apr 16 06:04:07 CEST 2012

Author: josephguillaume
Date: 2012-04-16 06:04:07 +0200 (Mon, 16 Apr 2012)
New Revision: 35

Added:
   pkg/NEWS
Modified:
   pkg/DESCRIPTION
   pkg/R/dream.R
   pkg/demo/parallelisation_chain_id.R
   pkg/inst/CITATION
   pkg/man/dream.Rd
   pkg/man/dreamCalibrate.Rd
   pkg/man/snow.chains.Rd
Log:
  o  turn the calculation of Cb and Wb off if a likelihood function is provided.
  o  cleaned up demo parallelisation_chain_id, example of extracting parameter values and likelihood function
  o  expanded 'see also' section in dream man page
  o  clarify that Gelman-Rubin statistic is convergence of chain in dream man page
  o  mention MT-DREAM ZS in dream man page
  o  fixed error in CITATION

Modified: pkg/DESCRIPTION
===================================================================

--- pkg/DESCRIPTION	2012-01-10 06:44:31 UTC (rev 34)
+++ pkg/DESCRIPTION	2012-04-16 04:04:07 UTC (rev 35)
@@ -1,15 +1,15 @@
 Package: dream
-Version: 0.4
-Date: 2011-01-10
+Version: 0.4-1
+Date: 2012-04-16
 Title: DiffeRential Evolution Adaptive Metropolis
-Author: Joseph Guillaume and Felix Andrews, based on MATLAB code by Jasper Vrugt
+Author: Joseph Guillaume <joseph.guillaume at anu.edu.au>, Felix Andrews <felix at nfrac.org>
 Maintainer: Joseph Guillaume <joseph.guillaume at anu.edu.au>
 Description: Efficient global MCMC even in high-dimensional spaces.
   Based on the original MATLAB code written by Jasper Vrugt.
   Methods for calibration and prediction using the DREAM algorithm
 Depends: coda
-Suggests: snow, doSNOW, R.matlab, lattice, reshape, mgcv
-Enhances: multicore, foreach, doMPI, doMC
+Suggests: snow, R.matlab, lattice, reshape, mgcv
+Enhances: multicore, foreach, doMPI, doMC, doSNOW
 Imports: stats, utils
 License: file LICENSE
 URL: http://dream.r-forge.r-project.org/

Added: pkg/NEWS
===================================================================
--- pkg/NEWS	                        (rev 0)
+++ pkg/NEWS	2012-04-16 04:04:07 UTC (rev 35)
@@ -0,0 +1,16 @@
+Changes in Version 0.4-1
+
+  o  turn the calculation of Cb and Wb off if a likelihood function is provided.
+  o  cleaned up demo parallelisation_chain_id, added example of extracting parameter values and likelihood function
+  o  expanded 'see also' section in dream man page
+  o  clarify that Gelman-Rubin statistic is convergence of chain in dream man page
+  o  mention MT-DREAM ZS in dream man page
+
+Changes in Version 0.4
+  o  Allowed outlierTest='None' (and burnin.length=Inf) (don't run removeOutliers if burnin.length=0)
+  o  Padding of numbers in default parameter names to guarantee alphabetic order
+  o  Help file for simulate
+  o  Added an example adapted from Vrugt's code
+  o  Added undocumented functions to compare R output to the original matlab output
+  o  Added mat6 files containing matlab output for example1 and example2
+  o  Added ks.test and qq plot comparisons to matlab results for both example1 and example2. The R version appears to behave identically to the matlab version
\ No newline at end of file


Property changes on: pkg/NEWS
___________________________________________________________________
Added: svn:executable
   + *

Modified: pkg/R/dream.R
===================================================================
--- pkg/R/dream.R	2012-01-10 06:44:31 UTC (rev 34)
+++ pkg/R/dream.R	2012-04-16 04:04:07 UTC (rev 35)
@@ -231,10 +231,12 @@
   if (!is.na(control$thin.t) && floor(max.counter/control$thin.t)<2) stop(sprintf("Thin parameter should be much smaller than number of iterations (currently %d,%d)",control$thin.t,max.counter))
 
   ## Calculate the parameters in the exponential power density function of Box and Tiao (1973)
-  cbwb <- CalcCbWb(control$gamma)
-  if (is.na(control$Cb))  control$Cb <- cbwb$Cb
-  if (is.na(control$Wb))  control$Wb <- cbwb$Wb
-
+  if (!func.type %in% c("posterior.density","logposterior.density")){
+    cbwb <- CalcCbWb(control$gamma)
+    if (is.na(control$Cb))  control$Cb <- cbwb$Cb
+    if (is.na(control$Wb))  control$Wb <- cbwb$Wb
+  }
+  
   ## Generate the Table with JumpRates (dependent on number of dimensions and number of pairs)
   Table.JumpRate<-matrix(NA,NDIM,control$DEpairs)
   for (zz in 1:control$DEpairs) Table.JumpRate[,zz] <- 2.38/sqrt(2*zz*1:NDIM)

Modified: pkg/demo/parallelisation_chain_id.R
===================================================================
--- pkg/demo/parallelisation_chain_id.R	2012-01-10 06:44:31 UTC (rev 34)
+++ pkg/demo/parallelisation_chain_id.R	2012-04-16 04:04:07 UTC (rev 35)
@@ -1,68 +1,37 @@
-## Example of parallelisation
-## Uses FME data (see other example)
+## Example of "snow.chains" parallelisation option
+## Intended to allow running external model instances using separate batch files
+## Uses FME data, see demo("FME.nonlinear.model")
 
 obs.all <- data.frame(x=c(   28,  55,   83,  110,  138,  225,  375),   # mg COD/l
                   y=c(0.053,0.06,0.112,0.105,0.099,0.122,0.125))   # 1/hour
 
-
+## Define control settings
 control <- list(
+                ## Number of chains to run in parallel
                 nseq=4
                 )
+## Define parameter ranges
 pars <- list(p1=c(0,1),p2=c(0,100))
 
-
-##########################
-## Demonstration of parallelisation packages
-##  Comparison of run times for known fixed termination problem
-
-lik <- function(Qobs,Qs,...){
-  res<-Qobs-Qs
-  sd.res<-sd(res)
-  var.res<-var(res)
-  logp<-sum(sapply(res,function(res_k)
-                   log(1/sqrt(2*pi)*sd.res)*exp(-res_k^2/(2*var.res))))
-}
-
-Model.split <- function(id,pars){
-
-  ##Make sure that you have as many batch scripts as you have parameter chains
-  ## placeholder for this example
-
-  ##Select out this parameter set
-  this.par.set=pars[id,]
-
-  ##Do something to set the parameters in SWAT for this instance
-  ## Because they're running on other nodes, they don't have access to data or functions in R
-  x <- c(   28,  55,   83,  110,  138,  225,  375)
-  Model.y <- function(p,x) {
-    p[1]*x/(x+p[2])
-  }
-
-  ##Run the model
-  ans <- Model.y(this.par.set,x)
-
-  ##Do something to collect the answer, in appropriate form for lik.fun in dreamCalibrate
-  return(ans)
-}
-
+## Define function to return log posterior density for a parameter set
+##  doing something different for each parallel chain
 Model.fit <- function(id,pars){
-  ##Make sure that you have as many batch scripts as you have parameter chains
-  ## placeholder for this example
-
   ##Select out this parameter set
   this.par.set=pars[id,]
 
-  ##Do something to set the parameters in SWAT for this instance
-  ## Because they're running on other nodes, they don't have access to data or functions in R
+  ##Do something to set the parameters for this instance
+  ## Because they're running on other nodes, they don't have access to the data
+  ##  or functions in this session of R
   x <- c(   28,  55,   83,  110,  138,  225,  375)
   Model.y <- function(p,x) {
     p[1]*x/(x+p[2])
   }
 
   ##Run the model
+  ## If an external model, it could be e.g. a different batch script for each parameter chain
   ans <- Model.y(this.par.set,x)
 
-  ##Do something to collect the answer as logp
+  ##Do something to collect the answer as log posterior density
   Qs <- ans
   Qobs <- c(0.053,0.06,0.112,0.105,0.099,0.122,0.125)
 
@@ -74,32 +43,64 @@
   return(logp)
 }
 
-## Example here is for SNOW with a socket cluster, which doesn't require any other software or setup
+################################################################################
+## Example here is for SNOW with a socket cluster, which doesn't require any
+##   other software or setup
 ## Parallelisation incurs an overhead and is not worthwhile for fast functions
-library(doSNOW)
+library(snow)
 cl <- makeCluster(2, type = "SOCK")
-registerDoSNOW(cl)
 
 ## Using dream directly, with Model.fit, which returns a likelihood
 set.seed(456)
 control$parallel <- "snow.chains"
-dd <- dream(FUN = Model.fit,
+result.dream <- dream(FUN = Model.fit,
             pars = pars,
             func.type = "logposterior.density",
             control=control
             )
-summary(dd)
+stopCluster(cl)
 
-## TODO: dreamCalibrate cannot be used because of the way it calls FUN
-## ## Using dreamCalibrate, which allows the likelihood function to be changed separately from the model function
-## set.seed(456)
-## dd <- dreamCalibrate(
-##                      FUN=Model.split,
-##                      pars = pars,
-##                      obs=obs.all$y,
-##                      lik.fun=lik,
-##                      control = control
-##                      )
-## summary(dd)
+################################################################################
+## Exploring results
 
-stopCluster(cl)
+## Summary of settings
+print(result.dream)
+
+## Summary of results of fit
+summary(result.dream)
+
+## Selection of key plots of fit
+plot(result.dream)
+
+## Return parameters having highest likelihood, see ?coef.dream
+coef(result.dream)
+
+################################################################################
+## Extract parameters and likelihood
+
+## Extract MCMC object, removing burn.in and thinning, see ?window.dream
+mcmc <- window(result.dream)
+## Convert likelihood to MCMC object
+logp <- as.mcmc(window(result.dream$hist.logp, start = start(mcmc)))
+
+## Calculate effective size of MCMC object, and thin
+effsz <- effectiveSize(mcmc)
+thin <- 2 * ceiling(max(nrow(mcmc[[1]])/effsz))
+mcmc <- window(mcmc, thin = thin)
+logp <- window(logp, thin = thin)
+## Convert to more usable formats
+psets <- as.matrix(mcmc)
+objseq <- as.vector(logp)
+
+summary(psets)
+summary(objseq)
+
+## Plot log likelihood as a function of parameter value
+##   (also known as 'dotty' plot)
+##  with vertical lines showing maximum likelihood parameter value
+par(mfrow=c(1,2))
+plot(psets[,1],objseq,ylim=c(-1,0))
+abline(v=coef(result.dream)[1],col="red")
+plot(psets[,2],objseq,ylim=c(-1,0))
+abline(v=coef(result.dream)[2],col="red")
+

Modified: pkg/inst/CITATION
===================================================================
--- pkg/inst/CITATION	2012-01-10 06:44:31 UTC (rev 34)
+++ pkg/inst/CITATION	2012-04-16 04:04:07 UTC (rev 35)
@@ -5,7 +5,7 @@
                        "simulation by differential evolution with", 
                        "self-adaptive randomized subspace sampling."),
          author = personList(as.person("J A Vrugt"),
-	                     person("C", "J F", last = "ter Braak"),
+	                     as.person("C J F ter Braak"),
                              as.person("C G H Diks"),
                              as.person("B A Robinson"),
                              as.person("J M Hyman"),

Modified: pkg/man/dream.Rd
===================================================================
--- pkg/man/dream.Rd	2012-01-10 06:44:31 UTC (rev 34)
+++ pkg/man/dream.Rd	2012-04-16 04:04:07 UTC (rev 35)
@@ -11,14 +11,19 @@
   Efficient global MCMC even in high-dimensional spaces.
   From J.A. Vrugt, C.J.F. ter Braak et al.
 
-Note that the dream_zs and dream_d algorithms may be superior in your circumstances. These are not implemented in this package. Please read the following references for details.
+  This function is a low-level interface, best suited for experts. If you want to use dream to calibrate a function, use \code{\link{dreamCalibrate}} instead.
+  
+  Note that the dream_zs and dream_d algorithms may be superior in your circumstances. These are not implemented in this package. Please read the following references for details.
 
 Vrugt, J. A. and Ter Braak, C. J. F. (2011) DREAM(D): an adaptive Markov Chain Monte Carlo simulation algorithm to solve discrete, noncontinuous, and combinatorial posterior parameter estimation problems, \emph{Hydrol. Earth Syst. Sci.}, 15, 3701-3713, doi:10.5194/hess-15-3701-2011, from \url{http://dx.doi.org/10.5194/hess-15-3701-2011}
 
-ter Braak, C. and J. Vrugt (2008). "Differential Evolution Markov Chain
-with snooker updater and fewer chains." Statistics and Computing 18(4): 435-446 DOI: 10.1007/s11222-008-9104-9, from url{http://dx.doi.org/10.1007/s11222-008-9104-9}
+ter Braak, C. and J. Vrugt (2008). Differential Evolution Markov Chain
+with snooker updater and fewer chains. Statistics and Computing 18(4): 435-446 DOI: 10.1007/s11222-008-9104-9, from url{http://dx.doi.org/10.1007/s11222-008-9104-9}
 
-  This function is a low-level interface, best suited for experts. If you want to use dream to calibrate a function, use \code{\link{dreamCalibrate}} instead.
+Laloy,E., and J.A. Vrugt.  2012. High-dimensional posterior exploration
+of hydrologic models using multiple-try DREAM(ZS) and high-performance
+computing.  Water Resources Research, 48, W0156, from url{http://dx.doi.org/10.1029/2011WR010608}
+
 }
 \arguments{
   \item{FUN}{
@@ -146,12 +151,6 @@
   If dream frequently warns that it is reentering the burn-period,
   \code{control$burnin.length} may need to be increased.
 
-  There are S3 methods for print, summary,
-  \code{\link[=plot.dream]{plot}},
-  \code{\link[=coef.dream]{coef}},
-  \code{\link[=simulate.dream]{simulate}},
-  \code{\link[=window.dream]{window}}
-
 }
 \value{
   A list with elements:
@@ -161,7 +160,10 @@
   \item{hist.logp}{History of log(\var{p}) values. matrix \code{nseq x n.elem}.}
   \item{AR}{Acceptance rate for each draw. matrix \code{n.elem x 2}.}
   \item{outlier}{Iterations at which chains were replaced. vector of variable length}
-  \item{R.stat}{gelman.diag statistic for each variable at each step. matrix \code{n.elem/steps x 1+ndim}.}
+  \item{R.stat}{\code{\link{gelman.diag}} statistic for each variable at
+    each step, matrix \code{n.elem/steps x 1+ndim}. This is a measure of convergence of the parallel MCMC
+    chains to the posterior distribution. (N.B. in nonlinear
+    optimisation, this is not a measure of convergence to an optimum parameter set) }
   \item{CR}{Probability of crossover at each step. matrix \code{n.elem/steps x
     1+length(pCR)}.}
   \item{in.burnin}{Boolean showing whether dream was in the burn-in
@@ -173,23 +175,31 @@
 
 \seealso{
   \code{\link{dreamCalibrate}} to calibrate a function using dream.
+  \code{\link{snow.chains}} for parallelised calibration of external
+  models (expert users only).
 
-  Examples in demo folder:
+  S3 generic methods for print, summary,
+  \code{\link[=plot.dream]{plot}},
+  \code{\link[=coef.dream]{coef}},
+  \code{\link[=simulate.dream]{simulate}},
+  \code{\link[=window.dream]{window}}
+  
+  Demos:
   \itemize{
-    \item{example1: }{Fitting a banana shaped distribution - the first example
-  in Vrugt matlab code}
-    }
+    \item{\code{demo(example1)} }{Fitting a banana shaped distribution - as in the original matlab code}
+    \item{\code{demo(example2)}}{Fitting an n-dimensional Gaussian distribution as
+  in the original matlab code}
   }
-
+}
 \references{
 
   Vrugt, J. A., ter Braak, C. J. F., Diks, C. G. H., Robinson, B. A.,
   Hyman, J. M., Higdon, D., 2009. Accelerating Markov chain Monte Carlo
   simulation by differential evolution with self-adaptive randomized
   subspace sampling. \emph{International Journal of Nonlinear Sciences
-  and Numerical Simulation} 10 (3), 273-290.
+  and Numerical Simulation} 10 (3), 273-290. \url{http://dx.doi.org/10.1515/IJNSNS.2009.10.3.273}
 
-  Vrugt, J. A., Braak, C. J. F., Gupta, H. V., Robinson, B. A.,
+  Vrugt, J. A., ter Braak, C. J. F., Gupta, H. V., Robinson, B. A.,
   2009. Equifinality of formal (DREAM) and informal (GLUE) Bayesian
   approaches in hydrologic modeling?
   \emph{Stochastic Environmental Research and Risk Assessment} 23 (7), 1011--1026.

Modified: pkg/man/dreamCalibrate.Rd
===================================================================
--- pkg/man/dreamCalibrate.Rd	2012-01-10 06:44:31 UTC (rev 34)
+++ pkg/man/dreamCalibrate.Rd	2012-04-16 04:04:07 UTC (rev 35)
@@ -61,9 +61,11 @@
   See \code{\link{dream}} for details on the calibration method,
   visualisation of its results and diagnostics.
 
-  Example in demo folder:
+  See \code{\link{snow.chains}} for parallelised calibration of external
+  models (expert users only).
+
   \itemize{
-    \item{FME non linear model: }{Calibrating the non-linear model shown
+    \item{\code{demo(FME.nonlinear.model)}}{Calibrating the non-linear model shown
       in the FME vignette}
   }
 }

Modified: pkg/man/snow.chains.Rd
===================================================================
--- pkg/man/snow.chains.Rd	2012-01-10 06:44:31 UTC (rev 34)
+++ pkg/man/snow.chains.Rd	2012-04-16 04:04:07 UTC (rev 35)
@@ -3,8 +3,10 @@
 \alias{snow.chains}
 \title{DREAM with external batch files in parallel folders}
 \description{
-  N.B. This functionality is only provided on request for expert users
-  and will not be supported by the author.
+  N.B. This functionality is only provided on request for expert
+  users of external model software. It is assumed that users are capable
+  of setting up batch files, executing them from R, and reading in the
+  results. The authors can only provide limited support.
 
   With the setting \code{control$parallel="snow.chains"}, dream
   interprets \code{FUN} to be of the form \code{logp=f(chain.id, list of
@@ -24,4 +26,7 @@
    }
  }
  \seealso{\code{\link{dream}} for documentation of other arguments, and
-  \code{\link{dreamCalibrate}} for the recommended way to calibrate a function using dream.}
\ No newline at end of file
+  \code{\link{dreamCalibrate}} for the recommended way to calibrate a
+  function using dream.
+  demo("parallelisation_chain_id")
+}