[Vegan-commits] r2574 - pkg/vegan/inst/doc
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Thu Jul 18 18:02:14 CEST 2013
Author: jarioksa
Date: 2013-07-18 18:02:14 +0200 (Thu, 18 Jul 2013)
New Revision: 2574
Modified:
pkg/vegan/inst/doc/decision-vegan.Rnw
pkg/vegan/inst/doc/vegan.sty
Log:
fix vegan.sty (order matters), edit decision for parallel processing
Modified: pkg/vegan/inst/doc/decision-vegan.Rnw
===================================================================
--- pkg/vegan/inst/doc/decision-vegan.Rnw 2013-07-18 13:00:06 UTC (rev 2573)
+++ pkg/vegan/inst/doc/decision-vegan.Rnw 2013-07-18 16:02:14 UTC (rev 2574)
@@ -63,13 +63,13 @@
For parallel processing, the \code{parallel} argument can be either
\begin{enumerate}
-\item Integer in which case the given number of parallel processes
+\item An integer in which case the given number of parallel processes
will be launched (value $1$ launches non-parallel processing). In
unix-like systems (\emph{e.g.}, MacOS, Linux) these will be forked
- \code{multicore} processes, but socket clusters will be set up,
- initialized and closed in Windows.
+ \code{multicore} processes. In Windows socket clusters will be set up,
+ initialized and closed.
\item A previously created socket cluster. This saves time as the
- cluster is not set up and closed repeatedly. If the argument is a
+ cluster is not set up and closed in the function. If the argument is a
socket cluster, it will also be used in unix-like systems. Setting
up a socket cluster is discussed in \S\,\ref{sec:parallel:socket}.
\end{enumerate}
@@ -105,27 +105,26 @@
parallelized code and give the pre-defined cluster as the value of
the \code{parallel} argument in \pkg{vegan}. If you want to use
socket clusters in unix-like systems (MacOS, Linux), this can be only
-done with pre-defined clusters as these systems default to fork
-clusters.
+done with pre-defined clusters.
If socket cluster is not set up in Windows, \pkg{vegan} will create and
close the cluster within the function body. This involves following commands:
-<<eval=false>>=
+\begin{Schunk}
+\begin{Soutput}
clus <- makeCluster(4)
## perform parallel processing
stopCluster(clus)
-@
+\end{Soutput}
+\end{Schunk}
The first command sets up the cluster, in this case with four
cores, and the second command stops the cluster.
Most parallelized \pkg{vegan} functions work similarly in socket and
fork clusters, but in \code{oecosimu} the parallel processing is used
-to evaluate user-defined functions. If you use pre-defined socket
-cluster, and you use functions in packages (like in \pkg{vegan}), you
-must make those packages known to the socket cluster. For example, if
-you want to run in parallel the \code{meandist} function of the
-\code{oecosimu} example with a pre-defined socket cluster, you must
-use:
+to evaluate user-defined functions, and their arguments and data must
+be made known to the socket cluster. For example, if you want to run
+in parallel the \code{meandist} function of the \code{oecosimu}
+example with a pre-defined socket cluster, you must use:
<<eval=false>>=
## start up and define meandist()
library(vegan)
@@ -151,9 +150,9 @@
\subsubsection{Random number generation}
\pkg{Vegan} does not use parallel processing in random number
-generation. You can set the seed for the standard random number
-generation, and setting the seed for the parallelized generator
-(L'Ecuyer) has no effect in \pkg{vegan}.
+generation, and you can set the seed for the standard random number
+generator. Setting the seed for the parallelized generator (L'Ecuyer)
+has no effect in \pkg{vegan}.
\subsubsection{Does it pay off?}
@@ -164,22 +163,19 @@
\code{library(vegan)} with \code{clusterEvalQ} can take two seconds or
longer, and only pays off if the non-parallel analysis takes ten
seconds or longer. Using pre-defined clusters will reduce the
-overhead, but not completely. Fork clusters (in unix-likes operating
-systems) have a smaller overhead and can be faster, but they also have
-an overhead.
+overhead. Fork clusters (in unix-likes operating systems) have a
+smaller overhead and can be faster, but they also have an overhead.
Each parallel process needs memory, and for a large number of
processes you need much memory. If the memory is exhausted, the
-parallel processes can stall and can take much longer than
+parallel processes can stall and take much longer than
non-parallel processes (minutes instead of seconds).
If the analysis is fast, and function runs in, say, less than five
seconds, parallel processing is rarely useful. Parallel processing is
useful only in slow analyses: large number of replications or
-simulations, slow evaluation of each simulation. It also seems that
-increasing the number of processes gives diminishing returns, in
-particular in socket clusters. The danger of memory exhaustion must
-also be remembered.
+simulations, slow evaluation of each simulation. The danger of memory
+exhaustion must always be remembered.
The benefits and potential problems of parallel processing depend on
your particular system: it is best to rely on your own experience.
@@ -188,7 +184,7 @@
The implementation of the parallel processing should accord with the
description of the user interface above (\S\,\ref{sec:parallel:ui}).
-Function \code{oecosimu} can be used as the reference implementation,
+Function \code{oecosimu} can be used as a reference implementation,
and similar interpretation and order of interpretation of arguments
should be followed. All future implementations should be consistent
and all must be changed if the call heuristic changes.
@@ -197,45 +193,49 @@
positive integer or a socket cluster. Integer $1$ means that no
parallel processing is performed. The ``normal'' default is
\code{NULL} which in the ``normal'' case is interpreted as $1$. Here
-``normal'' means that \R{} is run with default settings. Function
-\code{oecosimu} interprets the \code{parallel} arguments in the
-following way:
+``normal'' means that \R{} is run with default settings without
+setting \code{mc.cores} or environmental variable \code{MC_CORES}.
+
+Function \code{oecosimu} interprets the \code{parallel} arguments in
+the following way:
\begin{enumerate}
\item \code{NULL}: The function is called with argument \code{parallel
= getOption("mc.cores")}. The option \code{mc.cores} is normally
unset and then the default is \code{parallel = NULL}. This is
- interpreted as \code{parallel = 1} in \R-2.14.x. In \R-2.15.x (not
- yet released) the function inspects if a default socket cluster is
- defined. \R-2.15.0 has an unexported environment
- \code{parallel:::.reg} with variable \code{default} that is either
- \code{NULL} for unset default or a socket cluster. Querying this
- environment is an error in \R-2.14.x so that we also need to test
- the \R{} version. In the following \code{oecosimu} code we first
- see if the default cluster is set when \code{parallel = NULL}, and
- if it is unset, the \code{parallel} will still be \code{NULL} and
- will be changed to \code{1}. After this, the value of
- \code{parallel} will be either an integer or a socket cluster, and
- information on the type is saved in variable \code{hasClus}:
-<<eval=false>>=
+ interpreted as \code{parallel = 1} in \R-2.14.x. In \R-2.15.x the
+ function inspects if a default socket cluster is defined. \R-2.15.0
+ has a non-exported environment \code{parallel:::.reg} with variable
+ \code{default} that is either \code{NULL} (default) or a socket
+ cluster. Querying this environment is an error in \R-2.14.x so that
+ we also need to test the \R{} version. In the following
+ \code{oecosimu} code we first see if the default cluster is set when
+ \code{parallel = NULL}, and if it is unset, the \code{parallel} will
+ still be \code{NULL} and will be changed to \code{1}. After this,
+ the value of \code{parallel} will be either an integer or a socket
+ cluster, and information on the type is saved in variable
+ \code{hasClus}:
+\begin{Schunk}
+\begin{Soutput}
if(is.null(parallel) &&
- getRversion()>="2.15.0")
+ getRversion() >= "2.15.0")
parallel <- get("default",
envir = parallel:::.reg)
if (is.null(parallel) ||
- getRversion()<"2.14.0")
+ getRversion() < "2.14.0")
parallel <- 1
hasClus <- inherits(parallel, "cluster")
-@
+\end{Soutput}
+\end{Schunk}
\item Integer: An integer value is taken as the number of created
parallel processes. In unix-like systems this is the number of
forked multicore processes, and in Windows this is the number of
- workers in socket clusters. In Windows, the socket clustes is
+ workers in socket clusters. In Windows, the socket cluster is
created, and if needed \code{library(vegan)} is evaluated in the
cluster (this is not necessary if the function only uses internal
functions), and the cluster is stopped after parallel processing.
\item Socket cluster: If a socket cluster is given, it will be used in
- all operating systems. It is not created, \code{library(vegan)} is
- not evaluated and the cluster is not stopped.
+ all operating systems, and the cluster is not stopped
+ within the function.
\end{enumerate}
This gives the following precedence order for parallel processing
@@ -244,12 +244,12 @@
\item Explicitly given argument value of \code{parallel} will always
be used.
\item If \code{mc.cores} is set, it will be used. In Windows this
- will mean creating and stopping socket clusters even when a
- default cluster is set if \code{mc.cores} is not
- \code{NULL}. Please note that the \code{mc.cores} is only set from
- the environmental variable \code{MC_CORES} when you load the
- \pkg{parallel} package, and it is always unset before first
- \code{require(parallel)}. \footnote{The behaviour of
+ means creating and stopping socket clusters even when a default
+ cluster is set if \code{mc.cores} is not \code{NULL}. Please note
+ that the \code{mc.cores} is only set from the environmental
+ variable \code{MC_CORES} when you load the \pkg{parallel} package,
+ and it is always unset before first
+ \code{require(parallel)}.\footnote{The behaviour of
\code{mc.cores} option is untested in Windows.}
\item The default socket cluster will be used if
set.\footnote{Default cluster is available since \R{} 2.15.0.}
Modified: pkg/vegan/inst/doc/vegan.sty
===================================================================
--- pkg/vegan/inst/doc/vegan.sty 2013-07-18 13:00:06 UTC (rev 2573)
+++ pkg/vegan/inst/doc/vegan.sty 2013-07-18 16:02:14 UTC (rev 2574)
@@ -18,6 +18,15 @@
\renewcommand{\floatpagefraction}{0.8}
\usepackage{Sweave}
\usepackage{hyperref}
+%% layout depends on the number of columns
+\if at twocolumn
+ \renewenvironment{Schunk}{\par\footnotesize}{} % smaller examples
+ \setkeys{Gin}{width=\linewidth} % column wide figs
+\else
+ \renewenvironment{Schunk}{\par\small}{} % small examples
+ \setkeys{Gin}{width=0.55\linewidth} % narrow figs for sidecaps
+ \renewenvironment{figure}[1][tp]{\begin{SCfigure}[][#1]}{\end{SCfigure}} %sidecaps
+\fi
%% macros
%% \code should handle _ , ~ and $
\makeatletter
@@ -32,12 +41,4 @@
\newcommand{\VAR}{\mathsf{VAR}}
\newcommand{\COV}{\mathsf{COV}}
\newcommand{\Prob}{\mathsf{P}}
-%% layout depends on the number of columns
-\if at twocolumn
- \renewenvironment{Schunk}{\par\footnotesize}{} % smaller examples
- \setkeys{Gin}{width=\linewidth} % column wide figs
-\else
- \renewenvironment{Schunk}{\par\small}{} % small examples
- \setkeys{Gin}{width=0.55\linewidth} % narrow figs for sidecaps
- \renewenvironment{figure}[1][tp]{\begin{SCfigure}[][#1]}{\end{SCfigure}} %sidecaps
-\fi
+
More information about the Vegan-commits
mailing list