[Vegan-commits] r1940 - in pkg/vegan: R inst man
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Sun Oct 9 19:53:23 CEST 2011
Author: jarioksa
Date: 2011-10-09 19:53:22 +0200 (Sun, 09 Oct 2011)
New Revision: 1940
Modified:
pkg/vegan/R/permutest.cca.R
pkg/vegan/inst/ChangeLog
pkg/vegan/man/anova.cca.Rd
Log:
parallelized code for all OS, and 'permutations' argument can be a permute::shuffleSet() object
Modified: pkg/vegan/R/permutest.cca.R
===================================================================
--- pkg/vegan/R/permutest.cca.R 2011-10-09 17:37:16 UTC (rev 1939)
+++ pkg/vegan/R/permutest.cca.R 2011-10-09 17:53:22 UTC (rev 1940)
@@ -7,17 +7,22 @@
`permutest.cca` <-
function (x, permutations = 99,
model = c("reduced", "direct", "full"), first = FALSE,
- strata = NULL, parallel = 1, ...)
+ strata = NULL, parallel = 1, kind = c("snow", "multicore"),...)
{
+ kind <- match.arg(kind)
+ parallel <- as.integer(parallel)
model <- match.arg(model)
isCCA <- !inherits(x, "rda")
isPartial <- !is.null(x$pCCA)
## Function to get the F statistics in one loop
- getF <- function (R, ...)
+ getF <- function (indx, ...)
{
+ if (!is.matrix(indx))
+ dim(indx) <- c(1, length(indx))
+ R <- nrow(indx)
mat <- matrix(0, nrow = R, ncol = 3)
for (i in seq_len(R)) {
- take <- permuted.index(N, strata)
+ take <- indx[i,]
Y <- E[take, ]
if (isCCA)
wtake <- w[take]
@@ -101,18 +106,29 @@
runif(1)
seed <- get(".Random.seed", envir = .GlobalEnv, inherits = FALSE)
## permutations
- if (parallel > 1 && getRversion() >= "2.14" && require(parallel)
- && .Platform$OS.type == "unix") {
- R <- ceiling(permutations/parallel)
- mc.reset.stream()
- tmp <- do.call(rbind, mclapply(seq_len(parallel), getF, R = R,
- mc.cores = parallel))
+ if (length(permutations) == 1) {
+ permutations <- shuffleSet(N, permutations)
+ }
+ nperm <- nrow(permutations)
+ if (parallel > 1 && getRversion() >= "2.14" && require(parallel)) {
+ if (kind == "snow") {
+ cl <- makeCluster(parallel)
+ clusterEvalQ(cl, library(vegan))
+ tmp <- parRapply(cl, permutations, function(i) getF(i))
+ tmp <- t(matrix(tmp, nrow=3))
+ stopCluster(cl)
+ } else {
+ tmp <- do.call(rbind,
+ mclapply(1:nperm,
+ function(i) getF(permutations[i,]),
+ mc.cores = parallel))
+ }
} else {
- tmp <- getF(R = permutations)
+ tmp <- getF(permutations)
}
- num <- tmp[1:permutations,1]
- den <- tmp[1:permutations,2]
- F.perm <- tmp[1:permutations,3]
+ num <- tmp[,1]
+ den <- tmp[,2]
+ F.perm <- tmp[,3]
## Round to avoid arbitrary ordering of statistics due to
## numerical inaccuracy
F.0 <- round(F.0, 12)
Modified: pkg/vegan/inst/ChangeLog
===================================================================
--- pkg/vegan/inst/ChangeLog 2011-10-09 17:37:16 UTC (rev 1939)
+++ pkg/vegan/inst/ChangeLog 2011-10-09 17:53:22 UTC (rev 1940)
@@ -31,29 +31,31 @@
not find function in all packages, but 'vegan' is made known, and
'stats' and 'base' seem to be known.
- * permutest.cca: First attempt of setting 'parallel' processing in
- permutest.cca. Currently the parallelization only works in R
- 2.14.0 (alpha) and later with the 'parallel' package, and in
- unix-like operating systems (Linux and MacOS X were
- tested). Function permutest.cca gets a new argument 'parallel'
- (defaults 1) that gives the number of desired parallel
- processes. The argument is silently ignored if the system is not
- capable of parallel processing (missing 'parallel' package,
- Windows). The argument may be bassed to permutest.cca() from
- anova.cca(), but currently setting the random number generator
- seed will fail, and the results probably will be wrong. This
- feature is only for testing. The functionality cannot be included
- cleanly: it depends on the package 'parallel', but suggesting
- 'parallel' fails R CMD check in the current R release (2.13.2)
- which does not yet have 'parallel'. So we get warnings:
- 'library' or 'require' call not declared from: parallel, and
- permutest.cca: no visible global function definition for
- ‘mclapply’.
- Perhaps we delay adding this feature, and cancel this submission
- later. However, with these warnings, the function passes tests in
- R 2.13.2. (It fails in R 2.14.0 alpha since it suggests 'rgl', and
- that package fails in R 2.14.0.)
+ * permutest.cca: implemented 'parallel' processing in
+ permutest.cca. The parallelization only works in R 2.14.0 (alpha)
+ and later with the 'parallel' package. Function permutest.cca gets
+ a new arguments 'parallel' (defaults 1) that gives the number of
+ parallel process, and 'kind' that selects the parallelization
+ style which is either "snow" (large overhead, but works in al
+ OS's) and "multicore" (faster, but only works in unix-like systems
+ like Linux and MacOS X). The arguments are silently ignored if the
+ system is not capable of parallel processing. The functionality
+ cannot be included cleanly: it depends on the package 'parallel',
+ but suggesting 'parallel' fails R CMD check in the current R
+ release (2.13.2) which does not yet have 'parallel'. So we get
+ warnings: 'library' or 'require' "call not declared from:
+ parallel", and "permutest.cca: no visible global function
+ definition for ‘mclapply". However, with these warnings,
+ the function passes tests in R 2.13.2.
+ * permutest.cca: the user interface changed so that argument
+ 'permutations' can be either the number permutations (like
+ previosly), or a matrix of permutations like produced by
+ permute::shuffleSet(). This was done to move RNG outside
+ parallelized code. This will also allow much simpler and
+ anova.cca* code. Currently, the 'strata' argument will not work,
+ but this will be fixed "real soon now".
+
Version 2.1-2 (opened October 4, 2011)
* permutest.cca could not be update()d, because "permutest.cca"
Modified: pkg/vegan/man/anova.cca.Rd
===================================================================
--- pkg/vegan/man/anova.cca.Rd 2011-10-09 17:37:16 UTC (rev 1939)
+++ pkg/vegan/man/anova.cca.Rd 2011-10-09 17:53:22 UTC (rev 1940)
@@ -25,7 +25,8 @@
\method{permutest}{cca}(x, permutations = 99,
model = c("reduced", "direct", "full"),
- first = FALSE, strata, parallel = 1, ...)
+ first = FALSE, strata, parallel = 1, kind = c("snow", "multicore"),
+ ...)
}
\arguments{
@@ -54,11 +55,15 @@
permutation. If supplied, observations are permuted only within the
specified strata.}
- \item{parallel}{Number of parallel processes. The parallel
- processing is only possible in \R version 2.14.x and later, and
- currently only works in unix-like operating systems, such as Linux
- and MacOS X. The argument is silently ignored if the system is not
- capable of parallel processing. }
+ \item{parallel, kind}{Number of parallel processes. The parallel
+ processing is only possible in \R version 2.14.x and later. The
+ argument is silently ignored if the system is not capable of
+ parallel processing. There are two \code{kind} of parallelization:
+ \code{kind = "snow"} selects a socket cluster which is available
+ in all operationg systems, and \code{kind = "multicore"} selects a
+ fork cluster that is available only in unix-like systems (Linux,
+ MacOS X), but is usually faster. These arguments are experimental
+ and may change or disappear in any version. }
}
\details{
More information about the Vegan-commits
mailing list