[Vegan-commits] r2531 - in pkg/ordiconsensus: . R man

Tue Jun 18 21:10:24 CEST 2013

Author: gblanchet
Date: 2013-06-18 21:10:23 +0200 (Tue, 18 Jun 2013)
New Revision: 2531

Added:
   pkg/ordiconsensus/R/simulSADcomm.R
   pkg/ordiconsensus/man/simulSADcomm.Rd
Modified:
   pkg/ordiconsensus/DESCRIPTION
   pkg/ordiconsensus/NAMESPACE
   pkg/ordiconsensus/man/ordiconsensus-package.Rd
Log:
Added function simulSADcomm to package ordiconsensus

Modified: pkg/ordiconsensus/DESCRIPTION
===================================================================

--- pkg/ordiconsensus/DESCRIPTION	2013-06-18 13:56:29 UTC (rev 2530)
+++ pkg/ordiconsensus/DESCRIPTION	2013-06-18 19:10:23 UTC (rev 2531)
@@ -1,7 +1,7 @@
 Package: ordiconsensus
 Type: Package
 Title: Consensus of canonical ordinations through the canonical redundancy analysis
-Version: 0.3-2
+Version: 0.4
 Date: 2012-11-09
 Author: F. Guillaume Blanchet
 Maintainer: F. Guillaume Blanchet <guillaume.blanchet at helsinki.fi>

Modified: pkg/ordiconsensus/NAMESPACE
===================================================================
--- pkg/ordiconsensus/NAMESPACE	2013-06-18 13:56:29 UTC (rev 2530)
+++ pkg/ordiconsensus/NAMESPACE	2013-06-18 19:10:23 UTC (rev 2531)
@@ -1,6 +1,6 @@
 ### Export
 
-export(coeffCompare,consensusRDA,RV,SADbin)
+export(coeffCompare,consensusRDA,RV,SADbin,simulSADcomm)
 
 ### Import
 

Added: pkg/ordiconsensus/R/simulSADcomm.R
===================================================================
--- pkg/ordiconsensus/R/simulSADcomm.R	                        (rev 0)
+++ pkg/ordiconsensus/R/simulSADcomm.R	2013-06-18 19:10:23 UTC (rev 2531)
@@ -0,0 +1,162 @@
+simulSADcomm <-
+function(sp.abund,expl.var,expl.rand.sel=TRUE,nexpl.comb=2,binary=FALSE,fix.expl=NULL,nsite=50,weight=NULL,range.weight=c(0,2),sd.expl=FALSE,norm=c(0,1)){
+### Description:
+### 
+### Function that simulates data tables which have the same species
+### abundance distribution.
+###
+### Arguments:
+###
+### sp.abund : A vector defining the number of species in a bin.
+###            See "Details" for more information.
+### nsite : Numeric. Number of sites (rows) in the resulting matrix. See details.
+### expl.var : Matrix. Explanatory variables related to the species.
+### expl.rand.sel : Logical. Whether explanatory should be randomly selected to construct species or a fixed combination should be given. (Default is TRUE)
+### nexpl.comb : Numeric. The number of explanatory variables that will be combined together to construct the environmental variables. Default is 2.
+### binary : Logical. Whether the site-by-species matrix is an abundance (FALSE) or a presence/absence (TRUE). Default is FALSE
+### fix.expl : Matrix. Defines which combination of explanatory variables should be used to construct species. This argument is only active when expl.rand.sel=FALSE. See Details for more information.
+### weight : Vector. Regression coefficient used to give weight on each species. If NULL weights are random selected through a random samping of a uniform distribution with a range defined by range.weight. Default is NULL.
+### range.weight : Vector of length 2 giving the minimum and the maximum of a uniform distribution. This will be used to weight each species use to construct an explanatory variable. Default is 0 and 2.
+### sd.expl : Logical. Whether the standard deviation of the Normal error is a multiplier of the standard deviation of the deterministic portion of the newly created explanatory variable (TRUE) or the pure standard deviation (FALSE). Default is FALSE.
+### norm : Vector of length 2 giving the mean and a multiplier of the standard deviation of the deterministic portion of a newly created explanatory variable. Default is mean 0 and 1 time the standard deviation of the new deterministic explanatory variable.
+###
+### Details :
+### 
+### The argument "sp.abund" defines the species-abundance distribution structure of the data following the binnings proposed by Gray et al. (2006). For example, if the vector is (40,20,30), it means that there will be 40 species with 1 individual, 20 with 2 or 3 individuals, and 30 with 4 to 7 individuals.
+###
+### The individuals are assigned to the sites according to the the set of exlanatory variables given in expl.var. It is possible that a site occur with 0 individuals. They will be included in the analysis and dealt with a posteriori. 
+###
+### The explanatory variables are randomly sampled (without replacement) when combining (adding) explanatory variables together. The number of explanatory variables must be a multiple of nexpl.comb.
+###
+### Error was included to a species by multiplying a weight to the explanatory variable used to construct the species and by adding a normally distributed error term to that same explanatory variable. An error term with a standard deviation equal to the standard deviation of the explanatory variable allows for the explanatory variable to explain roughly 50% of the species it constructed.
+###
+### fix.expl is a matrix that has as many rows as there are species and as many columns as nexpl.comb (number of explanatory variables to combine). The numbers in fix.expl are integers that refers to the columns of expl.var. When fix.expl is used, nexpl.comb becomes meaningless.
+###
+### If a presence-absence matrix is constructed (binary=TRUE), sp.abund should be constructed in such a way that no bin should include species with an abundance larger than the number of sites. If it is not the case, an error message is sent. Within, this constraint, if the maximum of the last bin (the one with the largest abundance) is larger than the number of site, it will be automatically changed to the number of sites-1.
+###
+### Value :
+###
+### site.sp : The site (rows) by species (column) generated.
+### sel.expl : A vector presenting the order explanatory variables used to model which species. The order follows the order of the species.
+###
+### Reference : 
+### Gray, J. S., A. Bjorgeaeter, and K. I. Ugland. 2006. On plotting species abundance distributions, Journal of Animal Ecology. 75:752-756.
+### 
+### 
+### F. Guillaume Blanchet - September 2010, July 2011
+################################################################################
+	if(!is.vector(sp.abund)){
+		stop("'sp.abund' is not a vector")
+	}
+	
+	is.wholenumber <- function(x, tol = .Machine$double.eps^0.5)  abs(x - round(x)) < tol
+
+	nexpl.var<-ncol(expl.var)
+	
+	if(expl.rand.sel){
+		nexpl.var.new<-ncol(expl.var)/nexpl.comb
+		if(!is.wholenumber(nexpl.var.new)){
+			stop("'expl.var' is not a multiple of 'nexp.comb'")
+		}
+	}
+	
+	if(nrow(expl.var)!=nsite){
+		stop("'expl.var' should have the same number of row as 'nsite'")
+	}
+	
+	#CC# Find the minimum and the maximium number of individuals for each bin define in sp.abund
+	nbins<-length(sp.abund)
+	min.bin<-2^(0:(nbins-1))
+	max.bin<-2^(1:nbins)-1
+	
+	if(binary){
+		if(max(min.bin) > nsite){
+			stop("'sp.abund' has species with abundance too large")
+		}
+		max.bin[which.max(max.bin)]<-nsite-1
+		
+	}
+	
+	#CC# Construct site by specie result matrix
+	nsp<-sum(sp.abund)
+	site.sp<-matrix(0,nsite,nsp)
+	
+	#CC# Construct matrix presenting the environmental variable selection
+	if(expl.rand.sel){
+		expl.var.new<-matrix(NA,ncol=nexpl.var.new,nrow=nsite)
+		sel.expl<-sample(1:ncol(expl.var))
+		
+		first<-seq(1,nexpl.var,by=nexpl.comb)
+		last<-seq(nexpl.comb,nexpl.var,by=nexpl.comb)
+		
+		for(i in 1:nexpl.var.new){
+			expl.var.new[,i]<-rowSums(expl.var[,sel.expl[first[i]:last[i]]])
+		}
+		
+		sel.expl.new<-sample(1:nexpl.var.new,nsp,replace=TRUE)
+		sd.expl.var.new<-apply(expl.var.new,2,sd)
+	}else{
+		expl.var.new<-matrix(NA,ncol=nsp,nrow=nsite)
+		
+		for(i in 1:nsp){
+			expl.var.new[,i]<-rowSums(expl.var[,fix.expl[i,]])
+			sel.expl.new<-1:nsp
+			if(sd.expl){
+				sd.expl.var.new<-apply(expl.var.new,2,sd)
+			}else{
+				sd.expl.var.new<-rep(1,ncol(expl.var.new))
+			}
+		}
+	}
+	
+	#CC# Fill up site by species matrix
+	sp<-1
+	for(i in 1:nbins){
+		if(sp.abund[i]>0){
+			for(j in 1:sp.abund[i]){
+				for(k in sample(min.bin[i]:max.bin[i],1)){
+					#CC# Add a weight and an error term to the selected environmental variable
+					error<-rnorm(nsite,mean=norm[1],sd=sd.expl.var.new[sel.expl.new[sp]]*norm[2])
+					if(is.null(weight)){
+						weight.rnd<-runif(1,range.weight[1],range.weight[2])
+						smpl.prob<-abs(expl.var.new[,sel.expl.new[sp]]*weight.rnd+error)
+						#CC# Consider the sign of the regression coefficient
+						if(weight.rnd>0){
+							smpl.prob<-smpl.prob/sum(smpl.prob)
+						}else{
+							smpl.prob<-(1/smpl.prob)/sum(1/smpl.prob)
+						}
+					}else{
+						smpl.prob<-abs(expl.var.new[,sel.expl.new[sp]]*weight[sp]+error)
+						#CC# Consider the sign of the regression coefficient
+						if(weight[sp]>0){
+							smpl.prob<-smpl.prob/sum(smpl.prob)
+						}else{
+							smpl.prob<-(1/smpl.prob)/sum(1/smpl.prob)
+						}
+					}
+					#CC# Build presence/absence data
+					if(binary){
+						smpl.site<-sample(nsite,k,replace=FALSE,prob=smpl.prob)
+						site.sp[smpl.site,sp]<-site.sp[smpl.site,sp]+1
+					}else{
+						smpl.site<-sample(nsite,k,replace=TRUE,prob=smpl.prob)
+						for(l in smpl.site){
+							site.sp[l,sp]<-site.sp[l,sp]+1
+						}
+					}
+				}
+				sp<-sp+1
+			}
+		}
+	}
+	
+	if(expl.rand.sel){
+		res<-list(site.sp,sel.expl)
+		names(res)<-c("site.sp","sel.expl")
+	}else{
+		res<-list(site.sp,fix.expl)
+		names(res)<-c("site.sp","sel.expl")
+	}
+	return(res)
+}

Modified: pkg/ordiconsensus/man/ordiconsensus-package.Rd
===================================================================
--- pkg/ordiconsensus/man/ordiconsensus-package.Rd	2013-06-18 13:56:29 UTC (rev 2530)
+++ pkg/ordiconsensus/man/ordiconsensus-package.Rd	2013-06-18 19:10:23 UTC (rev 2531)
@@ -12,7 +12,7 @@
 \tabular{ll}{
 Package: \tab ordiconsensus\cr
 Type: \tab Package\cr
-Version: \tab 0.3-2\cr
+Version: \tab 0.4\cr
 Date: \tab 2012-11-12\cr
 License: \tab Unlimited\cr
 }

Added: pkg/ordiconsensus/man/simulSADcomm.Rd
===================================================================
--- pkg/ordiconsensus/man/simulSADcomm.Rd	                        (rev 0)
+++ pkg/ordiconsensus/man/simulSADcomm.Rd	2013-06-18 19:10:23 UTC (rev 2531)
@@ -0,0 +1,79 @@
+\name{simulSADcomm}
+\alias{simulSADcomm}
+\title{
+Simulate community matrix with constant SAD
+}
+\description{
+This function simulates community matrices with the same species abundance distribution following patterns defined by a set of explanatory variables. This function was used to simulate community matrices in Blanchet et al. (In press)}
+\usage{
+simulSADcomm(sp.abund, expl.var, expl.rand.sel = TRUE, nexpl.comb = 2, binary = FALSE, fix.expl = NULL, nsite = 50, weight = NULL, range.weight = c(0, 2), sd.expl = FALSE, norm = c(0, 1))
+}
+\arguments{
+  \item{sp.abund}{
+A vector defining the number of species in a bin. See Details for more information.
+}
+  \item{expl.var}{
+A matrix of explanatory variables to use to construct the species.
+}
+  \item{expl.rand.sel}{
+Logical. Whether explanatory should be randomly selected to construct species (TRUE) or a fixed combination should be given (FALSE). (Default is TRUE)
+}
+  \item{nexpl.comb}{
+Numeric. The number of explanatory variables that will be combined together to construct the environmental variables. Default is 2.
+}
+  \item{binary}{
+Logical. Whether the site-by-species matrix is an abundance (FALSE) or a presence/absence (TRUE). Default is FALSE.
+}
+  \item{fix.expl}{
+A matrix that defines which combination of explanatory variables should be used to construct species. This argument is only active when expl.rand.sel=FALSE. See Details for more information.
+}
+  \item{nsite}{
+Numeric. Number of sites (rows) in the resulting community matrix. See Details.
+}
+  \item{weight}{
+A vector of regression coefficient used to give weight on each species. If NULL, weights are random selected through a random samping of a uniform distribution with a range defined by range.weight. Default is NULL.
+}
+  \item{range.weight}{
+A vector of length 2 giving the minimum and the maximum of a uniform distribution from which \code{weights} will be sampled. This will be used to weight each species use to construct an explanatory variable. Default is 0 and 2.
+}
+  \item{sd.expl}{
+Logical. This argument is only active when \code{expl.rand.sel} is FALSE (That is when a fixed combination of explanatory variable is used to construct a community matrix). Whether the standard deviation of the Normal error added when constructing a species is a multiplier of the standard deviation of the deterministic portion of the newly created explanatory variable (TRUE) or the pure standard deviation (FALSE). Default is FALSE.
+}
+  \item{norm}{
+Vector of length 2 giving the mean and a multiplier of the standard deviation of the deterministic portion of a newly created explanatory variable. Default is mean = 0 and multiplier of the standard deviation of the new deterministic explanatory variable = 1.
+}
+}
+\details{
+The argument \code{sp.abund} defines the species-abundance distribution structure of the data following the binnings proposed by Gray et al. (2006). For example, if the vector is (40,20,30), it means that there will be 40 species with 1 individual, 20 with 2 or 3 individuals, and 30 with 4 to 7 individuals.
+
+The individuals are assigned to the sites according to the the set of exlanatory variables given in \code{expl.var}. It is possible that a site occur with 0 individuals. They will be included in the community matrix and should be dealt with \emph{a posteriori}. 
+
+When \code{expl.rand.sel} is TRUE, the explanatory variables are randomly sampled (without replacement) when combining (adding) explanatory variables together. The number of explanatory variables must be a multiple of nexpl.comb.
+
+Error is included to a species by multiplying a weight to the explanatory variable used to construct the species and by adding a normally distributed error term to the same explanatory variable. An error term with a standard deviation equal to the standard deviation of the explanatory variable allows for the explanatory variable to explain roughly 50% of the species it constructed.
+
+\code{fix.expl} is a matrix that has as many rows as there are species and as many columns as \code{nexpl.comb} (number of explanatory variables to combine). The numbers in \code{fix.expl} are integers that refers to the columns of \code{expl.var}. When \code{fix.expl} is used, \code{nexpl.comb} becomes meaningless.
+
+If a presence-absence matrix is constructed (\code{binary}=TRUE), \code{sp.abund} should be constructed in such a way that no bin should include species with an abundance larger than the number of sites. If it is not the case, an error message will be sent. Within, this constraint, if the maximum of the last bin (the one with the largest abundance) is larger than the number of site, it will be automatically changed to the number of sites-1.
+
+This function was designed to do much more than the simulations generated in the work of Blanchet et al. (in press). It is meant to be used for future simulation studies.
+}
+\value{
+\code{site.sp} : The site (rows) by species (column) community matrix generated.
+\code{sel.expl} : A vector presenting the order explanatory variables used to construct each species. The order follows the order of the species.
+}
+\references{
+Gray, J.S., A. Bjorgeaeter, and K.I. Ugland. 2006. On plotting species abundance distributions, \emph{Journal of Animal Ecology} \strong{75}:752--756.
+
+Blanchet, F.G., P. Legendre, J.A.C. Bergeron, F. He. In press. Consensus RDA across dissimilarity coefficients for canonical ordination of community composition data, \emph{Ecological Monographs}.
+}
+\author{
+F. Guillaume Blanchet
+}
+
+\examples{
+SAD<-c(1,2,4,6,4,2,1,0,0,0)
+expl<-matrix(rnorm(400),ncol=8)
+simulSADcomm(SAD,expl)
+}
+\keyword{ datagen }