[Genabel-commits] r897 - in pkg/GenABEL: . R man
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Fri Apr 27 17:43:49 CEST 2012
Author: yurii
Date: 2012-04-27 17:43:49 +0200 (Fri, 27 Apr 2012)
New Revision: 897
Added:
pkg/GenABEL/R/grammar.old.R
pkg/GenABEL/man/grammar.old.Rd
Removed:
pkg/GenABEL/man/grammar.Rd
Modified:
pkg/GenABEL/CHANGES.LOG
pkg/GenABEL/R/grammar.R
pkg/GenABEL/R/polygenic.R
pkg/GenABEL/generate_documentation.R
pkg/GenABEL/man/GenABEL-package.Rd
pkg/GenABEL/man/add.phdata.Rd
pkg/GenABEL/man/arrange_probabel_phe.Rd
pkg/GenABEL/man/blurGenotype.Rd
pkg/GenABEL/man/checkPackageVersionOnCRAN.Rd
pkg/GenABEL/man/del.phdata.Rd
pkg/GenABEL/man/estlambda.Rd
pkg/GenABEL/man/export.plink.Rd
pkg/GenABEL/man/extract.annotation.impute.Rd
pkg/GenABEL/man/extract.annotation.mach.Rd
pkg/GenABEL/man/findRelatives.Rd
pkg/GenABEL/man/generateOffspring.Rd
pkg/GenABEL/man/getLogLikelihoodGivenRelation.Rd
pkg/GenABEL/man/ibs.Rd
pkg/GenABEL/man/impute2databel.Rd
pkg/GenABEL/man/impute2mach.Rd
pkg/GenABEL/man/mach2databel.Rd
pkg/GenABEL/man/makeTransitionMatrix.Rd
pkg/GenABEL/man/mmscore.Rd
pkg/GenABEL/man/polygenic.Rd
pkg/GenABEL/man/polygenic_hglm.Rd
pkg/GenABEL/man/qtscore.Rd
pkg/GenABEL/man/recodeChromosome.Rd
pkg/GenABEL/man/reconstructNPs.Rd
pkg/GenABEL/man/sortmap.internal.Rd
Log:
Replaced 'grammar' function with new one, allowing 'raw', 'gc' and 'gamma' varieties of the method
Modified: pkg/GenABEL/CHANGES.LOG
===================================================================
--- pkg/GenABEL/CHANGES.LOG 2012-04-25 10:00:55 UTC (rev 896)
+++ pkg/GenABEL/CHANGES.LOG 2012-04-27 15:43:49 UTC (rev 897)
@@ -1,5 +1,8 @@
-*** v. 1.7-1 (2012.01.11)
+*** v. 1.7-1 (2012.04.27)
+Replaced 'grammar' function with new one, allowing 'raw', 'gc'
+and 'gamma' varieties of the method.
+
Added option weight = "eVar" to 'ibs'. This uses empirical variance of
genotypes when estimating kinship matrix. This is useful when working with
such data as Arabidopsis.
Modified: pkg/GenABEL/R/grammar.R
===================================================================
--- pkg/GenABEL/R/grammar.R 2012-04-25 10:00:55 UTC (rev 896)
+++ pkg/GenABEL/R/grammar.R 2012-04-27 15:43:49 UTC (rev 897)
@@ -1,142 +1,110 @@
+#' GRAMMAR test for association in samples with genetic structure
+#'
+#' Fast approximate test for association between a trait and genetic polymorphism,
+#' in samples with genetic sub-structure (e.g. relatives). The function implements
+#' several varieties of GRAMMAR ('gamma','gc', and 'raw').
+#'
+#' With 'raw' argument,
+#' the original GRAMMAR (Aulchenko et al., 2007) is implemented. This method
+#' is conservative and generates biased estimates of regression coefficients.
+#'
+#' With 'gc' argument, the GRAMMAR-GC (Amin et al., 2007) is implemented.
+#' This method solves conservativity of the test, but the Genomic Control (GC)
+#' lambda is by definition "1" and can not serve as an indicator of goodness of
+#' the model; also, the estimates of regression coefficients are biased (the
+#' same as in 'raw' GRAMMAR).
+#'
+#' GRAMMAR-Gamma ('gamma' argument) solves these problems, producing
+#' correct distribution of the test statistic, interpretable value of GC Lambda,
+#' and unbiased estimates of the regression coefficients. All together, the
+#' default 'gamma' method is recommended for use.
+#'
+#' @param polyObject object returned by \code{\link{polygenic}} function
+#' @param data object of \code{\link{gwaa.data-class}}
+#' @param method to be used, one of 'gamma','gc', or 'raw'
+#' @param propPs proportion of non-corrected P-values used to estimate the inflation factor Lambda,
+#' passed directly to the \code{\link{estlambda}}
+#' @param ... arguments passed to the function used for computations
+#' (\code{\link{qtscore}})
+#'
+#' @return Object of scan.gwaa-class
+#'
+#' @seealso \code{\link{polygenic}}, \code{\link{mmscore}}, \code{\link{qtscore}}
+#'
+#' @references
+#'
+#' GRAMMAR-Raw: Aulchenko YS, de Koning DJ, Haley C.
+#' Genomewide rapid association using mixed model and regression: a fast and
+#' simple method for genomewide pedigree-based quantitative trait loci
+#' association analysis. Genetics. 2007 Sep;177(1):577-85.
+#'
+#' GRAMMAR-GC: Amin N, van Duijn CM, Aulchenko YS.
+#' A genomic background based method for association analysis in related individuals.
+#' PLoS One. 2007 Dec 5;2(12):e1274.
+#'
+#' GRAMMAR-Gamma: Svisheva et al., submitted
+#'
+#' @examples
+#' # ge03d2 is rather bad data set for demonstration,
+#' # because this is a population-based study
+#' data(ge03d2.clean)
+#' #take half for speed
+#' ge03d2.clean <- ge03d2.clean[1:300,]
+#' # estimate genomic kinship
+#' gkin <- ibs(ge03d2.clean,w="freq")
+#' # perform polygneic analysis
+#' h2ht <- polygenic(height ~ sex + age,kin=gkin,ge03d2.clean)
+#' h2ht$est
+#' # compute mmscore stats
+#' mm <- mmscore(h2ht,data=ge03d2.clean)
+#' # compute grammar-gc
+#' grGc <- grammar(h2ht,data=ge03d2.clean,method="gc")
+#' #compute grammar-gamma
+#' grGamma <- grammar(h2ht,data=ge03d2.clean,method="gamma")
+#' # compare lambdas
+#' lambda(mm)
+#' estlambda(mm[,"chi2.1df"])
+#' lambda(grGamma)
+#' estlambda(grGamma[,"chi2.1df"])
+#' lambda(grGc)
+#' estlambda(grGc[,"chi2.1df"])
+#' #compare top results
+#' summary(mm)
+#' summary(grGamma)
+#' summary(grGc)
+#'
+#' @author Gulnara Svisheva, Yurii Aulchenko
+#'
+#' @keywords htest
+#'
"grammar" <-
-function(h2object,data,snpsubset,idsubset,strata,times=1,quiet=FALSE,
- bcast=10,clambda=FALSE,propPs=1.0) {
- warning("Depricated. Using qtscore on environmental residuals (qtscore(h2object$pgres,...))\nwith clam = FLASE\n")
- out <- qtscore(h2object$pgres,data=data,snpsubset=snpsubset,
- idsubset=idsubset,strata=strata,times=times,quiet=quiet,bcast=bcast,
- clambda=clambda,propPs=propPs)
- return(out)
- if (is(data,"gwaa.data"))
- {
- checkphengen(data)
- data <- data at gtdata
- }
- if (class(h2object) != "polygenic")
- stop("wrong class of h2object (should be polygenic)")
- if (!is(data,"snp.data")) {
- stop("wrong data class: should be gwaa.data or snp.data")
- }
- if (!missing(snpsubset)) data <- data[,snpsubset]
- if (!missing(idsubset)) data <- data[idsubset,]
- if (missing(strata)) {nstra=1; strata <- rep(0,data at nids)}
-
- if (length(strata)!=data at nids) stop("Strata variable and the data do not match in length")
- if (any(is.na(strata))) stop("Strata variable contains NAs")
-
- tmeas <- h2object$measuredIDs
- resid <- h2object$residualY
- if (any(tmeas == FALSE)) {
- if (!quiet) warning(paste(sum(!tmeas),"people (out of",length(tmeas),") excluded because they have trait or covariate missing\n"),immediate. = TRUE)
- if (length(tmeas) != data at nids) stop("Dimension of the outcome and SNP data object are different")
- data <- data[tmeas,]
- strata <- strata[tmeas]
- resid <- resid[tmeas]
- }
- if (any(strata!=0)) {
- olev <- levels(as.factor(strata))
- nstra <- length(olev)
- tstr <- strata
- for (i in 0:(nstra-1)) tstr <- replace(tstr,(strata==olev[i+1]),i)
- strata <- tstr
- rm(tstr)
- }
- nstra <- length(levels(as.factor(strata)))
-
- lenn <- data at nsnps;
- tvar <- h2object$h2an$estimate[length(h2object$h2an$estimate)]
- h2object$InvSigma <- h2object$InvSigma*(1.-h2object$esth2)*tvar #sqrt(tvar)
- out <- list()
- for (j in c(1:(times+1*(times>1)))) {
- if (j>1) resid <- sample(resid,replace=FALSE)
- chi2 <- .C("grammar",as.raw(data at gtps),as.double(resid),as.double(h2object$InvSigma),as.integer(data at nids),as.integer(data at nsnps), as.integer(nstra), as.integer(strata), chi2 = double(7*data at nsnps), PACKAGE="GenABEL")$chi2
- if (any(data at chromosome=="X")) {
- ogX <- data[,data at chromosome=="X"]
- sxstra <- strata; sxstra[ogX at male==1] <- strata[ogX at male==1]+nstra
- chi2X <- .C("grammar",as.raw(ogX at gtps),as.double(resid),as.double(h2object$InvSigma),as.integer(ogX at nids),as.integer(ogX at nsnps), as.integer(nstra*2), as.integer(sxstra), chi2 = double(7*ogX at nsnps), PACKAGE="GenABEL")$chi2
- revec <- (data at chromosome=="X")
- revec <- rep(revec,6)
- chi2 <- replace(chi2,revec,chi2X)
- rm(ogX,chi2X,revec);gc(verbose=FALSE)
+ function(polyObject,data,method=c("gamma","gc","raw"), propPs=1.0, ... )
+{
+ method <- match.arg(method)
+ if (method == "gamma") {
+ out <- qtscore(polyObject$pgresidualY,data=data,clambda=TRUE, ... )
+ # correct test and beta values
+ out at results[,"chi2.1df"] <- out[,"chi2.1df"]/polyObject$grammarGamma$Test
+ out at results[,"effB"] <- out[,"chi2.1df"]/polyObject$grammarGamma$Beta
+ # recompute p-values
+ out at results[,"P1df"] <- pchisq(out[,"chi2.1df"],df=1,low=FALSE)
+ # recompute Lambda
+ out at lambda <- estlambda(out[,"chi2.1df"],plot=FALSE,prop=propPs)
+ if (out at lambda$estimate <= 1) {
+ warning("estimate of Lambda < 1, constraining to 1")
+ out at lambda$estimate <- 1.0
+ out at lambda$se <- NA
}
- if (j == 1) {
- chi2.1df <- chi2[1:lenn];
- chi2.2df <- chi2[(lenn+1):(2*lenn)];
- out$chi2.1df <- chi2.1df
- out$chi2.2df <- chi2.2df
- actdf <- chi2[(2*lenn+1):(3*lenn)];
- lambda <- list()
- if (is.logical(clambda)) {
- if (lenn<10) {
- warning("no. observations < 10; Lambda set to 1")
- lambda$estimate <- 1.0
- lambda$se <- NA
- } else {
- if (lenn<100) warning("Number of observations < 100, Lambda estimate is unreliable")
- lambda <- estlambda(chi2.1df,plot=FALSE,prop=propPs)
- if (lambda$estimate<1.0 && clambda==TRUE) {
- warning("Lambda estimated < 1, set to 1")
- lambda$estimate <- 1.0
- lambda$se <- NA
- }
- }
- } else {
- if (is.numeric(clambda)) {
- lambda$estimate <- clambda
- lambda$se <- NA
- } else if (is.list(clambda)) {
- if (any(is.na(match(c("estimate","se"),names(clambda)))))
- stop("when clambda is list, should contain estimate and se")
- lambda <- clambda
- lambda$se <- NA
- } else {
- stop("clambda should be logical, numeric, or list")
- }
- }
- chi2.c1df <- chi2.1df/lambda$estimate
- effB <- chi2[(3*lenn+1):(lenn*4)]
-# effAB <- chi2[(4*lenn+1):(lenn*5)]
-# effBB <- chi2[(5*lenn+1):(lenn*6)]
- if (times>1) {
- pr.1df <- rep(0,lenn)
-# pr.2df <- rep(0,lenn)
- pr.c1df <- rep(0,lenn)
- }
- } else {
- th1 <- max(chi2[1:lenn])
- pr.1df <- pr.1df + 1*(chi2.1df < th1)
-# pr.2df <- pr.2df + 1*(chi2.2df < max(chi2[(lenn+1):(2*lenn)]))
- pr.c1df <- pr.c1df + 1*(chi2.c1df < th1)
- if (!quiet && ((j-1)/bcast == round((j-1)/bcast))) {
- cat("\b\b\b\b\b\b",round((100*(j-1)/times),digits=2),"%",sep="")
- flush.console()
- }
- }
- }
- if (times > bcast) cat("\n")
-
- if (times>1) {
- out$P1df <- pr.1df/times
- out$P1df <- replace(out$P1df,(out$P1df==0),1/(1+times))
-# out$P2df <- pr.2df/times
-# out$P2df <- replace(out$P2df,(out$P2df==0),1/(1+times))
- out$Pc1df <- pr.c1df/times
-# out$Pc1df <- replace(out$Pc1df,(out$Pc1df==0),1/(1+times))
+ } else if (method == "gc") {
+ out <- qtscore(polyObject$pgres,data=data,clambda=FALSE, prop=propPs, ... )
+ } else if (method == "raw") {
+ out <- qtscore(polyObject$pgres,data=data,clambda=TRUE, prop=propPs, ... )
} else {
- out$P1df <- pchisq(chi2.1df,1,lower=F)
-# out$P2df <- pchisq(chi2.2df,actdf,lower=F)
- out$Pc1df <- pchisq(chi2.c1df,1,lower=F)
+ stop("method should be one of 'gamma','gc', or 'raw'")
}
- out$lambda <- lambda
- out$effB <- effB
-# out$effAB <- effAB
-# out$effBB <- effBB
- out$snpnames <- data at snpnames
- out$map <- data at map
- out$chromosome <- data at chromosome
- out$idnames <- data at idnames
- out$formula <- match.call()
- out$family <- paste("score test for association with trait type") #,trait.type)
- out$N <- chi2[(6*lenn+1):(lenn*7)]
- class(out) <- "scan.gwaa"
- out
-}
+ # set uncorrectet stats to NA to avoid confusion
+ out at results[,"effAB"] <- out at results[,"effBB"] <- out at results[,"chi2.2df"] <-
+ out at results[,"P2df"] <- NA
+ # return results
+ return(out);
+}
\ No newline at end of file
Added: pkg/GenABEL/R/grammar.old.R
===================================================================
--- pkg/GenABEL/R/grammar.old.R (rev 0)
+++ pkg/GenABEL/R/grammar.old.R 2012-04-27 15:43:49 UTC (rev 897)
@@ -0,0 +1,143 @@
+"grammar.old" <-
+ function(h2object,data,snpsubset,idsubset,strata,times=1,quiet=FALSE,
+ bcast=10,clambda=FALSE,propPs=1.0)
+{
+ warning("Depricated. Using qtscore on environmental residuals (qtscore(h2object$pgres,...))\nwith clam = FLASE\n")
+ out <- qtscore(h2object$pgres,data=data,snpsubset=snpsubset,
+ idsubset=idsubset,strata=strata,times=times,quiet=quiet,bcast=bcast,
+ clambda=clambda,propPs=propPs)
+ return(out)
+ if (is(data,"gwaa.data"))
+ {
+ checkphengen(data)
+ data <- data at gtdata
+ }
+ if (class(h2object) != "polygenic")
+ stop("wrong class of h2object (should be polygenic)")
+ if (!is(data,"snp.data")) {
+ stop("wrong data class: should be gwaa.data or snp.data")
+ }
+ if (!missing(snpsubset)) data <- data[,snpsubset]
+ if (!missing(idsubset)) data <- data[idsubset,]
+ if (missing(strata)) {nstra=1; strata <- rep(0,data at nids)}
+
+ if (length(strata)!=data at nids) stop("Strata variable and the data do not match in length")
+ if (any(is.na(strata))) stop("Strata variable contains NAs")
+
+ tmeas <- h2object$measuredIDs
+ resid <- h2object$residualY
+ if (any(tmeas == FALSE)) {
+ if (!quiet) warning(paste(sum(!tmeas),"people (out of",length(tmeas),") excluded because they have trait or covariate missing\n"),immediate. = TRUE)
+ if (length(tmeas) != data at nids) stop("Dimension of the outcome and SNP data object are different")
+ data <- data[tmeas,]
+ strata <- strata[tmeas]
+ resid <- resid[tmeas]
+ }
+ if (any(strata!=0)) {
+ olev <- levels(as.factor(strata))
+ nstra <- length(olev)
+ tstr <- strata
+ for (i in 0:(nstra-1)) tstr <- replace(tstr,(strata==olev[i+1]),i)
+ strata <- tstr
+ rm(tstr)
+ }
+ nstra <- length(levels(as.factor(strata)))
+
+ lenn <- data at nsnps;
+ tvar <- h2object$h2an$estimate[length(h2object$h2an$estimate)]
+ h2object$InvSigma <- h2object$InvSigma*(1.-h2object$esth2)*tvar #sqrt(tvar)
+ out <- list()
+ for (j in c(1:(times+1*(times>1)))) {
+ if (j>1) resid <- sample(resid,replace=FALSE)
+ chi2 <- .C("grammar",as.raw(data at gtps),as.double(resid),as.double(h2object$InvSigma),as.integer(data at nids),as.integer(data at nsnps), as.integer(nstra), as.integer(strata), chi2 = double(7*data at nsnps), PACKAGE="GenABEL")$chi2
+ if (any(data at chromosome=="X")) {
+ ogX <- data[,data at chromosome=="X"]
+ sxstra <- strata; sxstra[ogX at male==1] <- strata[ogX at male==1]+nstra
+ chi2X <- .C("grammar",as.raw(ogX at gtps),as.double(resid),as.double(h2object$InvSigma),as.integer(ogX at nids),as.integer(ogX at nsnps), as.integer(nstra*2), as.integer(sxstra), chi2 = double(7*ogX at nsnps), PACKAGE="GenABEL")$chi2
+ revec <- (data at chromosome=="X")
+ revec <- rep(revec,6)
+ chi2 <- replace(chi2,revec,chi2X)
+ rm(ogX,chi2X,revec);gc(verbose=FALSE)
+ }
+ if (j == 1) {
+ chi2.1df <- chi2[1:lenn];
+ chi2.2df <- chi2[(lenn+1):(2*lenn)];
+ out$chi2.1df <- chi2.1df
+ out$chi2.2df <- chi2.2df
+ actdf <- chi2[(2*lenn+1):(3*lenn)];
+ lambda <- list()
+ if (is.logical(clambda)) {
+ if (lenn<10) {
+ warning("no. observations < 10; Lambda set to 1")
+ lambda$estimate <- 1.0
+ lambda$se <- NA
+ } else {
+ if (lenn<100) warning("Number of observations < 100, Lambda estimate is unreliable")
+ lambda <- estlambda(chi2.1df,plot=FALSE,prop=propPs)
+ if (lambda$estimate<1.0 && clambda==TRUE) {
+ warning("Lambda estimated < 1, set to 1")
+ lambda$estimate <- 1.0
+ lambda$se <- NA
+ }
+ }
+ } else {
+ if (is.numeric(clambda)) {
+ lambda$estimate <- clambda
+ lambda$se <- NA
+ } else if (is.list(clambda)) {
+ if (any(is.na(match(c("estimate","se"),names(clambda)))))
+ stop("when clambda is list, should contain estimate and se")
+ lambda <- clambda
+ lambda$se <- NA
+ } else {
+ stop("clambda should be logical, numeric, or list")
+ }
+ }
+ chi2.c1df <- chi2.1df/lambda$estimate
+ effB <- chi2[(3*lenn+1):(lenn*4)]
+# effAB <- chi2[(4*lenn+1):(lenn*5)]
+# effBB <- chi2[(5*lenn+1):(lenn*6)]
+ if (times>1) {
+ pr.1df <- rep(0,lenn)
+# pr.2df <- rep(0,lenn)
+ pr.c1df <- rep(0,lenn)
+ }
+ } else {
+ th1 <- max(chi2[1:lenn])
+ pr.1df <- pr.1df + 1*(chi2.1df < th1)
+# pr.2df <- pr.2df + 1*(chi2.2df < max(chi2[(lenn+1):(2*lenn)]))
+ pr.c1df <- pr.c1df + 1*(chi2.c1df < th1)
+ if (!quiet && ((j-1)/bcast == round((j-1)/bcast))) {
+ cat("\b\b\b\b\b\b",round((100*(j-1)/times),digits=2),"%",sep="")
+ flush.console()
+ }
+ }
+ }
+ if (times > bcast) cat("\n")
+
+ if (times>1) {
+ out$P1df <- pr.1df/times
+ out$P1df <- replace(out$P1df,(out$P1df==0),1/(1+times))
+# out$P2df <- pr.2df/times
+# out$P2df <- replace(out$P2df,(out$P2df==0),1/(1+times))
+ out$Pc1df <- pr.c1df/times
+# out$Pc1df <- replace(out$Pc1df,(out$Pc1df==0),1/(1+times))
+ } else {
+ out$P1df <- pchisq(chi2.1df,1,lower=F)
+# out$P2df <- pchisq(chi2.2df,actdf,lower=F)
+ out$Pc1df <- pchisq(chi2.c1df,1,lower=F)
+ }
+ out$lambda <- lambda
+ out$effB <- effB
+# out$effAB <- effAB
+# out$effBB <- effBB
+ out$snpnames <- data at snpnames
+ out$map <- data at map
+ out$chromosome <- data at chromosome
+ out$idnames <- data at idnames
+ out$formula <- match.call()
+ out$family <- paste("score test for association with trait type") #,trait.type)
+ out$N <- chi2[(6*lenn+1):(lenn*7)]
+ class(out) <- "scan.gwaa"
+ out
+}
Property changes on: pkg/GenABEL/R/grammar.old.R
___________________________________________________________________
Added: svn:mime-type
+ text/plain
Modified: pkg/GenABEL/R/polygenic.R
===================================================================
--- pkg/GenABEL/R/polygenic.R 2012-04-25 10:00:55 UTC (rev 896)
+++ pkg/GenABEL/R/polygenic.R 2012-04-27 15:43:49 UTC (rev 897)
@@ -100,8 +100,8 @@
#' \item{pgresidualY}{Environmental residuals from analysis, based on covariate effects
#' and predicted breeding value.
#' }
-#' \item{grresidualY}{GRAMMAR+ trait transformation}
-#' \item{grammarGamma}{list with GRAMMAR+ correction factors}
+#' \item{grresidualY}{GRAMMAR-transform trait transformation}
+#' \item{grammarGamma}{list with GRAMMAR-gamma correction factors}
#' \item{InvSigma}{Inverse of the variance-covariance matrix, computed at the
#' MLEs -- these are used in \code{\link{mmscore}} and \code{\link{grammar}}
#' functions.}
Modified: pkg/GenABEL/generate_documentation.R
===================================================================
--- pkg/GenABEL/generate_documentation.R 2012-04-25 10:00:55 UTC (rev 896)
+++ pkg/GenABEL/generate_documentation.R 2012-04-27 15:43:49 UTC (rev 897)
@@ -13,6 +13,7 @@
"GenABEL-package.R",
"generateOffspring.R",
"getLogLikelihoodGivenRelation.R",
+ "grammar.R",
"ibs.R",
"impute2databel.R",
"impute2mach.R",
@@ -31,7 +32,7 @@
#"summary.scan.gwaa.R"
)
-library(roxygen)
+library(roxygen2)
setwd("R")
unlink("GenABEL",recursive=TRUE)
package.skeleton("GenABEL",code_files=roxy_files)
Modified: pkg/GenABEL/man/GenABEL-package.Rd
===================================================================
--- pkg/GenABEL/man/GenABEL-package.Rd 2012-04-25 10:00:55 UTC (rev 896)
+++ pkg/GenABEL/man/GenABEL-package.Rd 2012-04-27 15:43:49 UTC (rev 897)
@@ -1,203 +1,235 @@
\name{GenABEL-package}
-\title{GenABEL: an R package for Genome Wide Association Analysis...}
-\description{GenABEL: an R package for Genome Wide Association Analysis}
-\details{Genome-wide association (GWA) analysis is a tool of choice
-for identification of genes for complex traits. Effective
-storage, handling and analysis of GWA data represent a
-challenge to modern computational genetics. GWA studies
-generate large amount of data: hundreds of thousands of
-single nucleotide polymorphisms (SNPs) are genotyped in
-hundreds or thousands of patients and controls. Data on
-each SNP undergoes several types of analysis:
-characterization of frequency distribution, testing of
-Hardy-Weinberg equilibrium, analysis of association between
-single SNPs and haplotypes and different traits, and so on.
-Because SNP genotypes in dense marker sets are correlated,
-significance testing in GWA analysis is preferably performed
-using computationally intensive permutation test procedures,
-further increasing the computational burden.
+\alias{GenABEL}
+\alias{GenABEL-package}
+\title{GenABEL: an R package for Genome Wide Association Analysis}
+\usage{
+ GenABEL-package()
+}
+\description{
+ Genome-wide association (GWA) analysis is a tool of
+ choice for identification of genes for complex traits.
+ Effective storage, handling and analysis of GWA data
+ represent a challenge to modern computational genetics.
+ GWA studies generate large amount of data: hundreds of
+ thousands of single nucleotide polymorphisms (SNPs) are
+ genotyped in hundreds or thousands of patients and
+ controls. Data on each SNP undergoes several types of
+ analysis: characterization of frequency distribution,
+ testing of Hardy-Weinberg equilibrium, analysis of
+ association between single SNPs and haplotypes and
+ different traits, and so on. Because SNP genotypes in
+ dense marker sets are correlated, significance testing in
+ GWA analysis is preferably performed using
+ computationally intensive permutation test procedures,
+ further increasing the computational burden.
+}
+\details{
+ To make GWA analysis possible on standard desktop
+ computers we developed GenABEL library which addresses
+ the following objectives:
-To make GWA analysis possible on standard desktop computers
-we developed GenABEL library which addresses the following
-objectives:
+ (1) Minimization of the amount of rapid access memory
+ (RAM) used and the time required for data transactions.
+ For this, we developed an effective data storage and
+ manipulation model.
-(1) Minimization of the amount of rapid access memory (RAM) used
-and the time required for data transactions. For this, we developed
-an effective data storage and manipulation model.
+ (2) Maximization of the throughput of GWA analysis. For
+ this, we designed optimal fast procedures for specific
+ genetic tests.
-(2) Maximization of the throughput of GWA analysis. For this,
-we designed optimal fast procedures for specific genetic tests.
+ Embedding GenABEL into R environment allows for easy data
+ characterization, exploration and presentation of the
+ results and gives access to a wide range of standard and
+ special statistical analysis functions available in base
+ R and specific R packages, such as "haplo.stats",
+ "genetics", etc.
-Embedding GenABEL into R environment allows for easy data
-characterization, exploration and presentation of the results
-and gives access to a wide range of standard and special
-statistical analysis functions available in base R and specific
-R packages, such as "haplo.stats", "genetics", etc.
+ To see (more or less complete) functionality of GenABEL,
+ try running
-To see (more or less complete) functionality of GenABEL, try running
+ demo(ge03d2).
-demo(ge03d2).
+ Other demo of interest could be run with demo(srdta).
+ Depending on your user priveleges in Windows, it may well
+ not run. In this case, try demo(srdtawin).
-Other demo of interest could be run with demo(srdta).
-Depending on your user priveleges in Windows, it may well not run.
-In this case, try demo(srdtawin).
+ The most important functions and classes are:
-The most important functions and classes are:
+ For converting data from other formats, see
-For converting data from other formats, see
+ \code{\link{convert.snp.illumina}}
+ (Illumina/Affymetrix-like format). This is our preferred
+ converting function, very extensively tested. Other
+ conversion functions include:
+ \code{\link{convert.snp.text}} (conversion from
+ human-readable GenABEL format),
+ \code{\link{convert.snp.ped}} (Linkage, Merlin, Mach, and
+ similar files), \code{\link{convert.snp.mach}}
+ (Mach-format), \code{\link{convert.snp.tped}} (from PLINK
+ TPED format), \code{\link{convert.snp.affymetrix}}
+ (BRML-style files).
-\code{\link{convert.snp.illumina}} (Illumina/Affymetrix-like format). This is
-our preferred converting function, very extensively tested. Other conversion
-functions include:
-\code{\link{convert.snp.text}} (conversion from human-readable GenABEL format),
-\code{\link{convert.snp.ped}} (Linkage, Merlin, Mach, and similar files),
-\code{\link{convert.snp.mach}} (Mach-format),
-\code{\link{convert.snp.tped}} (from PLINK TPED format),
-\code{\link{convert.snp.affymetrix}} (BRML-style files).
+ For converting of GenABEL's data to other formats, see
+ \code{\link{export.merlin}} (MERLIN and MACH formats),
+ \code{\link{export.impute}} (IMPUTE, SNPTEST and CHIAMO
+ formats), \code{\link{export.plink}} (PLINK format, also
+ exports phenotypic data).
-For converting of GenABEL's data to other formats, see
-\code{\link{export.merlin}} (MERLIN and MACH formats),
-\code{\link{export.impute}} (IMPUTE, SNPTEST and CHIAMO formats),
-\code{\link{export.plink}} (PLINK format, also exports phenotypic data).
+ To load the data, see \code{\link{load.gwaa.data}}.
-To load the data, see \code{\link{load.gwaa.data}}.
+ For conversion to DatABEL format (used by ProbABEL and
+ some other GenABEL suite packages), see
+ \code{\link{impute2databel}}, \code{\link{impute2mach}},
+ \code{\link{mach2databel}}.
-For conversion to DatABEL format (used by ProbABEL and some other
-GenABEL suite packages), see
-\code{\link{impute2databel}},
-\code{\link{impute2mach}},
-\code{\link{mach2databel}}.
+ For data managment and manipulations see
+ \code{\link{merge.gwaa.data}},
+ \code{\link{merge.snp.data}},
+ \code{\link{gwaa.data-class}},
+ \code{\link{snp.data-class}}, \code{\link{snp.names}},
+ \code{\link{snp.subset}}.
-For data managment and manipulations see
-\code{\link{merge.gwaa.data}},
-\code{\link{merge.snp.data}},
-\code{\link{gwaa.data-class}},
-\code{\link{snp.data-class}},
-\code{\link{snp.names}},
-\code{\link{snp.subset}}.
+ For merging extra data to the phenotypic part of
+ \code{\link{gwaa.data-class}} object, see
+ \code{\link{add.phdata}}.
-For merging extra data to the phenotypic part of \code{\link{gwaa.data-class}} object,
-see \code{\link{add.phdata}}.
+ For traits manipulations see \code{\link{ztransform}}
+ (transformation to standard Normal),
+ \code{\link{rntransform}} (rank-transformation to
+ normality), \code{\link{npsubtreated}} (non-parametric
+ routine to "impute" trait's values in these medicated).
-For traits manipulations see
-\code{\link{ztransform}} (transformation to standard Normal),
-\code{\link{rntransform}} (rank-transformation to normality),
-\code{\link{npsubtreated}} (non-parametric routine to "impute" trait's values in these medicated).
+ For quality control, see \code{\link{check.trait}},
+ \code{\link{check.marker}}, \code{\link{HWE.show}},
+ \code{\link{summary.snp.data}},
+ \code{\link{perid.summary}}, \code{\link{ibs}},
+ \code{\link{hom}}.
+ For fast analysis function, see
+ \code{\link{scan.gwaa-class}}, \code{\link{ccfast}},
+ \code{\link{qtscore}}, \code{\link{mmscore}},
+ \code{\link{egscore}}, \code{\link{ibs}},
+ \code{\link{r2fast}} (estimate linkage disequilibrium
+ using R2), \code{\link{dprfast}} (estimate linkage
+ disequilibrium using D'), \code{\link{rhofast}} (estimate
+ linkage disequilibrium using 'rho')
-For quality control, see
-\code{\link{check.trait}},
-\code{\link{check.marker}},
-\code{\link{HWE.show}},
-\code{\link{summary.snp.data}},
-\code{\link{perid.summary}},
-\code{\link{ibs}},
-\code{\link{hom}}.
+ For specific tools facilitating analysis of the data with
+ stratification (population stratification or (possibly
+ unknown) pedigree structure), see \code{\link{qtscore}}
+ (implements basic Genomic Control), \code{\link{ibs}}
+ (computations of IBS / genomic IBD),
+ \code{\link{egscore}} (stratification adjustment
+ following Price et al.), \code{\link{polygenic}}
+ (heritability analysis), \code{\link{polygenic_hglm}}
+ (another function for heritability analysis),
+ \code{\link{mmscore}} (score test of Chen and Abecasis),
+ \code{\link{grammar}} (grammar test of Aulchenko et al.).
-For fast analysis function, see
-\code{\link{scan.gwaa-class}},
-\code{\link{ccfast}},
-\code{\link{qtscore}},
-\code{\link{mmscore}},
-\code{\link{egscore}},
-\code{\link{ibs}},
-\code{\link{r2fast}} (estimate linkage disequilibrium using R2),
-\code{\link{dprfast}} (estimate linkage disequilibrium using D'),
-\code{\link{rhofast}} (estimate linkage disequilibrium using 'rho')
+ For functions facilitating construction of tables for
+ your manuscript, see \code{\link{descriptives.marker}},
+ \code{\link{descriptives.trait}},
+ \code{\link{descriptives.scan}}.
-For specific tools facilitating analysis of the data with stratification
-(population stratification or (possibly unknown) pedigree structure), see
-\code{\link{qtscore}} (implements basic Genomic Control),
-\code{\link{ibs}} (computations of IBS / genomic IBD),
-\code{\link{egscore}} (stratification adjustment following Price et al.),
-\code{\link{polygenic}} (heritability analysis),
-\code{\link{polygenic_hglm}} (another function for heritability analysis),
-\code{\link{mmscore}} (score test of Chen and Abecasis),
-\code{\link{grammar}} (grammar test of Aulchenko et al.).
+ For functions recunstructing relationships from genomic
+ data, see \code{\link{findRelatives}},
+ \code{\link{reconstructNPs}}.
-For functions facilitating construction of tables for your manuscript, see
-\code{\link{descriptives.marker}},
-\code{\link{descriptives.trait}},
-\code{\link{descriptives.scan}}.
+ For meta-analysis and related, see help on
+ \code{\link{formetascore}}.
-For functions recunstructing relationships from genomic data,
-see
-\code{\link{findRelatives}}, \code{\link{reconstructNPs}}.
+ For link to WEB databases, see \code{\link{show.ncbi}}.
-For meta-analysis and related, see help on
-\code{\link{formetascore}}.
+ For interfaces to other packages and standard R
+ functions, also for 2D scans, see \code{\link{scan.glm}},
+ \code{\link{scan.glm.2D}}, \code{\link{scan.haplo}},
+ \code{\link{scan.haplo.2D}},
+ \code{\link{scan.gwaa-class}},
+ \code{\link{scan.gwaa.2D-class}}.
-For link to WEB databases, see
-\code{\link{show.ncbi}}.
+ For graphical facilities, see
+ \code{\link{plot.scan.gwaa}},
+ \code{\link{plot.check.marker}}.
+}
+\examples{
+\dontrun{
+demo(ge03d2)
+demo(srdta)
+demo(srdtawin)
+}
+}
+\author{
+ Yurii Aulchenko et al. (see help pages for specific
+ functions)
+}
+\references{
+ If you use GenABEL package in your analysis, please cite
+ the following work:
-For interfaces to other packages and standard R functions,
-also for 2D scans, see
-\code{\link{scan.glm}},
-\code{\link{scan.glm.2D}},
-\code{\link{scan.haplo}},
-\code{\link{scan.haplo.2D}},
-\code{\link{scan.gwaa-class}},
-\code{\link{scan.gwaa.2D-class}}.
+ Aulchenko Y.S., Ripke S., Isaacs A., van Duijn C.M.
+ GenABEL: an R package for genome-wide association
+ analysis. Bioinformatics. 2007 23(10):1294-6.
-For graphical facilities, see
-\code{\link{plot.scan.gwaa}},
-\code{\link{plot.check.marker}}.}
-\alias{GenABEL-package}
-\alias{GenABEL}
-\author{Yurii Aulchenko et al.
-(see help pages for specific functions)}
-\references{If you use GenABEL package in your analysis, please cite the following work:
+ If you used \code{\link{polygenic}}, please cite
-Aulchenko Y.S., Ripke S., Isaacs A., van Duijn C.M. GenABEL: an R package
-for genome-wide association analysis. Bioinformatics. 2007 23(10):1294-6.
+ Thompson EA, Shaw RG (1990) Pedigree analysis for
+ quantitative traits: variance components without matrix
+ inversion. Biometrics 46, 399-413.
-If you used \code{\link{polygenic}}, please cite
+ If you used environmental residuals from
+ \code{\link{polygenic}} for \code{\link{qtscore}}, used
+ GRAMMAR and/or GRAMMAS analysis, please cite
-Thompson EA, Shaw RG (1990) Pedigree analysis for quantitative
-traits: variance components without matrix inversion. Biometrics
-46, 399-413.
+ Aulchenko YS, de Koning DJ, Haley C. Genomewide rapid
+ association using mixed model and regression: a fast and
+ simple method for genome-wide pedigree-based quantitative
+ trait loci association analysis. Genetics. 2007
+ 177(1):577-85.
-If you used environmental residuals from \code{\link{polygenic}} for
-\code{\link{qtscore}}, used GRAMMAR and/or GRAMMAS analysis, please cite
+ Amin N, van Duijn CM, Aulchenko YS. A genomic background
+ based method for association analysis in related
+ individuals. PLoS ONE. 2007 Dec 5;2(12):e1274.
-Aulchenko YS, de Koning DJ, Haley C. Genomewide rapid association using mixed model
-and regression: a fast and simple method for genome-wide pedigree-based quantitative
-trait loci association analysis. Genetics. 2007 177(1):577-85.
+ If you used \code{\link{mmscore}}, please cite
-Amin N, van Duijn CM, Aulchenko YS. A genomic background based method for
-association analysis in related individuals. PLoS ONE. 2007 Dec 5;2(12):e1274.
+ Chen WM, Abecasis GR. Family-based association tests for
+ genome-wide association scans. Am J Hum Genet. 2007
+ Nov;81(5):913-26.
-If you used \code{\link{mmscore}}, please cite
+ For exact HWE (used in \code{\link{summary.snp.data}}),
+ please cite:
-Chen WM, Abecasis GR. Family-based association tests for genome-wide association
-scans. Am J Hum Genet. 2007 Nov;81(5):913-26.
+ Wigginton G.E., Cutler D.J., Abecasis G.R. A note on
+ exact tests of Hardy-Weinberg equilibrium. Am J Hum
+ Genet. 2005 76: 887-893.
-For exact HWE (used in \code{\link{summary.snp.data}}), please cite:
+ For haplo.stats (\code{\link{scan.haplo}},
+ \code{\link{scan.haplo.2D}}), please cite:
-Wigginton G.E., Cutler D.J., Abecasis G.R. A note on exact tests of
-Hardy-Weinberg equilibrium. Am J Hum Genet. 2005 76: 887-893.
+ Schaid DJ, Rowland CM, Tines DE, Jacobson RM, Poland GA.
+ Score tests for association between traits and haplotypes
+ when linkage phase is ambiguous. Am J Hum Genet. 2002
+ 70:425-434.
-For haplo.stats (\code{\link{scan.haplo}}, \code{\link{scan.haplo.2D}}), please cite:
+ For fast LD computations (function \code{\link{dprfast}},
+ \code{\link{r2fast}}), please cite:
-Schaid DJ, Rowland CM, Tines DE, Jacobson RM, Poland GA. Score tests for
-association between traits and haplotypes when linkage phase is ambiguous.
-Am J Hum Genet. 2002 70:425-434.
+ Hao K, Di X, Cawley S. LdCompare: rapid computation of
+ single- and multiple-marker r2 and genetic coverage.
+ Bioinformatics. 2006 23:252-254.
[TRUNCATED]
To get the complete diff run:
svnlook diff /svnroot/genabel -r 897
More information about the Genabel-commits
mailing list