[Genabel-commits] r789 - in pkg/PredictABEL: . R man

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Fri Sep 30 12:36:11 CEST 2011


Author: lckarssen
Date: 2011-09-30 12:36:10 +0200 (Fri, 30 Sep 2011)
New Revision: 789

Added:
   pkg/PredictABEL/R/simulatedDataset.R
   pkg/PredictABEL/man/simulatedDataset.Rd
Removed:
   pkg/PredictABEL/R/simulation_codes.R
Modified:
   pkg/PredictABEL/DESCRIPTION
   pkg/PredictABEL/R/PredictABEL.R
Log:
Fixed a couple of minor build probems in PredictABEL.

Modified: pkg/PredictABEL/DESCRIPTION
===================================================================
--- pkg/PredictABEL/DESCRIPTION	2011-09-30 10:02:34 UTC (rev 788)
+++ pkg/PredictABEL/DESCRIPTION	2011-09-30 10:36:10 UTC (rev 789)
@@ -1,27 +1,23 @@
 Package: PredictABEL
 Title: Assessment of risk prediction models
-Version: 1.1.1
-Date: 2011-02-09
+Version: 1.2
+Date: 2011-09-30
 Author: Suman Kundu, Yurii S. Aulchenko, A. Cecile J.W. Janssens
-Maintainer: Suman Kundu <s.kundu at erasmusmc.nl>, 
-A. Cecile J.W. Janssens <a.janssens at erasmusmc.nl>
-Depends: R (>= 2.9.0), Hmisc, ROCR, epitools, PBSmodelling
+Maintainer: Suman Kundu <s.kundu at erasmusmc.nl>, A. Cecile J.W. Janssens <a.janssens at erasmusmc.nl>
+Depends: R (>= 2.12.0), Hmisc, ROCR, epitools, PBSmodelling
 Suggests: GenABEL
-Description: PredictABEL includes functions to assess the performance of 
-risk models. The package contains functions for the various measures that are 
-used in empirical studies, including univariate and multivariate odds ratios
+Description: PredictABEL includes functions to assess the performance of
+ risk models. The package contains functions for the various measures that are
+ used in empirical studies, including univariate and multivariate odds ratios
  (OR) of the predictors, the c-statistic (or area under the receiver operating
-  characteristic (ROC) curve (AUC)), Hosmer-Lemeshow goodness of fit test,
-   reclassification table, net reclassification improvement (NRI) and 
-   integrated discrimination improvement (IDI). Also included are functions 
-   to create plots, such as risk distributions, ROC curves, calibration plot, 
-   discrimination box plot and predictiveness curves. In addition to functions 
-   to assess the performance of risk models, the package includes functions to 
-   obtain weighted and unweighted risk scores as well as predicted risks using 
-   logistic regression analysis. These logistic regression functions are 
-   specifically written for models that include genetic variables, but they 
-   can also be applied to models that are based on non-genetic risk factors only. 
+ characteristic (ROC) curve (AUC)), Hosmer-Lemeshow goodness of fit test,
+ reclassification table, net reclassification improvement (NRI) and
+ integrated discrimination improvement (IDI). Also included are functions
+ to create plots, such as risk distributions, ROC curves, calibration plot,
+ discrimination box plot and predictiveness curves. In addition to functions
+ to assess the performance of risk models, the package includes functions to
+ obtain weighted and unweighted risk scores as well as predicted risks using
+ logistic regression analysis. These logistic regression functions are
+ specifically written for models that include genetic variables, but they
+ can also be applied to models that are based on non-genetic risk factors only.
 License: GPL (>= 2)
-
-
-

Modified: pkg/PredictABEL/R/PredictABEL.R
===================================================================
--- pkg/PredictABEL/R/PredictABEL.R	2011-09-30 10:02:34 UTC (rev 788)
+++ pkg/PredictABEL/R/PredictABEL.R	2011-09-30 10:36:10 UTC (rev 789)
@@ -1,115 +1,115 @@
-#'  An R package for the analysis of (genetic) risk prediction studies.  
+#'  An R package for the analysis of (genetic) risk prediction studies.
 #'
-#' Fueled by the substantial gene discoveries from genome-wide association 
-#' studies, there is increasing interest in investigating the predictive 
-#' ability of genetic risk models. To assess the performance of genetic risk 
-#' models, PredictABEL includes functions for the various measures and plots 
-#' that have been used in empirical studies, including univariate and 
-#' multivariate odds ratios (ORs) of the predictors, the c-statistic (or AUC), 
-#' Hosmer-Lemeshow goodness of fit test, reclassification table, net 
-#' reclassification improvement (NRI) and integrated discrimination 
-#' improvement (IDI). The plots included are the ROC plot, calibration plot, 
-#' discrimination box plot, predictiveness curve, and several risk distributions. 
+#' Fueled by the substantial gene discoveries from genome-wide association
+#' studies, there is increasing interest in investigating the predictive
+#' ability of genetic risk models. To assess the performance of genetic risk
+#' models, PredictABEL includes functions for the various measures and plots
+#' that have been used in empirical studies, including univariate and
+#' multivariate odds ratios (ORs) of the predictors, the c-statistic (or AUC),
+#' Hosmer-Lemeshow goodness of fit test, reclassification table, net
+#' reclassification improvement (NRI) and integrated discrimination
+#' improvement (IDI). The plots included are the ROC plot, calibration plot,
+#' discrimination box plot, predictiveness curve, and several risk distributions.
 #'
 #'
-#' These functions can be applied to predicted risks that are obtained using 
-#' logistic regression analysis, to weighted or unweighted risk scores, for 
-#' which the functions are included in this package. The functions can also be 
-#' used to assess risks or risk scores that are constructed using other methods, e.g., Cox Proportional 
-#' Hazards regression analysis, which are not included in the current version. 
-#' Risks obtained from other methods can be imported into R for assessment 
+#' These functions can be applied to predicted risks that are obtained using
+#' logistic regression analysis, to weighted or unweighted risk scores, for
+#' which the functions are included in this package. The functions can also be
+#' used to assess risks or risk scores that are constructed using other methods, e.g., Cox Proportional
+#' Hazards regression analysis, which are not included in the current version.
+#' Risks obtained from other methods can be imported into R for assessment
 #' of the predictive performance.
 #'
 #'
-#' The functions to construct the risk models using logistic regression analyses 
-#' are specifically written for models that include genetic variables, 
-#' eventually in addition to non-genetic factors, but they can also be applied 
-#' to construct models that are based on non-genetic risk factors only. \cr 
+#' The functions to construct the risk models using logistic regression analyses
+#' are specifically written for models that include genetic variables,
+#' eventually in addition to non-genetic factors, but they can also be applied
+#' to construct models that are based on non-genetic risk factors only. \cr
 #'
 #'
-#' Before using the functions \code{\link{fitLogRegModel}} for constructing 
-#' a risk model or \code{\link{riskScore}} for computing risk 
-#' scores, the following checks on the dataset are advisable to be done: 
-#' 
-#' (1) Missing values: The logistic regression analyses and computation of 
-#' the risk score are done only for subjects that have no missing data. In case 
-#' of missing values, individuals with missing data can be removed from the 
-#' dataset or imputation strategies can be used to fill in missing data. 
-#' Subjects with missing data can be removed with the R function \code{na.omit} 
-#' (available in \code{stats} package). 
-#' Example: \code{DataFileNew <- na.omit(DataFile)} 
+#' Before using the functions \code{\link{fitLogRegModel}} for constructing
+#' a risk model or \code{\link{riskScore}} for computing risk
+#' scores, the following checks on the dataset are advisable to be done:
+#'
+#' (1) Missing values: The logistic regression analyses and computation of
+#' the risk score are done only for subjects that have no missing data. In case
+#' of missing values, individuals with missing data can be removed from the
+#' dataset or imputation strategies can be used to fill in missing data.
+#' Subjects with missing data can be removed with the R function \code{na.omit}
+#' (available in \code{stats} package).
+#' Example: \code{DataFileNew <- na.omit(DataFile)}
 #' will make a new dataset (\code{DataFileNew}) with no missing values;
 #'
-#' 
-#' (2) Multicollinearity: When there is strong correlation between the 
-#' predictor variables, regression coefficients may be estimated imprecisely 
-#' and risks scores may be biased because the assumption of independent effects 
-#' is violated. In genetic risk prediction studies, problems with 
-#' multicollinearity should be expected when single nucleotide polymorphisms 
-#' (SNPs) located in the same gene are                                                    
-#' in strong linkage disequilibrium (LD). For SNPs in LD it is common to select 
-#' the variant with the lowest p-value in the model;  
-#' 
-#' 
-#' (3) Outliers: When the data contain significant outliers, either clinical 
-#' variables with extreme values of the outcomes or extreme values resulting 
-#' from errors in the data entry, these may impact the construction of the risk models and 
-#' computation of the risks scores. Data should be carefully checked and outliers 
+#'
+#' (2) Multicollinearity: When there is strong correlation between the
+#' predictor variables, regression coefficients may be estimated imprecisely
+#' and risks scores may be biased because the assumption of independent effects
+#' is violated. In genetic risk prediction studies, problems with
+#' multicollinearity should be expected when single nucleotide polymorphisms
+#' (SNPs) located in the same gene are
+#' in strong linkage disequilibrium (LD). For SNPs in LD it is common to select
+#' the variant with the lowest p-value in the model;
+#'
+#'
+#' (3) Outliers: When the data contain significant outliers, either clinical
+#' variables with extreme values of the outcomes or extreme values resulting
+#' from errors in the data entry, these may impact the construction of the risk models and
+#' computation of the risks scores. Data should be carefully checked and outliers
 #' need to be removed or replaced, if justified;
 #'
-#' (4) Recoding of data: In the computation of unweighted risk scores, it is assumed 
-#' that the genetic variants are coded \code{0,1,2}  
-#' representing the number of alleles carried. When variants 
-#' are coded \code{0,1} representing a dominant or recessive effect of the alleles, 
+#' (4) Recoding of data: In the computation of unweighted risk scores, it is assumed
+#' that the genetic variants are coded \code{0,1,2}
+#' representing the number of alleles carried. When variants
+#' are coded \code{0,1} representing a dominant or recessive effect of the alleles,
 #' the variables need to be recoded before unweighted risk scores can be computed. \cr
 #'
 #'
-#' To import data into R several alternative strategies can be used. Use the 
-#' \code{Hmisc} package for importing SPSS and SAS data into R. 
-#' Use "\code{ExampleData <- read.table("DataName.txt", header=T, sep="\t")}" for text 
-#' files where variable names are included as column headers and data are 
-#' separated by tabs.  
-#' Use "\code{ExampleData <- read.table("Name.csv", sep=",", header=T)}" 
-#' for comma-separated files with variable names as column headers. 
-#' Use \code{"setwd(dir)"} to set the working directory to "dir". The datafile 
-#' needs to be present in the working directory. \cr 
+#' To import data into R several alternative strategies can be used. Use the
+#' \code{Hmisc} package for importing SPSS and SAS data into R.
+#' Use "\code{ExampleData <- read.table("DataName.txt", header=T, sep="\t")}" for text
+#' files where variable names are included as column headers and data are
+#' separated by tabs.
+#' Use "\code{ExampleData <- read.table("Name.csv", sep=",", header=T)}"
+#' for comma-separated files with variable names as column headers.
+#' Use \code{"setwd(dir)"} to set the working directory to "dir". The datafile
+#' needs to be present in the working directory. \cr
 #'
 #'
 #' To export datafiles from R tables to a tab-delimited textfile with the first row as
-#' the name of the variables, 
-#' use "\code{write.table(R_Table, file="Name.txt", row.names=FALSE, sep="\t")}"  and 
+#' the name of the variables,
+#' use "\code{write.table(R_Table, file="Name.txt", row.names=FALSE, sep="\t")}"  and
 #' when a comma-separated textfile is requested and variable names are provided in the first row,
-#' use "\code{write.table(R_Table, file="Name.csv", row.names=FALSE, sep=",")}".  
+#' use "\code{write.table(R_Table, file="Name.csv", row.names=FALSE, sep=",")}".
 #' When the directory is not specified, the file will be
-#' saved in the working directory. For exporting R data into SPSS, SAS and 
+#' saved in the working directory. For exporting R data into SPSS, SAS and
 #' Stata data, use functions in the the \code{foreign} package. \cr
 #'
 #' Several functions in this package depend on other R packages:
-#' 
+#'
 #' (1) \code{Hmisc}, is used to compute NRI and IDI;
-#' 
+#'
 #' (2) \code{ROCR}, is used to produce ROC plots;
-#' 
+#'
 #' (3) \code{epitools}, is used to compute  univariate odds ratios;
-##' 
+##'
 #' (4) \code{PBSmodelling}, is used to produce predictiveness curve.
 #'
-#' @note The current version of the package includes the basic measures 
-#' and plots that are used in the assessment of (genetic) risk prediction models. 
-#' Planned extensions of the package include functions to construct risk 
-#' models using Cox Proportional Hazards analysis for prospective data and 
-#' functions to construct simulated data for the evaluation of 
+#' @note The current version of the package includes the basic measures
+#' and plots that are used in the assessment of (genetic) risk prediction models.
+#' Planned extensions of the package include functions to construct risk
+#' models using Cox Proportional Hazards analysis for prospective data and
+#' functions to construct simulated data for the evaluation of
 #' genetic risk models (see Janssens et al, Genet Med 2006).
-#' 
 #'
-##' @author  Suman Kundu  
+#'
+##' @author  Suman Kundu
 ##'
 ##' Yurii S. Aulchenko
 ##'
-##' A. Cecile J.W. Janssens  
+##' A. Cecile J.W. Janssens
 ##'
 ##' @keywords package
-#' 
+#'
 #' @references S Kundu, YS Aulchenko, CM van Duijn, ACJW Janssens. PredictABEL:
 #' an R package for the assessment of risk prediction models.
 #' Eur J Epidemiol. 2011;26:261-4. \cr
@@ -123,7 +123,7 @@
 #' P Kraft, S Melillo, CJ O'Donnell, MJ Pencina, D Ransohoff, SD Schully,
 #' D Seminara, DM Winn, CF Wright, CM van Duijn, J Little, MJ Khoury.
 #' Strengthening the reporting of genetic risk prediction studies
-#' (GRIPS)-Elaboration and explanation. Eur J Epidemiol. 2011;26:313-37. \cr   
+#' (GRIPS)-Elaboration and explanation. Eur J Epidemiol. 2011;26:313-37. \cr
 #'
 #' Aulchenko YS, Ripke S, Isaacs A, van Duijn CM. GenABEL: an R package for genome-wide
 #' association analysis. Bioinformatics 2007;23(10):1294-6.
@@ -133,41 +133,41 @@
 #'
 #' The function fits a standard GLM function for the logistic regression model.
 #' This function can be used to construct a logistic regression model based on genetic and non-genetic
-#' predictors. The function also allows to enter the genetic predictors 
-#' as a single risk score. For that purpose, the function requires that 
+#' predictors. The function also allows to enter the genetic predictors
+#' as a single risk score. For that purpose, the function requires that
 #' the dataset additionally includes the risk score.
 #' A new dataset can be constructed using
-#'  "\code{NewExampleData <- cbind(ExampleData,riskScore)}". 
+#'  "\code{NewExampleData <- cbind(ExampleData,riskScore)}".
 #' The genetic risk scores can be obtained
 #' using the function \code{\link{riskScore}} in this package or be
-#' imported from other methods. 
+#' imported from other methods.
 #'
-#' @param data Data frame or matrix that includes the outcome and 
-#' predictor variables.  
+#' @param data Data frame or matrix that includes the outcome and
+#' predictor variables.
 #' @param cOutcome Column number of the outcome variable. \code{cOutcome=2}
 #'  means that the second column of the dataset is the outcome variable.
-#' To fit the logistic regression model, the outcome variable needs to be 
+#' To fit the logistic regression model, the outcome variable needs to be
 #' (re)coded as \code{1} for the presence and \code{0} for the absence of the
-#' outcome of interest. 
-#' @param cNonGenPreds Column numbers of the non-genetic predictors that are  
-#' included in the model. An example to denote column numbers is  
-#' \code{c(3,6:8,10)}. Choose \code{c(0)} when no non-genetic predictors 
+#' outcome of interest.
+#' @param cNonGenPreds Column numbers of the non-genetic predictors that are
+#' included in the model. An example to denote column numbers is
+#' \code{c(3,6:8,10)}. Choose \code{c(0)} when no non-genetic predictors
 #' are considered.
-#' @param cNonGenPredsCat Column numbers of the non-genetic predictors that  
-#' are entered as categorical variables in the model. When non-genetic 
-#' predictors are not specified as being categorical they are treated as     
-#' continuous variables in the model. If no non-genetic predictors are 
-#' categorical, denote \code{c(0)}. 
+#' @param cNonGenPredsCat Column numbers of the non-genetic predictors that
+#' are entered as categorical variables in the model. When non-genetic
+#' predictors are not specified as being categorical they are treated as
+#' continuous variables in the model. If no non-genetic predictors are
+#' categorical, denote \code{c(0)}.
 #' @param cGenPreds Column numbers of the genetic predictors or genetic risk score.
 #' Denote \code{c(0)}
-#' when the prediction model does not consider 
+#' when the prediction model does not consider
 #' genetic predictors or genetic risk score.
-#' @param cGenPredsCat Column numbers of the genetic predictors that are 
-#' entered as categorical variables in the model. When SNPs are considered as 
-#' categorical, the model  
-#' will estimate effects per genotype. Otherwise, SNPs are considered as  
-#' continuous variables for which  the model will estimate an allelic effect.  
-#' Choose c(0) when no genetic predictors are considered as categorical 
+#' @param cGenPredsCat Column numbers of the genetic predictors that are
+#' entered as categorical variables in the model. When SNPs are considered as
+#' categorical, the model
+#' will estimate effects per genotype. Otherwise, SNPs are considered as
+#' continuous variables for which  the model will estimate an allelic effect.
+#' Choose c(0) when no genetic predictors are considered as categorical
 #' or when genetic predictors are entered as a risk score into the model.
 #'
 #' @return No value returned.
@@ -177,24 +177,24 @@
 #'
 #' @seealso  \code{\link{predRisk}}, \code{\link{ORmultivariate}}, \code{\link{riskScore}}
 #' @examples
-#' # specify dataset with outcome and predictor variables 
+#' # specify dataset with outcome and predictor variables
 #' data(ExampleData)
 #' # specify column number of outcome variable
-#'  cOutcome <- 2 
+#'  cOutcome <- 2
 #'  # specify column numbers of non-genetic predictors
 #'  cNonGenPred <- c(3:10)
-#'  # specify column numbers of non-genetic predictors that are categorical      
+#'  # specify column numbers of non-genetic predictors that are categorical
 #'  cNonGenPredCat <- c(6:8)
 #'  # specify column numbers of genetic predictors
 #'  cGenPred <- c(11,13:16)
 #'  # specify column numbers of genetic predictors that are categorical
-#'  cGenPredCat <- c(0) 
+#'  cGenPredCat <- c(0)
 #'
 #'  # fit logistic regression model
-#'  riskmodel <- fitLogRegModel(data=ExampleData, cOutcome=cOutcome, 
-#'  cNonGenPreds=cNonGenPred, cNonGenPredsCat=cNonGenPredCat, 
+#'  riskmodel <- fitLogRegModel(data=ExampleData, cOutcome=cOutcome,
+#'  cNonGenPreds=cNonGenPred, cNonGenPredsCat=cNonGenPredCat,
 #'  cGenPreds=cGenPred, cGenPredsCat=cGenPredCat)
-#' 
+#'
 #'  # show summary details for the fitted risk model
 #'  summary(riskmodel)
 #'
@@ -239,19 +239,19 @@
 return(model)
 
 }
-#' Function to compute predicted risks for all individuals in the dataset. 
-#' 
-#' The function computes predicted risks from a specified logistic regression model.   
-#' The function \code{\link{fitLogRegModel}} can be used to construct such a model.  
+#' Function to compute predicted risks for all individuals in the dataset.
 #'
-#' @param riskModel Name of logistic regression model that can be fitted using  
+#' The function computes predicted risks from a specified logistic regression model.
+#' The function \code{\link{fitLogRegModel}} can be used to construct such a model.
+#'
+#' @param riskModel Name of logistic regression model that can be fitted using
 #' the function \code{\link{fitLogRegModel}}.
-#' @param data Data frame or matrix that includes the outcome, ID number and 
-#' predictor variables. 
-#' @param cID Column number of ID variable. The ID number and predicted risks 
-#' will be saved under \code{filename}. When \code{cID} is not specified, the output is not saved. 
-#' @param filename Name of the output file in which the ID number and 
-#' estimated predicted risks will be saved. The file is saved in the working  
+#' @param data Data frame or matrix that includes the outcome, ID number and
+#' predictor variables.
+#' @param cID Column number of ID variable. The ID number and predicted risks
+#' will be saved under \code{filename}. When \code{cID} is not specified, the output is not saved.
+#' @param filename Name of the output file in which the ID number and
+#' estimated predicted risks will be saved. The file is saved in the working
 #' directory as a txt file. When no \code{filename} is specified, the output is not saved.
 #'
 #'
@@ -262,7 +262,7 @@
 #' @keywords htest
 #'
 #'
-#' @seealso \code{\link{fitLogRegModel}}, \code{\link{plotCalibration}}, 
+#' @seealso \code{\link{fitLogRegModel}}, \code{\link{plotCalibration}},
 #' \code{\link{plotROC}}, \code{\link{plotPriorPosteriorRisk}}
 #'
 #' @examples
@@ -286,34 +286,34 @@
 #'  cNonGenPreds=cNonGenPred, cNonGenPredsCat=cNonGenPredCat,
 #'  cGenPreds=cGenPred, cGenPredsCat=cGenPredCat)
 #'
-#'  # obtain predicted risks 
-#'  predRisk <- predRisk(riskModel=riskmodel, filename="name.txt") 
+#'  # obtain predicted risks
+#'  predRisk <- predRisk(riskModel=riskmodel, filename="name.txt")
 #'
-"predRisk" <- 
+"predRisk" <-
 function(riskModel, data, cID, filename)
 {
  if (any(class(riskModel) == "glm"))
   {
    predrisk <- predict(riskModel, type="response")
-  } 
-else 
+  }
+else
   {
-stop("The argument 'riskModel' should be a (GLM)model")      
+stop("The argument 'riskModel' should be a (GLM)model")
   }
-     
+
   if (!missing(data)&& !missing(cID)&& !missing(filename))
   {tab <- cbind(ID=data[, cID],PredRisk=predrisk)
   write.table(tab, file=filename, row.names = FALSE,sep = "\t")
   }
    return(predrisk)
-  }   
-#' Function to compute genetic risk scores. The function computes unweighted 
-#' or weighted genetic risk scores. The relative effects (or weights) of 
-#' genetic variants can either come from beta coefficients of a risk model 
+  }
+#' Function to compute genetic risk scores. The function computes unweighted
+#' or weighted genetic risk scores. The relative effects (or weights) of
+#' genetic variants can either come from beta coefficients of a risk model
 #' or from a vector of beta coefficients imported into R, e.g., when beta cofficients are obtained from meta-analysis.
-#' 
 #'
-#' The function calculates unweighted 
+#'
+#' The function calculates unweighted
 #' or weighted genetic risk scores. The unweighted genetic risk score is a simple
 #' risk allele count assuming that all alleles have the same effect. For this
 #' calculation, it is required that the genetic variables are coded as the number of risk
@@ -322,28 +322,28 @@
 #' is reversed. The weighted risk score is a sum of the number of risk alleles
 #' multiplied by their beta coefficients.
 #'
-#' The beta coefficients can come from two different sources, either beta coefficients of a risk model 
+#' The beta coefficients can come from two different sources, either beta coefficients of a risk model
 #' or a vector of beta coefficients imported into R, e.g., when beta cofficients are obtained from meta-analysis.
-#' This vector of beta coefficients 
+#' This vector of beta coefficients
 #' should be a named vector containing the same names as mentioned in genetic variants.
 #' A logistic regression model can be constructed using \code{\link{fitLogRegModel}}
 #' from this package.
-#'  
-#' @note When a vector of beta coefficients is imported, it should be checked 
-#' whether the DNA strands and the coding of the risk alleles are the same 
+#'
+#' @note When a vector of beta coefficients is imported, it should be checked
+#' whether the DNA strands and the coding of the risk alleles are the same
 #' as in the study data. The functions are available in the package \code{GenABEL}
-#' to accurately compute risk scores when the DNA strands are different or the risk 
-#' alleles are coded differently in the study data and the data used in meta-analysis.  
+#' to accurately compute risk scores when the DNA strands are different or the risk
+#' alleles are coded differently in the study data and the data used in meta-analysis.
 #'
-#' @param weights The vector that includes the weights given to the genetic 
+#' @param weights The vector that includes the weights given to the genetic
 #' variants. See details for more informations.
-#' @param data Data frame or matrix that includes the outcome  
-#' and predictors variables. 
-#' @param cGenPreds Column numbers of the genetic variables on the basis of   
+#' @param data Data frame or matrix that includes the outcome
+#' and predictors variables.
+#' @param cGenPreds Column numbers of the genetic variables on the basis of
 #'  which the risk score is computed.
 #' @param Type Specification of the type of risk scores that will be computed.
-#' Type can be weighted (\code{Type="weighted"}) or 
-#' unweighted (\code{Type="unweighted"}). 
+#' Type can be weighted (\code{Type="weighted"}) or
+#' unweighted (\code{Type="unweighted"}).
 #'
 #' @return The function returns a vector of risk scores.
 #'
@@ -354,7 +354,7 @@
 #' @seealso \code{\link{plotRiskDistribution}}, \code{\link{plotRiskscorePredrisk}}
 #' @examples
 #' # specify dataset with outcome and predictor variables
-#'  data(ExampleData) 
+#'  data(ExampleData)
 #'  # specify column numbers of genetic predictors
 #'  cGenPred <- c(11:16)
 #'
@@ -363,16 +363,16 @@
 #'  # called 'ExampleModels', which is described on page 4-5
 #'  riskmodel <- ExampleModels()$riskModel2
 #'
-#'  # compute unweighted risk scores 
-#'  riskScore <- riskScore(weights=riskmodel, data=ExampleData, 
-#' cGenPreds=cGenPred, Type="unweighted")  
+#'  # compute unweighted risk scores
+#'  riskScore <- riskScore(weights=riskmodel, data=ExampleData,
+#' cGenPreds=cGenPred, Type="unweighted")
 #'
-"riskScore" <- 
+"riskScore" <-
 function(weights, data, cGenPreds, Type )
 {
 riskModel <- weights
 x <- data[, cGenPreds]
-if (any(class(riskModel) == "glm")) 
+if (any(class(riskModel) == "glm"))
   {
     if(! setequal(intersect(names(riskModel$coef),colnames(x)),colnames(x)))
     {
@@ -381,9 +381,9 @@
      else
     {
      y <- riskModel$coef[intersect(names(riskModel$coef),colnames(x))]
-    } 
+    }
   }
-  
+
 else if(is.vector(riskModel))
  {
    if (length(names(riskModel))!= length(riskModel))
@@ -393,14 +393,14 @@
    if( setequal(intersect(names(riskModel),colnames(x)),colnames(x)))
    {
     y <- riskModel[intersect(colnames(x),names(riskModel))]
-   }                              
+   }
    else
    {
  stop("Beta coefficient vector does not contain all the genetic variants")
    }
-  
+
  }
- 
+
 else
    {
 stop("'weights' argument should either be a model or a named beta vector" )
@@ -408,7 +408,7 @@
 
 if(Type=="weighted")
 {
-wrs<- y %*% t(x) 
+wrs<- y %*% t(x)
 return(as.vector(wrs))
 }
 else if (Type=="unweighted")
@@ -422,47 +422,47 @@
 }
 #' Function to compute univariate ORs for genetic predictors. The function computes the univariate ORs with 95\% CIs for genetic predictors.
 #'
-#' The function computes the univariate ORs with 95\% CIs for the specified 
-#' genetic variants both per allele and per genotype. The ORs are saved with the data from which they are  
-#' calculated. Genotype frequencies are provided for      
-#' persons with and without the outcome  
-#' of interest. The genotype or allele that is coded as \code{'0'} is considered 
-#' as the reference to computes the ORs. 
-#' 
-#' @param data Data frame or matrix that includes the outcome and 
-#' predictors variables. 
-#' @param cOutcome Column number of the outcome variable. \code{cOutcome=2}  
+#' The function computes the univariate ORs with 95\% CIs for the specified
+#' genetic variants both per allele and per genotype. The ORs are saved with the data from which they are
+#' calculated. Genotype frequencies are provided for
+#' persons with and without the outcome
+#' of interest. The genotype or allele that is coded as \code{'0'} is considered
+#' as the reference to computes the ORs.
+#'
+#' @param data Data frame or matrix that includes the outcome and
+#' predictors variables.
+#' @param cOutcome Column number of the outcome variable. \code{cOutcome=2}
 #' means that the second column of the dataset is the outcome variable.
-#' @param cGenPreds Column numbers of genetic variables for which the ORs 
+#' @param cGenPreds Column numbers of genetic variables for which the ORs
 #' are calculated.
-#' @param filenameGeno Name of the output file in which the univariate ORs  
-#' and frequencies per genotype will be saved. The file is saved in the working directory as 
+#' @param filenameGeno Name of the output file in which the univariate ORs
+#' and frequencies per genotype will be saved. The file is saved in the working directory as
 #' a txt file. When no \code{filenameGeno} is specified, the output is not saved.
-#' @param filenameAllele Name of the output file in which the univariate ORs and 
-#' frequencies per allele will be saved. The file is saved in the working 
+#' @param filenameAllele Name of the output file in which the univariate ORs and
+#' frequencies per allele will be saved. The file is saved in the working
 #' directory as a txt file. When no \code{filenameAllele} is specified, the output is not saved.
-#' 
-#' @return The function returns two different tables. One table contains genotype frequencies 
-#' and univariate ORs with 95\% CIs and the other contains allele frequencies and 
+#'
+#' @return The function returns two different tables. One table contains genotype frequencies
+#' and univariate ORs with 95\% CIs and the other contains allele frequencies and
 #' univariate ORs with 95\% CIs.
 #'
 #'
 #' @keywords manip
-#' 
 #'
-#' @seealso \code{\link{ORmultivariate}}  
+#'
+#' @seealso \code{\link{ORmultivariate}}
 #' @examples
-#' # specify dataset with outcome and predictor variables 
+#' # specify dataset with outcome and predictor variables
 #' data(ExampleData)
 #' # specify column number of the outcome variable
-#'  cOutcome <- 2 
+#'  cOutcome <- 2
 #'  # specify column numbers of genetic predictors
 #'  cGenPreds <- c(11:13,16)
 #'
 #'  # compute univariate ORs
-#'  ORunivariate(data=ExampleData, cOutcome=cOutcome, cGenPreds=cGenPreds, 
+#'  ORunivariate(data=ExampleData, cOutcome=cOutcome, cGenPreds=cGenPreds,
 #' filenameGeno="GenoOR.txt", filenameAllele="AlleleOR.txt")
-#' 
+#'
 "ORunivariate" <- function(data, cOutcome,cGenPreds,filenameGeno, filenameAllele )
   {
   p<- data[,cGenPreds ] # p : A table of Genotype data for all SNP's
@@ -529,7 +529,7 @@
        "CI-Low","CI-high")
 names(dimnames(m1)) <- c("", " Genotype frequencies for Cases & Controls, and OR(95% CI)")
      if (!missing(filenameGeno))
-		 write.table(m1,file=filenameGeno, row.names = FALSE,sep = "\t")
+                 write.table(m1,file=filenameGeno, row.names = FALSE,sep = "\t")
 
        n1 <- as.table(n1)
        dimnames(n1)[[1]] <- c(1:dim(p)[2])
@@ -542,32 +542,32 @@
        p<- list(Genotype =m1,  Allelic=n1)
         return(p)
   }
-#' Function to obtain multivariate odds ratios from a logistic regression model. 
-#' The function estimates multivariate (adjusted) odds ratios (ORs) with    
-#' 95\% confidence intervals (CIs) for all the genetic and non-genetic variables 
-#' in the risk model. 
+#' Function to obtain multivariate odds ratios from a logistic regression model.
+#' The function estimates multivariate (adjusted) odds ratios (ORs) with
+#' 95\% confidence intervals (CIs) for all the genetic and non-genetic variables
+#' in the risk model.
 #'
-#' The function requires that first a logistic regression  
-#' model is fitted either by using \code{GLM} function or the function 
-#' \code{\link{fitLogRegModel}}. In addition to the multivariate ORs, 
-#' the function returns summary statistics of model performance, namely the Brier 
-#' score and the Nagelkerke's \eqn{R^2} value.  
-#' The Brier score quantifies the accuracy of risk predictions by comparing 
-#' predicted risks with observed outcomes at individual level (where outcome 
-#' values are either 0 or 1). The Nagelkerke's \eqn{R^2} value indicates the percentage of variation   
+#' The function requires that first a logistic regression
[TRUNCATED]

To get the complete diff run:
    svnlook diff /svnroot/genabel -r 789


More information about the Genabel-commits mailing list