[Genabel-commits] r1447 - in branches/ProbABEL-pvals/ProbABEL: . doc src
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Tue Dec 3 12:57:37 CET 2013
Author: lckarssen
Date: 2013-12-03 12:57:37 +0100 (Tue, 03 Dec 2013)
New Revision: 1447
Added:
branches/ProbABEL-pvals/ProbABEL/src/main_functions_dump.cpp
branches/ProbABEL-pvals/ProbABEL/src/main_functions_dump.h
Modified:
branches/ProbABEL-pvals/ProbABEL/configure.ac
branches/ProbABEL-pvals/ProbABEL/doc/ChangeLog
branches/ProbABEL-pvals/ProbABEL/doc/INSTALL
branches/ProbABEL-pvals/ProbABEL/doc/ProbABEL_manual.tex
branches/ProbABEL-pvals/ProbABEL/doc/packaging.txt
branches/ProbABEL-pvals/ProbABEL/doc/pacoxph.1
branches/ProbABEL-pvals/ProbABEL/doc/palinear.1
branches/ProbABEL-pvals/ProbABEL/doc/palogist.1
branches/ProbABEL-pvals/ProbABEL/src/Makefile.am
branches/ProbABEL-pvals/ProbABEL/src/command_line_settings.cpp
branches/ProbABEL-pvals/ProbABEL/src/coxph_data.cpp
branches/ProbABEL-pvals/ProbABEL/src/coxph_data.h
branches/ProbABEL-pvals/ProbABEL/src/eigen_mematrix.cpp
branches/ProbABEL-pvals/ProbABEL/src/extract-snp.cpp
branches/ProbABEL-pvals/ProbABEL/src/main.cpp
branches/ProbABEL-pvals/ProbABEL/src/phedata.cpp
branches/ProbABEL-pvals/ProbABEL/src/reg1.cpp
branches/ProbABEL-pvals/ProbABEL/src/regdata.cpp
branches/ProbABEL-pvals/ProbABEL/src/regdata.h
branches/ProbABEL-pvals/ProbABEL/src/usage.cpp
Log:
Merged ProbABEL trunk into the p-values branch (r.1398 -- 1446)
Modified: branches/ProbABEL-pvals/ProbABEL/configure.ac
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/configure.ac 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/configure.ac 2013-12-03 11:57:37 UTC (rev 1447)
@@ -2,7 +2,7 @@
# Process this file with autoconf to produce a configure script.
AC_PREREQ([2.67])
-AC_INIT(ProbABEL, 0.4.1, genabel-devel at r-forge.wu-wien.ac.at)
+AC_INIT(ProbABEL, 0.4.2, genabel-devel at r-forge.wu-wien.ac.at)
AM_INIT_AUTOMAKE([silent-rules])
AM_SILENT_RULES([yes])
AC_CONFIG_SRCDIR([src/data.h])
@@ -31,6 +31,11 @@
# with its own defaults
AC_PROG_CXX
+
+#Tell compiler to build not R version of filevector
+CXXFLAGS+=" -D_NOT_R_FILEVECTOR"
+
+
# Since most of our code is in C++, set that language as the default
# for the subsequent checks
AC_LANG_PUSH([C++])
@@ -154,6 +159,19 @@
AM_CONDITIONAL([BUILD_pacoxph], test "x$pacoxph" = "xyes")
+AC_ARG_ENABLE([extract-snp],
+ [AS_HELP_STRING([--enable-extract-snp], [enable building the
+ extract-snp program. NOTE: this program is not finished yet!])],
+ [extractsnp=yes],
+ [extractsnp=no])
+
+if test "x$extractsnp" = "xyes"; then
+ AC_MSG_NOTICE([building of extract-snp is enabled])
+fi
+
+AM_CONDITIONAL([BUILD_extractsnp], test "x$extractsnp" = "xyes")
+
+
# Files to be generated by autotools
AC_CONFIG_FILES([
Makefile
Modified: branches/ProbABEL-pvals/ProbABEL/doc/ChangeLog
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/doc/ChangeLog 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/doc/ChangeLog 2013-12-03 11:57:37 UTC (rev 1447)
@@ -2,6 +2,9 @@
* Fix bug #4919: Too small reading buffers for long alleles in mach info
and legend files. Thanks to Daniel Taliun for reporting the bug and
providing the patch. Thanks to Xia Shen for testing.
+* Fix bug #4776: Specifying --sep="\t" as an option to pa* doesn't insert
+ a TAB as separator. Now ProbABEL inserts proper tabs when specifying
+ "\t". Thanks to Maksim Struchalin for fixing this bug.
* Improved convergence checks in the Cox PH regression module. The checks
now give similar errors as R does.
* A minor change in the screen output of ProbABEL. Some of the status
@@ -9,6 +12,11 @@
slightly different place in the code to help debugging problems with the
input data.
* Fix a bug in the example scripts: an incorrect shell variable was used.
+* The extract-snp binary is no longer built by default (as it isn't
+ finished yet). Building can be enabled at compile time by giving the
+ --enable-extract-snp option to ./configure.
+* For developers: a start has been made on documenting the internal
+ functions using Doxygen.
***** v.0.4.1 (2013.08.29)
* Fix bug #4854: When using mmscore, there is one (nan) column missing in
Modified: branches/ProbABEL-pvals/ProbABEL/doc/INSTALL
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/doc/INSTALL 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/doc/INSTALL 2013-12-03 11:57:37 UTC (rev 1447)
@@ -47,12 +47,10 @@
this file to probabel_config.cfg.
To see options, run
-
./configure --help
The most notable option is
./configure --prefix=/some/subdirectory
-
to install ProbABEL in that subdirectory. Instead of using
/usr/local/ as install root directory, it installs in /some/subdirectory.
Modified: branches/ProbABEL-pvals/ProbABEL/doc/ProbABEL_manual.tex
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/doc/ProbABEL_manual.tex 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/doc/ProbABEL_manual.tex 2013-12-03 11:57:37 UTC (rev 1447)
@@ -1,6 +1,6 @@
\documentclass[12pt,a4paper]{article}
-\title{Manual for ProbABEL v0.4.1}
+\title{Manual for ProbABEL v0.4.2}
\author{\emph{Current Programmers:} Lennart Karssen$^{1}$, Maarten
Kooyman$^1$, \\
Yurii Aulchenko$^{2}$ \\
@@ -294,7 +294,7 @@
short explanation to the command line options:
\begin{verbatim}
user at server:~$ palogist --help
-probabel v. 0.4.1
+probabel v. 0.4.2
(C) Yurii Aulchenko, Lennart C. Karssen, Maksim Struchalin, EMCR
Using EIGEN version 3.1.2 for matrix operations
@@ -854,7 +854,7 @@
\end{quote}
A proper reference may look like
\begin{quote}
-For the analysis of imputed data, we used the \PA{} v.0.4.1
+For the analysis of imputed data, we used the \PA{} v.0.4.2
from the \texttt{GenABEL} suite of programs (Aulchenko \emph{et al.}, 2010).
\end{quote}
Modified: branches/ProbABEL-pvals/ProbABEL/doc/packaging.txt
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/doc/packaging.txt 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/doc/packaging.txt 2013-12-03 11:57:37 UTC (rev 1447)
@@ -26,6 +26,7 @@
DISTCHECK_CONFIGURE_FLAGS="--disable-palinear" make distcheck
DISTCHECK_CONFIGURE_FLAGS="--disable-palogist" make distcheck
DISTCHECK_CONFIGURE_FLAGS="--disable-pacoxph" make distcheck
+ DISTCHECK_CONFIGURE_FLAGS="--enable-extract-snp" make distcheck
** Building the package for the first time
Now that we verified that the source code builds without problems,
let's create the package.
Modified: branches/ProbABEL-pvals/ProbABEL/doc/pacoxph.1
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/doc/pacoxph.1 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/doc/pacoxph.1 2013-12-03 11:57:37 UTC (rev 1447)
@@ -1,4 +1,4 @@
-.TH pacoxph 1 "29 August 2013"
+.TH pacoxph 1 "25 November 2013"
.SH NAME
pacoxph \- Perform Genome-Wide Association Analysis using a linear model
.SH SYNOPSIS
Modified: branches/ProbABEL-pvals/ProbABEL/doc/palinear.1
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/doc/palinear.1 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/doc/palinear.1 2013-12-03 11:57:37 UTC (rev 1447)
@@ -1,4 +1,4 @@
-.TH palinear 1 "29 August 2013"
+.TH palinear 1 "25 November 2013"
.SH NAME
palinear \- Perform Genome-Wide Association Analysis using a linear model
.SH SYNOPSIS
Modified: branches/ProbABEL-pvals/ProbABEL/doc/palogist.1
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/doc/palogist.1 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/doc/palogist.1 2013-12-03 11:57:37 UTC (rev 1447)
@@ -1,4 +1,4 @@
-.TH palogist 1 "29 August 2013"
+.TH palogist 1 "25 November 2013"
.SH NAME
palogist \- Perform Genome-Wide Association Analysis using a linear model
.SH SYNOPSIS
Modified: branches/ProbABEL-pvals/ProbABEL/src/Makefile.am
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/src/Makefile.am 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/src/Makefile.am 2013-12-03 11:57:37 UTC (rev 1447)
@@ -7,7 +7,7 @@
command_line_settings.h command_line_settings.cpp reg1.h usage.h \
usage.cpp main.cpp utilities.h utilities.cpp phedata.h phedata.cpp \
cholesky.h cholesky.cpp regdata.h regdata.cpp \
- maskedmatrix.cpp maskedmatrix.h reg1.cpp
+ maskedmatrix.cpp maskedmatrix.h reg1.cpp main_functions_dump.h main_functions_dump.cpp
EIGENFILES = eigen_mematrix.h eigen_mematrix.cpp
@@ -30,7 +30,7 @@
include/R_ext/Print.h include/R_ext/Random.h include/R_ext/Utils.h \
include/R_ext/RS.h
-bin_PROGRAMS = extract-snp
+bin_PROGRAMS =
if BUILD_palinear
bin_PROGRAMS += palinear
endif
@@ -46,6 +46,11 @@
bin_PROGRAMS += pacoxph
endif
+if BUILD_extractsnp
+bin_PROGRAMS += extract-snp
+endif
+
+
palinear_SOURCES = $(REGFILES) $(FVSRC) $(FVHEADERS)
palinear_CXXFLAGS = -DLINEAR $(AM_CXXFLAGS)
palinear_CPPFLAGS = $(AM_CPPFLAGS)
Modified: branches/ProbABEL-pvals/ProbABEL/src/command_line_settings.cpp
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/src/command_line_settings.cpp 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/src/command_line_settings.cpp 2013-12-03 11:57:37 UTC (rev 1447)
@@ -130,11 +130,12 @@
return phefilename;
}
+
void cmdvars::set_variables(int argc, char * argv[])
{
int next_option;
const char * const short_options = "p:i:d:m:n:c:o:s:t:g:a:erlhb:vu";
- //b - interaction parameter
+ // b - interaction parameter
// ADD --fv FLAG (FILEVECTOR), IN WHICH CASE USE ALTERNATIVE
// CONSTRUCTOR FOR GENDATA
const struct option long_options[] =
@@ -205,7 +206,14 @@
ngpreds = atoi(optarg);
break;
case 'a':
- sep = optarg;
+ if (std::string(optarg) == std::string("\\t"))
+ {
+ sep = '\t';
+ }
+ else
+ {
+ sep = optarg;
+ }
break;
case 'e':
nohead = 1;
@@ -235,15 +243,17 @@
break;
default:
abort();
- } // end of switch
- } while (next_option != -1);
-} // end of function
+ } // end of switch
+ } while (next_option != -1); // end of while
+} // end of function
+
bool cmdvars::isIsInteractionExcluded() const
{
return is_interaction_excluded;
}
+
void cmdvars::printinfo()
{
print_version();
@@ -267,7 +277,8 @@
}
cerr << endl;
- cout << "One or more required command line options appear to be missing."
+ cout << "One or more required command line options "
+ << "appear to be missing."
<< endl
<< "Run " << program_name
<< " --help for more information on the available options\n";
@@ -347,7 +358,7 @@
if (interaction_excluded != 0)
{
- interaction = interaction_excluded; //ups
+ interaction = interaction_excluded; // ups
is_interaction_excluded = true;
}
if (outfilename.compare("") == 0)
@@ -369,6 +380,7 @@
<< endl;
exit(1);
}
+
if (robust)
{
cerr << "ERROR: robust standard errors not implemented "
Modified: branches/ProbABEL-pvals/ProbABEL/src/coxph_data.cpp
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/src/coxph_data.cpp 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/src/coxph_data.cpp 2013-12-03 11:57:37 UTC (rev 1447)
@@ -55,6 +55,8 @@
strata = obj.strata;
X = obj.X;
order = obj.order;
+ gcount = 0;
+ freq = 0;
masked_data = new unsigned short int[nids];
for (int i = 0; i < nids; i++)
@@ -65,16 +67,25 @@
coxph_data::coxph_data(phedata &phed, gendata &gend, const int snpnum)
{
- nids = gend.nids;
+ freq = 0;
+ gcount = 0;
+ nids = gend.nids;
masked_data = new unsigned short int[nids];
+
for (int i = 0; i < nids; i++)
+ {
masked_data[i] = 0;
+ }
ngpreds = gend.ngpreds;
if (snpnum >= 0)
+ {
ncov = phed.ncov + ngpreds;
+ }
else
+ {
ncov = phed.ncov;
+ }
if (phed.noutcomes != 2)
{
@@ -94,7 +105,7 @@
{
// X.put(1.,i,0);
stime[i] = (phed.Y).get(i, 0);
- sstat[i] = int((phed.Y).get(i, 1));
+ sstat[i] = static_cast<int>((phed.Y).get(i, 1));
if (sstat[i] != 1 && sstat[i] != 0)
{
std::cerr << "coxph_data: status not 0/1 "
@@ -185,48 +196,69 @@
delete[] passed_sorted;
}
-void coxph_data::update_snp(gendata &gend, const int snpnum)
-{
- /**
- * This is the main part of the fix of bug #1846
- * (C) of the fix:
- * UMC St Radboud Nijmegen,
- * Dept of Epidemiology & Biostatistics,
- * led by Prof. B. Kiemeney
- *
- * Note this sorts by "order"!!!
- * Here we deal with transposed X, hence last two arguments are swapped
- * compared to the other 'update_snp'
- * Also, the starting column-1 is not necessary for cox X therefore
- * 'ncov-j' changes to 'ncov-j-1'
- **/
+void coxph_data::update_snp(gendata &gend, const int snpnum) {
+ /**
+ * This is the main part of the fix of bug #1846
+ * (C) of the fix:
+ * UMC St Radboud Nijmegen,
+ * Dept of Epidemiology & Biostatistics,
+ * led by Prof. B. Kiemeney
+ *
+ * Note this sorts by "order"!!!
+ * Here we deal with transposed X, hence last two arguments are swapped
+ * compared to the other 'update_snp'
+ * Also, the starting column-1 is not necessary for cox X therefore
+ * 'ncov-j' changes to 'ncov-j-1'
+ **/
- for (int j = 0; j < ngpreds; j++)
- {
+ // reset counter for frequency since it is a new snp
+ gcount = 0;
+ freq = 0.0;
+
+ for (int j = 0; j < ngpreds; j++) {
double *snpdata = new double[nids];
- for (int i = 0; i < nids; i++)
- {
+ for (int i = 0; i < nids; i++) {
masked_data[i] = 0;
}
gend.get_var(snpnum * ngpreds + j, snpdata);
- for (int i = 0; i < nids; i++)
- {
+ for (int i = 0; i < nids; i++) {
X.put(snpdata[i], (ncov - j - 1), order[i]);
- if (std::isnan(snpdata[i]))
+ if (std::isnan(snpdata[i])) {
masked_data[order[i]] = 1;
- }
+ // snp not masked
+ } else {
+ // check for first predictor
+ if (j == 0) {
+ gcount++;
+ if (ngpreds == 1) {
+ freq += snpdata[i] * 0.5;
+ } else if (ngpreds == 2) {
+ freq += snpdata[i];
+ }
+ } else if (j == 1) {
+ // add second genotype in two predicor data form
+ freq += snpdata[i] * 0.5;
+ }
+ } // end std::isnan(snpdata[i]) snp
+ } // end i for loop
+
delete[] snpdata;
- }
+ } // end ngpreds loop
+
+ freq /= static_cast<double>(gcount); // Allele frequency
}
+/**
+ * update_snp() adds SNP information to the design matrix. This
+ * function allows you to strip that information from X again. This
+ * is used for example when calculating the null model.
+ *
+ */
void coxph_data::remove_snp_from_X()
{
- // update_snp() adds SNP information to the design matrix. This
- // function allows you to strip that information from X again.
- // This is used for example when calculating the null model.
if (ngpreds == 1)
{
@@ -257,9 +289,10 @@
// delete order;
}
+
coxph_data coxph_data::get_unmasked_data()
{
- coxph_data to; // = coxph_data(*this);
+ coxph_data to; // = coxph_data(*this);
// filter missing data
int nmeasured = 0;
@@ -270,6 +303,7 @@
nmeasured++;
}
}
+
to.nids = nmeasured;
to.ncov = ncov;
to.ngpreds = ngpreds;
@@ -311,6 +345,7 @@
return (to);
}
+
coxph_reg::coxph_reg(coxph_data &cdatain)
{
coxph_data cdata = cdatain.get_unmasked_data();
@@ -322,6 +357,7 @@
niter = 0;
}
+
void coxph_reg::estimate(coxph_data &cdatain, const int verbose,
int maxiter, double eps,
double tol_chol, const int model,
@@ -412,7 +448,7 @@
MatrixXd imateigen = imat.data;
VectorXd infs = ueigen.transpose() * imateigen;
VectorXd betaeigen = beta.data;
- if( infs.norm() > eps ||
+ if ( infs.norm() > eps ||
infs.norm() > sqrt(eps) * betaeigen.norm() )
{
cerr << "Warning for " << snpinfo.name[cursnp]
@@ -437,9 +473,7 @@
sebeta[i] = NAN;
beta[i] = NAN;
loglik = NAN;
- }
- else
- {
+ } else {
sebeta[i] = sqrt(imat.get(i, i));
loglik = loglik_int[1];
}
Modified: branches/ProbABEL-pvals/ProbABEL/src/coxph_data.h
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/src/coxph_data.h 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/src/coxph_data.h 2013-12-03 11:57:37 UTC (rev 1447)
@@ -27,10 +27,12 @@
coxph_data()
{
- nids = 0;
- ncov = 0;
- ngpreds = 0;
+ nids = 0;
+ ncov = 0;
+ ngpreds = 0;
masked_data = NULL;
+ gcount = 0;
+ freq = 0;
}
coxph_data(const coxph_data &obj);
@@ -42,6 +44,8 @@
int nids;
int ncov;
int ngpreds;
+ unsigned int gcount;
+ double freq;
mematrix<double> weights;
mematrix<double> stime;
mematrix<int> sstat;
Modified: branches/ProbABEL-pvals/ProbABEL/src/eigen_mematrix.cpp
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/src/eigen_mematrix.cpp 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/src/eigen_mematrix.cpp 2013-12-03 11:57:37 UTC (rev 1447)
@@ -66,19 +66,19 @@
exit(1);
}
int column = i % ncol;
- int row = (int) floor((double) i / ncol);
+ int row = static_cast<int>( floor(static_cast<double>(i / ncol)) );
return data(row, column);
}
-//template<class DT>
-//mematrix<DT> mematrix<DT>::operator+(DT toadd)
-//{
-// mematrix<DT> temp(nrow, ncol);
-// for (int i = 0; i < nelements; i++)
-// temp.data[i] = data[i] + toadd;
-// return temp;
-//}
+// template<class DT>
+// mematrix<DT> mematrix<DT>::operator+(DT toadd)
+// {
+// mematrix<DT> temp(nrow, ncol);
+// for (int i = 0; i < nelements; i++)
+// temp.data[i] = data[i] + toadd;
+// return temp;
+// }
template<class DT>
mematrix<DT> mematrix<DT>::operator+(const mematrix<DT> &M)
@@ -318,14 +318,14 @@
return temp;
}
-//template<class DT>
-//mematrix<double> todouble(mematrix<DT> &M)
-//{
-// mematrix<double> temp(M.nrow, M.ncol);
-// for (int i = 0; i < temp.nelements; i++)
-// temp.data[i] = double(M.data[i]);
-// return temp;
-//}
+// template<class DT>
+// mematrix<double> todouble(mematrix<DT> &M)
+// {
+// mematrix<double> temp(M.nrow, M.ncol);
+// for (int i = 0; i < temp.nelements; i++)
+// temp.data[i] = double(M.data[i]);
+// return temp;
+// }
template<class DT>
mematrix<DT> invert(const mematrix<DT> &M)
@@ -345,14 +345,15 @@
template<class DT>
mematrix<DT> productMatrDiag(const mematrix<DT> &M, const mematrix<DT> &D)
{
- //multiply all rows of M by value of first row of D
+ // multiply all rows of M by value of first row of D
if (M.ncol != D.nrow)
{
std::cerr << "productMatrDiag: wrong dimensions";
exit(1);
}
mematrix<DT> temp = M;
- //make a array of the first row of D in the same way orientation as M.data.row(i).array()
+ // make an array of the first row of D in the same way orientation as
+ // M.data.row(i).array()
Array<DT, Dynamic, Dynamic> row = D.data.block(0, 0, M.ncol, 1).transpose();
for (int i = 0; i < temp.nrow; i++)
Modified: branches/ProbABEL-pvals/ProbABEL/src/extract-snp.cpp
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/src/extract-snp.cpp 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/src/extract-snp.cpp 2013-12-03 11:57:37 UTC (rev 1447)
@@ -80,7 +80,8 @@
string snpname = "";
do
{
- next_option = getopt_long(argc, argv, short_options, long_options, NULL);
+ next_option = getopt_long(argc, argv,
+ short_options, long_options, NULL);
switch (next_option)
{
case 'h':
@@ -101,8 +102,7 @@
case '?': break;
case -1 : break;
}
- }
- while (next_option != -1);
+ } while (next_option != -1);
if (snpname.compare("") == 0)
{
Modified: branches/ProbABEL-pvals/ProbABEL/src/main.cpp
===================================================================
--- branches/ProbABEL-pvals/ProbABEL/src/main.cpp 2013-12-03 10:02:30 UTC (rev 1446)
+++ branches/ProbABEL-pvals/ProbABEL/src/main.cpp 2013-12-03 11:57:37 UTC (rev 1447)
@@ -54,6 +54,7 @@
#include "reg1.h"
#include "command_line_settings.h"
#include "coxph_data.h"
+#include "main_functions_dump.h"
/// Maximum number of iterations we allow the regression to run
#define MAXITER 20
@@ -64,411 +65,8 @@
-/**
- * Send a progress update (a percentage) to stdout so that the user
- * has a rough indication of the percentage of SNPs that has already
- * been completed.
- *
- * @param csnp Number of the SNP that is currently being analysed.
- * @param nsnps Total number of SNPs
- */
-void update_progress_to_cmd_line(int csnp, int nsnps)
-{
- std::cout << setprecision(2) << fixed;
- if (csnp % 1000 == 0)
- {
- if (csnp == 0)
- {
- std::cout << "Analysis: "
- << setw(5)
- << 100. * static_cast<double>(csnp)
- / static_cast<double>(nsnps)
- << "%...";
- }
- else
- {
- std::cout << "\b\b\b\b\b\b\b\b\b"
- << setw(5)
- << 100. * static_cast<double>(csnp)
- / static_cast<double>(nsnps)
- << "%...";
- }
- std::cout.flush();
- }
- std::cout << setprecision(6);
-}
-
/**
- * Open an output file for each model when using probability data
- * (npgreds == 2). This function creates the _2df.out.txt etc. files.
- *
- * @param outfile Vector of output streams
- * @param outfilename_str Basename of the outputfiles.
- */
-void open_files_for_output(std::vector<std::ofstream*>& outfile,
- std::string& outfilename_str)
-{
- //create a list of filenames
- const int amount_of_files = 5;
- std::string filenames[amount_of_files] = {
- outfilename_str + "_2df.out.txt",
- outfilename_str + "_add.out.txt",
- outfilename_str + "_domin.out.txt",
- outfilename_str + "_recess.out.txt",
- outfilename_str + "_over_domin.out.txt" };
-
- for (int i = 0; i < amount_of_files; i++)
- {
- outfile.push_back(new std::ofstream());
- outfile[i]->open((filenames[i]).c_str());
- if (!outfile[i]->is_open())
- {
- std::cerr << "Cannot open file for writing: "
- << filenames[i]
- << "\n";
- exit(1);
- }
- }
-}
-
-int create_phenotype(phedata& phd, cmdvars& input_var)
-{
- phd.set_is_interaction_excluded(input_var.isIsInteractionExcluded());
- phd.setphedata(input_var.getPhefilename(),
- input_var.getNoutcomes(),
- input_var.getNpeople(),
- input_var.getInteraction(),
- input_var.isIscox());
-
- int interaction_cox = input_var.getInteraction();
-#if COXPH
- interaction_cox--;
-#endif
-
- if (input_var.getInteraction() < 0 ||
- input_var.getInteraction() > phd.ncov ||
- interaction_cox > phd.ncov)
- {
- std::cerr << "error: Interaction parameter is out of range "
- << "(interaction="
- << input_var.getInteraction()
- << ") \n";
- exit(1);
- }
-
- return interaction_cox;
-}
-
-
-/**
- * Load the inverse variance-covariance matrix into an InvSigma object.
- *
- * @param input_var Object containing the values of the various
- * command line options.
- * @param phd Object with phenotype data
- * @param invvarmatrix The object of type masked_matrix in which the
- * inverse variance-covariance matrix is returned.
- */
-void loadInvSigma(cmdvars& input_var, phedata& phd, masked_matrix& invvarmatrix)
-{
- std::cout << "You are running mmscore...\n";
- InvSigma inv(input_var.getInverseFilename(), &phd);
- // invvarmatrix = inv.get_matrix();
- //double par = 1.; //var(phd.Y)*phd.nids/(phd.nids-phd.ncov-1);
- invvarmatrix.set_matrix(inv.get_matrix()); // = invvarmatrix * par;
- std::cout << " loaded InvSigma...\n" << std::flush;
-}
-
-
-/**
- * Create the first part of the output file header.
- *
- * \param outfile Vector of output file streams. Contains the streams
- * of the output file(s). One file when using dosage data (ngpreds==1)
- * and one for each genetic model in case probabilities are used
- * (ngpreds==2).
- * \param input_var Object containing the values of the various
- * command line options.
- * \param phd Object with phenotype data
- */
-void create_start_of_header(std::vector<std::ofstream*>& outfile,
- cmdvars& input_var, phedata& phd)
-{
- for (unsigned int i = 0; i < outfile.size(); i++)
- {
- (*outfile[i]) << "name"
- << input_var.getSep()
- << "A1"
- << input_var.getSep()
- << "A2"
- << input_var.getSep()
- << "Freq1"
- << input_var.getSep()
- << "MAF"
- << input_var.getSep()
- << "Quality"
- << input_var.getSep()
- << "Rsq"
- << input_var.getSep()
- << "n"
- << input_var.getSep()
- << "Mean_predictor_allele";
- if (input_var.getChrom() != "-1")
- (*outfile[i]) << input_var.getSep() << "chrom";
- if (input_var.getMapfilename() != NULL)
- (*outfile[i]) << input_var.getSep() << "position";
- }
-
- if (input_var.getAllcov()) //All covariates in output
- {
- for (unsigned int file = 0; file < outfile.size(); file++)
- for (int i = 0; i < phd.n_model_terms - 1; i++)
- *outfile[file] << input_var.getSep()
- << "beta_"
- << phd.model_terms[i]
- << input_var.getSep()
- << "sebeta_"
- << phd.model_terms[i];
- }
-}
-
-
-/**
- * Create the header of the output file(s).
- *
- * \param outfile vector of output file streams. Contains the streams
- * of the output file(s). One file when using dosage data (ngpreds==1)
- * and one for each genetic model in case probabilities are used
- * (ngpreds==2).
- * \param input_var object containing the values of the various
- * command line options.
- * \param phd object with phenotype data
- * \param interaction_cox are we using the Cox model with interaction?
- */
-void create_header(std::vector<std::ofstream*>& outfile, cmdvars& input_var,
- phedata& phd, int& interaction_cox)
-{
- create_start_of_header(outfile, input_var, phd);
-
- if (input_var.getNgpreds() == 1) // dose data: only additive model
- {
- *outfile[0] << input_var.getSep()
- << "beta_SNP_add"
- << input_var.getSep()
- << "sebeta_SNP_add";
-
- if (input_var.getInteraction() != 0)
- {
- *outfile[0] << input_var.getSep()
- << "beta_SNP_"
- << phd.model_terms[interaction_cox]
- << input_var.getSep()
- << "sebeta_SNP_"
- << phd.model_terms[interaction_cox];
- }
-
- if (input_var.getInverseFilename() == NULL)
- {
- //Han Chen
-#if !COXPH
- if (input_var.getInteraction() != 0 && !input_var.getAllcov())
- {
- *outfile[0] << input_var.getSep()
- << "cov_SNP_int_SNP_"
- << phd.model_terms[interaction_cox];
- }
-#endif
- }
- *outfile[0] << input_var.getSep() << "chi2_SNP";
- *outfile[0] << "\n";
- } // ngpreds == 1
- else if (input_var.getNgpreds() == 2) // prob data: all models
- {
- *outfile[0] << input_var.getSep()
- << "beta_SNP_A1A2"
- << input_var.getSep()
- << "sebeta_SNP_A1A2"
- << input_var.getSep()
- << "beta_SNP_A1A1"
- << input_var.getSep()
- << "sebeta_SNP_A1A1";
- *outfile[1] << input_var.getSep()
- << "beta_SNP_addA1"
- << input_var.getSep()
- << "sebeta_SNP_addA1";
- *outfile[2] << input_var.getSep()
- << "beta_SNP_domA1"
- << input_var.getSep()
- << "sebeta_SNP_domA1";
- *outfile[3] << input_var.getSep()
- << "beta_SNP_recA1"
- << input_var.getSep()
- << "sebeta_SNP_recA1";
- *outfile[4] << input_var.getSep()
- << "beta_SNP_odomA1"
- << input_var.getSep()
- << "sebeta_SNP_odomA1";
- if (input_var.getInteraction() != 0)
- {
- //Han Chen
- *outfile[0] << input_var.getSep()
- << "beta_SNP_A1A2_"
- << phd.model_terms[interaction_cox]
- << input_var.getSep()
- << "sebeta_SNP_A1A2_"
- << phd.model_terms[interaction_cox]
- << input_var.getSep()
- << "beta_SNP_A1A1_"
- << phd.model_terms[interaction_cox]
- << input_var.getSep()
- << "sebeta_SNP_A1A1_"
- << phd.model_terms[interaction_cox];
-#if !COXPH
- if (input_var.getInverseFilename() == NULL && !input_var.getAllcov())
- {
- *outfile[0] << input_var.getSep()
- << "cov_SNP_A1A2_int_SNP_"
- << phd.model_terms[interaction_cox]
- << input_var.getSep()
- << "cov_SNP_A1A1_int_SNP_"
- << phd.model_terms[interaction_cox];
- }
-#endif
- //Oct 26, 2009
- for (unsigned int file = 1; file < outfile.size(); file++)
- {
- *outfile[file] << input_var.getSep()
- << "beta_SNP_"
- << phd.model_terms[interaction_cox]
- << input_var.getSep()
- << "sebeta_SNP_"
- << phd.model_terms[interaction_cox];
- //Han Chen
-#if !COXPH
- if (input_var.getInverseFilename() == NULL
- && !input_var.getAllcov())
- {
- *outfile[file] << input_var.getSep()
- << "cov_SNP_int_SNP_"
- << phd.model_terms[interaction_cox];
- }
-#endif
- //Oct 26, 2009
- }
- }
- *outfile[0] << input_var.getSep() << "chi2_SNP_2df\n"; // "loglik\n";
- *outfile[1] << input_var.getSep() << "chi2_SNP_A1\n"; // "loglik\n";
- *outfile[2] << input_var.getSep() << "chi2_SNP_domA1\n";// "loglik\n";
- *outfile[3] << input_var.getSep() << "chi2_SNP_recA1\n";// "loglik\n";
- *outfile[4] << input_var.getSep() << "chi2_SNP_odomA1\n"; // "loglik\n";
- } // End: ngpreds == 2
- else
- {
- cerr << "Error: create_header(): ngpreds != 1 or 2.\n";
- }
-}
-
-
-/**
- * Write the information from the mlinfo file to the output file(s).
- *
- * \param outfile Vector of output file(s)
- * \param file index number identifying the file in the vector of files
- * \param mli mlinfo object
- * \param csnp number of the SNP that is currently being analysed
- * \param input_var object containing the information of the options
- * specified on the command line
- * \param gcount The number of non-NaN genotypes
- * \param freq The allele frequency based on the non-NaN genotypes
- */
-void write_mlinfo(const std::vector<std::ofstream*>& outfile, unsigned int file,
- const mlinfo& mli, int csnp, const cmdvars& input_var,
- int gcount, double freq)
-{
- *outfile[file] << mli.name[csnp]
- << input_var.getSep()
- << mli.A1[csnp]
- << input_var.getSep()
- << mli.A2[csnp]
- << input_var.getSep()
- << mli.Freq1[csnp]
- << input_var.getSep()
- << mli.MAF[csnp]
- << input_var.getSep()
- << mli.Quality[csnp]
- << input_var.getSep()
- << mli.Rsq[csnp]
- << input_var.getSep()
- << gcount
- << input_var.getSep()
- << freq;
- if (input_var.getChrom() != "-1")
- {
- *outfile[file] << input_var.getSep() << input_var.getChrom();
- }
- if (input_var.getMapfilename() != NULL)
- {
- *outfile[file] << input_var.getSep() << mli.map[csnp];
- }
-}
-
-
-/**
[TRUNCATED]
To get the complete diff run:
svnlook diff /svnroot/genabel -r 1447
More information about the Genabel-commits
mailing list