[GenABEL-dev] [Genabel-commits] r1837 - in pkg/OmicABELnoMM: . examples src tests

Yurii Aulchenko yurii.aulchenko at gmail.com
Tue Oct 21 17:33:25 CEST 2014


I second the MPI excitement

Is that the first MPI supporting package from GenA family? Worth the news item?!

Yurii

----------------------
Yurii Aulchenko 
(sent from mobile device)

> On Oct 21, 2014, at 16:01, "L.C. Karssen" <lennart at karssen.org> wrote:
> 
> Hi Alvaro,
> 
>> On 21-10-14 15:30, noreply at r-forge.r-project.org wrote:
>> Author: afrank
>> Date: 2014-10-21 15:30:17 +0200 (Tue, 21 Oct 2014)
>> New Revision: 1837
>> 
>> Modified:
>>   pkg/OmicABELnoMM/ChangeLog
>>   pkg/OmicABELnoMM/configure.ac
>>   pkg/OmicABELnoMM/examples/CreateData.R
>>   pkg/OmicABELnoMM/src/AIOwrapper.cpp
>>   pkg/OmicABELnoMM/src/AIOwrapper.h
>>   pkg/OmicABELnoMM/src/Algorithm.cpp
>>   pkg/OmicABELnoMM/src/Definitions.h
>>   pkg/OmicABELnoMM/src/main.cpp
>>   pkg/OmicABELnoMM/tests/test.cpp
>> Log:
>> Added check for rank of covariate matrix to avoid miss-calculations
>> of results when covariates are linearly dependent. An error is given and
>> the program is exited while letting the user know that covariates need
>> to change. Added MPI functionality for clusters. Screen output happens
>> on process 0 and multiple output files are produced, one per process.
> 
> That is cool news! Glad to have MPI functionality in OAnoMM.
> 
> Please also have a look at the comments below.
> 
> 
> Thanks,
> 
> Lennart.
> 
>> 
>> Modified: pkg/OmicABELnoMM/ChangeLog
>> ===================================================================
>> --- pkg/OmicABELnoMM/ChangeLog 2014-10-17 14:57:34 UTC (rev 1836)
>> +++ pkg/OmicABELnoMM/ChangeLog 2014-10-21 13:30:17 UTC (rev 1837)
>> @@ -3,7 +3,6 @@
>> ---------
>> Required Features:
>> -Add support for multiple files
>> --Add MPI suport 
>> 
>> Optional Features:
>> -Add exclusion lists for single sets of elements of phenotypes
>> @@ -13,6 +12,17 @@
>> -------------
>> -------------
>> 
>> +21-10-2014
>> +--------------
>> +svn commit -m "Added check for rank of covariate matrix to avoid miss-calculations of results when covariates are linearly dependent. An error is given and the program is exited while letting the user know that covariates need to change. Added MPI functionality for clusters. Screen output happens on process 0 and multiple output files are produced, one per process."
>> +Added check for rank of covariate matrix to avoid miss-calculations of 
>> +results when covariates are linearly dependent. 
>> +An error is given and the program is exited while
>> + letting the user know that covariates need to change. 
>> +Added MPI functionality for clusters. 
>> +Screen output happens on process 0 
>> +and multiple output files are produced, one per process. 
>> +
>> 8-10-2014
>> --------------
>> 
>> 
>> Modified: pkg/OmicABELnoMM/configure.ac
>> ===================================================================
>> --- pkg/OmicABELnoMM/configure.ac    2014-10-17 14:57:34 UTC (rev 1836)
>> +++ pkg/OmicABELnoMM/configure.ac    2014-10-21 13:30:17 UTC (rev 1837)
>> @@ -32,15 +32,17 @@
>> # with its own defaults
>> 
>> # Checks for programs.
>> -AC_PROG_CC
>> -AC_PROG_CXX
>> +#AC_PROG_CC
>> +#AC_PROG_CXX
>> +AC_PROG_CXX([mpic++ mpicxx])
>> 
>> +
>> # Check for openMP. If found the OPENMP_CXXFLAGS is set automatically
>> AC_OPENMP
>> AC_SUBST(AM_CXXFLAGS, "$OPENMP_CFLAGS")
>> 
>> #AM_CXXFLAGS="-static  -g -ggdb -I../libs/include -I./libs/include $AM_CXXFLAGS"
>> -AM_CXXFLAGS="-static -std=c++11 -O3 -I../libs/include -I./libs/include $AM_CXXFLAGS"
>> +AM_CXXFLAGS=" -std=c++11 -O3 -I../libs/include -I./libs/include $AM_CXXFLAGS"
>> # Checks for libraries.
>> # pthread library
>> AC_SEARCH_LIBS([pthread_mutex_init], [pthread], [],
>> 
>> Modified: pkg/OmicABELnoMM/examples/CreateData.R
>> ===================================================================
>> --- pkg/OmicABELnoMM/examples/CreateData.R    2014-10-17 14:57:34 UTC (rev 1836)
>> +++ pkg/OmicABELnoMM/examples/CreateData.R    2014-10-21 13:30:17 UTC (rev 1837)
>> @@ -2,11 +2,11 @@
>> 
>> library(DatABEL)
>> 
>> -n = 4000 # number of individuals
>> +n = 2000 # number of individuals
>> l = 3    # number of covariates+1 for intercept
>> int = 3
>> r = 2
>> -m = r*1000 # number of snps
>> +m = r*100000 # number of snps
>> t = 10000  # number of traits
>> var=0.05
>> 
>> 
>> Modified: pkg/OmicABELnoMM/src/AIOwrapper.cpp
>> ===================================================================
>> --- pkg/OmicABELnoMM/src/AIOwrapper.cpp    2014-10-17 14:57:34 UTC (rev 1836)
>> +++ pkg/OmicABELnoMM/src/AIOwrapper.cpp    2014-10-21 13:30:17 UTC (rev 1837)
>> @@ -45,7 +45,10 @@
>>     Fhandler->dosage_skip = 0;
>>     Fhandler->fileR = 0;
>>     Fhandler->add_dosages = false;
>> +    Fhandler->initial_file_pos = 0;
>> +    Fhandler->mpi_id = params.mpi_id;
>> 
>> +    int ar_names_toskip = 0;
>> 
>> 
>>     if(!Fhandler->fakefiles)
>> @@ -82,9 +85,32 @@
>>         Fhandler->fileN = params.n;
>>         realN = params.n;
>> 
>> -
>>         params.m = min((int)(ARfvi->fvi_header.numVariables/params.r),params.limit_m/params.r);
>> 
>> +        if(params.mpi_num_threads > 1 && params.m >= params.mpi_num_threads)
>> +        {
>> +            int m_mpi = params.m/params.mpi_num_threads;
>> +            ar_names_toskip = m_mpi;
>> +            Fhandler->initial_file_pos = (m_mpi*params.mpi_id) * params.r * params.n;
>> +            //cout << Fhandler->initial_file_pos << endl;
>> +            //corner case for non whole divisions of data among mpi procs
>> +            if((params.mpi_num_threads-1) == params.mpi_id)
>> +            {
>> +                //portion+rest
>> +                params.m = m_mpi + (params.m-params.mpi_num_threads* m_mpi);
>> +            }
>> +            else
>> +            {
>> +                params.m = m_mpi;
>> +            }
>> +
>> +            Fhandler->fnameOutFiles += "_mpi";
>> +            Fhandler->fnameOutFiles += std::to_string(params.mpi_id+1);
>> +            Fhandler->fnameOutFiles += "_";
>> +        }
>> +
>> +
>> +
>>         #ifdef DEBUG
>>         cout << "calcM:" << params.m << endl;
>>         #endif
>> @@ -120,16 +146,19 @@
>>             ARidName = string(&(ARfvi->fvi_data[yname_idx]));
>>             if(YidNames.compare(ALidName))
>>             {
> 
> 
> The if statements below are a pain to my eyes :-). Due to the formatting
> (missing indentation and/or missing curly braces) it gives the
> impression that both the cout and the i = statement are carried out if
> the condition is true (or maybe even that nothing happens in the
> if-statement if you take the lack of indentation seriously).
> 
> At the very least I would indent the cout statement, but given that it
> is a long line anyway (which are best avoided), I would add {}s and
> break the cout statement over multiple lines.
> 
>> +                if(params.mpi_id == 0)
>>                 cout << "Warning, ID names between -p file and -c file do not coincide! The files must have the same meaningful order" << endl;
>>                 i = params.n;//exit
>>             }
>>             if(YidNames.compare(ARidName))
>>             {
>> +                if(params.mpi_id == 0)
>>                 cout << "Warning, ID names between -p file and -g file do not coincide! The files must have the same meaningful order" << endl;
>>                 i = params.n;//exit
>>             }
>>             if(ARidName.compare(ALidName))
>>             {
>> +                if(params.mpi_id == 0)
>>                 cout << "Warning, ID names between -g file and -c file do not coincide! The files must have the same meaningful order" << endl;
>>                 i = params.n;//exit
>>             }
>> @@ -203,9 +232,7 @@
>> 
>>     if(!Fhandler->fakefiles)
>>     {
>> -
>>         removeALmissings(Fhandler->excl_List,params,Almissings);
>> -
>>     }
>> 
>> 
>> @@ -319,6 +346,7 @@
>>             string INTidName = string(&(Ifvi->fvi_data[name_idx]));
>>             if(ARidName.compare(INTidName))
>>             {
> 
> See note about if statements above.
> 
> 
>> +                if(params.mpi_id == 0)
>>                 cout << "Warning, ID names between -g file and interactions file do not coincide! The files must have the same meaningful order!" << endl;
>>                 i = params.n;//exit
>>             }
>> @@ -433,7 +461,7 @@
>>                 }
>>             }
>>         }
>> -        if(interaction_missings)
> 
> This if statement is indented as it should be, although I would prefer
> breaking the long line.
> 
>> +        if(interaction_missings && params.mpi_id == 0)
>>             cout << "Excluding " << interaction_missings << " from Interaction missings" << endl;
>>         #ifdef DEBUG
>>         cout << params.mb << " " << params.r << " " << params.m << endl;
>> @@ -462,6 +490,7 @@
>>         //!******nameGATHER****************
>> 
>>         int Aname_idx=Fhandler->fileN*ARfvi->fvi_header.namelength;//skip the names of the rows
>> +        Aname_idx += ARfvi->fvi_header.namelength*Fhandler->fileR*params.mpi_id*ar_names_toskip; // skipnot mine according to mpi
>>         if(Fhandler->use_dosages && Fhandler->add_dosages)
>>         {
>>             for(int i = 0; i < Fhandler->fileM; i++)
>> @@ -563,8 +592,11 @@
>> 
>>         if(params.l > 255 || params.p > 255 || params.n > 65535)//can remove if fixed for output files
>>         {
>> +            if(params.mpi_id == 0)
>> +            {
>>             cout << "Warning, output binary format does not yet support current problem sizes for the provided p, l, r and n." << endl;
>>             cout << "Omitting outputfile." << endl;
>> +            }
>>         }
>> 
>>         if(params.storeBin)
>> @@ -770,9 +802,15 @@
>>     ofstream fp_allResults;
>> 
>>     int y_file_pos = 0;
>> -    int ar_file_pos = 0;
>> 
>> +    int initial_file_pos = Fhandler->initial_file_pos;
>> 
>> +    int ar_file_pos = initial_file_pos;
>> +
>> +
>> +
>> +
>> +
>>     int seq_count;
>>     int max_secuential_write_count= 10;
>> 
>> @@ -1028,12 +1066,12 @@
>> 
>>             if(Fhandler->fakefiles)
>>             {
>> -                fp_Ar.seekg ( 0 ,  ios::beg  );
>> -                fp_Ar.read ((char*)tobeFilled->buff,sizeof(type_precision)*size_buff);
>> +//                fp_Ar.seekg ( 0 ,  ios::beg  );
>> +//                fp_Ar.read ((char*)tobeFilled->buff,sizeof(type_precision)*size_buff);
>> +//
>> +//                re_random_vec(tobeFilled->buff , Fhandler->n * tmp_ar_blockSize*Fhandler->r );
>> +//                re_random_vec_nan(tobeFilled->buff , Fhandler->n * tmp_ar_blockSize*Fhandler->r );
>> 
>> -                re_random_vec(tobeFilled->buff , Fhandler->n * tmp_ar_blockSize*Fhandler->r );
>> -                re_random_vec_nan(tobeFilled->buff , Fhandler->n * tmp_ar_blockSize*Fhandler->r );
>> -
>>             }
>>             else
>>             {
>> @@ -1220,8 +1258,8 @@
>>             if(Fhandler->Ar_to_readSize <= 0)
>>             {
>>                 Fhandler->Ar_to_readSize = Fhandler->Ar_Amount;
>> -                ar_file_pos = 0;
>> -                fp_Ar.seekg(0, ios::beg);
>> +                ar_file_pos = initial_file_pos;
>> +                fp_Ar.seekg(ar_file_pos, ios::beg);
>> 
>>                 if(fp_Ar.fail())
>>                 {
>> @@ -1864,7 +1902,7 @@
>>     }
>> 
>> 
>> -    if(Almissings > 0)
>> +    if(Almissings > 0 && (params.mpi_id == 0))
>>         cout << "Excluding " << Almissings << " from covariate missings" << endl;
>> 
>>     delete tempAL;
>> @@ -2067,7 +2105,8 @@
>> 
>>     int first,second;
>> 
>> -     cout << "Excluding Ids: \n";
>> +    if(Fhandler->mpi_id == 0)
>> +        cout << "Excluding Ids: \n";
>> 
>>       while (std::getline(fp_exL, line) && second < n )
>>       {
>> @@ -2076,6 +2115,7 @@
>>             iss >> first;
>> 
>>             iss >> second;
>> +            if(Fhandler->mpi_id == 0)
>>             cout << first << "-" << second << ", ";
>>             if(first > second)
>>             {
>> @@ -2108,6 +2148,7 @@
>>        cout << "Total Ids: " << n << "\nExcluded Ids: " << excl_count << endl;
>>        exit( 1 );
>>    }
>> +    if(Fhandler->mpi_id == 0)
>>     cout << "Excluded: "  << excl_count << " Using: " << n-excl_count<< endl;
>> 
>> 
>> 
>> Modified: pkg/OmicABELnoMM/src/AIOwrapper.h
>> ===================================================================
>> --- pkg/OmicABELnoMM/src/AIOwrapper.h    2014-10-17 14:57:34 UTC (rev 1836)
>> +++ pkg/OmicABELnoMM/src/AIOwrapper.h    2014-10-21 13:30:17 UTC (rev 1837)
>> @@ -37,8 +37,10 @@
>>     string fnameOutFiles;
>>     string fname_dosages;
>> 
>> +    int mpi_id;
>> 
>> 
>> +
>>     list< pair<int,int> >* excl_List;
>> 
>> 
>> @@ -74,6 +76,9 @@
>>     queue<type_buffElement*> ar_empty_buffers;
>>     queue<type_buffElement*> ar_full_buffers;
>> 
>> +    //MPI related vars
>> +    int initial_file_pos;
>> +
>>     int index;
>>     int fileN;
>>     int fileR;
>> @@ -198,6 +203,7 @@
>>         int realN;
>> 
>> 
>> +
>>     protected:
>> 
>>     private:
>> 
>> Modified: pkg/OmicABELnoMM/src/Algorithm.cpp
>> ===================================================================
>> --- pkg/OmicABELnoMM/src/Algorithm.cpp    2014-10-17 14:57:34 UTC (rev 1836)
>> +++ pkg/OmicABELnoMM/src/Algorithm.cpp    2014-10-21 13:30:17 UTC (rev 1837)
>> @@ -401,6 +401,8 @@
>>     params.threads = 1;
>>     params.r = 1;
>>     params.model = -1;
>> +    params.mpi_id = 0;
>> +    params.mpi_num_threads = 1;
>> 
>>     params.use_interactions = false;
>>     params.keep_depVar = false;
>> @@ -491,11 +493,12 @@
>> 
>> 
>>     lda = n;
>> -    if(!params.ForceCheck )
>> +    if(!params.ForceCheck && params.mpi_id == 0)
>>     {
>>         cout << endl;
>> -    }
>> -    for (int j = 0; j < y_iters && !params.ForceCheck; j++)
>> +    }
>> +
>> +    for (int j = 0; j < y_iters && !params.ForceCheck && params.mpi_id == 0; j++)
>>     {
>>         if (!params.ForceCheck &&  ((y_iters < 10 && y_iters > 2 )|| (y_iters >= 10 && (j%(y_iters/10)) == 0 )) )
>>         {
>> @@ -514,7 +517,7 @@
>>         }
>>     }
>> 
>> -    if(!params.ForceCheck)
>> +    if(!params.ForceCheck && params.mpi_id == 0)
>>         cout << endl;
>> 
>>     //add memalign
>> @@ -611,22 +614,52 @@
>>     out.acc_stl += ticks2sec(end_tick,start_tick2);
>> 
>> 
>> +    //linear dep of columns check
> 
> Have you ever considered using C++ vectors instead of arrays? They save
> you the trouble of new/delete, for example. I have no idea if there is a
> performance hit, however. But I can imagine that combined with iterators
> they may be well-optimised.
> Or do you simply use them because LAPACK needs arrays anyway?
> 
>> +    float* singular_vals = new float[l];
>> +    float* u = new float[n*n];
>> +    float* vt = new float[l*l];
>> +    float* superb = new float[l-1];
>> +//    for (int j = 0; j < n; j++)
>> +//    {
>> +//        AL[j] = AL[j+n];
>> +//    }
>> 
>> 
>> +    info = LAPACKE_sgesvd(STORAGE_TYPE, 'N','N',n,l,AL,n,singular_vals,u,n,vt,l,superb);info++;
>> 
>> +    for (int j = 0; j < l; j++)
>> +    {
>> +        if(singular_vals[j] < 0.000001)
>> +        {
>> +            if(params.mpi_id == 0)
>> +            {
>> +                cout << endl;
>> +                cout << "Error, the covariate matrix contains linearly dependent columns. Please review and change it accordingly." << singular_vals[j];
>> +                cout << "Make sure there are no duplciates or closesly realted covariates (complement).";
>> +            }
>> +            exit(1);
>> +        }
>> +    }
>> 
>> +    delete []singular_vals;
>> +    delete []u;
>> +    delete []vt;
>> +    delete []superb;
>> +
>> 
>> 
>>     float* Y;
>> 
>> -    // printf("\n\n%%Computations\n%%");
>> +    // printf("\n\n%%Computations\n%%");
>> +
>> +    copy_vec(backupAL, AL, n*l);
>> 
>> 
>>     get_ticks(start_tick);
>> 
>>     for (int j = 0; j < y_iters; j++)
>>     {
>> -        if (!params.ForceCheck &&  ((y_iters < 10 && y_iters > 2 )|| (y_iters >= 10 && (j%(y_iters/10)) == 0 )) )
>> +        if (!params.ForceCheck && params.mpi_id == 0 && ((y_iters < 10 && y_iters > 2 )|| (y_iters >= 10 && (j%(y_iters/10)) == 0 )) )
>>         {
>>             cout << AIOfile.io_overhead << flush;
>>             AIOfile.io_overhead = "*";
>> @@ -685,7 +718,7 @@
>>         for (int i = 0; i < a_iters; i++)
>>         {
>> 
>> -            if (!params.ForceCheck && y_iters <= 2 &&
>> +            if (!params.ForceCheck && y_iters <= 2 && params.mpi_id == 0 &&
>>                 (  (a_iters >= 10 && (i%(a_iters/(10))) == 0) || (a_iters < (10)) ))
>>             {
>>                 cout << AIOfile.io_overhead << flush;
>> 
>> Modified: pkg/OmicABELnoMM/src/Definitions.h
>> ===================================================================
>> --- pkg/OmicABELnoMM/src/Definitions.h    2014-10-17 14:57:34 UTC (rev 1836)
>> +++ pkg/OmicABELnoMM/src/Definitions.h    2014-10-21 13:30:17 UTC (rev 1837)
>> @@ -27,7 +27,8 @@
>> #include <omp.h>
>> #include <pthread.h>
>> #include <utility>
>> -#include <iomanip>
>> +#include <iomanip>
>> +#include <mpi.h>
>> 
>> #ifdef WINDOWS
>>     #include <windows.h>
>> @@ -170,7 +171,9 @@
>>     int mb;
>>     int id;
>> 
>> -    int threads;
>> +    int threads;
>> +    int mpi_id;
>> +    int mpi_num_threads;
>> 
>>     bool use_fake_files;
>> 
>> 
>> Modified: pkg/OmicABELnoMM/src/main.cpp
>> ===================================================================
>> --- pkg/OmicABELnoMM/src/main.cpp    2014-10-17 14:57:34 UTC (rev 1836)
>> +++ pkg/OmicABELnoMM/src/main.cpp    2014-10-21 13:30:17 UTC (rev 1837)
>> @@ -18,39 +18,41 @@
>>                         -d <0.0~1.0> -r <-10.0~1.0> -b -s <0.0~1.0>  -e <-10.0~1.0> -i -f";
>> 
>> string helpcmd_expl =
> 
> 
> Note that you can use the information generated by autotools for version
> info. If you include the config.h file that is generated you can do it
> like we did in ProbABEL, for example in src/usage.cpp:
> 
> void print_version(void) {
>    cout << PACKAGE
>         << " v. " << PACKAGE_VERSION
>         << "\n(C) Yurii Aulchenko, Lennart C. Karssen, Maarten Kooyman, "
>         << "Maksim Struchalin, The GenABEL team, EMC Rotterdam\n\n";
>    cout << "Using EIGEN version " << EIGEN_WORLD_VERSION
>         << "." << EIGEN_MAJOR_VERSION << "." << EIGEN_MINOR_VERSION
>         << " for matrix operations\n";
> }
> 
> The PACKAGE and PACKAGE_VERSION variables comes from config.h, and are
> generated by autotools. OmicABELnoMM's src/config.h contains them as well.
> 
> 
>> -"omicabelnomm Version 0.95b \n\t\
>> +"omicabelnomm Version 0.96b \n\t\
>> Required: \n\t\
>> --p --phe    \t <path/filename> to the inputs containing phenotypes \n\t\
>> --g --geno   \t <path/filename> to the inputs containing genotypes \n\t\
>> --c --cov    \t <path/filename> to the inputs containing covariates \n\t\
>> --o --out    \t <path/filename> to store the output to (used for all .txt and .ibin & .dbin) \n\n\
>> +-p --phe    \t <path/filename> to the inputs containing phenotypes. \n\t\
>> +-g --geno   \t <path/filename> to the inputs containing genotypes. \n\t\
>> +-c --cov    \t <path/filename> to the inputs containing covariates. \n\t\
>> +-o --out    \t <path/filename> to store the output to (used for all .txt and .ibin & .dbin). \n\n\
>> Optional: \n\t\
>> --n --ngpred \t <#SNPcols> Number of columns in the geno file that represent a single SNP \n\t\
>> --t --thr    \t <#CPUs> Number of computing threads to use to speed computations \n\t\
>> --x --excl   \t <path/filename> file containing list of individuals to exclude from input files, (see example file) \n\t\
>> +-n --ngpred \t <#SNPcols> Number of columns in the geno file that represent a single SNP. \n\t\
>> +-t --thr    \t <#CPUs> Number of computing threads to use to speed computations. Recommended is 4-8 per node (see MPI). \n\t\
>> +-x --excl   \t <path/filename> file containing list of individuals to exclude from input files, (see example file). \n\t\
>> -d --pdisp  \t <0.0~1.0> Value to use as maximum threshold for significance.\n\t\
>> -\t\t Results with P-values UNDER this threshold will be displayed in the putput .txt file \n\t\
>> +\t\t Results with P-values UNDER this threshold will be displayed in the putput .txt file. \n\t\
>> -r --rdisp  \t <-10.0~1.0> Value to use as minimum threshold for R2. \n\t\
>> -\t\t Results with R2-values ABOVE this threshold will be displayed in the putput .txt file \n\t\
>> --b --stobin \t Flag that forces to ALSO store results in a smaller binary format (*.ibin & *.dbin) \n\t\
>> --s --psto   \t <0.0~1.0>  Results with P-values UNDER this threshold will be displayed in the putput binary files \n\t\
>> --e --rsto   \t <-10.0~1.0> Results with R2-values ABOVE this threshold will be stored in the putput binary files \n\t\
>> --i --fdcov  \t Flag that forces to include covariates as part of the results that are stored in .txt and binary files \n\t\
>> +\t\t Results with R2-values ABOVE this threshold will be displayed in the putput .txt file. \n\t\
>> +-b --stobin \t Flag that forces to ALSO store results in a smaller binary format (*.ibin & *.dbin). \n\t\
>> +-s --psto   \t <0.0~1.0>  Results with P-values UNDER this threshold will be displayed in the putput binary files. \n\t\
>> +-e --rsto   \t <-10.0~1.0> Results with R2-values ABOVE this threshold will be stored in the putput binary files. \n\t\
>> +-i --fdcov  \t Flag that forces to include covariates (when its genotype is significant) as part of the results stored \n\t\
>> -f --fdgen  \t Flag that forces to consider all included results (causes the analisis to ignores ALL threshold values). \n\t\
>> --j --additive  \t Flag that runs the analisis with an Additive Model with (2*AA,1*AB,0*BB) effects \n\t\
>> --k --dominant  \t Flag that runs the analisis with an Dominant Model with (1*AA,1*AB,0*BB) effects \n\t\
>> --l --recessive \t Flag that runs the analisis with an Recessive Model with (1*AA,0*AB,0*BB) effects \n\t\
>> +-j --additive  \t Flag that runs the analisis with an Additive Model with (2*AA,1*AB,0*BB) effects. \n\t\
>> +-k --dominant  \t Flag that runs the analisis with an Dominant Model with (1*AA,1*AB,0*BB) effects. \n\t\
>> +-l --recessive \t Flag that runs the analisis with an Recessive Model with (1*AA,0*AB,0*BB) effects. \n\t\
>> -z --mylinear \t <path/filename> to read Factors 'f_i' for a Custom Linear Model with f1*X1,f2*X2,f3*X3...fn*X_ngpred as effects,\n\t\
>>               \t each column of each independent variable will be multiplied with the specified factors. \n\t\
>> -              \t Formula: y~alpha*cov + beta_1*f1*X1 + beta_2*f2*X2 +...+ beta_n*fn*Xn, (see example files!) \n\t\
>> +              \t Formula: y~alpha*cov + beta_1*f1*X1 + beta_2*f2*X2 +...+ beta_n*fn*Xn, (see example files!). \n\t\
>> -y --myaddit  \t <path/filename> to read Factors 'f_i' for a Custom Additive Model with (f1*X1,f2*X2,f3*X3...fn*X_ngpred) as effects,\n\t\
>>               \t each column of each independent variable will be multiplied with the specified factors and then added together. \n\t\
>> -              \t Formula: y~alpha*cov + beta*(f1*X1 + f2*X2 +...+ fn*Xn), (see example files!) \n\t\
>> --v --simpleinter <path/filename> to read the interactions from; for single analysis using multile interactions \n\t\
>> --w --multinter \t <path/filename> to read the interactions from; for multiple analysis using single interaction per analysis \n\t\
>> +              \t Formula: y~alpha*cov + beta*(f1*X1 + f2*X2 +...+ fn*Xn), (see example files!). \n\t\
>> +-v --simpleinter <path/filename> to read the interactions from; for single analysis using multile interactions. \n\t\
>> +-w --multinter \t <path/filename> to read the interactions from; for multiple analysis using single interaction per analysis. \n\t\
>> -u --keepinter \t Flag that sets if the interaction analysis chose is to too keep the dependent variable X.\n\t\
>> -            \t If set, Formula: y~alpha*cov + beta_1*INT*X + beta_2*X, (see example files!) \n\t\
>> -            \t Default not set, Formula: y~alpha*cov + beta_1*INT*X, (see example files!) \n\t\
>> +            \t If set, Formula: y~alpha*cov + beta_1*INT*X + beta_2*X, (see example files!). \n\t\
>> +            \t Default not set, Formula: y~alpha*cov + beta_1*INT*X, (see example files!). \n\t\n\t\
>> +            \t Support for MPI is available. Simply use mpirun -np <#nodes> omicabelnomm <params> on an Open-MPI enabled computer/cluster.\n\t\
>> +            \t Recommended is to use MPI when dealing with problems with over 2000 genotypes, at a rate of 1 node per 2000 genotypes.\n\t\
>> ";
>> 
>> 
>> @@ -128,7 +130,8 @@
>>             break;
>> 
>>         if (!optarg && (c != 'f' && c != 'b' && c != 'i' && c != 'h'))
>> -        {
>> +        {
> 
> 
> Below are a couple of confusing if statement again.
> 
> 
> 
>> +            if(params.mpi_id == 0)
>>             cout << "\nerror with argument parameter " << (char)c << endl;
>>             exit(1);
>>         }
>> @@ -152,7 +155,7 @@
>>             pos = string(optarg).find(".");
>>             if (pos != string::npos)
>>                 params.fnameY = string(optarg).substr(0, pos);
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-p Reading phenotypes from file " << optarg << endl;
>>             break;
>> 
>> @@ -163,13 +166,14 @@
>>            pos = string(optarg).find(".");
>>             if (pos != string::npos)
>>                 params.fnameAR = string(optarg).substr(0, pos);
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-g Reading with genotype data from file " << optarg << endl;
>>             break;
>> 
>>          case 'n':
>>             params.r = atoi(optarg);
>> -            params.r = max(params.r, 1);
>> +            params.r = max(params.r, 1);
>> +            if(params.mpi_id == 0)
>>             cout << "-n Using columns per snp as " << params.r << endl;
>>             break;
>> 
>> @@ -180,7 +184,7 @@
>>             pos = string(optarg).find(".");
>>             if (pos != string::npos)
>>                 params.fnameAL = string(optarg).substr(0, pos);
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-c Reading covariates from file " << optarg << endl;
>>             break;
>> 
>> @@ -191,98 +195,99 @@
>>             pos = string(optarg).find(".");
>>             if (pos != string::npos)
>>                 params.fnameOutFiles = string(optarg).substr(0, pos);
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-o Writing output files to " << optarg << endl;
>>             break;
>> 
>>         case 't':
>>             params.threads = atoi(optarg);
>> -            params.threads = min(max(params.threads, 1), 32);
>> +            params.threads = min(max(params.threads, 1), 32);
>> +            if(params.mpi_id == 0)
>>             cout << "-t Using " << params.threads << " CPU threads for parallelism" <<  endl;
>>             break;
>> 
>>          case 'x':
>>             params.fname_excludelist = string(optarg);
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-x Excluding ids on " << params.fname_excludelist << endl;
>>             break;
>> 
>> 
>>          case 'd':
>>             params.minPdisp = fabs(atof(optarg));
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-d Significance to display in .txt will be P-val's below " << params.minPdisp << endl;
>>             break;
>> 
>>         case 's':
>>             params.minPstore = fabs(atof(optarg));
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-s Significance to store in .bin will be P-val's below " << params.minPstore << endl;
>>             break;
>> 
>>         case 'r':
>>             params.minR2disp = (atof(optarg));
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-r Minimum R2 to display in .txt will be above " << params.minR2disp << endl;
>>             break;
>> 
>>         case 'e':
>>             params.minR2store =  (atof(optarg));
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-e Minimum R2 to store in .bin will be above " << params.minR2store << endl;
>>             break;
>> 
>>         case 'i':
>>             params.disp_cov = true;
>> -
>> -            cout << "-i Covariate results will be included in results" << endl;
>> +            if(params.mpi_id == 0)
>> +            cout << "-i Covariate results (significant) will be included in results whenever their respective snps are also significant" << endl;
>>             break;
>> 
>>         case 'f':
>>             params.storePInd = true;
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-f Forcing all included results to be considered independently of max P-val or min R2. (SLOW!)"<< endl;
>>             break;
>> 
>>         case 'j':
>>             params.model = 0;
>>             params.dosages = true;
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-j Using Additive Model with (2*AA,1*AB,0*BB) effects"<< endl;
>>             break;
>> 
>>         case 'k':
>>             params.model = 1;
>>             params.dosages = true;
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-k Using Dominant Model with (1*AA,1*AB,0*BB) effects"<< endl;
>>             break;
>> 
>>         case 'l':
>>             params.model = 2;
>>             params.dosages = true;
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-j Using Recessive Model with (1*AA,0*AB,0*BB) effects"<< endl;
>>             break;
>> 
>>         case 'z':
>>             params.model = 3;
>>             params.dosages = true;
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-z Using Custom Linear Model with parameters read from the file "<< params.fname_dosages << endl;
>>             break;
>> 
>>         case 'y':
>>             params.model = 4;
>>             params.dosages = true;
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-z Using Custom Additive Model with parameters read from the file "<< params.fname_dosages << endl;
>>             break;
>> 
>>         case 'b':
>>             params.storeBin = true;
>> 
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-b Results will be stored in binary format too"<< endl;
>>             break;
>> 
>> @@ -291,7 +296,7 @@
>>             params.use_interactions = true;
>>             params.keep_depVar = true;
>>             params.use_multiple_interaction_sets = false;
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-v File containing single interactions " << params.fname_interactions << endl;
>>             break;
>> 
>> @@ -301,13 +306,14 @@
>>             params.use_interactions = true;
>>             params.keep_depVar = true;
>>             params.use_multiple_interaction_sets = true;
>> -
>> +            if(params.mpi_id == 0)
>>             cout << "-w File containing multiple interactions " << params.fname_interactions << endl;
>>             break;
>> 
>> 
>>         case 'u':
>> -            params.keep_depVar = true;
>> +            params.keep_depVar = true;
>> +            if(params.mpi_id == 0)
>>             cout << "-u Keeping independent variable for interaction analysis " <<  endl;
>>             break;
>> 
>> @@ -356,16 +362,31 @@
>> 
>> int main(int argc, char *argv[] )
>> {
>> -
>> +
>>     struct Settings params;
>>     Algorithm alg;
>> 
>> -
>>     //!default params
>>     alg.applyDefaultParams(params);
>> 
>> +    //!MPI
>> +    int size, rank;
>> +    MPI_Init(&argc, &argv);
>> +    MPI_Comm_size(MPI_COMM_WORLD, &size);
>> +    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>> +    //printf("SIZE = %d RANK = %d\n",size,rank);
>> +    params.mpi_id = rank;
>> +    params.mpi_num_threads = size;
>> +
>> +
>> 
>> +
>> +
>> +
>> +
>> +
>> 
>> +
>>     parse_params(argc, argv, params);
>> 
>> 
>> @@ -375,26 +396,27 @@
>> 
>>     omp_set_num_threads(params.threads);
>>     blas_set_num_threads(params.threads);
>> -    cout << "Omicabelnomm Version 0.95b \n\t\";
> 
> And one for the config.h variables.
> 
> 
> 
>> +    if(params.mpi_id == 0)
>> +        cout << "Omicabelnomm Version 0.96b \n";
>> 
>> 
>>     //params.use_fake_files = true;
>> -    if (params.use_fake_files)
>> -    {
>> -        int multiplier = 1000;
>> +//    if (params.use_fake_files)
>> +//    {
>> +//        int multiplier = 1000;
>> +//
>> +//        params.n = 4 * multiplier;
>> +//        params.l = 3;
>> +//        params.r = 1;
>> +//
>> +//        params.t  = 1 * multiplier;
>> +//        params.tb = 1 * multiplier;
>> +//
>> +//        params.m  = 1 * multiplier;
>> +//        params.mb = 1 * multiplier;
>> +//    }
>> 
>> -        params.n = 4 * multiplier;
>> -        params.l = 3;
>> -        params.r = 1;
>> 
>> -        params.t  = 1 * multiplier;
>> -        params.tb = 1 * multiplier;
>> -
>> -        params.m  = 1 * multiplier;
>> -        params.mb = 1 * multiplier;
>> -    }
>> -
>> -
>>     if(!help_request)
>>     {
>>         struct Outputs out = {0};
>> @@ -409,7 +431,10 @@
>>         cout << endl << "Number of Significant Results: " << out.total_sig_results <<  endl;
>>     }
>> 
>> -    cout << endl;
>> +    cout << endl;
>> +
>> +    //!MPI
>> +    MPI_Finalize();
>> 
>>     return 0;
>> }
>> 
>> Modified: pkg/OmicABELnoMM/tests/test.cpp
>> ===================================================================
>> --- pkg/OmicABELnoMM/tests/test.cpp    2014-10-17 14:57:34 UTC (rev 1836)
>> +++ pkg/OmicABELnoMM/tests/test.cpp    2014-10-21 13:30:17 UTC (rev 1837)
>> @@ -81,6 +81,12 @@
>> {
>>     struct Settings params;
>> 
>> +    //!MPI
>> +    int size, rank;
>> +    MPI_Init(&argc, &argv);
>> +    MPI_Comm_size(MPI_COMM_WORLD, &size);
>> +    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>> +    //printf("SIZE = %d RANK = %d\n",size,rank);
>> 
>> 
>>     omp_set_nested(false);
>> @@ -92,7 +98,7 @@
>> 
>> 
>> 
>> -    int max_threads = 2;
>> +    int max_threads = 1;
>> 
>>     params.threads = max_threads;
>> 
>> @@ -132,7 +138,10 @@
>>     params.minR2disp = 0.000001;
>>     params.disp_cov = true;
>> 
>> +    params.mpi_id = rank;
>> +    params.mpi_num_threads = size;
>> 
>> +
>>     params.r = 2;
>>     params.fnameOutFiles="examples/results/normal";
>>     params.fnameAL="examples/XL";
>> @@ -144,8 +153,11 @@
>>     alg.solve(params, out, P_NEQ_B_OPT_MD);
>>     print_output(out, gemm_gflopsPsec);
>> 
>> -    exit(0);
>> +    //!MPI
>> +    MPI_Finalize();
>> 
>> +    //exit(0);
>> +
>>      cout << "\nInteraction Tests\n" << flush;
>>     //!-------------------------------------
>>     //!-------------------------------------
>> @@ -282,7 +294,7 @@
>>     }
>> 
>> 
>> -
>> +    exit(0);
>>     //!-------------------------------------
>>     alg.applyDefaultParams(params);
>>     memset(&out, 0, sizeof(out));
>> 
>> _______________________________________________
>> Genabel-commits mailing list
>> Genabel-commits at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits
> 
> 
> 
> Thanks a lot for your effort,
> 
> Lennart.
> 
> 
> -- 
> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
> L.C. Karssen
> Utrecht
> The Netherlands
> 
> lennart at karssen.org
> http://blog.karssen.org
> GPG key ID: A88F554A
> -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
> 
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel


More information about the genabel-devel mailing list