From sharapovsodbo at gmail.com Mon Jul 1 09:51:04 2013 From: sharapovsodbo at gmail.com (=?KOI8-R?B?88/Ews8g+8HSwdDP1w==?=) Date: Mon, 1 Jul 2013 14:51:04 +0700 Subject: [GenABEL-dev] bug_in_OmicABEL_reshuffle_fixed Message-ID: Dear all! I fixed bug in OmicABEL_reshuffle. This bug was only for big data. The reason is, that for big output data value of tile_coordinate is higher, than max(int). For example: for data with 1080 ids and 122756 SNPs max(tile_coordinate)=1080(ids) * 122756(SNPs) * 8 (sizeof(double)) * 5 (columns:beta_1,se_1,beta_SNP,se_SNP, etc) = 5 303 059 200 max(int) = 2 147 483 647 max(unsigned int) = 4 294 967 295 This values is lower than max(tile_coordinate). That's why tile_coordinates for a half of data were incorrect and senseless. So, the solution of this problem is change type of variabels for tile_coordinates: I select int64_t instead of int. max (int64_t)= 9,223,372,036,854,775,808. I think this is enough!=) Now, "reshuffle" works with big data correctly. Compilation for Linux and Windows was succesful. -- *_________________________________* * *With best regards Sodbo Zh. Sharapov Phone: +79831347688 Email: sharapovsodbo at gmail.com sharapov at bionet.nsc.ru Skype: sharapovsodbo -------------- next part -------------- An HTML attachment was scrubbed... URL: From lennart at karssen.org Mon Jul 1 11:06:08 2013 From: lennart at karssen.org (L.C. Karssen) Date: Mon, 01 Jul 2013 11:06:08 +0200 Subject: [GenABEL-dev] bug_in_OmicABEL_reshuffle_fixed In-Reply-To: References: Message-ID: <51D14680.4000908@karssen.org> Thanks ?????, good work! I've got a similar feature request/bug report for ProbABEL, do you know what the effect of going from unsigned int to int64 will be on memory usage? In the case of ProbABEL it is mostly about the counters for SNPs and samples/IDs, so my guess is that it wouldn't be much of an increase (only a few extra bits for those counters); all the allelic dosages/probabilities are stored as doubles, so that won't change. Of course going from unsigned int to in64 will mean people can load more data at the same time, but in my opinion it is their responsibility to have enough free memory (if they don't have that ProbABEL will fail with an allocation error). Thanks, Lennart. On 01-07-13 09:51, ????? ??????? wrote: > Dear all! > I fixed bug in OmicABEL_reshuffle. > This bug was only for big data. The reason is, that for big output data > value of tile_coordinate is higher, than max(int). > For example: for data with 1080 ids and 122756 SNPs > max(tile_coordinate)=1080(ids) * 122756(SNPs) * 8 (sizeof(double)) * 5 > (columns:beta_1,se_1,beta_SNP,se_SNP, etc) = 5 303 059 200 > max(int) = 2 147 483 647 > max(unsigned int) = 4 294 967 295 > This values is lower than max(tile_coordinate). That's why > tile_coordinates for a half of data were incorrect and senseless. > So, the solution of this problem is change type of variabels for > tile_coordinates: I select int64_t instead of int. > max (int64_t)= 9,223,372,036,854,775,808. I think this is enough!=) > Now, "reshuffle" works with big data correctly. Compilation for Linux > and Windows was succesful. > -- > ___________________________________ > _ > _With best regards > > Sodbo Zh. Sharapov > Phone: +79831347688 > Email: sharapovsodbo at gmail.com > sharapov at bionet.nsc.ru > Skype: sharapovsodbo > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- ----------------------------------------------------------------- L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org Stuur mij aub geen Word of Powerpoint bestanden! Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html ------------------------------------------------------------------ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 230 bytes Desc: OpenPGP digital signature URL: From yurii.aulchenko at gmail.com Mon Jul 1 13:59:32 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Mon, 1 Jul 2013 13:59:32 +0200 Subject: [GenABEL-dev] [Genabel-commits] r1264 - in pkg/OmicABEL: . doc src src/float2double In-Reply-To: <20130701085630.EE01B18070E@r-forge.r-project.org> References: <20130701085630.EE01B18070E@r-forge.r-project.org> Message-ID: <5083713419523740697@unknownmsgid> Diego, thanks for reacting so quickly and arranging the float2double converter for filevector files! Two questions/suggestions: 1) I wonder if float2double is a good name - could that be the name is already taken? Should we be more specific this is related to filevector? 2) You check that inFile is != float and break execution if yes. Should the program also report what format the data is in? e.g. "The inFile contains filevector-INT, but I can only convert filevector-FLOAT to filevector-DOUBLE"? These are suggestions for discussion - I do not have a strong opinion here. YA ---------------------- Yurii Aulchenko (sent from mobile device) On 1 Jul 2013, at 10:56, "noreply at r-forge.r-project.org" wrote: > Author: dfabregat > Date: 2013-07-01 10:56:30 +0200 (Mon, 01 Jul 2013) > New Revision: 1264 > > Added: > pkg/OmicABEL/src/float2double/ > pkg/OmicABEL/src/float2double/float2double.c > Modified: > pkg/OmicABEL/Makefile > pkg/OmicABEL/doc/HOWTO > Log: > Adding the program float2double to translate DatABEL > "float" data into DatABEL "double" data. > > > Modified: pkg/OmicABEL/Makefile > =================================================================== > --- pkg/OmicABEL/Makefile 2013-07-01 08:50:00 UTC (rev 1263) > +++ pkg/OmicABEL/Makefile 2013-07-01 08:56:30 UTC (rev 1264) > @@ -2,8 +2,10 @@ > > SRCDIR = ./src > RESH_SRCDIR = ./src/reshuffle > +F2D_SRCDIR = ./src/float2double > CLAKGWAS = ./bin/CLAK-GWAS > RESHUFFLE = ./bin/reshuffle > +F2D = ./bin/float2double > > #QUICK and DIRTY > CXX=g++ > @@ -15,11 +17,13 @@ > SRCS = $(SRCDIR)/CLAK_GWAS.c $(SRCDIR)/fgls_chol.c $(SRCDIR)/fgls_eigen.c $(SRCDIR)/wrappers.c $(SRCDIR)/timing.c $(SRCDIR)/statistics.c $(SRCDIR)/REML.c $(SRCDIR)/optimization.c $(SRCDIR)/ooc_BLAS.c $(SRCDIR)/double_buffering.c $(SRCDIR)/utils.c $(SRCDIR)/GWAS.c $(SRCDIR)/databel.c > OBJS = $(SRCS:.c=.o) > RESH_SRCS=$(RESH_SRCDIR)/main.cpp $(RESH_SRCDIR)/iout_file.cpp $(RESH_SRCDIR)/Parameters.cpp $(RESH_SRCDIR)/reshuffle.cpp $(RESH_SRCDIR)/test.cpp > -RESH_OBJS = $(RESH_SRCS:.cpp=.o) > +RESH_OBJS=$(RESH_SRCS:.cpp=.o) > +F2D_SRCS=$(F2D_SRCDIR)/float2double.c > +F2D_OBJS=$(F2D_SRCS:.c=.o) $(SRCDIR)/databel.o $(SRCDIR)/wrappers.o > > .PHONY: all clean > > -all: ./bin/ $(CLAKGWAS) $(RESHUFFLE) > +all: ./bin/ $(CLAKGWAS) $(RESHUFFLE) $(F2D) > > ./bin: > mkdir bin > @@ -31,15 +35,19 @@ > cd $(RESH_SRCDIR) > $(CXX) $^ -o $@ > > +$(F2D): $(F2D_OBJS) > + cd $(F2D_SRCDIR) > + $(CC) $^ -o $@ > + > # Dirty, improve > platform=Linux > bindistDir=OmicABEL-$(platform)-bin > -bindist: ./bin/ $(CLAKGWAS) $(RESHUFFLE) > +bindist: ./bin/ $(CLAKGWAS) $(RESHUFFLE) $(F2D) > rm -rf $(bindistDir) > mkdir $(bindistDir) > mkdir $(bindistDir)/bin/ > mkdir $(bindistDir)/doc/ > - cp -a $(CLAKGWAS) $(RESHUFFLE) $(bindistDir)/bin/ > + cp -a $(CLAKGWAS) $(RESHUFFLE) $(F2D) $(bindistDir)/bin/ > cp -a COPYING LICENSE README DISCLAIMER.$(platform) $(bindistDir) > cp -a doc/README-reshuffle doc/INSTALL doc/HOWTO $(bindistDir)/doc > tar -czvf $(bindistDir).tgz $(bindistDir) > @@ -52,6 +60,8 @@ > $(RM) $(SRCDIR)/*opari_GPU* > $(RM) $(RESH_OBJS) > $(RM) $(RESHUFFLE) > + $(RM) $(F2D_OBJS) > + $(RM) $(F2D) > > > src/CLAK_GWAS.o: src/CLAK_GWAS.c src/wrappers.h src/utils.h src/GWAS.h \ > > Modified: pkg/OmicABEL/doc/HOWTO > =================================================================== > --- pkg/OmicABEL/doc/HOWTO 2013-07-01 08:50:00 UTC (rev 1263) > +++ pkg/OmicABEL/doc/HOWTO 2013-07-01 08:56:30 UTC (rev 1264) > @@ -5,6 +5,9 @@ > > * CLAK-GWAS: the program to run GWAS analyses (through CLAK-Chol or CLAK-Eig) > * reshuffle: the program to extract the output of CLAK-GWAS into text format > +* float2double: the program to translate databel files (*.fvi, *.fvd) > + in single precision "float" format into double precision > + "double" format. > > The output produced by CLAK-GWAS is kept in a compact binary format > for performance reasons. The user can then use "reshuffle" to > @@ -21,6 +24,10 @@ > > http://www.genabel.org/packages/OmicABEL > > +If you already prepared your data in DatABEL format, but you used > +single precision (float) data. You can make use of float2double > +to transform it into double precision (double) data. > + > If you need help, please contact us, or use the GenABEL project forum > > http://forum.genabel.org > @@ -40,7 +47,7 @@ > > The example in the tutorial also provides a basic example on using OmicABEL > to run your GWAS analyses. Here we detail the options of CLAK-GWAS. > -The complete list of options for CLAK-GWAS is avaliable through the command > +The complete list of options for CLAK-GWAS is available through the command > > ./CLAK-GWAS -h > > @@ -84,3 +91,5 @@ > > > For a detailed description of "reshuffle", please refer to doc/README-reshuffle > + > + > > Added: pkg/OmicABEL/src/float2double/float2double.c > =================================================================== > --- pkg/OmicABEL/src/float2double/float2double.c (rev 0) > +++ pkg/OmicABEL/src/float2double/float2double.c 2013-07-01 08:56:30 UTC (rev 1264) > @@ -0,0 +1,142 @@ > +/* > + * Copyright (c) 2010-2013, Diego Fabregat-Traver and Paolo Bientinesi. > + * All rights reserved. > + * > + * This file is part of OmicABEL. > + * > + * OmicABEL is free software: you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation, either version 3 of the License, or > + * (at your option) any later version. > + * > + * OmicABEL is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with OmicABEL. If not, see . > + * > + * > + * Coded by: > + * Diego Fabregat-Traver (fabregat at aices.rwth-aachen.de) > + */ > + > +#include > +#include > +#include > + > +#include "../wrappers.h" > +#include "../databel.h" > + > +#define MB (1L<<20) > +#define STR_BUFFER_SIZE 256 > + > +int main( int argc, char *argv[] ) > +{ > + char fin_path_fvi[STR_BUFFER_SIZE], > + fin_path_fvd[STR_BUFFER_SIZE], > + fout_path_fvi[STR_BUFFER_SIZE], > + fout_path_fvd[STR_BUFFER_SIZE]; > + FILE *fin, *fout; > + struct databel_fvi *databel_in, *databel_out; > + > + float *datain; > + double *dataout; > + size_t buff_size = 256*MB; > + > + long long int nelems; > + int nelems_in_buff, nelems_to_write; > + int header_data_size; > + > + int i, j, out; > + > + if ( argc != 3 ) > + { > + fprintf( stderr, "Usage: %s floatFileIn doubleFileOut\n", argv[0] ); > + exit( EXIT_FAILURE ); > + } > + > + snprintf( fin_path_fvi, STR_BUFFER_SIZE, "%s.fvi", argv[1] ); > + snprintf( fin_path_fvd, STR_BUFFER_SIZE, "%s.fvd", argv[1] ); > + snprintf( fout_path_fvi, STR_BUFFER_SIZE, "%s.fvi", argv[2] ); > + snprintf( fout_path_fvd, STR_BUFFER_SIZE, "%s.fvd", argv[2] ); > + > + // FVI files > + databel_in = load_databel_fvi( fin_path_fvi ); > + if ( databel_in->fvi_header.type != FLOAT_TYPE ) > + { > + fprintf( stderr, "Input databel file(s) %s should include \"float\" data\n", argv[1]); > + exit( EXIT_FAILURE ); > + } > + databel_out = (databel_fvi *) fgls_malloc( sizeof(databel_fvi) ); > + // Header > + databel_out->fvi_header.type = DOUBLE_TYPE; > + databel_out->fvi_header.nelements = databel_in->fvi_header.nelements; > + databel_out->fvi_header.numObservations = databel_in->fvi_header.numObservations; > + databel_out->fvi_header.numVariables = databel_in->fvi_header.numVariables; > + databel_out->fvi_header.bytesPerRecord = sizeof( double ); > + databel_out->fvi_header.bitsPerRecord = databel_out->fvi_header.bytesPerRecord * 8; > + databel_out->fvi_header.namelength = databel_in->fvi_header.namelength; > + for ( i = 0; i < RESERVEDSPACE; i++ ) > + databel_out->fvi_header.reserved[i] = '\0'; > + // Labels > + header_data_size = (databel_out->fvi_header.numVariables + databel_out->fvi_header.numObservations ) * > + databel_out->fvi_header.namelength * sizeof(char); > + databel_out->fvi_data = (char *) fgls_malloc ( header_data_size ); > + memcpy( databel_out->fvi_data, databel_in->fvi_data, header_data_size ); > + > + // Write > + fout = fgls_fopen( fout_path_fvi, "wb" ); > + out = fwrite( &databel_out->fvi_header, sizeof(databel_fvi_header), 1, fout); > + if ( out != 1 ) > + { > + fprintf(stderr, "Error writing fvi header\n" ); > + exit( EXIT_FAILURE ); > + } > + out = fwrite( databel_out->fvi_data, > + databel_out->fvi_header.namelength * sizeof(char), > + databel_out->fvi_header.numVariables + databel_out->fvi_header.numObservations, > + fout); > + if ( out != (databel_out->fvi_header.numVariables + databel_out->fvi_header.numObservations) ) > + { > + fprintf(stderr, "Error writing fvi data\n" ); > + exit( EXIT_FAILURE ); > + } > + fclose( fout ); > + > + // FVD > + fin = fgls_fopen( fin_path_fvd, "rb" ); > + fout = fgls_fopen( fout_path_fvd, "wb" ); > + // buff_size determines the size of the buffer for the "double" array. > + // For the same amount of elements, float needs half the memory space > + datain = (float *) fgls_malloc( buff_size / 2 ); > + dataout = (double *) fgls_malloc( buff_size ); > + > + nelems = databel_out->fvi_header.numVariables * databel_out->fvi_header.numObservations; // total elems in file > + nelems_in_buff = buff_size / sizeof(double); > + for ( i = 0; i < nelems; i += nelems_in_buff ) > + { > + nelems_to_write = ((nelems - i) >= nelems_in_buff) ? nelems_in_buff : nelems - i; > + if ( fread( datain, sizeof(float), nelems_to_write, fin ) != nelems_to_write ) > + { > + fprintf( stderr, "Error reading data from %s\n", fin_path_fvd ); > + exit( EXIT_FAILURE ); > + } > + for ( j = 0; j < nelems_to_write; j++ ) > + dataout[j] = (double)datain[j]; > + if ( fwrite( dataout, sizeof(double), nelems_to_write, fout ) != nelems_to_write ) > + { > + fprintf( stderr, "Error writing data to %s\n", fout_path_fvd ); > + exit( EXIT_FAILURE ); > + } > + } > + fclose( fin ); > + fclose( fout ); > + free( datain ); > + free( dataout ); > + free_databel_fvi( &databel_in ); > + free_databel_fvi( &databel_out ); > + > + return 0; > +} > > _______________________________________________ > Genabel-commits mailing list > Genabel-commits at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits From yurii.aulchenko at gmail.com Mon Jul 1 14:40:01 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Mon, 1 Jul 2013 14:40:01 +0200 Subject: [GenABEL-dev] bug_in_OmicABEL_reshuffle_fixed In-Reply-To: References: Message-ID: Thanks, Sodbo - does pass my test now! :) This is actually very good - I was so depressed not seeing any association, then happy to discover a bug, and now even more happy to see quite a few significant hits! YA On Mon, Jul 1, 2013 at 9:51 AM, ????? ??????? wrote: > Dear all! > I fixed bug in OmicABEL_reshuffle. > This bug was only for big data. The reason is, that for big output data > value of tile_coordinate is higher, than max(int). > For example: for data with 1080 ids and 122756 SNPs > max(tile_coordinate)=1080(ids) * 122756(SNPs) * 8 (sizeof(double)) * 5 > (columns:beta_1,se_1,beta_SNP,se_SNP, etc) = 5 303 059 200 > max(int) = 2 147 483 647 > max(unsigned int) = 4 294 967 295 > This values is lower than max(tile_coordinate). That's why > tile_coordinates for a half of data were incorrect and senseless. > So, the solution of this problem is change type of variabels for > tile_coordinates: I select int64_t instead of int. > max (int64_t)= 9,223,372,036,854,775,808. I think this is enough!=) > Now, "reshuffle" works with big data correctly. Compilation for Linux and > Windows was succesful. > -- > *_________________________________* > * > *With best regards > > Sodbo Zh. Sharapov > Phone: +79831347688 > Email: sharapovsodbo at gmail.com > sharapov at bionet.nsc.ru > Skype: sharapovsodbo > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Mon Jul 1 14:50:44 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Mon, 1 Jul 2013 14:50:44 +0200 Subject: [GenABEL-dev] update of OmicABEL binaries on genabel.org Message-ID: Dear Diego, can you please compile the (updated) OmicABEL for Linux and push the bin-dist to the genabel.org? before that - should we also change the version number so people do not get confused? YA -------------- next part -------------- An HTML attachment was scrubbed... URL: From sharapovsodbo at gmail.com Mon Jul 1 16:52:17 2013 From: sharapovsodbo at gmail.com (=?KOI8-R?B?88/Ews8g+8HSwdDP1w==?=) Date: Mon, 1 Jul 2013 21:52:17 +0700 Subject: [GenABEL-dev] bug_in_OmicABEL_reshuffle_fixed In-Reply-To: References: Message-ID: Thank you, Lennart and Yurii=) >I've got a similar feature request/bug report for ProbABEL, do you know >what the effect of going from unsigned int to int64 will be on memory >usage? int64_t use 8 bytes instead of 4 bytes for int. In case of "reshuffle", now there are only three int64_t variables. As you can see, there is no problem with size of memory. But, during "reshuffling" tile_coordinates counting many times (about one time per 5-10 doubles from data). So, now reshuffle's runtime for data [1080traits;122756SNP;5 columns] is about 21 sec (this runtime is for --chi=25 opertaion). Before correction, runtime was about 16 sec...faster than now. PS: I found some another bugs in reshuffle(with --heritabilities) and ,also, ways to optimized work with big data. As soon as possible, I'll do it. 2013/7/1 Yurii Aulchenko > Thanks, Sodbo - does pass my test now! :) > > This is actually very good - I was so depressed not seeing any > association, then happy to discover a bug, and now even more happy to see > quite a few significant hits! > > YA > > On Mon, Jul 1, 2013 at 9:51 AM, ????? ??????? wrote: > >> Dear all! >> I fixed bug in OmicABEL_reshuffle. >> This bug was only for big data. The reason is, that for big output data >> value of tile_coordinate is higher, than max(int). >> For example: for data with 1080 ids and 122756 SNPs >> max(tile_coordinate)=1080(ids) * 122756(SNPs) * 8 (sizeof(double)) * 5 >> (columns:beta_1,se_1,beta_SNP,se_SNP, etc) = 5 303 059 200 >> max(int) = 2 147 483 647 >> max(unsigned int) = 4 294 967 295 >> This values is lower than max(tile_coordinate). That's why >> tile_coordinates for a half of data were incorrect and senseless. >> So, the solution of this problem is change type of variabels for >> tile_coordinates: I select int64_t instead of int. >> max (int64_t)= 9,223,372,036,854,775,808. I think this is enough!=) >> Now, "reshuffle" works with big data correctly. Compilation for Linux and >> Windows was succesful. >> -- >> *_________________________________* >> * >> *With best regards >> >> Sodbo Zh. Sharapov >> Phone: +79831347688 >> Email: sharapovsodbo at gmail.com >> sharapov at bionet.nsc.ru >> Skype: sharapovsodbo >> >> _______________________________________________ >> genabel-devel mailing list >> genabel-devel at lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >> > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter] [ > Blog ] > -- *_________________________________* * *With best regards Sodbo Zh. Sharapov Phone: +79831347688 Email: sharapovsodbo at gmail.com sharapov at bionet.nsc.ru Skype: sharapovsodbo -------------- next part -------------- An HTML attachment was scrubbed... URL: From sharapovsodbo at gmail.com Mon Jul 1 17:39:18 2013 From: sharapovsodbo at gmail.com (=?KOI8-R?B?88/Ews8g+8HSwdDP1w==?=) Date: Mon, 1 Jul 2013 22:39:18 +0700 Subject: [GenABEL-dev] OmicABEL_float2double_compilation_failed Message-ID: Hello! I have a problem with compilation float2double for Linux: lima at mga:~/Sodbo/Packages/OmicABEL/src/float2double$ gcc float2double.c -Wall -o float2double float2double.c: In function ?main?: float2double.c:67: error: expected expression before ?)? token float2double.c:74: error: expected expression before ?;? token and for Windows the same: gcc float2double.c float2double.c: In function 'main': float2double.c:67:49 error: expected expression before ?)? token float2double.c:74:44: error: expected expression before ?;? token As far as I can judge, the problem is in this expresions: 1) if ( databel_in->fvi_header.type != FLOAT_TYPE ){} 2) databel_out->fvi_header.type = DOUBLE_TYPE; -- *_________________________________* * *With best regards Sodbo Zh. Sharapov Phone: +79831347688 Email: sharapovsodbo at gmail.com sharapov at bionet.nsc.ru Skype: sharapovsodbo -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Mon Jul 1 21:13:25 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Mon, 1 Jul 2013 21:13:25 +0200 Subject: [GenABEL-dev] OmicABEL_float2double_compilation_failed In-Reply-To: References: Message-ID: Sodbo - please check the Makefile - it looks like float2double make use of other source files as well! YA On Mon, Jul 1, 2013 at 5:39 PM, ????? ??????? wrote: > Hello! > > I have a problem with compilation float2double for Linux: > > lima at mga:~/Sodbo/Packages/OmicABEL/src/float2double$ gcc float2double.c > -Wall -o float2double > float2double.c: In function ?main?: > float2double.c:67: error: expected expression before ?)? token > float2double.c:74: error: expected expression before ?;? token > > and for Windows the same: > > gcc float2double.c > float2double.c: In function 'main': > float2double.c:67:49 error: expected expression before ?)? token > float2double.c:74:44: error: expected expression before ?;? token > > As far as I can judge, the problem is in this expresions: > > 1) if ( databel_in->fvi_header.type != FLOAT_TYPE ){} > 2) databel_out->fvi_header.type = DOUBLE_TYPE; > > > > -- > *_________________________________* > * > *With best regards > > Sodbo Zh. Sharapov > Phone: +79831347688 > Email: sharapovsodbo at gmail.com > sharapov at bionet.nsc.ru > Skype: sharapovsodbo > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Tue Jul 2 09:27:34 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Tue, 2 Jul 2013 09:27:34 +0200 Subject: [GenABEL-dev] bug_in_OmicABEL_reshuffle_fixed In-Reply-To: References: Message-ID: On Mon, Jul 1, 2013 at 4:52 PM, ????? ??????? wrote: > Thank you, Lennart and Yurii=) > > >I've got a similar feature request/bug report for ProbABEL, do you know > >what the effect of going from unsigned int to int64 will be on memory > >usage? > > int64_t use 8 bytes instead of 4 bytes for int. > In case of "reshuffle", now there are only three int64_t variables. As you > can see, there is no problem with size of memory. > But, during "reshuffling" tile_coordinates counting many times (about one > time per 5-10 doubles from data). > So, now reshuffle's runtime for data [1080traits;122756SNP;5 columns] is > about 21 sec (this runtime is for --chi=25 opertaion). > Before correction, runtime was about 16 sec...faster than now. > > PS: I found some another bugs in reshuffle(with --heritabilities) and > ,also, ways to optimized work with big data. As soon as possible, I'll do > it. > > Yep, I noticed that outputs of --heritabilities are a bit strange (some small negatives for parameters which should be positive) :) keep us posted! YA -------------- next part -------------- An HTML attachment was scrubbed... URL: From fabregat at aices.rwth-aachen.de Tue Jul 2 10:56:01 2013 From: fabregat at aices.rwth-aachen.de (Diego Fabregat Traver) Date: Tue, 02 Jul 2013 10:56:01 +0200 Subject: [GenABEL-dev] OmicABEL_float2double_compilation_failed In-Reply-To: References: Message-ID: Hi Sodbo, thanks for the report. I didn't commit databel.h, which defines and assigns a value to the datatype identifiers. It should work now. Please, let me know. Best, Diego > Hello! > > > I have a problem with compilation float2double for Linux: > lima at mga:~/Sodbo/Packages/OmicABEL/src/float2double$ gcc float2double.c -Wall -o float2double > float2double.c: In function ?main?: > > float2double.c:67: error: expected expression before ?)? token > float2double.c:74: error: expected expression before ?;? token > > > > and for Windows the same: > > > > gcc float2double.c > > float2double.c: In function 'main': > > float2double.c:67:49 error: expected expression before ?)? token > float2double.c:74:44: error: expected expression before ?;? token > > > > > As far as I can judge, the problem is in this expresions: > > > 1) if ( databel_in->fvi_header.type != FLOAT_TYPE ){} > 2) databel_out->fvi_header.type = DOUBLE_TYPE; From fabregat at aices.rwth-aachen.de Tue Jul 2 10:59:28 2013 From: fabregat at aices.rwth-aachen.de (Diego Fabregat Traver) Date: Tue, 02 Jul 2013 10:59:28 +0200 Subject: [GenABEL-dev] OmicABEL_float2double_compilation_failed In-Reply-To: References: Message-ID: On 01/07/13, Yurii Aulchenko wrote: > Sodbo - please check the Makefile - it looks like float2double make use of other source files as well! This is also true. With that compile line you will have linking errors. For Linux, typing make at OmicABEL's root directory should work fine. > > YA > > > On Mon, Jul 1, 2013 at 5:39 PM, ????? ??????? wrote: > > > > > > Hello! > > > > > > I have a problem with compilation float2double for Linux: > > lima at mga:~/Sodbo/Packages/OmicABEL/src/float2double$ gcc float2double.c -Wall -o float2double > > > > float2double.c: In function ?main?: > > > > float2double.c:67: error: expected expression before ?)? token > > float2double.c:74: error: expected expression before ?;? token > > > > > > > > and for Windows the same: > > > > > > > > gcc float2double.c > > > > float2double.c: In function 'main': > > > > float2double.c:67:49 error: expected expression before ?)? token > > float2double.c:74:44: error: expected expression before ?;? token > > > > > > > > > > > > As far as I can judge, the problem is in this expresions: > > > > > > 1) if ( databel_in->fvi_header.type != FLOAT_TYPE ){} > > 2) databel_out->fvi_header.type = DOUBLE_TYPE; > > > > > > > > > > > > > > > > -- > > > > _________________________________ > > > > With best regards > > > > Sodbo Zh. Sharapov > > Phone: ?+79831347688 > > Email: ? ?sharapovsodbo at gmail.com(javascript:main.compose() > > > > > > ? ? ? ? ? ? ?sharapov at bionet.nsc.ru(javascript:main.compose() > > Skype: ? sharapovsodbo > > > > > > > > _______________________________________________ > > > > genabel-devel mailing list > > > > genabel-devel at lists.r-forge.r-project.org > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > > > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > > > [?LinkedIn(http://nl.linkedin.com/in/yuriiaulchenko)?]?[ Twitter(http://twitter.com/YuriiAulchenko) ] [ Blog(http://yurii-aulchenko.blogspot.nl/) ] > > > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel From sharapovsodbo at gmail.com Tue Jul 2 11:03:58 2013 From: sharapovsodbo at gmail.com (=?KOI8-R?B?88/Ews8g+8HSwdDP1w==?=) Date: Tue, 2 Jul 2013 16:03:58 +0700 Subject: [GenABEL-dev] OmicABEL_float2double_compilation_failed In-Reply-To: References: Message-ID: Great! float2double compilation successfully complete! Thank you! 2013/7/2 Diego Fabregat Traver > Hi Sodbo, > > thanks for the report. I didn't commit databel.h, which > defines and assigns a value to the datatype identifiers. > > It should work now. Please, let me know. > > Best, > Diego > > > Hello! > > > > > > I have a problem with compilation float2double for Linux: > > lima at mga:~/Sodbo/Packages/OmicABEL/src/float2double$ gcc float2double.c > -Wall -o float2double > > float2double.c: In function ?main?: > > > > float2double.c:67: error: expected expression before ?)? token > > float2double.c:74: error: expected expression before ?;? token > > > > > > > > and for Windows the same: > > > > > > > > gcc float2double.c > > > > float2double.c: In function 'main': > > > > float2double.c:67:49 error: expected expression before ?)? token > > float2double.c:74:44: error: expected expression before ?;? token > > > > > > > > > > As far as I can judge, the problem is in this expresions: > > > > > > 1) if ( databel_in->fvi_header.type != FLOAT_TYPE ){} > > 2) databel_out->fvi_header.type = DOUBLE_TYPE; > > -- *_________________________________* * *With best regards Sodbo Zh. Sharapov Phone: +79831347688 Email: sharapovsodbo at gmail.com sharapov at bionet.nsc.ru Skype: sharapovsodbo -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Tue Jul 2 11:22:48 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Tue, 2 Jul 2013 11:22:48 +0200 Subject: [GenABEL-dev] [Genabel-commits] r1267 - pkg/OmicABEL/src In-Reply-To: <20130702085259.3ECDE184468@r-forge.r-project.org> References: <20130702085259.3ECDE184468@r-forge.r-project.org> Message-ID: <6167880795671206958@unknownmsgid> Diego, I understand this file is the part of filevector. In that may it be better to have a symlink instead of hard copy? - this is what we do for say DatA, MixA and GenA. Y ---------------------- Yurii Aulchenko (sent from mobile device) On 2 Jul 2013, at 10:53, "noreply at r-forge.r-project.org" wrote: > Author: dfabregat > Date: 2013-07-02 10:52:58 +0200 (Tue, 02 Jul 2013) > New Revision: 1267 > > Modified: > pkg/OmicABEL/src/databel.h > Log: > Defining DatABEL datatypes and their associated value > for *.fvi headers. > > > Modified: pkg/OmicABEL/src/databel.h > =================================================================== > --- pkg/OmicABEL/src/databel.h 2013-07-01 12:55:37 UTC (rev 1266) > +++ pkg/OmicABEL/src/databel.h 2013-07-02 08:52:58 UTC (rev 1267) > @@ -25,14 +25,14 @@ > #ifndef DATABEL_H > #define DATABEL_H > > -#define UNSIGNED_SHORT_INT_TYPE > -#define SHORT_INT_TYPE > -#define UNSIGNED_INT_TYPE > -#define INT_TYPE > -#define FLOAT_TYPE > -#define DOUBLE_TYPE > -#define SIGNED_CHAR_TYPE > -#define UNSIGNED_CHAR_TYPE > +enum datatype{ UNSIGNED_SHORT_INT_TYPE = 1, > + SHORT_INT_TYPE, > + UNSIGNED_INT_TYPE, > + INT_TYPE, > + FLOAT_TYPE, > + DOUBLE_TYPE, > + SIGNED_CHAR_TYPE, > + UNSIGNED_CHAR_TYPE }; > > #define NAMELENGTH 32 > #define RESERVEDSPACE 5 > > _______________________________________________ > Genabel-commits mailing list > Genabel-commits at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits From fabregat at aices.rwth-aachen.de Tue Jul 2 11:45:17 2013 From: fabregat at aices.rwth-aachen.de (Diego Fabregat Traver) Date: Tue, 02 Jul 2013 11:45:17 +0200 Subject: [GenABEL-dev] [Genabel-commits] r1267 - pkg/OmicABEL/src Message-ID: On 02/07/13, Yurii Aulchenko wrote: > Diego, > > I understand this file is the part of filevector. In that may it be > better to have a symlink instead of hard copy? - this is what we do > for say DatA, MixA and GenA. I am not sure what you mean by "part of". If you mean a copy of a file from filevector, it is not. If you mean related, yes it is. databel.{c,h} is OmicABEL is just a small module with a couple utilities: https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.h?view=markup&root=genabel https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.c?view=markup&root=genabel Diego > Y > > ---------------------- > Yurii Aulchenko > (sent from mobile device) > > On 2 Jul 2013, at 10:53, "noreply at r-forge.r-project.org" > wrote: > > > Author: dfabregat > > Date: 2013-07-02 10:52:58 +0200 (Tue, 02 Jul 2013) > > New Revision: 1267 > > > > Modified: > >?? pkg/OmicABEL/src/databel.h > > Log: > > Defining DatABEL datatypes and their associated value > > for *.fvi headers. > > > > > > Modified: pkg/OmicABEL/src/databel.h > > =================================================================== > > --- pkg/OmicABEL/src/databel.h??? 2013-07-01 12:55:37 UTC (rev 1266) > > +++ pkg/OmicABEL/src/databel.h??? 2013-07-02 08:52:58 UTC (rev 1267) > > @@ -25,14 +25,14 @@ > > #ifndef DATABEL_H > > #define DATABEL_H > > > > -#define UNSIGNED_SHORT_INT_TYPE > > -#define SHORT_INT_TYPE > > -#define UNSIGNED_INT_TYPE > > -#define INT_TYPE > > -#define FLOAT_TYPE > > -#define DOUBLE_TYPE > > -#define SIGNED_CHAR_TYPE > > -#define UNSIGNED_CHAR_TYPE > > +enum datatype{ UNSIGNED_SHORT_INT_TYPE = 1, > > +?????????????? SHORT_INT_TYPE, > > +?????????????? UNSIGNED_INT_TYPE, > > +?????????????? INT_TYPE, > > +?????????????? FLOAT_TYPE, > > +?????????????? DOUBLE_TYPE, > > +?????????????? SIGNED_CHAR_TYPE, > > +?????????????? UNSIGNED_CHAR_TYPE }; > > > > #define NAMELENGTH 32 > > #define RESERVEDSPACE 5 > > > > _______________________________________________ > > Genabel-commits mailing list > > Genabel-commits at lists.r-forge.r-project.org > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel From yurii.aulchenko at gmail.com Tue Jul 2 12:38:08 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Tue, 2 Jul 2013 12:38:08 +0200 Subject: [GenABEL-dev] [Genabel-commits] r1267 - pkg/OmicABEL/src In-Reply-To: References: Message-ID: ah, ok, I thought it was a copy, sorry for confusion in principle we should think of tighter integration OmicA-filevector/DatA, but this is not something for 5 minutes :) YA On Tue, Jul 2, 2013 at 11:45 AM, Diego Fabregat Traver < fabregat at aices.rwth-aachen.de> wrote: > > > On 02/07/13, Yurii Aulchenko wrote: > > > Diego, > > > > I understand this file is the part of filevector. In that may it be > > better to have a symlink instead of hard copy? - this is what we do > > for say DatA, MixA and GenA. > > I am not sure what you mean by "part of". If you mean a copy of a file from > filevector, it is not. If you mean related, yes it is. > > databel.{c,h} is OmicABEL is just a small module with a couple utilities: > > > https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.h?view=markup&root=genabel > > https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.c?view=markup&root=genabel > > Diego > > > Y > > > > ---------------------- > > Yurii Aulchenko > > (sent from mobile device) > > > > On 2 Jul 2013, at 10:53, "noreply at r-forge.r-project.org" > > wrote: > > > > > Author: dfabregat > > > Date: 2013-07-02 10:52:58 +0200 (Tue, 02 Jul 2013) > > > New Revision: 1267 > > > > > > Modified: > > > pkg/OmicABEL/src/databel.h > > > Log: > > > Defining DatABEL datatypes and their associated value > > > for *.fvi headers. > > > > > > > > > Modified: pkg/OmicABEL/src/databel.h > > > =================================================================== > > > --- pkg/OmicABEL/src/databel.h 2013-07-01 12:55:37 UTC (rev 1266) > > > +++ pkg/OmicABEL/src/databel.h 2013-07-02 08:52:58 UTC (rev 1267) > > > @@ -25,14 +25,14 @@ > > > #ifndef DATABEL_H > > > #define DATABEL_H > > > > > > -#define UNSIGNED_SHORT_INT_TYPE > > > -#define SHORT_INT_TYPE > > > -#define UNSIGNED_INT_TYPE > > > -#define INT_TYPE > > > -#define FLOAT_TYPE > > > -#define DOUBLE_TYPE > > > -#define SIGNED_CHAR_TYPE > > > -#define UNSIGNED_CHAR_TYPE > > > +enum datatype{ UNSIGNED_SHORT_INT_TYPE = 1, > > > + SHORT_INT_TYPE, > > > + UNSIGNED_INT_TYPE, > > > + INT_TYPE, > > > + FLOAT_TYPE, > > > + DOUBLE_TYPE, > > > + SIGNED_CHAR_TYPE, > > > + UNSIGNED_CHAR_TYPE }; > > > > > > #define NAMELENGTH 32 > > > #define RESERVEDSPACE 5 > > > > > > _______________________________________________ > > > Genabel-commits mailing list > > > Genabel-commits at lists.r-forge.r-project.org > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits > > _______________________________________________ > > genabel-devel mailing list > > genabel-devel at lists.r-forge.r-project.org > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From fabregat at aices.rwth-aachen.de Tue Jul 2 13:11:51 2013 From: fabregat at aices.rwth-aachen.de (Diego Fabregat Traver) Date: Tue, 02 Jul 2013 13:11:51 +0200 Subject: [GenABEL-dev] layout of GenABEL main page Message-ID: On 28/06/13, Yurii Aulchenko wrote: > How do you like this one? I like it a lot. What do you think about reducing the font size for the subtitle and right-justifying it? Would it still be readable? I liked that detail from the previous attempts with the "Project" subtitle. In any case, this is just a minor detail. It looks great as it is. Thanks to Grant Borodin! > YA > > > On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko wrote: > > > > Dear Nicola, Diego, Lennart,? > > > > > > Thanks for your feedback! I will ask Grant Borodin, who kindly designed these logos, if he could change C according to your comment (capital "ABEL" and "statistical genomics" as in F). > > > > > > > > > > Yurii > > > > > > > > On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver wrote: > > > > > > > > > > > > > > > Congrats to whoever designed these logos, they look very nice :) > > > > > > > > > > > > With respect to my preferences, I fully agree with Lennart: "C with capital ABEL and statistical genomics below it" would be my choice. > > > > > > > > > > > > Best, > > > > > > Diego > > > > > > > > > > > > > > > > > > > > > On 20/06/13, "L.C. Karssen" ? wrote: > > > > > > > > > > > > > Wow! Those look really nice! > > > > > > > > > > > > > > I like options C and F the most. Actually a combination would be even > > > > > > > better IMHO: use C with capital ABEL and statistical genomics below it. > > > > > > > > > > > > > > > > > > > > > Looking forward to head the opinion of others, > > > > > > > > > > > > > > Lennart. > > > > > > > > > > > > > > On 20-06-13 09:34, Yurii Aulchenko wrote: > > > > > > > > Please find attached few more logo variants > > > > > > > > > > > > > > > > Yurii > > > > > > > > > > > > > > > > From kooyman at gmail.com Tue Jul 2 14:10:53 2013 From: kooyman at gmail.com (Maarten Kooyman) Date: Tue, 02 Jul 2013 14:10:53 +0200 Subject: [GenABEL-dev] layout of GenABEL main page In-Reply-To: References: Message-ID: <51D2C34D.2000907@gmail.com> Dear all, It looks really nice ! Credits for who made it. However, I have more the impression that it looks like a polypeptide chain or a rosary. The seventies font is a matter of taste, but it remind me of comic sans(including a upside down e as a). I wonder if it readable if you print it on a poster: I think this is a important use-case of a scientific logo. Kind regards, Maarten On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote: > On 28/06/13, Yurii Aulchenko wrote: > >> How do you like this one? > I like it a lot. > > What do you think about reducing the font size for the subtitle > and right-justifying it? Would it still be readable? I liked that > detail from the previous attempts with the "Project" subtitle. > > In any case, this is just a minor detail. It looks great as it is. > > Thanks to Grant Borodin! > >> YA >> >> >> On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko wrote: >> >> >>> Dear Nicola, Diego, Lennart, >>> >>> >>> Thanks for your feedback! I will ask Grant Borodin, who kindly designed these logos, if he could change C according to your comment (capital "ABEL" and "statistical genomics" as in F). >>> >>> >>> >>> >>> Yurii >>> >>> >>> >>> On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver wrote: >>> >>> >>> >>>> >>>> Congrats to whoever designed these logos, they look very nice :) >>>> >>>> >>>> >>>> With respect to my preferences, I fully agree with Lennart: "C with capital ABEL and statistical genomics below it" would be my choice. >>>> >>>> >>>> >>>> Best, >>>> >>>> Diego >>>> >>>> >>>> >>>> >>>> >>>> >>>> On 20/06/13, "L.C. Karssen" wrote: >>>> >>>> >>>> >>>>> Wow! Those look really nice! >>>>> I like options C and F the most. Actually a combination would be even >>>>> better IMHO: use C with capital ABEL and statistical genomics below it. >>>>> Looking forward to head the opinion of others, >>>>> Lennart. >>>>> On 20-06-13 09:34, Yurii Aulchenko wrote: >>>>>> Please find attached few more logo variants >>>>>> Yurii >>>> >>> >>> >> >> >> > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel From yurii.aulchenko at gmail.com Tue Jul 2 14:38:39 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Tue, 2 Jul 2013 14:38:39 +0200 Subject: [GenABEL-dev] layout of GenABEL main page In-Reply-To: <51D2C34D.2000907@gmail.com> References: <51D2C34D.2000907@gmail.com> Message-ID: Dear All, I agree with critique of Maarten, and I actually still not sure if I like Maarten's or Grant's idea better. Interesting thing is that - not sure all realize it - Grant's variant is his vision of Maarten's prototype :) However, Grant's variant has an important advantage - it is ready to serve as logo. And I actually want to use a logo in my slides for UseR!-2013. So I suggest we take Grant's logo as a working variant. No doubt that the logo is going to evolve with time - as anything we do in the project - code, documentation; logo is no different, I think. The element which is going to stay and keep it recognizable is the way of spelling the GenABEL :) - Like Gnu's horns in the GNU logo. What we can do next is to place an open call on site/forum for other users to contribute, but this is going to take time, and meanwhile I suggest to stick with Grant's variant. Yurii On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman wrote: > Dear all, > > > It looks really nice ! Credits for who made it. However, I have more the > impression that it looks like a polypeptide chain or a rosary. The > seventies font is a matter of taste, but it remind me of comic > sans(including a upside down e as a). I wonder if it readable if you print > it on a poster: I think this is a important use-case of a scientific logo. > > Kind regards, > > > Maarten > > > > > On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote: > >> On 28/06/13, Yurii Aulchenko wrote: >> >> How do you like this one? >>> >> I like it a lot. >> >> What do you think about reducing the font size for the subtitle >> and right-justifying it? Would it still be readable? I liked that >> detail from the previous attempts with the "Project" subtitle. >> >> In any case, this is just a minor detail. It looks great as it is. >> >> Thanks to Grant Borodin! >> >> >>> YA >>> >>> >>> On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko < >>> yurii.aulchenko at gmail.com(**javascript:main.compose()> wrote: >>> >>> >>> Dear Nicola, Diego, Lennart, >>>> >>>> >>>> Thanks for your feedback! I will ask Grant Borodin, who kindly designed >>>> these logos, if he could change C according to your comment (capital "ABEL" >>>> and "statistical genomics" as in F). >>>> >>>> >>>> >>>> >>>> Yurii >>>> >>>> >>>> >>>> On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver < >>>> fabregat at aices.rwth-aachen.de**(javascript:main.compose()> wrote: >>>> >>>> >>>> >>>> >>>>> Congrats to whoever designed these logos, they look very nice :) >>>>> >>>>> >>>>> >>>>> With respect to my preferences, I fully agree with Lennart: "C with >>>>> capital ABEL and statistical genomics below it" would be my choice. >>>>> >>>>> >>>>> >>>>> Best, >>>>> >>>>> Diego >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On 20/06/13, "L.C. Karssen" >>>>> wrote: >>>>> >>>>> >>>>> >>>>> Wow! Those look really nice! >>>>>> I like options C and F the most. Actually a combination would be even >>>>>> better IMHO: use C with capital ABEL and statistical genomics below >>>>>> it. >>>>>> Looking forward to head the opinion of others, >>>>>> Lennart. >>>>>> On 20-06-13 09:34, Yurii Aulchenko wrote: >>>>>> >>>>>>> Please find attached few more logo variants >>>>>>> Yurii >>>>>>> >>>>>> >>>>> >>>> >>>> >>> >>> >>> ______________________________**_________________ >> genabel-devel mailing list >> genabel-devel at lists.r-forge.r-**project.org >> https://lists.r-forge.r-**project.org/cgi-bin/mailman/** >> listinfo/genabel-devel >> > > ______________________________**_________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-**project.org > https://lists.r-forge.r-**project.org/cgi-bin/mailman/** > listinfo/genabel-devel > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicola.pirastu at burlo.trieste.it Tue Jul 2 16:27:48 2013 From: nicola.pirastu at burlo.trieste.it (Nicola Pirastu) Date: Tue, 2 Jul 2013 16:27:48 +0200 Subject: [GenABEL-dev] layout of GenABEL main page In-Reply-To: References: <51D2C34D.2000907@gmail.com> Message-ID: <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it> Just to add my two cents to the discussion, I think that the problem is not with the DNA helix but with the font. I've played around a bit with it and if you use for example Helvetica or something less comic-sans-like it does look better. Also for some reason I'm still disturbed by the green but it is a very personal opinion.. Nicola Dr. Nicola Pirastu PhD Research Fellow Medical Sciences, Chirurgical and Health Department University of Trieste Medical Genetics IRCCS Burlo Garofolo Via dell'Istria 65/1 34137 Italy tel. +390403785539 Il giorno 02/lug/2013, alle ore 14:38, Yurii Aulchenko > ha scritto: Dear All, I agree with critique of Maarten, and I actually still not sure if I like Maarten's or Grant's idea better. Interesting thing is that - not sure all realize it - Grant's variant is his vision of Maarten's prototype :) However, Grant's variant has an important advantage - it is ready to serve as logo. And I actually want to use a logo in my slides for UseR!-2013. So I suggest we take Grant's logo as a working variant. No doubt that the logo is going to evolve with time - as anything we do in the project - code, documentation; logo is no different, I think. The element which is going to stay and keep it recognizable is the way of spelling the GenABEL :) - Like Gnu's horns in the GNU logo. What we can do next is to place an open call on site/forum for other users to contribute, but this is going to take time, and meanwhile I suggest to stick with Grant's variant. Yurii On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman > wrote: Dear all, It looks really nice ! Credits for who made it. However, I have more the impression that it looks like a polypeptide chain or a rosary. The seventies font is a matter of taste, but it remind me of comic sans(including a upside down e as a). I wonder if it readable if you print it on a poster: I think this is a important use-case of a scientific logo. Kind regards, Maarten On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote: On 28/06/13, Yurii Aulchenko > wrote: How do you like this one? I like it a lot. What do you think about reducing the font size for the subtitle and right-justifying it? Would it still be readable? I liked that detail from the previous attempts with the "Project" subtitle. In any case, this is just a minor detail. It looks great as it is. Thanks to Grant Borodin! YA On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko (javascript:main.compose()> wrote: Dear Nicola, Diego, Lennart, Thanks for your feedback! I will ask Grant Borodin, who kindly designed these logos, if he could change C according to your comment (capital "ABEL" and "statistical genomics" as in F). Yurii On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver (javascript:main.compose()> wrote: Congrats to whoever designed these logos, they look very nice :) With respect to my preferences, I fully agree with Lennart: "C with capital ABEL and statistical genomics below it" would be my choice. Best, Diego On 20/06/13, "L.C. Karssen" (javascript:main.compose()> wrote: Wow! Those look really nice! I like options C and F the most. Actually a combination would be even better IMHO: use C with capital ABEL and statistical genomics below it. Looking forward to head the opinion of others, Lennart. On 20-06-13 09:34, Yurii Aulchenko wrote: Please find attached few more logo variants Yurii _______________________________________________ genabel-devel mailing list genabel-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel _______________________________________________ genabel-devel mailing list genabel-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter ] [ Blog ] _______________________________________________ genabel-devel mailing list genabel-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel messaggio, o responsabili per la sua consegna alla persona, o se avete ricevuto il messaggio per errore, siete pregati di non trascriverlo, copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential information may be contained in this message or in its attachments. If you are not the addressee indicated in this message, or responsible for message delivering to that person, or if you have received this message in error, you may not transcribe, copy or deliver this message to anyone. In that case, you should delete this message and its attachments. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lennart at karssen.org Tue Jul 2 18:05:57 2013 From: lennart at karssen.org (L.C. Karssen) Date: Tue, 02 Jul 2013 18:05:57 +0200 Subject: [GenABEL-dev] [Genabel-commits] r1267 - pkg/OmicABEL/src In-Reply-To: References: Message-ID: <51D2FA65.7050102@karssen.org> Dear all, On 02-07-13 12:38, Yurii Aulchenko wrote: > ah, ok, I thought it was a copy, sorry for confusion > > in principle we should think of tighter integration > OmicA-filevector/DatA, but this is not something for 5 minutes :) > Definitely not 5 minutes :-). My idea has always been be to turn the contents of the filevector directory in SVN into a (shared) library. This would allow us to package it separately without the need to copy/symlink the directory for each of the other packages. It would more easily allow versioning of the filevector files. If this all works, we can compile the other ABELs with a -lfilevector option. For the user, however, it would mean that the need to install two packages (e.g. for ProbABEL they would need to install the filevector library and ProbABEL itself) unless we distribute the various "filevector.so" files with the other packages. Of course, any ideas/opinions on this are welcome! Lennart. > YA > > On Tue, Jul 2, 2013 at 11:45 AM, Diego Fabregat Traver > > > wrote: > > > > On 02/07/13, Yurii Aulchenko > wrote: > > > Diego, > > > > I understand this file is the part of filevector. In that may it be > > better to have a symlink instead of hard copy? - this is what we do > > for say DatA, MixA and GenA. > > I am not sure what you mean by "part of". If you mean a copy of a > file from > filevector, it is not. If you mean related, yes it is. > > databel.{c,h} is OmicABEL is just a small module with a couple > utilities: > > https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.h?view=markup&root=genabel > https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.c?view=markup&root=genabel > > Diego > > > Y > > > > ---------------------- > > Yurii Aulchenko > > (sent from mobile device) > > > > On 2 Jul 2013, at 10:53, "noreply at r-forge.r-project.org > " > > > wrote: > > > > > Author: dfabregat > > > Date: 2013-07-02 10:52:58 +0200 (Tue, 02 Jul 2013) > > > New Revision: 1267 > > > > > > Modified: > > > pkg/OmicABEL/src/databel.h > > > Log: > > > Defining DatABEL datatypes and their associated value > > > for *.fvi headers. > > > > > > > > > Modified: pkg/OmicABEL/src/databel.h > > > =================================================================== > > > --- pkg/OmicABEL/src/databel.h 2013-07-01 12:55:37 UTC (rev 1266) > > > +++ pkg/OmicABEL/src/databel.h 2013-07-02 08:52:58 UTC (rev 1267) > > > @@ -25,14 +25,14 @@ > > > #ifndef DATABEL_H > > > #define DATABEL_H > > > > > > -#define UNSIGNED_SHORT_INT_TYPE > > > -#define SHORT_INT_TYPE > > > -#define UNSIGNED_INT_TYPE > > > -#define INT_TYPE > > > -#define FLOAT_TYPE > > > -#define DOUBLE_TYPE > > > -#define SIGNED_CHAR_TYPE > > > -#define UNSIGNED_CHAR_TYPE > > > +enum datatype{ UNSIGNED_SHORT_INT_TYPE = 1, > > > + SHORT_INT_TYPE, > > > + UNSIGNED_INT_TYPE, > > > + INT_TYPE, > > > + FLOAT_TYPE, > > > + DOUBLE_TYPE, > > > + SIGNED_CHAR_TYPE, > > > + UNSIGNED_CHAR_TYPE }; > > > > > > #define NAMELENGTH 32 > > > #define RESERVEDSPACE 5 > > > > > > _______________________________________________ > > > Genabel-commits mailing list > > > Genabel-commits at lists.r-forge.r-project.org > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits > > _______________________________________________ > > genabel-devel mailing list > > genabel-devel at lists.r-forge.r-project.org > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter > ] [ Blog > ] > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- ----------------------------------------------------------------- L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org Stuur mij aub geen Word of Powerpoint bestanden! Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html ------------------------------------------------------------------ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 230 bytes Desc: OpenPGP digital signature URL: From yurii.aulchenko at gmail.com Tue Jul 2 21:27:09 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Tue, 2 Jul 2013 21:27:09 +0200 Subject: [GenABEL-dev] [Genabel-commits] r1267 - pkg/OmicABEL/src In-Reply-To: <51D2FA65.7050102@karssen.org> References: <51D2FA65.7050102@karssen.org> Message-ID: On Tue, Jul 2, 2013 at 6:05 PM, L.C. Karssen wrote: > Dear all, > > On 02-07-13 12:38, Yurii Aulchenko wrote: > > ah, ok, I thought it was a copy, sorry for confusion > > > > in principle we should think of tighter integration > > OmicA-filevector/DatA, but this is not something for 5 minutes :) > > > > Definitely not 5 minutes :-). My idea has always been be to turn the > contents of the filevector directory in SVN into a (shared) library. > This would allow us to package it separately without the need to > copy/symlink the directory for each of the other packages. It would more > easily allow versioning of the filevector files. > If this all works, we can compile the other ABELs with a -lfilevector > option. For the user, however, it would mean that the need to install > two packages (e.g. for ProbABEL they would need to install the > filevector library and ProbABEL itself) unless we distribute the various > "filevector.so" files with the other packages. > > Of course, any ideas/opinions on this are welcome! > > I like the 'library' idea very much; in a way I think this should be as cool as what Lennart did to ProbABEL (autotools) :) Not sure that this will be very easy to do, especially cross different platforms though. Also not sure how that will work with CRAN-submissions.I think we will obtain useful experience while solving the MixABEL (which does use GSL, and is not on CRAN). YA Lennart. > > > YA > > > > On Tue, Jul 2, 2013 at 11:45 AM, Diego Fabregat Traver > > > > > wrote: > > > > > > > > On 02/07/13, Yurii Aulchenko > > wrote: > > > > > Diego, > > > > > > I understand this file is the part of filevector. In that may it be > > > better to have a symlink instead of hard copy? - this is what we do > > > for say DatA, MixA and GenA. > > > > I am not sure what you mean by "part of". If you mean a copy of a > > file from > > filevector, it is not. If you mean related, yes it is. > > > > databel.{c,h} is OmicABEL is just a small module with a couple > > utilities: > > > > > https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.h?view=markup&root=genabel > > > https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.c?view=markup&root=genabel > > > > Diego > > > > > Y > > > > > > ---------------------- > > > Yurii Aulchenko > > > (sent from mobile device) > > > > > > On 2 Jul 2013, at 10:53, "noreply at r-forge.r-project.org > > " > > > > > wrote: > > > > > > > Author: dfabregat > > > > Date: 2013-07-02 10:52:58 +0200 (Tue, 02 Jul 2013) > > > > New Revision: 1267 > > > > > > > > Modified: > > > > pkg/OmicABEL/src/databel.h > > > > Log: > > > > Defining DatABEL datatypes and their associated value > > > > for *.fvi headers. > > > > > > > > > > > > Modified: pkg/OmicABEL/src/databel.h > > > > > =================================================================== > > > > --- pkg/OmicABEL/src/databel.h 2013-07-01 12:55:37 UTC (rev > 1266) > > > > +++ pkg/OmicABEL/src/databel.h 2013-07-02 08:52:58 UTC (rev > 1267) > > > > @@ -25,14 +25,14 @@ > > > > #ifndef DATABEL_H > > > > #define DATABEL_H > > > > > > > > -#define UNSIGNED_SHORT_INT_TYPE > > > > -#define SHORT_INT_TYPE > > > > -#define UNSIGNED_INT_TYPE > > > > -#define INT_TYPE > > > > -#define FLOAT_TYPE > > > > -#define DOUBLE_TYPE > > > > -#define SIGNED_CHAR_TYPE > > > > -#define UNSIGNED_CHAR_TYPE > > > > +enum datatype{ UNSIGNED_SHORT_INT_TYPE = 1, > > > > + SHORT_INT_TYPE, > > > > + UNSIGNED_INT_TYPE, > > > > + INT_TYPE, > > > > + FLOAT_TYPE, > > > > + DOUBLE_TYPE, > > > > + SIGNED_CHAR_TYPE, > > > > + UNSIGNED_CHAR_TYPE }; > > > > > > > > #define NAMELENGTH 32 > > > > #define RESERVEDSPACE 5 > > > > > > > > _______________________________________________ > > > > Genabel-commits mailing list > > > > Genabel-commits at lists.r-forge.r-project.org > > > > > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits > > > _______________________________________________ > > > genabel-devel mailing list > > > genabel-devel at lists.r-forge.r-project.org > > > > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > _______________________________________________ > > genabel-devel mailing list > > genabel-devel at lists.r-forge.r-project.org > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > > > > > > > > -- > > ----------------------------------------------------- > > Yurii S. Aulchenko > > > > [ LinkedIn ] [ Twitter > > ] [ Blog > > ] > > > > > > _______________________________________________ > > genabel-devel mailing list > > genabel-devel at lists.r-forge.r-project.org > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > > -- > ----------------------------------------------------------------- > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > > Stuur mij aub geen Word of Powerpoint bestanden! > Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html > ------------------------------------------------------------------ > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Wed Jul 3 22:44:58 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Wed, 3 Jul 2013 22:44:58 +0200 Subject: [GenABEL-dev] joining the GenABEL project - what is the procedure? In-Reply-To: <75533175-C431-46A8-8B57-AC12B67C968E@burlo.trieste.it> References: <51916427.8020307@karssen.org> <75533175-C431-46A8-8B57-AC12B67C968E@burlo.trieste.it> Message-ID: Dear All, I just discovered that we already had this discussion (under slightly different angle) a couple of years ago, see https://lists.r-forge.r-project.org/pipermail/genabel-devel/2011-October/000367.html In the light of that discussion I have added questions * Legal issues ** Is there a clear (standard) license? ** Is the license GNU GPL-compatible? to the reviewer's list YA On Thu, Jun 27, 2013 at 11:10 AM, Nicola Pirastu < nicola.pirastu at burlo.trieste.it> wrote: > Sorry for the late reply. > > Great! I'll look at the comments and modify the package accordingly. > > I'll let you know when I'm done. > > Thanks a lot. > > Best > > Nicola > > > > Dr. Nicola Pirastu PhD > Research Fellow > Medical Sciences, Chirurgical and Health Department > University of Trieste > Medical Genetics > IRCCS Burlo Garofolo > Via dell'Istria 65/1 > 34137 Italy > tel. +390403785539 > > Il giorno 20/giu/2013, alle ore 23:33, Yurii Aulchenko < > yurii.aulchenko at gmail.com> ha scritto: > > ok, draft review form (and a review of RegionABEL) completed at > http://piratepad.net/9ExdfmuJHV > > Definitely not all comments are equal - some are minor/suggestive, > others more important. > > Nicola, may be you can reply directly in that form to my concerns > > I would really like if someone else will do the next round of review - I > feel I am really biased - any volunteers? > > YA > > On Thu, Jun 20, 2013 at 1:09 AM, Yurii Aulchenko < > yurii.aulchenko at gmail.com> wrote: > >> FYI, I started drafting more detailed reviewers' instructions ( >> http://piratepad.net/9ExdfmuJHV) and going to apply this template for >> Nicola's package. Few questions will pop up on the way, I am sure. >> YA >> >> >> On Tue, May 28, 2013 at 8:52 AM, Nicola Pirastu < >> nicola.pirastu at burlo.trieste.it> wrote: >> >>> Hi, >>> >>> I think this is a very good plan. As for time I think a couple of >>> months is fine, I still need to do some work to demonstrate that everything >>> works fine (simulations, etc etc?.). Actually if some one would like to >>> lend a hand on that side he/she would be more than welcome :). >>> >>> I'll send you the code separately with a tutorial attached so we can >>> get started. >>> >>> Best. >>> >>> Nicola >>> >>> >>> Il giorno 28/mag/2013, alle ore 04:39, Yurii Aulchenko < >>> yurii.aulchenko at gmail.com> ha scritto: >>> >>> I think it may be indeed a good idea to start with a 'case' and >>> develop/tune the recommendations on the way. Nicola's new package would >>> provide a good starting point (then we actually can think of re-review of >>> some of the packages which are in the GenABEL suite already). >>> >>> What about following plan >>> >>> 1) We (Nicola, Yurii, ...) draft reviewer's instructions (starting >>> with points made during this discussion) - I made a piratepad >>> http://piratepad.net/9ExdfmuJHV (at the moment simply a copy of latest >>> Nicola's email); later we will circulate the draft on the list >>> >>> 2) Take RegionABEL as an example (I am volunteering to be the 'test' >>> reviewer), and explore this case to check the review procedure. Nicola, may >>> be you can send me the code already. >>> >>> 3) Ask an external person to act as a reviewer - this is for testing >>> our reviewers' instructions >>> >>> The whole process (esp if we want to go for (3)) may take a couple of >>> months. Nicola, how much in hurry are you with publication? >>> >>> Yurii >>> >>> >>> On Wed, May 22, 2013 at 2:55 PM, Nicola Pirastu < >>> nicola.pirastu at burlo.trieste.it> wrote: >>> >>>> Dear all, >>>> >>>> I think that the best way we can discuss about this is to start with a >>>> real case. I would propose to start from the package >>>> I've just written to run gene/region wide analysis which I've called >>>> RegionABEL. >>>> >>>> It basically gives gene wide value with real or imputed data, with or >>>> without kinship included. It is not for analyzing rare variants, so it is >>>> not like SKAT. If you want to think of it in terms of existing software it >>>> is like VEGAS or plink-ave. The main advance is that since it does not use >>>> simulation/permutations to get pvalues it is much faster (4 hours on 1000G >>>> data vs 12-16 of VEGAS on HapMap 2.5). The other great advantage is that it >>>> does not require prior knowledge of LD as in other methods. >>>> I have beta version of the package and I've written a Tutorial to >>>> explain how to use it. >>>> >>>> So how do you think we should proceed now? Should we ask some >>>> volunteers to review it? >>>> >>>> >>>> Best. >>>> >>>> Nicola >>>> >>>> >>>> >>>> >>>> Il giorno 14/mag/2013, alle ore 00:07, L.C. Karssen < >>>> lennart at karssen.org> ha scritto: >>>> >>>> > Dear all, >>>> > >>>> > It's been a while but this mail was still on my todo list. I agree >>>> with >>>> > Yurii that we should start establishing procedures for projects >>>> wanting >>>> > to join the GenABEL project umbrella. Software lifecycle management is >>>> > too often overlooked when developing a package and we don't want to >>>> > 'degrade' the GenABEL project brand name by including packages that >>>> are >>>> > not maintained anymore after the initial paper is published. Or, >>>> another >>>> > argument I've come across: we make it open source so everyone can >>>> > contribute to it (and therefore it will 'somehow' be maintained >>>> without >>>> > us putting more effort into it). That's not how it works. The software >>>> > ecosystem in which a package lives is dynamic and a package should >>>> adapt >>>> > to that. >>>> > >>>> > As Yurii wrote we discussed this at the EMGM conference and agreed >>>> that >>>> > code review should be part of it. This neatly ties into the discussion >>>> > we had on thils list some time ago about coding standards. This does >>>> not >>>> > mean we force everybody to use four spaces instead of eight when >>>> > indenting code, but more serious stuff like variables named "a" or >>>> "df" >>>> > are not helpful when someone wants to contribute or take over >>>> > maintenance of the package. >>>> > >>>> > I've just committed the draft document of the coding standards to the >>>> > www folder of the SVN repo (rev. 1215). It's a (plain text) Org-mode >>>> > file; the HTML file is created from this Org file (using org-mode >>>> allows >>>> > us to easily export the text in various formats). Those of you who >>>> want >>>> > to convert without ever opening emacs can run the command >>>> > emacs --batch --eval '(and (find-file "codingstyle.org") >>>> > (org-export-as-html nil))' >>>> > from the command line. >>>> > >>>> > Looking forward to your comments, both on this e-mail and the coding >>>> > standards. >>>> > >>>> > >>>> > Lennart. >>>> > >>>> > On 02-05-13 15:15, Yurii Aulchenko wrote: >>>> >> Dear All, >>>> >> >>>> >> I have recently received several requests from people who would like >>>> to >>>> >> join to the GenABEL project with their software. Given this is a >>>> >> community-based project, neither me nor someone else is in a >>>> position to >>>> >> say 'yes' or 'no' - we need to develop some procedure how a software >>>> >> joins the project. >>>> >> >>>> >> We have discussed this with Nicola and Lennart during EMGM-2013, and >>>> we >>>> >> think that we do need a technical review as a part of the procedure >>>> >> (addressing the issues of license, clarity of the code, integration >>>> with >>>> >> other packages, etc.). We also need to think how we do maintenance: >>>> the >>>> >> suggestion would be to request that the author joins the forum and >>>> the >>>> >> list. If we see that a package is not actively maintained (e.g. we >>>> can >>>> >> not reach the maintainer), we should tag such a package as >>>> 'orphaned'. >>>> >> >>>> >> In many respects, we can base our procedure on the procedures >>>> developed >>>> >> by Bioconductor. In our procedures we need to achieve two conflicting >>>> >> goals: a) we do not want to repel potential contributors by a long >>>> list >>>> >> of technical requirements but at the same time b) in the sake of >>>> >> maintainability we need the code to comply to some requirements. >>>> >> Probably we should have 'minimal' and 'complete' requirements with >>>> >> packages clearly tagged on the web pages. >>>> >> >>>> >> Let us know what you think. I will initiate a PiratPad document after >>>> >> having initial response from you. >>>> >> >>>> >> best regards, >>>> >> YA >>>> >> >>>> >> >>>> >> _______________________________________________ >>>> >> genabel-devel mailing list >>>> >> genabel-devel at lists.r-forge.r-project.org >>>> >> >>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >>>> >> >>>> > >>>> > -- >>>> > ----------------------------------------------------------------- >>>> > L.C. Karssen >>>> > Utrecht >>>> > The Netherlands >>>> > >>>> > lennart at karssen.org >>>> > http://blog.karssen.org >>>> > >>>> > Stuur mij aub geen Word of Powerpoint bestanden! >>>> > Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html >>>> > ------------------------------------------------------------------ >>>> > >>>> > _______________________________________________ >>>> > genabel-devel mailing list >>>> > genabel-devel at lists.r-forge.r-project.org >>>> > >>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >>>> >>>> AVVISO DI RISERVATEZZA Informazioni riservate possono essere >>>> contenute nel messaggio o nei suoi allegati. Se non siete i destinatari >>>> indicati nel messaggio, o responsabili per la sua consegna alla persona, o >>>> se avete ricevuto il messaggio per errore, siete pregati di non >>>> trascriverlo, copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a >>>> cancellare il messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE >>>> Confidential information may be contained in this message or in its >>>> attachments. If you are not the addressee indicated in this message, or >>>> responsible for message delivering to that person, or if you have received >>>> this message in error, you may not transcribe, copy or deliver this message >>>> to anyone. In that case, you should delete this message and its >>>> attachments. Thank you. >>>> _______________________________________________ >>>> genabel-devel mailing list >>>> genabel-devel at lists.r-forge.r-project.org >>>> >>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >>>> >>> >>> >>> >>> -- >>> ----------------------------------------------------- >>> Yurii S. Aulchenko >>> >>> [ LinkedIn ] [ Twitter] [ >>> Blog ] >>> >>> >>> AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute >>> nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel >>> messaggio, o responsabili per la sua consegna alla persona, o se avete >>> ricevuto il messaggio per errore, siete pregati di non trascriverlo, >>> copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il >>> messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential >>> information may be contained in this message or in its attachments. If you >>> are not the addressee indicated in this message, or responsible for message >>> delivering to that person, or if you have received this message in error, >>> you may not transcribe, copy or deliver this message to anyone. In that >>> case, you should delete this message and its attachments. Thank you. >>> >> >> >> >> -- >> ----------------------------------------------------- >> Yurii S. Aulchenko >> >> [ LinkedIn ] [ Twitter] [ >> Blog ] >> > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter] [ > Blog ] > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute nel > messaggio o nei suoi allegati. Se non siete i destinatari indicati nel > messaggio, o responsabili per la sua consegna alla persona, o se avete > ricevuto il messaggio per errore, siete pregati di non trascriverlo, > copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il > messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential > information may be contained in this message or in its attachments. If you > are not the addressee indicated in this message, or responsible for message > delivering to that person, or if you have received this message in error, > you may not transcribe, copy or deliver this message to anyone. In that > case, you should delete this message and its attachments. Thank you. > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Fri Jul 5 11:04:16 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Fri, 5 Jul 2013 11:04:16 +0200 Subject: [GenABEL-dev] presentation at UseR!-2013 Message-ID: Dear All, I am now drafting my presentation for UseR!-2013 ( http://www.edii.uclm.es/~useR-2013/). My presentation about "The GenABEL suite for genome-wide association analyses" is scheduled for Wed July 10 morning. I will send it to the list for the discussion as soon as I have a draft (most likely by Saturday eve). I thought it may be a good idea to present the evolution of the GenABEL in number, so the idea is to get the numbers by years/quartes of the year (say, #posts in 2009=x1, 2010=x2...) and present them graphically. For some of growth metrics I can get the dynamics by years easily, but for some I have no idea and hope you could help me (may be also by providing the numbers directly). Here a small list of metrics I thought of: #packages: very easy to count :) #posts on GenABEL-devel: possible to count #posts on forum: no idea how to do that for defined time periods #number of lines of code in our SVN repo: no idea #citations (GenA, ProbA...): easy to count thanks to Google Scholar #mentions on the Web: ??? Any other nice and easily computed metrics? I will appreciate your help and suggestions, and sorry for late notice. best, Yurii -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Fri Jul 5 11:12:56 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Fri, 5 Jul 2013 11:12:56 +0200 Subject: [GenABEL-dev] presentation at UseR!-2013 In-Reply-To: References: Message-ID: <-2742659264291419413@unknownmsgid> PS the presentation will be a much-updated version of my previous presentation http://mga.bionet.nsc.ru/~yurii/courses/ge03-2013/_GenABEL.pdf If you have some relevant slides, I will appreciate greatly if you could send these to me (of cause your Contribution will be acknowledged) Best, Yurii ---------------------- Yurii Aulchenko (sent from mobile device) On 5 Jul 2013, at 11:04, Yurii Aulchenko wrote: Dear All, I am now drafting my presentation for UseR!-2013 ( http://www.edii.uclm.es/~useR-2013/). My presentation about "The GenABEL suite for genome-wide association analyses" is scheduled for Wed July 10 morning. I will send it to the list for the discussion as soon as I have a draft (most likely by Saturday eve). I thought it may be a good idea to present the evolution of the GenABEL in number, so the idea is to get the numbers by years/quartes of the year (say, #posts in 2009=x1, 2010=x2...) and present them graphically. For some of growth metrics I can get the dynamics by years easily, but for some I have no idea and hope you could help me (may be also by providing the numbers directly). Here a small list of metrics I thought of: #packages: very easy to count :) #posts on GenABEL-devel: possible to count #posts on forum: no idea how to do that for defined time periods #number of lines of code in our SVN repo: no idea #citations (GenA, ProbA...): easy to count thanks to Google Scholar #mentions on the Web: ??? Any other nice and easily computed metrics? I will appreciate your help and suggestions, and sorry for late notice. best, Yurii -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.kooijman at erasmusmc.nl Fri Jul 5 12:24:03 2013 From: m.kooijman at erasmusmc.nl (Maarten Kooyman) Date: Fri, 05 Jul 2013 12:24:03 +0200 Subject: [GenABEL-dev] presentation at UseR!-2013 In-Reply-To: References: Message-ID: <51D69EC3.5040000@erasmusmc.nl> Hi Yurri, You might try to install phpBB statistics. https://www.phpbb.com/customise/db/mod/phpbb_statistics/ Good luck! Maarten Kooyman Erasmus MC Department of Epidemiology Room Na27-18 Postbus 2040 3000 CA Rotterdam The Netherlands phone: +31-10-7038194 mobile: +31-6-28569364 e-mail: m.kooijman at erasmusmc.nl GPG key ID: AA2CAF11 On 07/05/2013 11:04 AM, Yurii Aulchenko wrote: > Dear All, > > I am now drafting my presentation for UseR!-2013 > (http://www.edii.uclm.es/~useR-2013/ > ). My presentation about "The > GenABEL suite for genome-wide association analyses" is scheduled for > Wed July 10 morning. I will send it to the list for the discussion as > soon as I have a draft (most likely by Saturday eve). > > I thought it may be a good idea to present the evolution of the > GenABEL in number, so the idea is to get the numbers by years/quartes > of the year (say, #posts in 2009=x1, 2010=x2...) and present them > graphically. For some of growth metrics I can get the dynamics by > years easily, but for some I have no idea and hope you could help me > (may be also by providing the numbers directly). > > Here a small list of metrics I thought of: > > #packages: very easy to count :) > #posts on GenABEL-devel: possible to count > #posts on forum: no idea how to do that for defined time periods > #number of lines of code in our SVN repo: no idea > #citations (GenA, ProbA...): easy to count thanks to Google Scholar > #mentions on the Web: ??? > > Any other nice and easily computed metrics? > > I will appreciate your help and suggestions, and sorry for late notice. > > best, > Yurii > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From lennart at karssen.org Fri Jul 5 12:30:39 2013 From: lennart at karssen.org (L.C. Karssen) Date: Fri, 05 Jul 2013 12:30:39 +0200 Subject: [GenABEL-dev] presentation at UseR!-2013 In-Reply-To: References: Message-ID: <51D6A04F.7050708@karssen.org> Hi Yurii, On 07/05/2013 11:04 AM, Yurii Aulchenko wrote: > Dear All, > > I am now drafting my presentation for UseR!-2013 ( > http://www.edii.uclm.es/~useR-2013/). My presentation about "The GenABEL > suite for genome-wide association analyses" is scheduled for Wed July 10 > morning. I will send it to the list for the discussion as soon as I have a > draft (most likely by Saturday eve). > > I thought it may be a good idea to present the evolution of the GenABEL in > number, so the idea is to get the numbers by years/quartes of the year > (say, #posts in 2009=x1, 2010=x2...) and present them graphically. For some > of growth metrics I can get the dynamics by years easily, but for some I > have no idea and hope you could help me (may be also by providing the > numbers directly). > > Here a small list of metrics I thought of: > > #packages: very easy to count :) > #posts on GenABEL-devel: possible to count > #posts on forum: no idea how to do that for defined time periods I guess you need to run a query on the database to get those. Our hoster has a phpmyadmin interface yuo can use for that (or you could probably use the SSH account and run the MySQL client from the command line). Probably a query along this line: SELECT yearweek(date(from_unixtime(post_time))) AS week, COUNT(*) AS num_posts FROM phpbb_posts GROUP BY yearweek(date(from_unixtime(post_time))) > #number of lines of code in our SVN repo: no idea Probably SLOCcount will help: http://www.dwheeler.com/sloccount/ > #citations (GenA, ProbA...): easy to count thanks to Google Scholar > #mentions on the Web: ??? > > Any other nice and easily computed metrics? > > I will appreciate your help and suggestions, and sorry for late notice. > Good luck, Lennart. > best, > Yurii > > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- ----------------------------------------------------------------- L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org Stuur mij aub geen Word of Powerpoint bestanden! Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html ------------------------------------------------------------------ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 230 bytes Desc: OpenPGP digital signature URL: From yurii.aulchenko at gmail.com Fri Jul 5 12:44:48 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Fri, 5 Jul 2013 12:44:48 +0200 Subject: [GenABEL-dev] presentation at UseR!-2013 In-Reply-To: <51D69EC3.5040000@erasmusmc.nl> References: <51D69EC3.5040000@erasmusmc.nl> Message-ID: Thanks, Maarten, good suggestion! I now I remember that we linked the site to the Google Analytics, which does allow nice summaries per period. So the question of #visitors to the GenABEL.org per time period is solved! Wonder about forum... Yurii On Fri, Jul 5, 2013 at 12:24 PM, Maarten Kooyman wrote: > Hi Yurri, > > You might try to install phpBB statistics. > > https://www.phpbb.com/customise/db/mod/phpbb_statistics/ > > > Good luck! > > Maarten Kooyman > Erasmus MC > Department of Epidemiology > Room Na27-18 > > Postbus 2040 > 3000 CA Rotterdam > The Netherlands > > phone: +31-10-7038194 > mobile: +31-6-28569364 > e-mail: m.kooijman at erasmusmc.nl > GPG key ID: AA2CAF11 > > On 07/05/2013 11:04 AM, Yurii Aulchenko wrote: > > Dear All, > > I am now drafting my presentation for UseR!-2013 ( > http://www.edii.uclm.es/~useR-2013/). My presentation about "The GenABEL > suite for genome-wide association analyses" is scheduled for Wed July 10 > morning. I will send it to the list for the discussion as soon as I have a > draft (most likely by Saturday eve). > > I thought it may be a good idea to present the evolution of the GenABEL > in number, so the idea is to get the numbers by years/quartes of the year > (say, #posts in 2009=x1, 2010=x2...) and present them graphically. For some > of growth metrics I can get the dynamics by years easily, but for some I > have no idea and hope you could help me (may be also by providing the > numbers directly). > > Here a small list of metrics I thought of: > > #packages: very easy to count :) > #posts on GenABEL-devel: possible to count > #posts on forum: no idea how to do that for defined time periods > #number of lines of code in our SVN repo: no idea > #citations (GenA, ProbA...): easy to count thanks to Google Scholar > #mentions on the Web: ??? > > Any other nice and easily computed metrics? > > I will appreciate your help and suggestions, and sorry for late notice. > > best, > Yurii > > > _______________________________________________ > genabel-devel mailing listgenabel-devel at lists.r-forge.r-project.orghttps://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Fri Jul 5 13:42:35 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Fri, 5 Jul 2013 13:42:35 +0200 Subject: [GenABEL-dev] presentation at UseR!-2013 In-Reply-To: References: <51D69EC3.5040000@erasmusmc.nl> Message-ID: Maarten has kindly agreed to check if he could generate some numbers/graphs using google analytics. Mind that we can probably spend just a couple of slides on the "progress number" for which we need some impressive figures Anyways, even if the numbers/graphs do not make it into presentation, we can probably use the graphs/numbers for the (much under-attended!) "showcase" section on the web-site :) best, and many thanks, YA On Fri, Jul 5, 2013 at 12:44 PM, Yurii Aulchenko wrote: > Thanks, Maarten, good suggestion! I now I remember that we linked the site > to the Google Analytics, which does allow nice summaries per period. > > So the question of #visitors to the GenABEL.org per time period is solved! > > Wonder about forum... > > Yurii > > > On Fri, Jul 5, 2013 at 12:24 PM, Maarten Kooyman wrote: > >> Hi Yurri, >> >> You might try to install phpBB statistics. >> >> https://www.phpbb.com/customise/db/mod/phpbb_statistics/ >> >> >> Good luck! >> >> Maarten Kooyman >> Erasmus MC >> Department of Epidemiology >> Room Na27-18 >> >> Postbus 2040 >> 3000 CA Rotterdam >> The Netherlands >> >> phone: +31-10-7038194 >> mobile: +31-6-28569364 >> e-mail: m.kooijman at erasmusmc.nl >> GPG key ID: AA2CAF11 >> >> On 07/05/2013 11:04 AM, Yurii Aulchenko wrote: >> >> Dear All, >> >> I am now drafting my presentation for UseR!-2013 ( >> http://www.edii.uclm.es/~useR-2013/). My presentation about "The GenABEL >> suite for genome-wide association analyses" is scheduled for Wed July 10 >> morning. I will send it to the list for the discussion as soon as I have a >> draft (most likely by Saturday eve). >> >> I thought it may be a good idea to present the evolution of the GenABEL >> in number, so the idea is to get the numbers by years/quartes of the year >> (say, #posts in 2009=x1, 2010=x2...) and present them graphically. For some >> of growth metrics I can get the dynamics by years easily, but for some I >> have no idea and hope you could help me (may be also by providing the >> numbers directly). >> >> Here a small list of metrics I thought of: >> >> #packages: very easy to count :) >> #posts on GenABEL-devel: possible to count >> #posts on forum: no idea how to do that for defined time periods >> #number of lines of code in our SVN repo: no idea >> #citations (GenA, ProbA...): easy to count thanks to Google Scholar >> #mentions on the Web: ??? >> >> Any other nice and easily computed metrics? >> >> I will appreciate your help and suggestions, and sorry for late notice. >> >> best, >> Yurii >> >> >> _______________________________________________ >> genabel-devel mailing listgenabel-devel at lists.r-forge.r-project.orghttps://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >> >> >> >> _______________________________________________ >> genabel-devel mailing list >> genabel-devel at lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >> > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter] [ > Blog ] > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Fri Jul 5 14:04:36 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Fri, 5 Jul 2013 14:04:36 +0200 Subject: [GenABEL-dev] presentation at UseR!-2013 In-Reply-To: <51D6A04F.7050708@karssen.org> References: <51D6A04F.7050708@karssen.org> Message-ID: On Fri, Jul 5, 2013 at 12:30 PM, L.C. Karssen wrote: > Hi Yurii, > > On 07/05/2013 11:04 AM, Yurii Aulchenko wrote: > > Dear All, > > > > I am now drafting my presentation for UseR!-2013 ( > > http://www.edii.uclm.es/~useR-2013/). My presentation about "The GenABEL > > suite for genome-wide association analyses" is scheduled for Wed July 10 > > morning. I will send it to the list for the discussion as soon as I have > a > > draft (most likely by Saturday eve). > > > > I thought it may be a good idea to present the evolution of the GenABEL > in > > number, so the idea is to get the numbers by years/quartes of the year > > (say, #posts in 2009=x1, 2010=x2...) and present them graphically. For > some > > of growth metrics I can get the dynamics by years easily, but for some I > > have no idea and hope you could help me (may be also by providing the > > numbers directly). > > > > Here a small list of metrics I thought of: > > > > #packages: very easy to count :) > > #posts on GenABEL-devel: possible to count > > #posts on forum: no idea how to do that for defined time periods > > I guess you need to run a query on the database to get those. Our hoster > has a phpmyadmin interface yuo can use for that (or you could probably > use the SSH account and run the MySQL client from the command line). > Probably a query along this line: > > SELECT yearweek(date(from_unixtime(post_time))) AS week, COUNT(*) AS > num_posts FROM phpbb_posts GROUP BY > yearweek(date(from_unixtime(post_time))) > > arrgh... probably I can figure this out if I had enough time, but gonna to invest into presentation now. If you/someone could give a hand, would be great :) > > > #number of lines of code in our SVN repo: no idea > > Probably SLOCcount will help: http://www.dwheeler.com/sloccount/ > > This is a nice one! Two problems: it does not count/recognize R; did not see how to use it to see the dynamics (what was there in repo 2 years ago?..) But I like that even without the R code counts (which is 148,000 lines), for ~65,000 lines of mostly C/C++ I get the message indicating that GenABEL is worth few millions of dollars: Development Effort Estimate, Person-Years (Person-Months) = 15.44 (185.24) (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05)) Schedule Estimate, Years (Months) = 1.05 (12.61) (Basic COCOMO model, Months = 2.5 * (person-months**0.38)) Total Estimated Cost to Develop = $ 2,085,323 (average salary = $56,286/year, overhead = 2.40). So I think I should use these figures in my presentation :) > #citations (GenA, ProbA...): easy to count thanks to Google Scholar > > #mentions on the Web: ??? > > > > Any other nice and easily computed metrics? > > > > I will appreciate your help and suggestions, and sorry for late notice. > > > > > Good luck, > > Lennart. > > > best, > > Yurii > > > > > > > > _______________________________________________ > > genabel-devel mailing list > > genabel-devel at lists.r-forge.r-project.org > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > > > -- > ----------------------------------------------------------------- > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > > Stuur mij aub geen Word of Powerpoint bestanden! > Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html > ------------------------------------------------------------------ > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Fri Jul 5 14:36:42 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Fri, 5 Jul 2013 14:36:42 +0200 Subject: [GenABEL-dev] presentation at UseR!-2013 In-Reply-To: References: Message-ID: On Fri, Jul 5, 2013 at 11:04 AM, Yurii Aulchenko wrote: > Dear All, > > I am now drafting my presentation for UseR!-2013 ( > http://www.edii.uclm.es/~useR-2013/). My presentation about "The GenABEL > suite for genome-wide association analyses" is scheduled for Wed July 10 > morning. I will send it to the list for the discussion as soon as I have a > draft (most likely by Saturday eve). > > I thought it may be a good idea to present the evolution of the GenABEL in > number, so the idea is to get the numbers by years/quartes of the year > (say, #posts in 2009=x1, 2010=x2...) and present them graphically. For some > of growth metrics I can get the dynamics by years easily, but for some I > have no idea and hope you could help me (may be also by providing the > numbers directly). > > Here a small list of metrics I thought of: > > #packages: very easy to count :) > #posts on GenABEL-devel: possible to count > #posts on forum: no idea how to do that for defined time periods > #number of lines of code in our SVN repo: no idea > #citations (GenA, ProbA...): easy to count thanks to Google Scholar > Even easier than I thought: http://scholar.google.nl/citations?view_op=view_citation&hl=en&user=wdqXTTEAAAAJ&citation_for_view=wdqXTTEAAAAJ:UeHWp8X0CEIC http://scholar.google.nl/citations?view_op=view_citation&hl=en&user=wdqXTTEAAAAJ&citation_for_view=wdqXTTEAAAAJ:KlAtU1dfN6UC Will add the "projection" for 2013 + total # citations + arrows for when package/paper got out > #mentions on the Web: ??? > > Any other nice and easily computed metrics? > > I will appreciate your help and suggestions, and sorry for late notice. > > best, > Yurii > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Fri Jul 5 14:55:23 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Fri, 5 Jul 2013 14:55:23 +0200 Subject: [GenABEL-dev] layout of GenABEL main page In-Reply-To: <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it> References: <51D2C34D.2000907@gmail.com> <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it> Message-ID: I suggest that for the moment we go with what we have (Grant's variant); we can change later. Please let me know if you have a strong opinion against! - I really would like to use the logo for my presentation and also play a bit how well it fits our pages (genabel.org, facebook, twitter) YA On Tue, Jul 2, 2013 at 4:27 PM, Nicola Pirastu < nicola.pirastu at burlo.trieste.it> wrote: > Just to add my two cents to the discussion, > > I think that the problem is not with the DNA helix but with the font. > I've played around a bit with it and if you use for example Helvetica or > something less comic-sans-like it does look better. Also for some reason > I'm still disturbed by the green but it is a very personal opinion.. > > Nicola > > Dr. Nicola Pirastu PhD > Research Fellow > Medical Sciences, Chirurgical and Health Department > University of Trieste > Medical Genetics > IRCCS Burlo Garofolo > Via dell'Istria 65/1 > 34137 Italy > tel. +390403785539 > > Il giorno 02/lug/2013, alle ore 14:38, Yurii Aulchenko < > yurii.aulchenko at gmail.com> ha scritto: > > Dear All, > > I agree with critique of Maarten, and I actually still not sure if I > like Maarten's or Grant's idea better. Interesting thing is that - not sure > all realize it - Grant's variant is his vision of Maarten's prototype :) > However, Grant's variant has an important advantage - it is ready to serve > as logo. And I actually want to use a logo in my slides for UseR!-2013. > > So I suggest we take Grant's logo as a working variant. No doubt that > the logo is going to evolve with time - as anything we do in the project - > code, documentation; logo is no different, I think. The element which is > going to stay and keep it recognizable is the way of spelling the GenABEL > :) - Like Gnu's horns in the GNU logo. > > What we can do next is to place an open call on site/forum for other > users to contribute, but this is going to take time, and meanwhile I > suggest to stick with Grant's variant. > > Yurii > > On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman wrote: > >> Dear all, >> >> >> It looks really nice ! Credits for who made it. However, I have more the >> impression that it looks like a polypeptide chain or a rosary. The >> seventies font is a matter of taste, but it remind me of comic >> sans(including a upside down e as a). I wonder if it readable if you print >> it on a poster: I think this is a important use-case of a scientific logo. >> >> Kind regards, >> >> >> Maarten >> >> >> >> >> On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote: >> >>> On 28/06/13, Yurii Aulchenko wrote: >>> >>> How do you like this one? >>>> >>> I like it a lot. >>> >>> What do you think about reducing the font size for the subtitle >>> and right-justifying it? Would it still be readable? I liked that >>> detail from the previous attempts with the "Project" subtitle. >>> >>> In any case, this is just a minor detail. It looks great as it is. >>> >>> Thanks to Grant Borodin! >>> >>> >>>> YA >>>> >>>> >>>> On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko < >>>> yurii.aulchenko at gmail.com(**javascript:main.compose()> wrote: >>>> >>>> >>>> Dear Nicola, Diego, Lennart, >>>>> >>>>> >>>>> Thanks for your feedback! I will ask Grant Borodin, who kindly >>>>> designed these logos, if he could change C according to your comment >>>>> (capital "ABEL" and "statistical genomics" as in F). >>>>> >>>>> >>>>> >>>>> >>>>> Yurii >>>>> >>>>> >>>>> >>>>> On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver < >>>>> fabregat at aices.rwth-aachen.de**(javascript:main.compose()> wrote: >>>>> >>>>> >>>>> >>>>> >>>>>> Congrats to whoever designed these logos, they look very nice :) >>>>>> >>>>>> >>>>>> >>>>>> With respect to my preferences, I fully agree with Lennart: "C with >>>>>> capital ABEL and statistical genomics below it" would be my choice. >>>>>> >>>>>> >>>>>> >>>>>> Best, >>>>>> >>>>>> Diego >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 20/06/13, "L.C. Karssen" >>>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> Wow! Those look really nice! >>>>>>> I like options C and F the most. Actually a combination would be even >>>>>>> better IMHO: use C with capital ABEL and statistical genomics below >>>>>>> it. >>>>>>> Looking forward to head the opinion of others, >>>>>>> Lennart. >>>>>>> On 20-06-13 09:34, Yurii Aulchenko wrote: >>>>>>> >>>>>>>> Please find attached few more logo variants >>>>>>>> Yurii >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>>> ______________________________**_________________ >>> genabel-devel mailing list >>> genabel-devel at lists.r-forge.r-**project.org >>> https://lists.r-forge.r-**project.org/cgi-bin/mailman/** >>> listinfo/genabel-devel >>> >> >> ______________________________**_________________ >> genabel-devel mailing list >> genabel-devel at lists.r-forge.r-**project.org >> https://lists.r-forge.r-**project.org/cgi-bin/mailman/** >> listinfo/genabel-devel >> > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter] [ > Blog ] > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute nel > messaggio o nei suoi allegati. Se non siete i destinatari indicati nel > messaggio, o responsabili per la sua consegna alla persona, o se avete > ricevuto il messaggio per errore, siete pregati di non trascriverlo, > copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il > messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential > information may be contained in this message or in its attachments. If you > are not the addressee indicated in this message, or responsible for message > delivering to that person, or if you have received this message in error, > you may not transcribe, copy or deliver this message to anyone. In that > case, you should delete this message and its attachments. Thank you. > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicola.pirastu at burlo.trieste.it Fri Jul 5 15:05:16 2013 From: nicola.pirastu at burlo.trieste.it (Nicola Pirastu) Date: Fri, 5 Jul 2013 15:05:16 +0200 Subject: [GenABEL-dev] layout of GenABEL main page In-Reply-To: References: <51D2C34D.2000907@gmail.com> <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it> Message-ID: <6632A424-420E-423B-957A-3B8481DD0122@burlo.trieste.it> I agree, in the end it's not the coca-cola logo and we have not been using it for years so I don't think people are going to be confused if the Logo changes in a few months. I am actually curious to see how it will look on the forum. I do think that if it's not too much work, the colors of the forum and website should match those of the logo though. Nicola Dr. Nicola Pirastu PhD Research Fellow Medical Sciences, Chirurgical and Health Department University of Trieste Medical Genetics IRCCS Burlo Garofolo Via dell'Istria 65/1 34137 Italy tel. +390403785539 Il giorno 05/lug/2013, alle ore 14:55, Yurii Aulchenko > ha scritto: I suggest that for the moment we go with what we have (Grant's variant); we can change later. Please let me know if you have a strong opinion against! - I really would like to use the logo for my presentation and also play a bit how well it fits our pages (genabel.org, facebook, twitter) YA On Tue, Jul 2, 2013 at 4:27 PM, Nicola Pirastu > wrote: Just to add my two cents to the discussion, I think that the problem is not with the DNA helix but with the font. I've played around a bit with it and if you use for example Helvetica or something less comic-sans-like it does look better. Also for some reason I'm still disturbed by the green but it is a very personal opinion.. Nicola Dr. Nicola Pirastu PhD Research Fellow Medical Sciences, Chirurgical and Health Department University of Trieste Medical Genetics IRCCS Burlo Garofolo Via dell'Istria 65/1 34137 Italy tel. +390403785539 Il giorno 02/lug/2013, alle ore 14:38, Yurii Aulchenko > ha scritto: Dear All, I agree with critique of Maarten, and I actually still not sure if I like Maarten's or Grant's idea better. Interesting thing is that - not sure all realize it - Grant's variant is his vision of Maarten's prototype :) However, Grant's variant has an important advantage - it is ready to serve as logo. And I actually want to use a logo in my slides for UseR!-2013. So I suggest we take Grant's logo as a working variant. No doubt that the logo is going to evolve with time - as anything we do in the project - code, documentation; logo is no different, I think. The element which is going to stay and keep it recognizable is the way of spelling the GenABEL :) - Like Gnu's horns in the GNU logo. What we can do next is to place an open call on site/forum for other users to contribute, but this is going to take time, and meanwhile I suggest to stick with Grant's variant. Yurii On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman > wrote: Dear all, It looks really nice ! Credits for who made it. However, I have more the impression that it looks like a polypeptide chain or a rosary. The seventies font is a matter of taste, but it remind me of comic sans(including a upside down e as a). I wonder if it readable if you print it on a poster: I think this is a important use-case of a scientific logo. Kind regards, Maarten On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote: On 28/06/13, Yurii Aulchenko > wrote: How do you like this one? I like it a lot. What do you think about reducing the font size for the subtitle and right-justifying it? Would it still be readable? I liked that detail from the previous attempts with the "Project" subtitle. In any case, this is just a minor detail. It looks great as it is. Thanks to Grant Borodin! YA On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko (javascript:main.compose()> wrote: Dear Nicola, Diego, Lennart, Thanks for your feedback! I will ask Grant Borodin, who kindly designed these logos, if he could change C according to your comment (capital "ABEL" and "statistical genomics" as in F). Yurii On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver (javascript:main.compose()> wrote: Congrats to whoever designed these logos, they look very nice :) With respect to my preferences, I fully agree with Lennart: "C with capital ABEL and statistical genomics below it" would be my choice. Best, Diego On 20/06/13, "L.C. Karssen" (javascript:main.compose()> wrote: Wow! Those look really nice! I like options C and F the most. Actually a combination would be even better IMHO: use C with capital ABEL and statistical genomics below it. Looking forward to head the opinion of others, Lennart. On 20-06-13 09:34, Yurii Aulchenko wrote: Please find attached few more logo variants Yurii _______________________________________________ genabel-devel mailing list genabel-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel _______________________________________________ genabel-devel mailing list genabel-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter ] [ Blog ] _______________________________________________ genabel-devel mailing list genabel-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel messaggio, o responsabili per la sua consegna alla persona, o se avete ricevuto il messaggio per errore, siete pregati di non trascriverlo, copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential information may be contained in this message or in its attachments. If you are not the addressee indicated in this message, or responsible for message delivering to that person, or if you have received this message in error, you may not transcribe, copy or deliver this message to anyone. In that case, you should delete this message and its attachments. Thank you. -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter ] [ Blog ] AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel messaggio, o responsabili per la sua consegna alla persona, o se avete ricevuto il messaggio per errore, siete pregati di non trascriverlo, copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential information may be contained in this message or in its attachments. If you are not the addressee indicated in this message, or responsible for message delivering to that person, or if you have received this message in error, you may not transcribe, copy or deliver this message to anyone. In that case, you should delete this message and its attachments. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Fri Jul 5 15:09:34 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Fri, 5 Jul 2013 15:09:34 +0200 Subject: [GenABEL-dev] layout of GenABEL main page In-Reply-To: <6632A424-420E-423B-957A-3B8481DD0122@burlo.trieste.it> References: <51D2C34D.2000907@gmail.com> <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it> <6632A424-420E-423B-957A-3B8481DD0122@burlo.trieste.it> Message-ID: On Fri, Jul 5, 2013 at 3:05 PM, Nicola Pirastu < nicola.pirastu at burlo.trieste.it> wrote: > I agree, in the end it's not the coca-cola logo and we have not been > using it for years so I don't think people are going to be confused if the > Logo changes in a few months. > > More than that - I really think it should evolve as our project does :) > I am actually curious to see how it will look on the forum. I do think > that if it's not too much work, the colors of the forum and website should > match those of the logo though. > Yep. I now start understanding why people were giving the costs estimates of few thousands of euro for the that basic design package: e.g. for facebook we need cover and avatar (latter would do for the twitter as well). So this is whole project :) May be later we should think of inviting some guys from a design school - they must be looking for graduation projects to make, and may be they would be willing to do that for free :) YA > > Nicola > > > Dr. Nicola Pirastu PhD > Research Fellow > Medical Sciences, Chirurgical and Health Department > University of Trieste > Medical Genetics > IRCCS Burlo Garofolo > Via dell'Istria 65/1 > 34137 Italy > tel. +390403785539 > > Il giorno 05/lug/2013, alle ore 14:55, Yurii Aulchenko < > yurii.aulchenko at gmail.com> ha scritto: > > I suggest that for the moment we go with what we have (Grant's variant); > we can change later. > > Please let me know if you have a strong opinion against! - I really > would like to use the logo for my presentation and also play a bit how well > it fits our pages (genabel.org, facebook, twitter) > > YA > > On Tue, Jul 2, 2013 at 4:27 PM, Nicola Pirastu < > nicola.pirastu at burlo.trieste.it> wrote: > >> Just to add my two cents to the discussion, >> >> I think that the problem is not with the DNA helix but with the font. >> I've played around a bit with it and if you use for example Helvetica or >> something less comic-sans-like it does look better. Also for some reason >> I'm still disturbed by the green but it is a very personal opinion.. >> >> Nicola >> >> Dr. Nicola Pirastu PhD >> Research Fellow >> Medical Sciences, Chirurgical and Health Department >> University of Trieste >> Medical Genetics >> IRCCS Burlo Garofolo >> Via dell'Istria 65/1 >> 34137 Italy >> tel. +390403785539 >> >> Il giorno 02/lug/2013, alle ore 14:38, Yurii Aulchenko < >> yurii.aulchenko at gmail.com> ha scritto: >> >> Dear All, >> >> I agree with critique of Maarten, and I actually still not sure if I >> like Maarten's or Grant's idea better. Interesting thing is that - not sure >> all realize it - Grant's variant is his vision of Maarten's prototype :) >> However, Grant's variant has an important advantage - it is ready to serve >> as logo. And I actually want to use a logo in my slides for UseR!-2013. >> >> So I suggest we take Grant's logo as a working variant. No doubt that >> the logo is going to evolve with time - as anything we do in the project - >> code, documentation; logo is no different, I think. The element which is >> going to stay and keep it recognizable is the way of spelling the GenABEL >> :) - Like Gnu's horns in the GNU logo. >> >> What we can do next is to place an open call on site/forum for other >> users to contribute, but this is going to take time, and meanwhile I >> suggest to stick with Grant's variant. >> >> Yurii >> >> On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman wrote: >> >>> Dear all, >>> >>> >>> It looks really nice ! Credits for who made it. However, I have more >>> the impression that it looks like a polypeptide chain or a rosary. The >>> seventies font is a matter of taste, but it remind me of comic >>> sans(including a upside down e as a). I wonder if it readable if you print >>> it on a poster: I think this is a important use-case of a scientific logo. >>> >>> Kind regards, >>> >>> >>> Maarten >>> >>> >>> >>> >>> On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote: >>> >>>> On 28/06/13, Yurii Aulchenko wrote: >>>> >>>> How do you like this one? >>>>> >>>> I like it a lot. >>>> >>>> What do you think about reducing the font size for the subtitle >>>> and right-justifying it? Would it still be readable? I liked that >>>> detail from the previous attempts with the "Project" subtitle. >>>> >>>> In any case, this is just a minor detail. It looks great as it is. >>>> >>>> Thanks to Grant Borodin! >>>> >>>> >>>>> YA >>>>> >>>>> >>>>> On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko < >>>>> yurii.aulchenko at gmail.com(**javascript:main.compose()> wrote: >>>>> >>>>> >>>>> Dear Nicola, Diego, Lennart, >>>>>> >>>>>> >>>>>> Thanks for your feedback! I will ask Grant Borodin, who kindly >>>>>> designed these logos, if he could change C according to your comment >>>>>> (capital "ABEL" and "statistical genomics" as in F). >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Yurii >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver < >>>>>> fabregat at aices.rwth-aachen.de**(javascript:main.compose()> wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Congrats to whoever designed these logos, they look very nice :) >>>>>>> >>>>>>> >>>>>>> >>>>>>> With respect to my preferences, I fully agree with Lennart: "C with >>>>>>> capital ABEL and statistical genomics below it" would be my choice. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Diego >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 20/06/13, "L.C. Karssen" >>>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> Wow! Those look really nice! >>>>>>>> I like options C and F the most. Actually a combination would be >>>>>>>> even >>>>>>>> better IMHO: use C with capital ABEL and statistical genomics below >>>>>>>> it. >>>>>>>> Looking forward to head the opinion of others, >>>>>>>> Lennart. >>>>>>>> On 20-06-13 09:34, Yurii Aulchenko wrote: >>>>>>>> >>>>>>>>> Please find attached few more logo variants >>>>>>>>> Yurii >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> ______________________________**_________________ >>>> genabel-devel mailing list >>>> genabel-devel at lists.r-forge.r-**project.org >>>> https://lists.r-forge.r-**project.org/cgi-bin/mailman/** >>>> listinfo/genabel-devel >>>> >>> >>> ______________________________**_________________ >>> genabel-devel mailing list >>> genabel-devel at lists.r-forge.r-**project.org >>> https://lists.r-forge.r-**project.org/cgi-bin/mailman/** >>> listinfo/genabel-devel >>> >> >> >> >> -- >> ----------------------------------------------------- >> Yurii S. Aulchenko >> >> [ LinkedIn ] [ Twitter] [ >> Blog ] >> _______________________________________________ >> genabel-devel mailing list >> genabel-devel at lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >> >> >> AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute >> nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel >> messaggio, o responsabili per la sua consegna alla persona, o se avete >> ricevuto il messaggio per errore, siete pregati di non trascriverlo, >> copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il >> messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential >> information may be contained in this message or in its attachments. If you >> are not the addressee indicated in this message, or responsible for message >> delivering to that person, or if you have received this message in error, >> you may not transcribe, copy or deliver this message to anyone. In that >> case, you should delete this message and its attachments. Thank you. >> > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter] [ > Blog ] > > > AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute > nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel > messaggio, o responsabili per la sua consegna alla persona, o se avete > ricevuto il messaggio per errore, siete pregati di non trascriverlo, > copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il > messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential > information may be contained in this message or in its attachments. If you > are not the addressee indicated in this message, or responsible for message > delivering to that person, or if you have received this message in error, > you may not transcribe, copy or deliver this message to anyone. In that > case, you should delete this message and its attachments. Thank you. > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From lennart at karssen.org Sat Jul 6 18:10:48 2013 From: lennart at karssen.org (L.C. Karssen) Date: Sat, 06 Jul 2013 18:10:48 +0200 Subject: [GenABEL-dev] presentation at UseR!-2013 In-Reply-To: References: <51D6A04F.7050708@karssen.org> Message-ID: <51D84188.6010306@karssen.org> Hi Yurii, Please find attached the output of the MySQL statement. I added another column in which the week numbers are separated from the year by a dash, that makes it easier to read in e.g. R: posts <- read.table("tmp/posts_per_week_converted.out", header=TRUE, sep=" ", row.names=NULL) colnames(posts) <- c("date", "num_posts") # Convert year-week to year-month-day posts$weekdate <- as.Date(paste(posts$date, 1), format="%Y-%U %u") head(posts) date num_posts weekdate 1 2011-01 1 2011-01-03 2 2011-04 15 2011-01-24 3 2011-05 7 2011-01-31 4 2011-06 24 2011-02-07 5 2011-07 10 2011-02-14 6 2011-08 7 2011-02-21 This should help making a bar plot of "weekdate" vs. "num_posts". By the way, the SQL script is in the ~/scripts/ directory on the SSH server of our hoster. You can execute it like this: mysql -u USERNAME --password=PASSWORD -h HOSTNAME < get_weekly_posts.sql > posts_per_week.out The user name, password and host name can be found in the backup scripts in that same directory. Best, Lennart. On 05-07-13 14:04, Yurii Aulchenko wrote: > > > On Fri, Jul 5, 2013 at 12:30 PM, L.C. Karssen > wrote: > > Hi Yurii, > > On 07/05/2013 11:04 AM, Yurii Aulchenko wrote: > > Dear All, > > > > I am now drafting my presentation for UseR!-2013 ( > > http://www.edii.uclm.es/~useR-2013/). My presentation about "The > GenABEL > > suite for genome-wide association analyses" is scheduled for Wed > July 10 > > morning. I will send it to the list for the discussion as soon as > I have a > > draft (most likely by Saturday eve). > > > > I thought it may be a good idea to present the evolution of the > GenABEL in > > number, so the idea is to get the numbers by years/quartes of the year > > (say, #posts in 2009=x1, 2010=x2...) and present them graphically. > For some > > of growth metrics I can get the dynamics by years easily, but for > some I > > have no idea and hope you could help me (may be also by providing the > > numbers directly). > > > > Here a small list of metrics I thought of: > > > > #packages: very easy to count :) > > #posts on GenABEL-devel: possible to count > > #posts on forum: no idea how to do that for defined time periods > > I guess you need to run a query on the database to get those. Our hoster > has a phpmyadmin interface yuo can use for that (or you could probably > use the SSH account and run the MySQL client from the command line). > Probably a query along this line: > > SELECT yearweek(date(from_unixtime(post_time))) AS week, COUNT(*) AS > num_posts FROM phpbb_posts GROUP BY > yearweek(date(from_unixtime(post_time))) > > > arrgh... probably I can figure this out if I had enough time, but gonna > to invest into presentation now. If you/someone could give a hand, would > be great :) > > > > > > #number of lines of code in our SVN repo: no idea > > Probably SLOCcount will help: http://www.dwheeler.com/sloccount/ > > > This is a nice one! Two problems: it does not count/recognize R; did not > see how to use it to see the dynamics (what was there in repo 2 years > ago?..) > > But I like that even without the R code counts (which is 148,000 lines), > for ~65,000 lines of mostly C/C++ I get the message indicating that > GenABEL is worth few millions of dollars: > > Development Effort Estimate, Person-Years (Person-Months) = 15.44 (185.24) > (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05)) > Schedule Estimate, Years (Months) = 1.05 (12.61) > (Basic COCOMO model, Months = 2.5 * (person-months**0.38)) > Total Estimated Cost to Develop = $ 2,085,323 > (average salary = $56,286/year, overhead = 2.40). > > So I think I should use these figures in my presentation :) > > > #citations (GenA, ProbA...): easy to count thanks to Google Scholar > > #mentions on the Web: ??? > > > > Any other nice and easily computed metrics? > > > > I will appreciate your help and suggestions, and sorry for late > notice. > > > > > Good luck, > > Lennart. > > > best, > > Yurii > > > > > > > > _______________________________________________ > > genabel-devel mailing list > > genabel-devel at lists.r-forge.r-project.org > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > > > -- > ----------------------------------------------------------------- > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > > Stuur mij aub geen Word of Powerpoint bestanden! > Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html > ------------------------------------------------------------------ > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter > ] [ Blog > ] -- ----------------------------------------------------------------- L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org Stuur mij aub geen Word of Powerpoint bestanden! Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html ------------------------------------------------------------------ -------------- next part -------------- week num_posts 2011-01 1 2011-04 15 2011-05 7 2011-06 24 2011-07 10 2011-08 7 2011-09 17 2011-10 27 2011-11 11 2011-12 11 2011-13 19 2011-14 4 2011-15 12 2011-16 20 2011-17 6 2011-18 6 2011-19 6 2011-20 12 2011-21 9 2011-22 13 2011-23 4 2011-24 40 2011-25 19 2011-26 6 2011-27 26 2011-28 5 2011-29 3 2011-30 2 2011-31 2 2011-32 4 2011-33 11 2011-34 17 2011-35 1 2011-36 5 2011-37 2 2011-38 3 2011-39 16 2011-40 15 2011-41 4 2011-42 7 2011-43 18 2011-44 3 2011-45 4 2011-46 6 2011-47 5 2011-48 7 2011-49 6 2011-50 3 2011-51 9 2011-52 2 2012-01 1 2012-02 2 2012-03 1 2012-04 4 2012-05 6 2012-06 9 2012-07 12 2012-08 26 2012-09 1 2012-10 2 2012-11 6 2012-12 5 2012-13 4 2012-14 2 2012-15 2 2012-16 5 2012-17 4 2012-18 1 2012-19 3 2012-20 2 2012-22 9 2012-23 8 2012-24 9 2012-25 14 2012-26 6 2012-27 6 2012-28 16 2012-29 5 2012-30 2 2012-32 6 2012-33 1 2012-34 10 2012-35 2 2012-36 1 2012-37 9 2012-38 20 2012-39 8 2012-40 8 2012-41 1 2012-42 9 2012-43 9 2012-44 5 2012-45 1 2012-46 1 2012-47 12 2012-48 2 2012-49 16 2012-50 4 2012-51 7 2012-53 4 2013-01 31 2013-02 9 2013-03 7 2013-04 19 2013-05 5 2013-06 8 2013-07 20 2013-08 7 2013-09 15 2013-10 12 2013-11 2 2013-12 6 2013-13 17 2013-14 18 2013-15 9 2013-16 2 2013-17 10 2013-18 12 2013-19 18 2013-20 7 2013-21 3 2013-22 2 2013-23 2 2013-24 9 2013-25 7 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 230 bytes Desc: OpenPGP digital signature URL: From yurii.aulchenko at gmail.com Sun Jul 7 03:40:06 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Sun, 7 Jul 2013 03:40:06 +0200 Subject: [GenABEL-dev] presentation at UseR!-2013 In-Reply-To: <51D84188.6010306@karssen.org> References: <51D6A04F.7050708@karssen.org> <51D84188.6010306@karssen.org> Message-ID: Thank you very much, Lennart! - not sure I will manage to use this data for current presentation - it is getting rather big now, and I am getting tired... I probably can use these numbers to make a figure "how the community sets off", but not sure, did not have time to present these numbers graphically yet. You can find the current draft of presentation at my public Dropbox, https://dl.dropboxusercontent.com/u/13260693/GenABEL-1.odp Comments/suggestions/improvements are welcome! Note that I have 15-17 minutes for the presentation, so slide count is already too high. Can probably cut short on the "history". Also wonder if this presentation will be interesting for the R people - it is kind of very general one at the moment. YA On Sat, Jul 6, 2013 at 6:10 PM, L.C. Karssen wrote: > Hi Yurii, > > Please find attached the output of the MySQL statement. I added another > column in which the week numbers are separated from the year by a dash, > that makes it easier to read in e.g. R: > > posts <- read.table("tmp/posts_per_week_converted.out", header=TRUE, > sep=" ", row.names=NULL) > > colnames(posts) <- c("date", "num_posts") > > # Convert year-week to year-month-day > posts$weekdate <- as.Date(paste(posts$date, 1), format="%Y-%U %u") > > head(posts) > date num_posts weekdate > 1 2011-01 1 2011-01-03 > 2 2011-04 15 2011-01-24 > 3 2011-05 7 2011-01-31 > 4 2011-06 24 2011-02-07 > 5 2011-07 10 2011-02-14 > 6 2011-08 7 2011-02-21 > > > This should help making a bar plot of "weekdate" vs. "num_posts". > > > By the way, the SQL script is in the ~/scripts/ directory on the SSH > server of our hoster. You can execute it like this: > mysql -u USERNAME --password=PASSWORD -h HOSTNAME < > get_weekly_posts.sql > posts_per_week.out > > The user name, password and host name can be found in the backup scripts > in that same directory. > > > Best, > > Lennart. > > > On 05-07-13 14:04, Yurii Aulchenko wrote: > > > > > > On Fri, Jul 5, 2013 at 12:30 PM, L.C. Karssen > > wrote: > > > > Hi Yurii, > > > > On 07/05/2013 11:04 AM, Yurii Aulchenko wrote: > > > Dear All, > > > > > > I am now drafting my presentation for UseR!-2013 ( > > > http://www.edii.uclm.es/~useR-2013/). My presentation about "The > > GenABEL > > > suite for genome-wide association analyses" is scheduled for Wed > > July 10 > > > morning. I will send it to the list for the discussion as soon as > > I have a > > > draft (most likely by Saturday eve). > > > > > > I thought it may be a good idea to present the evolution of the > > GenABEL in > > > number, so the idea is to get the numbers by years/quartes of the > year > > > (say, #posts in 2009=x1, 2010=x2...) and present them graphically. > > For some > > > of growth metrics I can get the dynamics by years easily, but for > > some I > > > have no idea and hope you could help me (may be also by providing > the > > > numbers directly). > > > > > > Here a small list of metrics I thought of: > > > > > > #packages: very easy to count :) > > > #posts on GenABEL-devel: possible to count > > > #posts on forum: no idea how to do that for defined time periods > > > > I guess you need to run a query on the database to get those. Our > hoster > > has a phpmyadmin interface yuo can use for that (or you could > probably > > use the SSH account and run the MySQL client from the command line). > > Probably a query along this line: > > > > SELECT yearweek(date(from_unixtime(post_time))) AS week, COUNT(*) AS > > num_posts FROM phpbb_posts GROUP BY > > yearweek(date(from_unixtime(post_time))) > > > > > > arrgh... probably I can figure this out if I had enough time, but gonna > > to invest into presentation now. If you/someone could give a hand, would > > be great :) > > > > > > > > > > > #number of lines of code in our SVN repo: no idea > > > > Probably SLOCcount will help: http://www.dwheeler.com/sloccount/ > > > > > > This is a nice one! Two problems: it does not count/recognize R; did not > > see how to use it to see the dynamics (what was there in repo 2 years > > ago?..) > > > > But I like that even without the R code counts (which is 148,000 lines), > > for ~65,000 lines of mostly C/C++ I get the message indicating that > > GenABEL is worth few millions of dollars: > > > > Development Effort Estimate, Person-Years (Person-Months) = 15.44 > (185.24) > > (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05)) > > Schedule Estimate, Years (Months) = 1.05 (12.61) > > (Basic COCOMO model, Months = 2.5 * (person-months**0.38)) > > Total Estimated Cost to Develop = $ 2,085,323 > > (average salary = $56,286/year, overhead = 2.40). > > > > So I think I should use these figures in my presentation :) > > > > > #citations (GenA, ProbA...): easy to count thanks to Google Scholar > > > #mentions on the Web: ??? > > > > > > Any other nice and easily computed metrics? > > > > > > I will appreciate your help and suggestions, and sorry for late > > notice. > > > > > > > > > Good luck, > > > > Lennart. > > > > > best, > > > Yurii > > > > > > > > > > > > _______________________________________________ > > > genabel-devel mailing list > > > genabel-devel at lists.r-forge.r-project.org > > > > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > > > > > > > -- > > ----------------------------------------------------------------- > > L.C. Karssen > > Utrecht > > The Netherlands > > > > lennart at karssen.org > > http://blog.karssen.org > > > > Stuur mij aub geen Word of Powerpoint bestanden! > > Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html > > ------------------------------------------------------------------ > > > > > > _______________________________________________ > > genabel-devel mailing list > > genabel-devel at lists.r-forge.r-project.org > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > > > > > > > > -- > > ----------------------------------------------------- > > Yurii S. Aulchenko > > > > [ LinkedIn ] [ Twitter > > ] [ Blog > > ] > > -- > ----------------------------------------------------------------- > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > > Stuur mij aub geen Word of Powerpoint bestanden! > Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html > ------------------------------------------------------------------ > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Wed Jul 10 20:42:45 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Wed, 10 Jul 2013 20:42:45 +0200 Subject: [GenABEL-dev] presentation at UseR!-2013 In-Reply-To: References: <51D6A04F.7050708@karssen.org> <51D84188.6010306@karssen.org> Message-ID: Dear All, The last variant of presentation - the one I presented at UseR!-2013 this morning - is available at previous link. The presentation went fine (though I was slightly over time and therefore was wrapping up a bit too quickly). Several people contacted me after the talk. Later, we should probably move that presentation to our web-site. (what section? showcase?..) Lennart, Maarten, many thanks for your input! YA On Sun, Jul 7, 2013 at 3:40 AM, Yurii Aulchenko wrote: > Thank you very much, Lennart! - not sure I will manage to use this data > for current presentation - it is getting rather big now, and I am getting > tired... I probably can use these numbers to make a figure "how the > community sets off", but not sure, did not have time to present these > numbers graphically yet. > > You can find the current draft of presentation at my public Dropbox, > https://dl.dropboxusercontent.com/u/13260693/GenABEL-1.odp > > Comments/suggestions/improvements are welcome! > > Note that I have 15-17 minutes for the presentation, so slide count is > already too high. Can probably cut short on the "history". Also wonder if > this presentation will be interesting for the R people - it is kind of very > general one at the moment. > > YA > > > On Sat, Jul 6, 2013 at 6:10 PM, L.C. Karssen wrote: > >> Hi Yurii, >> >> Please find attached the output of the MySQL statement. I added another >> column in which the week numbers are separated from the year by a dash, >> that makes it easier to read in e.g. R: >> >> posts <- read.table("tmp/posts_per_week_converted.out", header=TRUE, >> sep=" ", row.names=NULL) >> >> colnames(posts) <- c("date", "num_posts") >> >> # Convert year-week to year-month-day >> posts$weekdate <- as.Date(paste(posts$date, 1), format="%Y-%U %u") >> >> head(posts) >> date num_posts weekdate >> 1 2011-01 1 2011-01-03 >> 2 2011-04 15 2011-01-24 >> 3 2011-05 7 2011-01-31 >> 4 2011-06 24 2011-02-07 >> 5 2011-07 10 2011-02-14 >> 6 2011-08 7 2011-02-21 >> >> >> This should help making a bar plot of "weekdate" vs. "num_posts". >> >> >> By the way, the SQL script is in the ~/scripts/ directory on the SSH >> server of our hoster. You can execute it like this: >> mysql -u USERNAME --password=PASSWORD -h HOSTNAME < >> get_weekly_posts.sql > posts_per_week.out >> >> The user name, password and host name can be found in the backup scripts >> in that same directory. >> >> >> Best, >> >> Lennart. >> >> >> On 05-07-13 14:04, Yurii Aulchenko wrote: >> > >> > >> > On Fri, Jul 5, 2013 at 12:30 PM, L.C. Karssen > > > wrote: >> > >> > Hi Yurii, >> > >> > On 07/05/2013 11:04 AM, Yurii Aulchenko wrote: >> > > Dear All, >> > > >> > > I am now drafting my presentation for UseR!-2013 ( >> > > http://www.edii.uclm.es/~useR-2013/). My presentation about "The >> > GenABEL >> > > suite for genome-wide association analyses" is scheduled for Wed >> > July 10 >> > > morning. I will send it to the list for the discussion as soon as >> > I have a >> > > draft (most likely by Saturday eve). >> > > >> > > I thought it may be a good idea to present the evolution of the >> > GenABEL in >> > > number, so the idea is to get the numbers by years/quartes of the >> year >> > > (say, #posts in 2009=x1, 2010=x2...) and present them graphically. >> > For some >> > > of growth metrics I can get the dynamics by years easily, but for >> > some I >> > > have no idea and hope you could help me (may be also by providing >> the >> > > numbers directly). >> > > >> > > Here a small list of metrics I thought of: >> > > >> > > #packages: very easy to count :) >> > > #posts on GenABEL-devel: possible to count >> > > #posts on forum: no idea how to do that for defined time periods >> > >> > I guess you need to run a query on the database to get those. Our >> hoster >> > has a phpmyadmin interface yuo can use for that (or you could >> probably >> > use the SSH account and run the MySQL client from the command line). >> > Probably a query along this line: >> > >> > SELECT yearweek(date(from_unixtime(post_time))) AS week, COUNT(*) >> AS >> > num_posts FROM phpbb_posts GROUP BY >> > yearweek(date(from_unixtime(post_time))) >> > >> > >> > arrgh... probably I can figure this out if I had enough time, but gonna >> > to invest into presentation now. If you/someone could give a hand, would >> > be great :) >> > >> > >> > >> > >> > > #number of lines of code in our SVN repo: no idea >> > >> > Probably SLOCcount will help: http://www.dwheeler.com/sloccount/ >> > >> > >> > This is a nice one! Two problems: it does not count/recognize R; did not >> > see how to use it to see the dynamics (what was there in repo 2 years >> > ago?..) >> > >> > But I like that even without the R code counts (which is 148,000 lines), >> > for ~65,000 lines of mostly C/C++ I get the message indicating that >> > GenABEL is worth few millions of dollars: >> > >> > Development Effort Estimate, Person-Years (Person-Months) = 15.44 >> (185.24) >> > (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05)) >> > Schedule Estimate, Years (Months) = 1.05 (12.61) >> > (Basic COCOMO model, Months = 2.5 * (person-months**0.38)) >> > Total Estimated Cost to Develop = $ 2,085,323 >> > (average salary = $56,286/year, overhead = 2.40). >> > >> > So I think I should use these figures in my presentation :) >> > >> > > #citations (GenA, ProbA...): easy to count thanks to Google >> Scholar >> > > #mentions on the Web: ??? >> > > >> > > Any other nice and easily computed metrics? >> > > >> > > I will appreciate your help and suggestions, and sorry for late >> > notice. >> > > >> > >> > >> > Good luck, >> > >> > Lennart. >> > >> > > best, >> > > Yurii >> > > >> > > >> > > >> > > _______________________________________________ >> > > genabel-devel mailing list >> > > genabel-devel at lists.r-forge.r-project.org >> > >> > > >> > >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >> > > >> > >> > >> > -- >> > ----------------------------------------------------------------- >> > L.C. Karssen >> > Utrecht >> > The Netherlands >> > >> > lennart at karssen.org >> > http://blog.karssen.org >> > >> > Stuur mij aub geen Word of Powerpoint bestanden! >> > Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html >> > ------------------------------------------------------------------ >> > >> > >> > _______________________________________________ >> > genabel-devel mailing list >> > genabel-devel at lists.r-forge.r-project.org >> > >> > >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >> > >> > >> > >> > >> > -- >> > ----------------------------------------------------- >> > Yurii S. Aulchenko >> > >> > [ LinkedIn ] [ Twitter >> > ] [ Blog >> > ] >> >> -- >> ----------------------------------------------------------------- >> L.C. Karssen >> Utrecht >> The Netherlands >> >> lennart at karssen.org >> http://blog.karssen.org >> >> Stuur mij aub geen Word of Powerpoint bestanden! >> Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html >> ------------------------------------------------------------------ >> > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter] [ > Blog ] > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From lennart at karssen.org Thu Jul 11 23:56:37 2013 From: lennart at karssen.org (L.C. Karssen) Date: Thu, 11 Jul 2013 23:56:37 +0200 Subject: [GenABEL-dev] ProbABEL, chi^2, Wald and log-likelihood Message-ID: <51DF2A15.4020607@karssen.org> Dear all, For the upcoming release of ProbABEL I've run into the following. In the past (~ v 0.1-3) the output of ProbABEL had chi^2 values when doing Cox regression. These were based on the likelihood ratio test: 2 * (loglik -loglik_null) ~ chi_1^2 However, at some point, when having hamissing data was allowed in ProbABEL, we ran into the problem that the null model had to be recalculated for cases with missing genotype data. To do that 'simply' for each SNP would be time consuming, so the chi^2 values were removed from the output and replaced by the loglik values for the full model. (At least, that's how I guess it went). Now, I would like to get them back. This can be done in two ways: 1) calculate chi^2 as described above, with some smart way of only recalculating the null model when a missing value occurs (this shouldn't be often with today's imputed data). 2) simply calculate the chi^2 value through the Wald test. We have betas and se_betas, so that is easy. Many of you have more knowledge about statistics than I do, so, statistically, are these methods equivalent? Or is one better (more precise/unbiased) than the other? Another question: While testing the Wald-type implementation I ran into the following: I would assume that for the 2df models (where we get beta_SNP_A1A2 and beta_SNP_A1A1) the final chi^2 value would be the sum of the individual Wald statistics, which would be distributed as chi_2^2 (so 2 df). Is that correct? I ask this because if I compare them with the chi^2 values from the LRT I get different values. In the example data set I get: name chi^2_Wald chi^2_LRT rs7247199 0.880949 0.452465 rs8102643 0.0116651 0.512709 <- here we have a missing value! rs8102615 1.51434 0.754701 rs8105536 2.56337 1.33223 rs2312724 0.492364 0.256649 When running the additive model I do get (almost) the same results: name chi^2_Wald chi^2_LRT rs7247199 0.0101558 0.01012 rs8102643 0.353168 0.492147 <- here we have a missing value! rs8102615 0.0181841 0.0180033 rs8105536 0.00222781 0.00222216 rs2312724 0.0412005 0.0401556 Shouldn't the chi_2 values be equal in both cases? FYI: the LRT chi^2 values are the same as those obtained with ProbABEL v0.1-3. Any suggestions? Thanks, Lennart. -- ----------------------------------------------------------------- L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org Stuur mij aub geen Word of Powerpoint bestanden! Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html ------------------------------------------------------------------ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 230 bytes Desc: OpenPGP digital signature URL: From yurii.aulchenko at gmail.com Fri Jul 12 01:41:47 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Fri, 12 Jul 2013 01:41:47 +0200 Subject: [GenABEL-dev] ProbABEL, chi^2, Wald and log-likelihood In-Reply-To: <51DF2A15.4020607@karssen.org> References: <51DF2A15.4020607@karssen.org> Message-ID: In principle score, Wald, and LRT have to give similar answers in non-extreme cases. LRT is theoretically the most superior method (if underlying model assumptions, e.g. normality, hold). Score / Wald are the approximations to LRT derived at the point of null/alternative, respectively. They actually ARE derived from quadratic approximations of the likleihood function derived at these points :) As for practical advantages/disadvantages of these, may be someone else could comment. I remember there are good/bad sides in both... Re: Wald on 2df - you can not add Walds from individual beta/se, you need to take the covariance into account. For full treatment of the problem, see http://www.math.chalmers.se/~wermuth/pdfs/86-95/CoxWer90_An_approximation_to_ML.pdf For a simple variant, I think our ProbABEL paper does give some details on score/Wald. Would that be good idea to put this discussion topic to our "Journal club"? - these are kind of topics of general interest irrespective of GenABEL. best, Yurii On Thu, Jul 11, 2013 at 11:56 PM, L.C. Karssen wrote: > Dear all, > > For the upcoming release of ProbABEL I've run into the following. In the > past (~ v 0.1-3) the output of ProbABEL had chi^2 values when doing Cox > regression. These were based on the likelihood ratio test: > 2 * (loglik -loglik_null) ~ chi_1^2 > However, at some point, when having hamissing data was allowed in > ProbABEL, we ran into the problem that the null model had to be > recalculated for cases with missing genotype data. To do that 'simply' > for each SNP would be time consuming, so the chi^2 values were removed > from the output and replaced by the loglik values for the full model. > (At least, that's how I guess it went). > > Now, I would like to get them back. This can be done in two ways: > 1) calculate chi^2 as described above, with some smart way of only > recalculating the null model when a missing value occurs (this shouldn't > be often with today's imputed data). > 2) simply calculate the chi^2 value through the Wald test. We have betas > and se_betas, so that is easy. > > Many of you have more knowledge about statistics than I do, so, > statistically, are these methods equivalent? Or is one better (more > precise/unbiased) than the other? > > > Another question: > While testing the Wald-type implementation I ran into the following: > I would assume that for the 2df models (where we get beta_SNP_A1A2 and > beta_SNP_A1A1) the final chi^2 value would be the sum of the individual > Wald statistics, which would be distributed as chi_2^2 (so 2 df). Is > that correct? I ask this because if I compare them with the chi^2 values > from the LRT I get different values. In the example data set I get: > name chi^2_Wald chi^2_LRT > rs7247199 0.880949 0.452465 > rs8102643 0.0116651 0.512709 <- here we have a missing value! > rs8102615 1.51434 0.754701 > rs8105536 2.56337 1.33223 > rs2312724 0.492364 0.256649 > > When running the additive model I do get (almost) the same results: > name chi^2_Wald chi^2_LRT > rs7247199 0.0101558 0.01012 > rs8102643 0.353168 0.492147 <- here we have a missing value! > rs8102615 0.0181841 0.0180033 > rs8105536 0.00222781 0.00222216 > rs2312724 0.0412005 0.0401556 > > Shouldn't the chi_2 values be equal in both cases? FYI: the LRT chi^2 > values are the same as those obtained with ProbABEL v0.1-3. > > > Any suggestions? > Thanks, > > Lennart. > > -- > ----------------------------------------------------------------- > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > > Stuur mij aub geen Word of Powerpoint bestanden! > Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html > ------------------------------------------------------------------ > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Fri Jul 12 12:19:29 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Fri, 12 Jul 2013 12:19:29 +0200 Subject: [GenABEL-dev] Fwd: [R-Forge] Downtime 15 July References: <20130712101528.4B2BA185606@r-forge.r-project.org> Message-ID: <4309418709146154992@unknownmsgid> FYI ---------------------- Yurii Aulchenko (sent from mobile device) Begin forwarded message: *From:* *Date:* 12 July 2013 12:15:28 CEST *To:* yurii.aulchenko at gmail.com *Subject:* *[R-Forge] Downtime 15 July* Dear R-Forge users Packages are available for download once again and the build process has been restarted. A second (complete sitewide) downtime has to be announced. It will be next Monday on 15th July starting at around 9:00 CEST and will last up to 1 day. That should be however the last downtime necessary associated with the relocation of the WU campus. We apologize for any inconvenience The R-Forge Team -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Sun Jul 14 19:43:49 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Sun, 14 Jul 2013 19:43:49 +0200 Subject: [GenABEL-dev] presentation at UseR!-2013 In-Reply-To: References: <51D6A04F.7050708@karssen.org> <51D84188.6010306@karssen.org> Message-ID: Dear All, I have composed a short report from UseR! conference, see http://www.genabel.org/news20130714 for the links. best wishes, and thanks again for your help, Yurii On Wed, Jul 10, 2013 at 8:42 PM, Yurii Aulchenko wrote: > Dear All, > > The last variant of presentation - the one I presented at UseR!-2013 this > morning - is available at previous link. The presentation went fine (though > I was slightly over time and therefore was wrapping up a bit too quickly). > Several people contacted me after the talk. > > Later, we should probably move that presentation to our web-site. (what > section? showcase?..) > > Lennart, Maarten, many thanks for your input! > > YA > > > On Sun, Jul 7, 2013 at 3:40 AM, Yurii Aulchenko > wrote: > >> Thank you very much, Lennart! - not sure I will manage to use this data >> for current presentation - it is getting rather big now, and I am getting >> tired... I probably can use these numbers to make a figure "how the >> community sets off", but not sure, did not have time to present these >> numbers graphically yet. >> >> You can find the current draft of presentation at my public Dropbox, >> https://dl.dropboxusercontent.com/u/13260693/GenABEL-1.odp >> >> Comments/suggestions/improvements are welcome! >> >> Note that I have 15-17 minutes for the presentation, so slide count is >> already too high. Can probably cut short on the "history". Also wonder if >> this presentation will be interesting for the R people - it is kind of very >> general one at the moment. >> >> YA >> >> >> On Sat, Jul 6, 2013 at 6:10 PM, L.C. Karssen wrote: >> >>> Hi Yurii, >>> >>> Please find attached the output of the MySQL statement. I added another >>> column in which the week numbers are separated from the year by a dash, >>> that makes it easier to read in e.g. R: >>> >>> posts <- read.table("tmp/posts_per_week_converted.out", header=TRUE, >>> sep=" ", row.names=NULL) >>> >>> colnames(posts) <- c("date", "num_posts") >>> >>> # Convert year-week to year-month-day >>> posts$weekdate <- as.Date(paste(posts$date, 1), format="%Y-%U %u") >>> >>> head(posts) >>> date num_posts weekdate >>> 1 2011-01 1 2011-01-03 >>> 2 2011-04 15 2011-01-24 >>> 3 2011-05 7 2011-01-31 >>> 4 2011-06 24 2011-02-07 >>> 5 2011-07 10 2011-02-14 >>> 6 2011-08 7 2011-02-21 >>> >>> >>> This should help making a bar plot of "weekdate" vs. "num_posts". >>> >>> >>> By the way, the SQL script is in the ~/scripts/ directory on the SSH >>> server of our hoster. You can execute it like this: >>> mysql -u USERNAME --password=PASSWORD -h HOSTNAME < >>> get_weekly_posts.sql > posts_per_week.out >>> >>> The user name, password and host name can be found in the backup scripts >>> in that same directory. >>> >>> >>> Best, >>> >>> Lennart. >>> >>> >>> On 05-07-13 14:04, Yurii Aulchenko wrote: >>> > >>> > >>> > On Fri, Jul 5, 2013 at 12:30 PM, L.C. Karssen >> > > wrote: >>> > >>> > Hi Yurii, >>> > >>> > On 07/05/2013 11:04 AM, Yurii Aulchenko wrote: >>> > > Dear All, >>> > > >>> > > I am now drafting my presentation for UseR!-2013 ( >>> > > http://www.edii.uclm.es/~useR-2013/). My presentation about "The >>> > GenABEL >>> > > suite for genome-wide association analyses" is scheduled for Wed >>> > July 10 >>> > > morning. I will send it to the list for the discussion as soon as >>> > I have a >>> > > draft (most likely by Saturday eve). >>> > > >>> > > I thought it may be a good idea to present the evolution of the >>> > GenABEL in >>> > > number, so the idea is to get the numbers by years/quartes of >>> the year >>> > > (say, #posts in 2009=x1, 2010=x2...) and present them >>> graphically. >>> > For some >>> > > of growth metrics I can get the dynamics by years easily, but for >>> > some I >>> > > have no idea and hope you could help me (may be also by >>> providing the >>> > > numbers directly). >>> > > >>> > > Here a small list of metrics I thought of: >>> > > >>> > > #packages: very easy to count :) >>> > > #posts on GenABEL-devel: possible to count >>> > > #posts on forum: no idea how to do that for defined time periods >>> > >>> > I guess you need to run a query on the database to get those. Our >>> hoster >>> > has a phpmyadmin interface yuo can use for that (or you could >>> probably >>> > use the SSH account and run the MySQL client from the command >>> line). >>> > Probably a query along this line: >>> > >>> > SELECT yearweek(date(from_unixtime(post_time))) AS week, COUNT(*) >>> AS >>> > num_posts FROM phpbb_posts GROUP BY >>> > yearweek(date(from_unixtime(post_time))) >>> > >>> > >>> > arrgh... probably I can figure this out if I had enough time, but gonna >>> > to invest into presentation now. If you/someone could give a hand, >>> would >>> > be great :) >>> > >>> > >>> > >>> > >>> > > #number of lines of code in our SVN repo: no idea >>> > >>> > Probably SLOCcount will help: http://www.dwheeler.com/sloccount/ >>> > >>> > >>> > This is a nice one! Two problems: it does not count/recognize R; did >>> not >>> > see how to use it to see the dynamics (what was there in repo 2 years >>> > ago?..) >>> > >>> > But I like that even without the R code counts (which is 148,000 >>> lines), >>> > for ~65,000 lines of mostly C/C++ I get the message indicating that >>> > GenABEL is worth few millions of dollars: >>> > >>> > Development Effort Estimate, Person-Years (Person-Months) = 15.44 >>> (185.24) >>> > (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05)) >>> > Schedule Estimate, Years (Months) = 1.05 >>> (12.61) >>> > (Basic COCOMO model, Months = 2.5 * (person-months**0.38)) >>> > Total Estimated Cost to Develop = $ 2,085,323 >>> > (average salary = $56,286/year, overhead = 2.40). >>> > >>> > So I think I should use these figures in my presentation :) >>> > >>> > > #citations (GenA, ProbA...): easy to count thanks to Google >>> Scholar >>> > > #mentions on the Web: ??? >>> > > >>> > > Any other nice and easily computed metrics? >>> > > >>> > > I will appreciate your help and suggestions, and sorry for late >>> > notice. >>> > > >>> > >>> > >>> > Good luck, >>> > >>> > Lennart. >>> > >>> > > best, >>> > > Yurii >>> > > >>> > > >>> > > >>> > > _______________________________________________ >>> > > genabel-devel mailing list >>> > > genabel-devel at lists.r-forge.r-project.org >>> > >>> > > >>> > >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >>> > > >>> > >>> > >>> > -- >>> > ----------------------------------------------------------------- >>> > L.C. Karssen >>> > Utrecht >>> > The Netherlands >>> > >>> > lennart at karssen.org >>> > http://blog.karssen.org >>> > >>> > Stuur mij aub geen Word of Powerpoint bestanden! >>> > Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html >>> > ------------------------------------------------------------------ >>> > >>> > >>> > _______________________________________________ >>> > genabel-devel mailing list >>> > genabel-devel at lists.r-forge.r-project.org >>> > >>> > >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >>> > >>> > >>> > >>> > >>> > -- >>> > ----------------------------------------------------- >>> > Yurii S. Aulchenko >>> > >>> > [ LinkedIn ] [ Twitter >>> > ] [ Blog >>> > ] >>> >>> -- >>> ----------------------------------------------------------------- >>> L.C. Karssen >>> Utrecht >>> The Netherlands >>> >>> lennart at karssen.org >>> http://blog.karssen.org >>> >>> Stuur mij aub geen Word of Powerpoint bestanden! >>> Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html >>> ------------------------------------------------------------------ >>> >> >> >> >> -- >> ----------------------------------------------------- >> Yurii S. Aulchenko >> >> [ LinkedIn ] [ Twitter] [ >> Blog ] >> > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter] [ > Blog ] > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From lennart at karssen.org Sun Jul 14 22:00:38 2013 From: lennart at karssen.org (L.C. Karssen) Date: Sun, 14 Jul 2013 22:00:38 +0200 Subject: [GenABEL-dev] ProbABEL, chi^2, Wald and log-likelihood In-Reply-To: References: <51DF2A15.4020607@karssen.org> Message-ID: <51E30366.4010201@karssen.org> Thanks for the explanation Yurii. On 12-07-13 01:41, Yurii Aulchenko wrote: > In principle score, Wald, and LRT have to give similar answers in > non-extreme cases. LRT is theoretically the most superior method (if > underlying model assumptions, e.g. normality, hold). Score / Wald are > the approximations to LRT derived at the point of null/alternative, > respectively. They actually ARE derived from quadratic approximations of > the likleihood function derived at these points :) Interesting! I didn't know that. > > As for practical advantages/disadvantages of these, may be someone else > could comment. I remember there are good/bad sides in both... > > Re: Wald on 2df - you can not add Walds from individual beta/se, you > need to take the covariance into account. I see, I guess adding them is only allowed when the two are independent (hence no covariance). Right? > For full treatment of the > problem, see > > http://www.math.chalmers.se/~wermuth/pdfs/86-95/CoxWer90_An_approximation_to_ML.pdf > Thanks. Not an easy piece to read... > For a simple variant, I think our ProbABEL paper does give some details > on score/Wald. > > Would that be good idea to put this discussion topic to our "Journal > club"? - these are kind of topics of general interest irrespective of > GenABEL. > Good idea. I'll see if I can find the time to start the discussion there. Best, Lennart. > best, > Yurii > > On Thu, Jul 11, 2013 at 11:56 PM, L.C. Karssen > wrote: > > Dear all, > > For the upcoming release of ProbABEL I've run into the following. In the > past (~ v 0.1-3) the output of ProbABEL had chi^2 values when doing Cox > regression. These were based on the likelihood ratio test: > 2 * (loglik -loglik_null) ~ chi_1^2 > However, at some point, when having hamissing data was allowed in > ProbABEL, we ran into the problem that the null model had to be > recalculated for cases with missing genotype data. To do that 'simply' > for each SNP would be time consuming, so the chi^2 values were removed > from the output and replaced by the loglik values for the full model. > (At least, that's how I guess it went). > > Now, I would like to get them back. This can be done in two ways: > 1) calculate chi^2 as described above, with some smart way of only > recalculating the null model when a missing value occurs (this shouldn't > be often with today's imputed data). > 2) simply calculate the chi^2 value through the Wald test. We have betas > and se_betas, so that is easy. > > Many of you have more knowledge about statistics than I do, so, > statistically, are these methods equivalent? Or is one better (more > precise/unbiased) than the other? > > > Another question: > While testing the Wald-type implementation I ran into the following: > I would assume that for the 2df models (where we get beta_SNP_A1A2 and > beta_SNP_A1A1) the final chi^2 value would be the sum of the individual > Wald statistics, which would be distributed as chi_2^2 (so 2 df). Is > that correct? I ask this because if I compare them with the chi^2 values > from the LRT I get different values. In the example data set I get: > name chi^2_Wald chi^2_LRT > rs7247199 0.880949 0.452465 > rs8102643 0.0116651 0.512709 <- here we have a missing value! > rs8102615 1.51434 0.754701 > rs8105536 2.56337 1.33223 > rs2312724 0.492364 0.256649 > > When running the additive model I do get (almost) the same results: > name chi^2_Wald chi^2_LRT > rs7247199 0.0101558 0.01012 > rs8102643 0.353168 0.492147 <- here we have a missing value! > rs8102615 0.0181841 0.0180033 > rs8105536 0.00222781 0.00222216 > rs2312724 0.0412005 0.0401556 > > Shouldn't the chi_2 values be equal in both cases? FYI: the LRT chi^2 > values are the same as those obtained with ProbABEL v0.1-3. > > > Any suggestions? > Thanks, > > Lennart. > > -- > ----------------------------------------------------------------- > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > > Stuur mij aub geen Word of Powerpoint bestanden! > Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html > ------------------------------------------------------------------ > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter > ] [ Blog > ] -- ----------------------------------------------------------------- L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org Stuur mij aub geen Word of Powerpoint bestanden! Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html ------------------------------------------------------------------ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 230 bytes Desc: OpenPGP digital signature URL: From alvaro.frank at rwth-aachen.de Mon Jul 15 17:07:26 2013 From: alvaro.frank at rwth-aachen.de (Alvaro Jesus Frank) Date: Mon, 15 Jul 2013 17:07:26 +0200 Subject: [GenABEL-dev] multiple ProbABEL's palinear runs Message-ID: Dear all, I am working on a high performance implementation of an ordinary linear estimator (OLS model), similar to the one implemented in ProbABEL's palinear (without --mmscore option), where X are SNP given and Y are the phenotypes. (As given by the ProbABEl manual on section 7 "Methodology" at http://www.genabel.org/sites/default/files/pdfs/ProbABEL_manual.pdf) b = (X'*X)^-1 * X' * y. The goal is to solve this with multiple design matrices (SNPs??) X and Phenotypes Y. For this we compute the formula as for each X for each Y b=(X'*X)^-1 * X' * y. We want to offer the GenABEL community an Estimator to be used in the same way people use the current tools (ProbABEL in R), but faster, and capable of handling LARGE datasets (in disk & memory). That is why I am writing it in C++, while making sure that it can be called directly from R. My understanding: A few concerns came to mind when researching the workflow in using OMICS data in Linear Estimators. There seems to be a long process before the real life data from MaCH (test.mldose? for X and mlinfo? for Y) that is sitting on files can be used in calculations. The first concern is how to obtain the design matrices X from the files. It is my understanding that there are two types of data, imputed data and databel data. Either way, data seems to be pre-processed early in the workflow; my impression is that this preprocessing is done in R. It also seems that R can't handle large amounts of data loaded in memory at once. >From what I see, data comes with some irregularities in its values (missing values, invalid rows in X/Y matrices), and this makes it difficult to use Linear Estimators right away; this is why the preprocessing exists. DatABEL seems to be the R tool (implemented in C++) that can do fast pre-processing of big sets of data. Well, I think that DatABEL only does the reading and writing of files in C++ (called filevector), while the pre-processing functions are defined and implemented in R. Am I correct? My Problems: This is where my troubles start. Since I am trying to make this tool usable for the GenABEL community while still being able to handle TERABYTES of data with fast computations, I would really like to include the preprocessing of X and Y into my C++ workflow. To solve the memory and performance limitations of R, I am trying to load the data from disk from within C++. Since I am performing my estimator function in C++, it expects those matrices to have numbers that can be used for computation. Assuming that data must be preprocessed to be able to get valid matrices with usable numbers, I have the following options: A) For performance reasons, I was considering having the data already pre-processed in disk files. Is this feasible, (preprocessed data would take at most as much space in disk as original data, is this cumbersome)? B) If there are only a few preprocessing functions that people use, I could re-implement them inside C++ and use them on the fly while loading the data from disk. This would be more difficult if everyone has their own customized R pre-processing functions. C) Another alternative is to allow users to use their own R pre-processing functions that pre-process the data. I would then go about preprocessing on the fly from inside C++ by doing calls back to R. This would be slower and harder to do than B). D) If DatABEL really does all the necesary pre-processing from inside C++, I could just directly use it or allow the user to specify what to use and won't need to re-implement the pre-processing functions. It seems tho, that preprocessing of the data takes from 30mins to an hour into DatABEL filevector format. I would really appreciate any help that would clarify my understanding of how the pre-processing of data works and where it fits in the work-flow. Best regards, - Alvaro Frank From yurii.aulchenko at gmail.com Mon Jul 15 22:02:09 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Mon, 15 Jul 2013 22:02:09 +0200 Subject: [GenABEL-dev] layout of GenABEL main page In-Reply-To: References: <51D2C34D.2000907@gmail.com> <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it> <6632A424-420E-423B-957A-3B8481DD0122@burlo.trieste.it> Message-ID: Dear All, a small update - I have original vector graphics files from Grant at my disposal; if some people would like to play with these files, send me a message and I can forward the vector files to you. best, Yurii On Fri, Jul 5, 2013 at 3:09 PM, Yurii Aulchenko wrote: > > > On Fri, Jul 5, 2013 at 3:05 PM, Nicola Pirastu < > nicola.pirastu at burlo.trieste.it> wrote: > >> I agree, in the end it's not the coca-cola logo and we have not been >> using it for years so I don't think people are going to be confused if the >> Logo changes in a few months. >> >> > More than that - I really think it should evolve as our project does :) > > > >> I am actually curious to see how it will look on the forum. I do think >> that if it's not too much work, the colors of the forum and website should >> match those of the logo though. >> > > Yep. I now start understanding why people were giving the costs estimates > of few thousands of euro for the that basic design package: e.g. for > facebook we need cover and avatar (latter would do for the twitter as > well). So this is whole project :) > > May be later we should think of inviting some guys from a design school - > they must be looking for graduation projects to make, and may be they would > be willing to do that for free :) > > YA > > >> >> Nicola >> >> >> Dr. Nicola Pirastu PhD >> Research Fellow >> Medical Sciences, Chirurgical and Health Department >> University of Trieste >> Medical Genetics >> IRCCS Burlo Garofolo >> Via dell'Istria 65/1 >> 34137 Italy >> tel. +390403785539 >> >> Il giorno 05/lug/2013, alle ore 14:55, Yurii Aulchenko < >> yurii.aulchenko at gmail.com> ha scritto: >> >> I suggest that for the moment we go with what we have (Grant's variant); >> we can change later. >> >> Please let me know if you have a strong opinion against! - I really >> would like to use the logo for my presentation and also play a bit how well >> it fits our pages (genabel.org, facebook, twitter) >> >> YA >> >> On Tue, Jul 2, 2013 at 4:27 PM, Nicola Pirastu < >> nicola.pirastu at burlo.trieste.it> wrote: >> >>> Just to add my two cents to the discussion, >>> >>> I think that the problem is not with the DNA helix but with the font. >>> I've played around a bit with it and if you use for example Helvetica or >>> something less comic-sans-like it does look better. Also for some reason >>> I'm still disturbed by the green but it is a very personal opinion.. >>> >>> Nicola >>> >>> Dr. Nicola Pirastu PhD >>> Research Fellow >>> Medical Sciences, Chirurgical and Health Department >>> University of Trieste >>> Medical Genetics >>> IRCCS Burlo Garofolo >>> Via dell'Istria 65/1 >>> 34137 Italy >>> tel. +390403785539 >>> >>> Il giorno 02/lug/2013, alle ore 14:38, Yurii Aulchenko < >>> yurii.aulchenko at gmail.com> ha scritto: >>> >>> Dear All, >>> >>> I agree with critique of Maarten, and I actually still not sure if I >>> like Maarten's or Grant's idea better. Interesting thing is that - not sure >>> all realize it - Grant's variant is his vision of Maarten's prototype :) >>> However, Grant's variant has an important advantage - it is ready to serve >>> as logo. And I actually want to use a logo in my slides for UseR!-2013. >>> >>> So I suggest we take Grant's logo as a working variant. No doubt that >>> the logo is going to evolve with time - as anything we do in the project - >>> code, documentation; logo is no different, I think. The element which is >>> going to stay and keep it recognizable is the way of spelling the GenABEL >>> :) - Like Gnu's horns in the GNU logo. >>> >>> What we can do next is to place an open call on site/forum for other >>> users to contribute, but this is going to take time, and meanwhile I >>> suggest to stick with Grant's variant. >>> >>> Yurii >>> >>> On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman wrote: >>> >>>> Dear all, >>>> >>>> >>>> It looks really nice ! Credits for who made it. However, I have more >>>> the impression that it looks like a polypeptide chain or a rosary. The >>>> seventies font is a matter of taste, but it remind me of comic >>>> sans(including a upside down e as a). I wonder if it readable if you print >>>> it on a poster: I think this is a important use-case of a scientific logo. >>>> >>>> Kind regards, >>>> >>>> >>>> Maarten >>>> >>>> >>>> >>>> >>>> On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote: >>>> >>>>> On 28/06/13, Yurii Aulchenko wrote: >>>>> >>>>> How do you like this one? >>>>>> >>>>> I like it a lot. >>>>> >>>>> What do you think about reducing the font size for the subtitle >>>>> and right-justifying it? Would it still be readable? I liked that >>>>> detail from the previous attempts with the "Project" subtitle. >>>>> >>>>> In any case, this is just a minor detail. It looks great as it is. >>>>> >>>>> Thanks to Grant Borodin! >>>>> >>>>> >>>>>> YA >>>>>> >>>>>> >>>>>> On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko < >>>>>> yurii.aulchenko at gmail.com(**javascript:main.compose()> wrote: >>>>>> >>>>>> >>>>>> Dear Nicola, Diego, Lennart, >>>>>>> >>>>>>> >>>>>>> Thanks for your feedback! I will ask Grant Borodin, who kindly >>>>>>> designed these logos, if he could change C according to your comment >>>>>>> (capital "ABEL" and "statistical genomics" as in F). >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Yurii >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver < >>>>>>> fabregat at aices.rwth-aachen.de**(javascript:main.compose()> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Congrats to whoever designed these logos, they look very nice :) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> With respect to my preferences, I fully agree with Lennart: "C with >>>>>>>> capital ABEL and statistical genomics below it" would be my choice. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> Diego >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 20/06/13, "L.C. Karssen" >>>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Wow! Those look really nice! >>>>>>>>> I like options C and F the most. Actually a combination would be >>>>>>>>> even >>>>>>>>> better IMHO: use C with capital ABEL and statistical genomics >>>>>>>>> below it. >>>>>>>>> Looking forward to head the opinion of others, >>>>>>>>> Lennart. >>>>>>>>> On 20-06-13 09:34, Yurii Aulchenko wrote: >>>>>>>>> >>>>>>>>>> Please find attached few more logo variants >>>>>>>>>> Yurii >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> ______________________________**_________________ >>>>> genabel-devel mailing list >>>>> genabel-devel at lists.r-forge.r-**project.org >>>>> https://lists.r-forge.r-**project.org/cgi-bin/mailman/** >>>>> listinfo/genabel-devel >>>>> >>>> >>>> ______________________________**_________________ >>>> genabel-devel mailing list >>>> genabel-devel at lists.r-forge.r-**project.org >>>> https://lists.r-forge.r-**project.org/cgi-bin/mailman/** >>>> listinfo/genabel-devel >>>> >>> >>> >>> >>> -- >>> ----------------------------------------------------- >>> Yurii S. Aulchenko >>> >>> [ LinkedIn ] [ Twitter] [ >>> Blog ] >>> _______________________________________________ >>> genabel-devel mailing list >>> genabel-devel at lists.r-forge.r-project.org >>> >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >>> >>> >>> AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute >>> nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel >>> messaggio, o responsabili per la sua consegna alla persona, o se avete >>> ricevuto il messaggio per errore, siete pregati di non trascriverlo, >>> copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il >>> messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential >>> information may be contained in this message or in its attachments. If you >>> are not the addressee indicated in this message, or responsible for message >>> delivering to that person, or if you have received this message in error, >>> you may not transcribe, copy or deliver this message to anyone. In that >>> case, you should delete this message and its attachments. Thank you. >>> >> >> >> >> -- >> ----------------------------------------------------- >> Yurii S. Aulchenko >> >> [ LinkedIn ] [ Twitter] [ >> Blog ] >> >> >> AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute >> nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel >> messaggio, o responsabili per la sua consegna alla persona, o se avete >> ricevuto il messaggio per errore, siete pregati di non trascriverlo, >> copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il >> messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential >> information may be contained in this message or in its attachments. If you >> are not the addressee indicated in this message, or responsible for message >> delivering to that person, or if you have received this message in error, >> you may not transcribe, copy or deliver this message to anyone. In that >> case, you should delete this message and its attachments. Thank you. >> > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter] [ > Blog ] > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Mon Jul 15 10:06:55 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Mon, 15 Jul 2013 10:06:55 +0200 Subject: [GenABEL-dev] ProbABEL, chi^2, Wald and log-likelihood In-Reply-To: <51E30366.4010201@karssen.org> References: <51DF2A15.4020607@karssen.org> <51E30366.4010201@karssen.org> Message-ID: On Sun, Jul 14, 2013 at 10:00 PM, L.C. Karssen wrote: > Thanks for the explanation Yurii. > > On 12-07-13 01:41, Yurii Aulchenko wrote: > > In principle score, Wald, and LRT have to give similar answers in > > non-extreme cases. LRT is theoretically the most superior method (if > > underlying model assumptions, e.g. normality, hold). Score / Wald are > > the approximations to LRT derived at the point of null/alternative, > > respectively. They actually ARE derived from quadratic approximations of > > the likleihood function derived at these points :) > > Interesting! I didn't know that. > Yep, this is quite interesting. I think David Clayton's book (Statistical Models in Epi?) gives very simple and clear explanation of how you get to the score and Wald from LRT - very nice reading. > > > > > As for practical advantages/disadvantages of these, may be someone else > > could comment. I remember there are good/bad sides in both... > > > > Re: Wald on 2df - you can not add Walds from individual beta/se, you > > need to take the covariance into account. > > I see, I guess adding them is only allowed when the two are independent > (hence no covariance). Right? > True. And zero-covariance is definitely not the case with the 2df test :) > > > For full treatment of the > > problem, see > > > > > http://www.math.chalmers.se/~wermuth/pdfs/86-95/CoxWer90_An_approximation_to_ML.pdf > > > > Thanks. Not an easy piece to read... > It is not, but at the end it is simple (see the ProbABEL paper)... unfortunately this is one of these "simple" things which are "so simple" after you have figured them out - and after some time you only remember that they were "simple", but not exact way how it works (this is why I refer you to papers). > > > For a simple variant, I think our ProbABEL paper does give some details > > on score/Wald. > > > > Would that be good idea to put this discussion topic to our "Journal > > club"? - these are kind of topics of general interest irrespective of > > GenABEL. > > > > Good idea. I'll see if I can find the time to start the discussion there. > > > Best, > > Lennart. > > > > best, > > Yurii > > > > On Thu, Jul 11, 2013 at 11:56 PM, L.C. Karssen > > wrote: > > > > Dear all, > > > > For the upcoming release of ProbABEL I've run into the following. In > the > > past (~ v 0.1-3) the output of ProbABEL had chi^2 values when doing > Cox > > regression. These were based on the likelihood ratio test: > > 2 * (loglik -loglik_null) ~ chi_1^2 > > However, at some point, when having hamissing data was allowed in > > ProbABEL, we ran into the problem that the null model had to be > > recalculated for cases with missing genotype data. To do that > 'simply' > > for each SNP would be time consuming, so the chi^2 values were > removed > > from the output and replaced by the loglik values for the full model. > > (At least, that's how I guess it went). > > > > Now, I would like to get them back. This can be done in two ways: > > 1) calculate chi^2 as described above, with some smart way of only > > recalculating the null model when a missing value occurs (this > shouldn't > > be often with today's imputed data). > > 2) simply calculate the chi^2 value through the Wald test. We have > betas > > and se_betas, so that is easy. > > > > Many of you have more knowledge about statistics than I do, so, > > statistically, are these methods equivalent? Or is one better (more > > precise/unbiased) than the other? > > > > > > Another question: > > While testing the Wald-type implementation I ran into the following: > > I would assume that for the 2df models (where we get beta_SNP_A1A2 > and > > beta_SNP_A1A1) the final chi^2 value would be the sum of the > individual > > Wald statistics, which would be distributed as chi_2^2 (so 2 df). Is > > that correct? I ask this because if I compare them with the chi^2 > values > > from the LRT I get different values. In the example data set I get: > > name chi^2_Wald chi^2_LRT > > rs7247199 0.880949 0.452465 > > rs8102643 0.0116651 0.512709 <- here we have a missing > value! > > rs8102615 1.51434 0.754701 > > rs8105536 2.56337 1.33223 > > rs2312724 0.492364 0.256649 > > > > When running the additive model I do get (almost) the same results: > > name chi^2_Wald chi^2_LRT > > rs7247199 0.0101558 0.01012 > > rs8102643 0.353168 0.492147 <- here we have a missing > value! > > rs8102615 0.0181841 0.0180033 > > rs8105536 0.00222781 0.00222216 > > rs2312724 0.0412005 0.0401556 > > > > Shouldn't the chi_2 values be equal in both cases? FYI: the LRT chi^2 > > values are the same as those obtained with ProbABEL v0.1-3. > > > > > > Any suggestions? > > Thanks, > > > > Lennart. > > > > -- > > ----------------------------------------------------------------- > > L.C. Karssen > > Utrecht > > The Netherlands > > > > lennart at karssen.org > > http://blog.karssen.org > > > > Stuur mij aub geen Word of Powerpoint bestanden! > > Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html > > ------------------------------------------------------------------ > > > > > > _______________________________________________ > > genabel-devel mailing list > > genabel-devel at lists.r-forge.r-project.org > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > > > > > > > > -- > > ----------------------------------------------------- > > Yurii S. Aulchenko > > > > [ LinkedIn ] [ Twitter > > ] [ Blog > > ] > > -- > ----------------------------------------------------------------- > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > > Stuur mij aub geen Word of Powerpoint bestanden! > Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html > ------------------------------------------------------------------ > > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From kooyman at gmail.com Tue Jul 16 09:13:37 2013 From: kooyman at gmail.com (Maarten Kooyman) Date: Tue, 16 Jul 2013 09:13:37 +0200 Subject: [GenABEL-dev] layout of GenABEL main page In-Reply-To: References: <51D2C34D.2000907@gmail.com> <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it> <6632A424-420E-423B-957A-3B8481DD0122@burlo.trieste.it> Message-ID: <51E4F2A1.4040002@gmail.com> Hi Yurii, Under what kind of licence are the logo's available? Maybe it is handy to put them on the website for easy access. Kind regards, Maarten On 07/15/2013 10:02 PM, Yurii Aulchenko wrote: > Dear All, > > a small update - I have original vector graphics files from Grant at > my disposal; if some people would like to play with these files, send > me a message and I can forward the vector files to you. > > best, > Yurii > > On Fri, Jul 5, 2013 at 3:09 PM, Yurii Aulchenko > > wrote: > > > > On Fri, Jul 5, 2013 at 3:05 PM, Nicola Pirastu > > wrote: > > I agree, in the end it's not the coca-cola logo and we have > not been using it for years so I don't think people are going > to be confused if the Logo changes in a few months. > > > More than that - I really think it should evolve as our project > does :) > > I am actually curious to see how it will look on the forum. I > do think that if it's not too much work, the colors of the > forum and website should match those of the logo though. > > > Yep. I now start understanding why people were giving the costs > estimates of few thousands of euro for the that basic design > package: e.g. for facebook we need cover and avatar (latter would > do for the twitter as well). So this is whole project :) > > May be later we should think of inviting some guys from a design > school - they must be looking for graduation projects to make, and > may be they would be willing to do that for free :) > > YA > > > Nicola > > > Dr. Nicola Pirastu PhD > Research Fellow > Medical Sciences, Chirurgical and Health Department > University of Trieste > Medical Genetics > IRCCS Burlo Garofolo > Via dell'Istria 65/1 > 34137 Italy > tel. +390403785539 > > Il giorno 05/lug/2013, alle ore 14:55, Yurii Aulchenko > > > ha scritto: > >> I suggest that for the moment we go with what we have >> (Grant's variant); we can change later. >> >> Please let me know if you have a strong opinion against! - I >> really would like to use the logo for my presentation and >> also play a bit how well it fits our pages (genabel.org >> , facebook, twitter) >> >> YA >> >> On Tue, Jul 2, 2013 at 4:27 PM, Nicola Pirastu >> > > wrote: >> >> Just to add my two cents to the discussion, >> >> I think that the problem is not with the DNA helix but >> with the font. I've played around a bit with it and if >> you use for example Helvetica or something less >> comic-sans-like it does look better. Also for some reason >> I'm still disturbed by the green but it is a very >> personal opinion.. >> >> Nicola >> >> Dr. Nicola Pirastu PhD >> Research Fellow >> Medical Sciences, Chirurgical and Health Department >> University of Trieste >> Medical Genetics >> IRCCS Burlo Garofolo >> Via dell'Istria 65/1 >> 34137 Italy >> tel. +390403785539 >> >> Il giorno 02/lug/2013, alle ore 14:38, Yurii Aulchenko >> > > ha scritto: >> >>> Dear All, >>> >>> I agree with critique of Maarten, and I actually still >>> not sure if I like Maarten's or Grant's idea better. >>> Interesting thing is that - not sure all realize it - >>> Grant's variant is his vision of Maarten's prototype :) >>> However, Grant's variant has an important advantage - it >>> is ready to serve as logo. And I actually want to use a >>> logo in my slides for UseR!-2013. >>> >>> So I suggest we take Grant's logo as a working variant. >>> No doubt that the logo is going to evolve with time - as >>> anything we do in the project - code, documentation; >>> logo is no different, I think. The element which is >>> going to stay and keep it recognizable is the way of >>> spelling the GenABEL :) - Like Gnu's horns in the GNU logo. >>> >>> What we can do next is to place an open call on >>> site/forum for other users to contribute, but this is >>> going to take time, and meanwhile I suggest to stick >>> with Grant's variant. >>> >>> Yurii >>> >>> On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman >>> > wrote: >>> >>> Dear all, >>> >>> >>> It looks really nice ! Credits for who made it. >>> However, I have more the impression that it looks >>> like a polypeptide chain or a rosary. The seventies >>> font is a matter of taste, but it remind me of comic >>> sans(including a upside down e as a). I wonder if it >>> readable if you print it on a poster: I think this >>> is a important use-case of a scientific logo. >>> >>> Kind regards, >>> >>> >>> Maarten >>> >>> >>> >>> >>> On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote: >>> >>> On 28/06/13, Yurii Aulchenko >>> >> > wrote: >>> >>> How do you like this one? >>> >>> I like it a lot. >>> >>> What do you think about reducing the font size >>> for the subtitle >>> and right-justifying it? Would it still be >>> readable? I liked that >>> detail from the previous attempts with the >>> "Project" subtitle. >>> >>> In any case, this is just a minor detail. It >>> looks great as it is. >>> >>> Thanks to Grant Borodin! >>> >>> YA >>> >>> >>> On Thu, Jun 27, 2013 at 1:16 PM, Yurii >>> Aulchenko >> (javascript:main.compose()> >>> wrote: >>> >>> >>> Dear Nicola, Diego, Lennart, >>> >>> >>> Thanks for your feedback! I will ask >>> Grant Borodin, who kindly designed these >>> logos, if he could change C according to >>> your comment (capital "ABEL" and >>> "statistical genomics" as in F). >>> >>> >>> >>> >>> Yurii >>> >>> >>> >>> On Wed, Jun 26, 2013 at 4:16 PM, Diego >>> Fabregat Traver >>> >> (javascript:main.compose()> >>> wrote: >>> >>> >>> >>> >>> Congrats to whoever designed these >>> logos, they look very nice :) >>> >>> >>> >>> With respect to my preferences, I >>> fully agree with Lennart: "C with >>> capital ABEL and statistical >>> genomics below it" would be my choice. >>> >>> >>> >>> Best, >>> >>> Diego >>> >>> >>> >>> >>> >>> >>> On 20/06/13, "L.C. Karssen" >>> >> (javascript:main.compose()> >>> wrote: >>> >>> >>> >>> Wow! Those look really nice! >>> I like options C and F the most. >>> Actually a combination would be even >>> better IMHO: use C with capital >>> ABEL and statistical genomics >>> below it. >>> Looking forward to head the >>> opinion of others, >>> Lennart. >>> On 20-06-13 09:34, Yurii >>> Aulchenko wrote: >>> >>> Please find attached few >>> more logo variants >>> Yurii >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> genabel-devel mailing list >>> genabel-devel at lists.r-forge.r-project.org >>> >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >>> >>> >>> _______________________________________________ >>> genabel-devel mailing list >>> genabel-devel at lists.r-forge.r-project.org >>> >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >>> >>> >>> >>> >>> -- >>> ----------------------------------------------------- >>> Yurii S. Aulchenko >>> >>> [ LinkedIn >>> ] [ Twitter >>> ] [ Blog >>> ] >>> _______________________________________________ >>> genabel-devel mailing list >>> genabel-devel at lists.r-forge.r-project.org >>> >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >> >> AVVISO DI RISERVATEZZA Informazioni riservate possono >> essere contenute nel messaggio o nei suoi allegati. Se >> non siete i destinatari indicati nel messaggio, o >> responsabili per la sua consegna alla persona, o se avete >> ricevuto il messaggio per errore, siete pregati di non >> trascriverlo, copiarlo o inviarlo a nessuno. In tal caso >> vi invitiamo a cancellare il messaggio ed i suoi >> allegati. Grazie. CONFIDENTIALITY NOTICE Confidential >> information may be contained in this message or in its >> attachments. If you are not the addressee indicated in >> this message, or responsible for message delivering to >> that person, or if you have received this message in >> error, you may not transcribe, copy or deliver this >> message to anyone. In that case, you should delete this >> message and its attachments. Thank you. >> >> >> >> >> -- >> ----------------------------------------------------- >> Yurii S. Aulchenko >> >> [ LinkedIn ] [ >> Twitter ] [ Blog >> ] > > AVVISO DI RISERVATEZZA Informazioni riservate possono essere > contenute nel messaggio o nei suoi allegati. Se non siete i > destinatari indicati nel messaggio, o responsabili per la sua > consegna alla persona, o se avete ricevuto il messaggio per > errore, siete pregati di non trascriverlo, copiarlo o inviarlo > a nessuno. In tal caso vi invitiamo a cancellare il messaggio > ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE > Confidential information may be contained in this message or > in its attachments. If you are not the addressee indicated in > this message, or responsible for message delivering to that > person, or if you have received this message in error, you may > not transcribe, copy or deliver this message to anyone. In > that case, you should delete this message and its attachments. > Thank you. > > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter > ] [ Blog > ] > > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter > ] [ Blog > ] > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From lennart at karssen.org Tue Jul 16 17:11:56 2013 From: lennart at karssen.org (L.C. Karssen) Date: Tue, 16 Jul 2013 17:11:56 +0200 Subject: [GenABEL-dev] Creation of genabel-announce mailing list Message-ID: <51E562BC.20605@karssen.org> Dear all, I've created a new mailing list on the r-forge page. The address is genabel-announce at lists.r-forge.r-project.org and its intended to be used by package maintainers to announce new version of their packages (or completely new packages) so that users who want to stay up to date only need to subscribe to this list to be informed. By default e-mails to this list will be held until approved by the list owner or a list moderator. At present I have listed myself and Yurii as list-owners. Best, Lennart. -- ----------------------------------------------------------------- L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org Stuur mij aub geen Word of Powerpoint bestanden! Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html ------------------------------------------------------------------ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 230 bytes Desc: OpenPGP digital signature URL: From lennart at karssen.org Tue Jul 16 17:25:55 2013 From: lennart at karssen.org (L.C. Karssen) Date: Tue, 16 Jul 2013 17:25:55 +0200 Subject: [GenABEL-dev] Creation of genabel-announce mailing list In-Reply-To: <51E562BC.20605@karssen.org> References: <51E562BC.20605@karssen.org> Message-ID: <51E56603.9080809@karssen.org> I've added an announcement on the GenABEL.org wesite as well: http://www.genabel.org/node/284 Lennart. On 16-07-13 17:11, L.C. Karssen wrote: > Dear all, > > I've created a new mailing list on the r-forge page. The address is > genabel-announce at lists.r-forge.r-project.org and its intended to be used > by package maintainers to announce new version of their packages (or > completely new packages) so that users who want to stay up to date only > need to subscribe to this list to be informed. > > By default e-mails to this list will be held until approved by the list > owner or a list moderator. > At present I have listed myself and Yurii as list-owners. > > > Best, > > Lennart. > > > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- ----------------------------------------------------------------- L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org Stuur mij aub geen Word of Powerpoint bestanden! Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html ------------------------------------------------------------------ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 230 bytes Desc: OpenPGP digital signature URL: From lennart at karssen.org Thu Jul 18 23:33:26 2013 From: lennart at karssen.org (L.C. Karssen) Date: Thu, 18 Jul 2013 23:33:26 +0200 Subject: [GenABEL-dev] multiple ProbABEL's palinear runs In-Reply-To: References: Message-ID: <51E85F26.1030600@karssen.org> Dear Alvaro, Thank you for showing interest in the ProbABEL project! On 15-07-13 17:07, Alvaro Jesus Frank wrote: > > Dear all, > > I am working on a high performance implementation of an ordinary > linear estimator (OLS model), similar to the one implemented in > ProbABEL's palinear (without --mmscore option), where X are SNP given > and Y are the phenotypes. (As given by the ProbABEl manual on section > 7 "Methodology" at > http://www.genabel.org/sites/default/files/pdfs/ProbABEL_manual.pdf) > > > b = (X'*X)^-1 * X' * y. > > The goal is to solve this with multiple design matrices (SNPs??) X Indeed, the design matrix contains both SNP data and other covariates (e.g. sex, age, etc.). > and Phenotypes Y. For this we compute the formula as > > for each X for each Y b=(X'*X)^-1 * X' * y. > > > We want to offer the GenABEL community an Estimator to be used in the > same way people use the current tools (ProbABEL in R) Actually, ProbABEL is a command line tool. Even though several packages of the GenABEL project are R packages, ProbABEL is not. >, but faster, > and capable of handling LARGE datasets (in disk & memory). That is > why I am writing it in C++, Sounds good! ProbABEL is written in a mixture of C and C++. > while making sure that it can be called > directly from R. I'm not sure that that should be a requirement. At the moment the workflow is roughly the following: 1) prepare phenotype data (e.g. specify covariates, do QC like removing outliers, log transformation, etc.). This is done by each researcher independently, as they are the experts on their phenotypes. 2) Imputation of genetic data is done centrally as this is a time consuming task, that only needs to be redone if additional individuals have been genotyped or whenever a genomic reference set has been updated. This happens roughly once or twice per year. > > My understanding: A few concerns came to mind when researching the > workflow in using OMICS data in Linear Estimators. There seems to be > a long process before the real life data from MaCH (test.mldose? for > X Just to be sure, for each SNP, X contains dosage or probability data that SNP and the covariate data as specified by the researcher. > and mlinfo? for Y) Nope, Y is not take from the mlinfo file. The data from the mlinfo file is not used in the regression. After the regression is done, the information in the mlinfo file (e.g. SNP name, chromosome number, base pair position) is simply copied to the output file. > that is sitting on files can be used in > calculations. The first concern is how to obtain the design matrices > X from the files. I agree. > > It is my understanding that there are two types of data, imputed data > and databel data. Almost correct. Imputed genotype data "comes out of" the imputation software in the form of (possibly zipped) text files, the test.mldose (basically N_SNPs x N_ids) and test.mlinfo files (N_SNPs x ~7). The filevector/DatABEL file format is simply a way to store the dosage data in such a (binary) way that we don't need to load a complete text file into memory. FYI: An imputed data set of ~7000 individuals and ~20e6 imputed SNPs uses 459 GB in DatABEL format, the text-based mlinfo files take up 881 MB and the gzipped dosage text files take up 59GB. The top item on my wishlist is a compressed form of the filevector/DatABEL files, as you can see from these numbers. > Either way, data seems to be pre-processed early in > the workflow; Actually, there isn't too much preprocessing going on. If we only look at dosage data the only thing that needs to be done for each SNP is to add the dosage data for each individual as a column to the (constant) matrix of covariate data to form the design matrix X. Because we want to allow for missing (genotype) data we have added some routines to get the data without missing values. > my impression is that this preprocessing is done in R. Usually only for the creation of the phenotype file. For a single (non-omics) phenotype like height, disease status, a blood lipid level, etc. this is easy. The researcher usually has these files (N_IDs rows, one column for the phenotype and a few columns for covariates like age, sex, age^2, etc). Of course, for omics data the number of phenotypes is much larger. But for that scenario OmicABEL is developed. > It also seems that R can't handle large amounts of data loaded in > memory at once. That is another reasons why DatABEL (the R library interface to the filevector format) was developed. > > From what I see, data comes with some irregularities in its values > (missing values, invalid rows in X/Y matrices), and this makes it > difficult to use Linear Estimators right away; this is why the > preprocessing exists. Correct. Most people use imputed genotype data, there won't be many NA's there. On the other hand, since genotype imputation is done centrally for all genotype individuals, it is very common to have missing data in the phenotype file (i.e. Y and covariate data). > DatABEL seems to be the R tool (implemented in > C++) that can do fast pre-processing of big sets of data. A very common use case after running a GWAS (the genome-wide linear regression we're talking about), is that a reasearcher wants to know the exact dosages for all individuals for his top ten of most significant hits. This is when (s)he uses grep to find out in which mlinfo files the SNPs are located. This is necessary because the genotype data is split up in several files per chromosome to make handling the files easier (parallel computation of the linear regression on a multicore cluster is easy that way, we simply submit one job per 'chunk' and in that way several chunks run in parallel). Then (s)he starts R, loads the DatABEL library to read the genotype data from those specific files. > Well, I > think that DatABEL only does the reading and writing of files in C++ > (called filevector), Correct. > while the pre-processing functions are defined > and implemented in R. Am I correct? > Not quite. Apart from the one-time only conversion of the text files with (imputed) genotype data to DatABEL format (which is done in R usually, but the filevector lib also has command line tools (written in C++) to do this), the end user doesn't do much with DatABEL (for pre-processing). Within ProbABEL we do some pre-processing (e.g. removal of individuals without genotype information), and in the loop over all SNPs the combining of the genotype information with the other covariates into the design matrix. > > My Problems: This is where my troubles start. Since I am trying to > make this tool usable for the GenABEL community while still being > able to handle TERABYTES of data with fast computations, I would > really like to include the preprocessing of X and Y into my C++ > workflow. To solve the memory and performance limitations of R, I am > trying to load the data from disk from within C++. Since I am > performing my estimator function in C++, it expects those matrices to > have numbers that can be used for computation. Assuming that data > must be preprocessed to be able to get valid matrices with usable > numbers, I have the following options: > > A) For performance reasons, I was considering having the data already > pre-processed in disk files. Is this feasible, (preprocessed data > would take at most as much space in disk as original data, is this > cumbersome)? > > B) If there are only a few preprocessing functions that people use, I > could re-implement them inside C++ and use them on the fly while > loading the data from disk. This would be more difficult if everyone > has their own customized R pre-processing functions. > > C) Another alternative is to allow users to use their own R > pre-processing functions that pre-process the data. I would then go > about preprocessing on the fly from inside C++ by doing calls back to > R. This would be slower and harder to do than B). > > D) If DatABEL really does all the necesary pre-processing from inside > C++, I could just directly use it or allow the user to specify what > to use and won't need to re-implement the pre-processing functions. > It seems tho, that preprocessing of the data takes from 30mins to an > hour into DatABEL filevector format. I think it would be a good idea to rethink the DatABEL/filevector format. As I already mentioned, if we could store the data in a compressed way (while still retaining good speed and (relatively) low RAM usage life for the user would be much better. > > > I would really appreciate any help that would clarify my > understanding of how the pre-processing of data works and where it > fits in the work-flow. If you like we could set up a Skype call. I think that would help both of us a lot in understanding each other. Maybe Yurii and Maarten would like to participate as well? Thanks again for showing interest in ProbABEL. I think we can learn a lot from your expertise! Best regards, Lennart Karssen (present maintainer of the ProbABEL package) > > Best regards, > > - Alvaro Frank _______________________________________________ > genabel-devel mailing list genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > -- ----------------------------------------------------------------- L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org Stuur mij aub geen Word of Powerpoint bestanden! Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html ------------------------------------------------------------------ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 230 bytes Desc: OpenPGP digital signature URL: From yurii.aulchenko at gmail.com Sat Jul 20 10:19:26 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Sat, 20 Jul 2013 10:19:26 +0200 Subject: [GenABEL-dev] layout of GenABEL main page In-Reply-To: <51E4F2A1.4040002@gmail.com> References: <51D2C34D.2000907@gmail.com> <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it> <6632A424-420E-423B-957A-3B8481DD0122@burlo.trieste.it> <51E4F2A1.4040002@gmail.com> Message-ID: The license is the one we decide on - I paid for the logo and own the copyright et al. So I was thinking we can release it under some license which would allow people to play with it. At the same time I would like to make sure that the original logo and the derivatives are used only for the GenABEL project. Any ideas what is good license for that? I am a bit lost on that... - some variant of Creative Commons license? YA On Tue, Jul 16, 2013 at 9:13 AM, Maarten Kooyman wrote: > Hi Yurii, > > Under what kind of licence are the logo's available? Maybe it is handy to > put them on the website for easy access. > > Kind regards, > > Maarten > > On 07/15/2013 10:02 PM, Yurii Aulchenko wrote: > > Dear All, > > a small update - I have original vector graphics files from Grant at my > disposal; if some people would like to play with these files, send me a > message and I can forward the vector files to you. > > best, > Yurii > > On Fri, Jul 5, 2013 at 3:09 PM, Yurii Aulchenko > wrote: > >> >> >> On Fri, Jul 5, 2013 at 3:05 PM, Nicola Pirastu < >> nicola.pirastu at burlo.trieste.it> wrote: >> >>> I agree, in the end it's not the coca-cola logo and we have not been >>> using it for years so I don't think people are going to be confused if the >>> Logo changes in a few months. >>> >>> >> More than that - I really think it should evolve as our project does :) >> >> >> >>> I am actually curious to see how it will look on the forum. I do >>> think that if it's not too much work, the colors of the forum and website >>> should match those of the logo though. >>> >> >> Yep. I now start understanding why people were giving the costs >> estimates of few thousands of euro for the that basic design package: e.g. >> for facebook we need cover and avatar (latter would do for the twitter as >> well). So this is whole project :) >> >> May be later we should think of inviting some guys from a design school >> - they must be looking for graduation projects to make, and may be they >> would be willing to do that for free :) >> >> YA >> >> >>> >>> Nicola >>> >>> >>> Dr. Nicola Pirastu PhD >>> Research Fellow >>> Medical Sciences, Chirurgical and Health Department >>> University of Trieste >>> Medical Genetics >>> IRCCS Burlo Garofolo >>> Via dell'Istria 65/1 >>> 34137 Italy >>> tel. +390403785539 >>> >>> Il giorno 05/lug/2013, alle ore 14:55, Yurii Aulchenko < >>> yurii.aulchenko at gmail.com> ha scritto: >>> >>> I suggest that for the moment we go with what we have (Grant's variant); >>> we can change later. >>> >>> Please let me know if you have a strong opinion against! - I really >>> would like to use the logo for my presentation and also play a bit how well >>> it fits our pages (genabel.org, facebook, twitter) >>> >>> YA >>> >>> On Tue, Jul 2, 2013 at 4:27 PM, Nicola Pirastu < >>> nicola.pirastu at burlo.trieste.it> wrote: >>> >>>> Just to add my two cents to the discussion, >>>> >>>> I think that the problem is not with the DNA helix but with the font. >>>> I've played around a bit with it and if you use for example Helvetica or >>>> something less comic-sans-like it does look better. Also for some reason >>>> I'm still disturbed by the green but it is a very personal opinion.. >>>> >>>> Nicola >>>> >>>> Dr. Nicola Pirastu PhD >>>> Research Fellow >>>> Medical Sciences, Chirurgical and Health Department >>>> University of Trieste >>>> Medical Genetics >>>> IRCCS Burlo Garofolo >>>> Via dell'Istria 65/1 >>>> 34137 Italy >>>> tel. +390403785539 >>>> >>>> Il giorno 02/lug/2013, alle ore 14:38, Yurii Aulchenko < >>>> yurii.aulchenko at gmail.com> ha scritto: >>>> >>>> Dear All, >>>> >>>> I agree with critique of Maarten, and I actually still not sure if I >>>> like Maarten's or Grant's idea better. Interesting thing is that - not sure >>>> all realize it - Grant's variant is his vision of Maarten's prototype :) >>>> However, Grant's variant has an important advantage - it is ready to serve >>>> as logo. And I actually want to use a logo in my slides for UseR!-2013. >>>> >>>> So I suggest we take Grant's logo as a working variant. No doubt that >>>> the logo is going to evolve with time - as anything we do in the project - >>>> code, documentation; logo is no different, I think. The element which is >>>> going to stay and keep it recognizable is the way of spelling the GenABEL >>>> :) - Like Gnu's horns in the GNU logo. >>>> >>>> What we can do next is to place an open call on site/forum for other >>>> users to contribute, but this is going to take time, and meanwhile I >>>> suggest to stick with Grant's variant. >>>> >>>> Yurii >>>> >>>> On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman wrote: >>>> >>>>> Dear all, >>>>> >>>>> >>>>> It looks really nice ! Credits for who made it. However, I have more >>>>> the impression that it looks like a polypeptide chain or a rosary. The >>>>> seventies font is a matter of taste, but it remind me of comic >>>>> sans(including a upside down e as a). I wonder if it readable if you print >>>>> it on a poster: I think this is a important use-case of a scientific logo. >>>>> >>>>> Kind regards, >>>>> >>>>> >>>>> Maarten >>>>> >>>>> >>>>> >>>>> >>>>> On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote: >>>>> >>>>>> On 28/06/13, Yurii Aulchenko wrote: >>>>>> >>>>>> How do you like this one? >>>>>>> >>>>>> I like it a lot. >>>>>> >>>>>> What do you think about reducing the font size for the subtitle >>>>>> and right-justifying it? Would it still be readable? I liked that >>>>>> detail from the previous attempts with the "Project" subtitle. >>>>>> >>>>>> In any case, this is just a minor detail. It looks great as it is. >>>>>> >>>>>> Thanks to Grant Borodin! >>>>>> >>>>>> >>>>>>> YA >>>>>>> >>>>>>> >>>>>>> On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko < >>>>>>> yurii.aulchenko at gmail.com(javascript:main.compose()> wrote: >>>>>>> >>>>>>> >>>>>>> Dear Nicola, Diego, Lennart, >>>>>>>> >>>>>>>> >>>>>>>> Thanks for your feedback! I will ask Grant Borodin, who kindly >>>>>>>> designed these logos, if he could change C according to your comment >>>>>>>> (capital "ABEL" and "statistical genomics" as in F). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Yurii >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver < >>>>>>>> fabregat at aices.rwth-aachen.de(javascript:main.compose()> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Congrats to whoever designed these logos, they look very nice :) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> With respect to my preferences, I fully agree with Lennart: "C >>>>>>>>> with capital ABEL and statistical genomics below it" would be my choice. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> >>>>>>>>> Diego >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 20/06/13, "L.C. Karssen" >>>>>>>> javascript:main.compose()> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Wow! Those look really nice! >>>>>>>>>> I like options C and F the most. Actually a combination would be >>>>>>>>>> even >>>>>>>>>> better IMHO: use C with capital ABEL and statistical genomics >>>>>>>>>> below it. >>>>>>>>>> Looking forward to head the opinion of others, >>>>>>>>>> Lennart. >>>>>>>>>> On 20-06-13 09:34, Yurii Aulchenko wrote: >>>>>>>>>> >>>>>>>>>>> Please find attached few more logo variants >>>>>>>>>>> Yurii >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>> genabel-devel mailing list >>>>>> genabel-devel at lists.r-forge.r-project.org >>>>>> >>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >>>>>> >>>>> >>>>> _______________________________________________ >>>>> genabel-devel mailing list >>>>> genabel-devel at lists.r-forge.r-project.org >>>>> >>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >>>>> >>>> >>>> >>>> >>>> -- >>>> ----------------------------------------------------- >>>> Yurii S. Aulchenko >>>> >>>> [ LinkedIn ] [ Twitter] [ >>>> Blog ] >>>> _______________________________________________ >>>> genabel-devel mailing list >>>> genabel-devel at lists.r-forge.r-project.org >>>> >>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >>>> >>>> >>>> AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute >>>> nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel >>>> messaggio, o responsabili per la sua consegna alla persona, o se avete >>>> ricevuto il messaggio per errore, siete pregati di non trascriverlo, >>>> copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il >>>> messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential >>>> information may be contained in this message or in its attachments. If you >>>> are not the addressee indicated in this message, or responsible for message >>>> delivering to that person, or if you have received this message in error, >>>> you may not transcribe, copy or deliver this message to anyone. In that >>>> case, you should delete this message and its attachments. Thank you. >>>> >>> >>> >>> >>> -- >>> ----------------------------------------------------- >>> Yurii S. Aulchenko >>> >>> [ LinkedIn ] [ Twitter] [ >>> Blog ] >>> >>> >>> AVVISO DI RISERVATEZZA Informazioni riservate possono essere >>> contenute nel messaggio o nei suoi allegati. Se non siete i destinatari >>> indicati nel messaggio, o responsabili per la sua consegna alla persona, o >>> se avete ricevuto il messaggio per errore, siete pregati di non >>> trascriverlo, copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a >>> cancellare il messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE >>> Confidential information may be contained in this message or in its >>> attachments. If you are not the addressee indicated in this message, or >>> responsible for message delivering to that person, or if you have received >>> this message in error, you may not transcribe, copy or deliver this message >>> to anyone. In that case, you should delete this message and its >>> attachments. Thank you. >>> >> >> >> >> -- >> ----------------------------------------------------- >> Yurii S. Aulchenko >> >> [ LinkedIn ] [ Twitter] [ >> Blog ] >> > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter] [ > Blog ] > > > _______________________________________________ > genabel-devel mailing listgenabel-devel at lists.r-forge.r-project.orghttps://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Sat Jul 20 17:15:41 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Sat, 20 Jul 2013 17:15:41 +0200 Subject: [GenABEL-dev] using reshuffle Message-ID: Hi Sodbo, It seems that reshuffle does not work correctly, at least I can not get to the results with it (see below). I use a dataset with ~107k traits and ~280k SNPs. Any idea? - do I do something wrong? YA With perl-extractor I get chi2 of 62 ya567666 at cluster:~[167]$ perl extractCell.pl /hpcwork/df938257/natgen/B2 329 209602 | gawk '{print $_,($2/$4)^2}' -0.165153577923775 0.580845952033997 0.0298683661967516 0.0734809562563896 -0.00155110028572381 62.4845 But this is not the case with reshuffle (and also I do not get any output with reshuffle /hpcwork/df938257/natgen/B2 --chi=30, while I know there are such chi2's in the results) ya567666 at cluster:~[167]$ reshuffle /hpcwork/df938257/natgen/B2 --snps=209602 --traits=329 --chi Finish iout_file read 0.11 sec Start_write_chi_data=0.14 sec End_write_chi_trait spm_1_AND_spmp_23 0.14 sec Finish_write_chi_data 0.14 sec Finish reshuffling 0.14 sec ya567666 at cluster:~[168]$ cat chi_data.txt SNP Trait beta_1 beta_SNP se_1 se_SNP cov_SNP_1 Chi2 rs4902242 spm_1_AND_spmp_23 -0.00234050769358873 -0.0338250175118446 0.128280490636826 0.0329618416726589 0.0770578160881996 1.05306001371466 ya567666 at cluster:~[169]$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yurii.aulchenko at gmail.com Sat Jul 20 17:26:57 2013 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Sat, 20 Jul 2013 17:26:57 +0200 Subject: [GenABEL-dev] using reshuffle In-Reply-To: References: Message-ID: Another point: apparently you do not check boundaries - e.g. when I try to get results for trait #200,000 (I have 107,000 only) I get the core dump. YA On Sat, Jul 20, 2013 at 5:15 PM, Yurii Aulchenko wrote: > Hi Sodbo, > > It seems that reshuffle does not work correctly, at least I can not get to > the results with it (see below). I use a dataset with ~107k traits and > ~280k SNPs. > > Any idea? - do I do something wrong? > > YA > > With perl-extractor I get chi2 of 62 > > ya567666 at cluster:~[167]$ perl extractCell.pl /hpcwork/df938257/natgen/B2 > 329 209602 | gawk '{print $_,($2/$4)^2}' > -0.165153577923775 0.580845952033997 0.0298683661967516 > 0.0734809562563896 -0.00155110028572381 62.4845 > > But this is not the case with reshuffle (and also I do not get any output > with reshuffle /hpcwork/df938257/natgen/B2 --chi=30, while I know there are > such chi2's in the results) > > ya567666 at cluster:~[167]$ reshuffle /hpcwork/df938257/natgen/B2 > --snps=209602 --traits=329 --chi > Finish iout_file read 0.11 sec > Start_write_chi_data=0.14 sec > End_write_chi_trait spm_1_AND_spmp_23 0.14 sec > Finish_write_chi_data 0.14 sec > Finish reshuffling 0.14 sec > ya567666 at cluster:~[168]$ cat chi_data.txt > SNP Trait beta_1 beta_SNP se_1 se_SNP cov_SNP_1 > Chi2 > rs4902242 spm_1_AND_spmp_23 -0.00234050769358873 > -0.0338250175118446 0.128280490636826 0.0329618416726589 > 0.0770578160881996 1.05306001371466 > ya567666 at cluster:~[169]$ > -- ----------------------------------------------------- Yurii S. Aulchenko [ LinkedIn ] [ Twitter] [ Blog ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From sharapovsodbo at gmail.com Sat Jul 20 18:10:15 2013 From: sharapovsodbo at gmail.com (=?KOI8-R?B?88/Ews8g+8HSwdDP1w==?=) Date: Sat, 20 Jul 2013 23:10:15 +0700 Subject: [GenABEL-dev] using reshuffle In-Reply-To: References: Message-ID: Hello! I'll will check reshuffle tomorrow. 20.07.2013 22:26 ???????????? "Yurii Aulchenko" ???????: > Another point: apparently you do not check boundaries - e.g. when I try to > get results for trait #200,000 (I have 107,000 only) I get the core dump. > > YA > > On Sat, Jul 20, 2013 at 5:15 PM, Yurii Aulchenko < > yurii.aulchenko at gmail.com> wrote: > >> Hi Sodbo, >> >> It seems that reshuffle does not work correctly, at least I can not get >> to the results with it (see below). I use a dataset with ~107k traits and >> ~280k SNPs. >> >> Any idea? - do I do something wrong? >> >> YA >> >> With perl-extractor I get chi2 of 62 >> >> ya567666 at cluster:~[167]$ perl extractCell.pl /hpcwork/df938257/natgen/B2 >> 329 209602 | gawk '{print $_,($2/$4)^2}' >> -0.165153577923775 0.580845952033997 0.0298683661967516 >> 0.0734809562563896 -0.00155110028572381 62.4845 >> >> But this is not the case with reshuffle (and also I do not get any output >> with reshuffle /hpcwork/df938257/natgen/B2 --chi=30, while I know there are >> such chi2's in the results) >> >> ya567666 at cluster:~[167]$ reshuffle /hpcwork/df938257/natgen/B2 >> --snps=209602 --traits=329 --chi >> Finish iout_file read 0.11 sec >> Start_write_chi_data=0.14 sec >> End_write_chi_trait spm_1_AND_spmp_23 0.14 sec >> Finish_write_chi_data 0.14 sec >> Finish reshuffling 0.14 sec >> ya567666 at cluster:~[168]$ cat chi_data.txt >> SNP Trait beta_1 beta_SNP se_1 se_SNP cov_SNP_1 >> Chi2 >> rs4902242 spm_1_AND_spmp_23 -0.00234050769358873 >> -0.0338250175118446 0.128280490636826 0.0329618416726589 >> 0.0770578160881996 1.05306001371466 >> ya567666 at cluster:~[169]$ >> > > > > -- > ----------------------------------------------------- > Yurii S. Aulchenko > > [ LinkedIn ] [ Twitter] [ > Blog ] > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvaro.frank at rwth-aachen.de Sun Jul 21 20:28:52 2013 From: alvaro.frank at rwth-aachen.de (Alvaro Jesus Frank) Date: Sun, 21 Jul 2013 20:28:52 +0200 Subject: [GenABEL-dev] multiple ProbABEL's palinear runs Message-ID: Dear Lennart, Thanks for the reply with all the useful information. Perhaps when I have a prototype (computational core excluding real data handling) working we could set up the skype call? Here I have some follow up questions. > > I'm not sure that that should be a requirement. At the moment the > workflow is roughly the following: > 1) prepare phenotype data (e.g. specify covariates, do QC like removing > outliers, log transformation, etc.). This is done by each researcher > independently, as they are the experts on their phenotypes. > Usually only for the creation of the phenotype file. For a single > (non-omics) phenotype like height, disease status, a blood lipid level, > etc. this is easy. The researcher usually has these files (N_IDs rows, > one column for the phenotype and a few columns for covariates like age, > sex, age^2, etc). > Of course, for omics data the number of phenotypes is much larger. But > for that scenario OmicABEL is developed. The purpose is to go along the lines of OmicABEL where multiple phenotypes can be used in the computation, but by being as flexible to any existing ways of storing the multiple phenotype data as possible. I.e: If the standard already is (for a single phenotype ) to have a .txt file for analysis, simply use their existing files in bulk. If everyone stores this in their own way, then simply going the way of OmicABEL would be the best, requiring all phen. files to be re-packaged in a DatABEL format. If everyone uses the same standard for phenotype files, then I can just support those directly (supporting low memory usage too, as this is not dependent on how data is stored, but on how it is accessed). > 2) Imputation of genetic data is done centrally as this is a time > consuming task, It takes hours to my understanding right? > that only needs to be redone if additional individuals > have been genotyped or whenever a genomic reference set has been > updated. This happens roughly once or twice per year. Data on files on disk that is used in computations already went through this process right? (I.e: is ready to compute) > FYI: An imputed data set of ~7000 individuals and ~20e6 imputed SNPs > uses 459 GB in DatABEL format, the text-based mlinfo files take up 881 > MB and the gzipped dosage text files take up 59GB. > The top item on my wishlist is a compressed form of the > filevector/DatABEL files, as you can see from these numbers. So the DatABEL binary file takes MORE space than the raw equivalent dosage text files *.mldose (when gzipped)? What about when they are not compressed? According to my calculations if there are N=10^9 entries, in binary you can store with single precision 1 entry in 32bits(4Bytes) to a total of 3.72 Gigs (N*4) but in raw text file each digit requires 1 byte, storing 9 characters to represent the number, then it would requires at least N*8 = 8,38 Gigs, which should be double the size. > Imputed genotype data "comes out of" the imputation > software in the form of (possibly zipped) text files, the test.mldose > (basically N_SNPs x N_ids) and test.mlinfo files (N_SNPs x ~7). > > The filevector/DatABEL file format is simply a way to store the dosage > data in such a (binary) way that we don't need to load a complete text > file into memory. If users had the choice, what would the rather have the application do: a) Use the existing raw text .mldose file(files?) they already have without requiring to use their entire memory at once (similar to filevector). b) Force them to transform their files into even more files that use the filevector format and the application would use those (also low memory usage). c) Something else? > Actually, there isn't too much preprocessing going on. If we only look > at dosage data the only thing that needs to be done for each SNP is to > add the dosage data for each individual as a column to the (constant) > matrix of covariate data to form the design matrix X. This is the process that I refer to as X = [ XL | XR ] where the design matrix X is formed like: -Covariates XL that is constant (of size N_ids:rows, N_covariates:columns) -XR that is built with dosage data and is different for each ___ what? (how? If the dosage data is a big sequence how do you establish how much to take and add to XL to form X. > Because we want to allow for missing (genotype) data we have added some > routines to get the data without missing values. > That is another reasons why DatABEL (the R library interface to the > filevector format) was developed. This is already done in that central process that happens only once or twice a year like you mentioned before right? Data sitting on files already accounts for this missing values? > Most people use imputed genotype data, there won't be many NA's > there. On the other hand, since genotype imputation is done centrally > for all genotype individuals, it is very common to have missing data in > the phenotype file (i.e. Y and covariate data). How does the processing of genotype data create missing pheno data? How is this then corrected? (by user/probabel?) Does this mean that if phenotype data is missing for an individual, then this individual is simply not used in the calculation? I.e: in the part of the regresion where: X' * Y the calculation is not performed? Or "missing data in the phenotype file" means that Y has missing rows and data must be dropped/filled? (for non covariate entries). I know that OmicABEL does averaging for the missing covariate entries. Is this done for non covariate missing entries? If each Phenotype file comes with both covariate data (whic his supposed to be cosntant) and phenotype data, does this mean that the constant data is duplicated in disk? > Not quite. Apart from the one-time only conversion of the text files > with (imputed) genotype data to DatABEL format (which is done in R > usually, but the filevector lib also has command line tools (written in > C++) to do this), the end user doesn't do much with DatABEL (for > pre-processing). Within ProbABEL we do some pre-processing (e.g. removal > of individuals without genotype information), How do you determine which are these? This means that the users leaves their phenotypic data uncorrected in files? (prev.question). So if genotype data is also missing for Y's that DO exist, these are also dropped? What other data manipulations not part of the regression process are done inside ProbABEL? > and in the loop over all > SNPs the combining of the genotype information with the other covariates > into the design matrix. > this is the formation of X = [ XL | XR ] right? > The top item on my wishlist is a compressed form of the > filevector/DatABEL files, as you can see from these numbers. > > I think it would be a good idea to rethink the DatABEL/filevector > format. As I already mentioned, if we could store the data in a > compressed way (while still retaining good speed and (relatively) low > RAM usage life for the user would be much better. I have looked into this and there are some solutions for data compression of random floating point data. I am not sure how efficient they are but my guess is what disk usage can be reduced to around 70-60%. It must be stated that data loading into memory is independet on how it is stored. It is ALWAYS possible to just load parts of files into memory, be either in filevector format or inputed *.mldose data. The routine that DatABEL uses to load memory are the only thing that needs to be worked on to support ow memory usage, and not the format itself. On another topic related to OmicABEL, I wish to know to what extend it is used and if its not used widely, what the reason is. What hinders its adoption to do multiple Xr and Y analysis? Thanks again for the input! -Alvaro Frank From kooyman at gmail.com Sun Jul 21 21:21:04 2013 From: kooyman at gmail.com (Maarten Kooyman) Date: Sun, 21 Jul 2013 21:21:04 +0200 Subject: [GenABEL-dev] multiple ProbABEL's palinear runs In-Reply-To: References: Message-ID: <51EC34A0.4060303@gmail.com> Dear Alvaro, I did some benchmarking on ProbABEL's palinear (without --mmscore option) in the past and I can recall that most time the program spend on getting the genotype data to the OLS part, and not the OLS part itself. I could not find the results of the profiling so I am not sure this was truly the case. Loading the genotypes only ones instead of it N times(where N is number of phenotypes) would give a speed up. However, be aware when using real life data, outliers of the phonotypes are removed. If this outliers are not removed in your data, the amount of false positives will be high. So matrix X is for every phenotype unique. Since the (X'*X)^-1 * X' which is a part of b = (X'*X)^-1 * X' * y. is not the same for each phenotype, the speed-up there will be hard(er) to get. I think without the ability to censor phenotypes the program will not have much real life use. Kind regards, Maarten On 07/15/2013 05:07 PM, Alvaro Jesus Frank wrote: > Dear all, > > I am working on a high performance implementation of an ordinary linear estimator (OLS model), similar to the one implemented in ProbABEL's palinear (without --mmscore option), where X are SNP given and Y are the phenotypes. > (As given by the ProbABEl manual on section 7 "Methodology" at http://www.genabel.org/sites/default/files/pdfs/ProbABEL_manual.pdf) > > > b = (X'*X)^-1 * X' * y. > > The goal is to solve this with multiple design matrices (SNPs??) X and Phenotypes Y. For this we compute the formula as > > for each X > for each Y > b=(X'*X)^-1 * X' * y. > > > We want to offer the GenABEL community an Estimator to be used in the same way people use the current tools (ProbABEL in R), but faster, and capable of handling LARGE datasets (in disk & memory). > That is why I am writing it in C++, while making sure that it can be called directly from R. > > My understanding: > A few concerns came to mind when researching the workflow in using OMICS data in Linear Estimators. > There seems to be a long process before the real life data from MaCH (test.mldose? for X and mlinfo? for Y) that is sitting on files can be used in calculations. The first concern is how to obtain the design matrices X from the files. > > It is my understanding that there are two types of data, imputed data and databel data. Either way, data seems to be pre-processed early in the workflow; my impression is that this preprocessing is done in R. It also seems that R can't handle large amounts of data loaded in memory at once. > > From what I see, data comes with some irregularities in its values (missing values, invalid rows in X/Y matrices), and this makes it difficult to use Linear Estimators right away; this is why the preprocessing exists. DatABEL seems to be the R tool (implemented in C++) that can do fast pre-processing of big sets of data. Well, I think that DatABEL only does the reading and writing of files in C++ (called filevector), while the pre-processing functions are defined and implemented in R. Am I correct? > > > My Problems: > This is where my troubles start. Since I am trying to make this tool usable for the GenABEL community while still being able to handle TERABYTES of data with fast computations, I would really like to include the preprocessing of X and Y into my C++ workflow. To solve the memory and performance limitations of R, I am trying to load the data from disk from within C++. Since I am performing my estimator function in C++, it expects those matrices to have numbers that can be used for computation. Assuming that data must be preprocessed to be able to get valid matrices with usable numbers, I have the following options: > > A) > For performance reasons, I was considering having the data already pre-processed in disk files. Is this feasible, (preprocessed data would take at most as much space in disk as original data, is this cumbersome)? > > B) > If there are only a few preprocessing functions that people use, I could re-implement them inside C++ and use them on the fly while loading the data from disk. This would be more difficult if everyone has their own customized R pre-processing functions. > > C) > Another alternative is to allow users to use their own R pre-processing functions that pre-process the data. I would then go about preprocessing on the fly from inside C++ by doing calls back to R. This would be slower and harder to do than B). > > D) > If DatABEL really does all the necesary pre-processing from inside C++, I could just directly use it or allow the user to specify what to use and won't need to re-implement the pre-processing functions. It seems tho, that preprocessing of the data takes from 30mins to an hour into DatABEL filevector format. > > > I would really appreciate any help that would clarify my understanding of how the pre-processing of data works and where it fits in the work-flow. > > Best regards, > > - Alvaro Frank > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel From sharapovsodbo at gmail.com Thu Jul 25 08:44:17 2013 From: sharapovsodbo at gmail.com (=?KOI8-R?B?88/Ews8g+8HSwdDP1w==?=) Date: Thu, 25 Jul 2013 13:44:17 +0700 Subject: [GenABEL-dev] using reshuffle In-Reply-To: References: Message-ID: Dear all! I commited newest version of reshuffle Now reshuffle works 2x faster!=) Reasons: --ostringstream oss: outputs cache --exclude from cycle's and put them upper double* buf = new double[per_trait_per_snp]; char s[30]; --(int64_t) blablabla instead of (int64_t)bla + (int64_t)bla + (int64_t)bla To find "hot spots" in reshuffle, I used GNU Profiler GNU Coverage testing tool Very useful tools to find right places in programm to optimizate! Now 5Gb CLAK-GWAS output convert to 16 Gb txt files for 380 sec or 6 minutes. Machine: Intel Core i7 930; 8Gb RAM (it is not cluster's node, I think on cluster's node reshuffle's run would be faster=) There are problems with extract heritability and write slim data. I'll check soon 2013/7/20 ????? ??????? > Hello! > I'll will check reshuffle tomorrow. > 20.07.2013 22:26 ???????????? "Yurii Aulchenko" > ???????: > > Another point: apparently you do not check boundaries - e.g. when I try to >> get results for trait #200,000 (I have 107,000 only) I get the core dump. >> >> YA >> >> On Sat, Jul 20, 2013 at 5:15 PM, Yurii Aulchenko < >> yurii.aulchenko at gmail.com> wrote: >> >>> Hi Sodbo, >>> >>> It seems that reshuffle does not work correctly, at least I can not get >>> to the results with it (see below). I use a dataset with ~107k traits and >>> ~280k SNPs. >>> >>> Any idea? - do I do something wrong? >>> >>> YA >>> >>> With perl-extractor I get chi2 of 62 >>> >>> ya567666 at cluster:~[167]$ perl extractCell.pl >>> /hpcwork/df938257/natgen/B2 329 209602 | gawk '{print $_,($2/$4)^2}' >>> -0.165153577923775 0.580845952033997 0.0298683661967516 >>> 0.0734809562563896 -0.00155110028572381 62.4845 >>> >>> But this is not the case with reshuffle (and also I do not get any >>> output with reshuffle /hpcwork/df938257/natgen/B2 --chi=30, while I know >>> there are such chi2's in the results) >>> >>> ya567666 at cluster:~[167]$ reshuffle /hpcwork/df938257/natgen/B2 >>> --snps=209602 --traits=329 --chi >>> Finish iout_file read 0.11 sec >>> Start_write_chi_data=0.14 sec >>> End_write_chi_trait spm_1_AND_spmp_23 0.14 sec >>> Finish_write_chi_data 0.14 sec >>> Finish reshuffling 0.14 sec >>> ya567666 at cluster:~[168]$ cat chi_data.txt >>> SNP Trait beta_1 beta_SNP se_1 se_SNP cov_SNP_1 >>> Chi2 >>> rs4902242 spm_1_AND_spmp_23 -0.00234050769358873 >>> -0.0338250175118446 0.128280490636826 0.0329618416726589 >>> 0.0770578160881996 1.05306001371466 >>> ya567666 at cluster:~[169]$ >>> >> >> >> >> -- >> ----------------------------------------------------- >> Yurii S. Aulchenko >> >> [ LinkedIn ] [ Twitter] [ >> Blog ] >> > -- *_________________________________* * *With best regards Sodbo Zh. Sharapov Phone: +79831347688 Email: sharapovsodbo at gmail.com sharapov at bionet.nsc.ru Skype: sharapovsodbo -------------- next part -------------- An HTML attachment was scrubbed... URL: From lennart at karssen.org Thu Jul 25 17:08:17 2013 From: lennart at karssen.org (L.C. Karssen) Date: Thu, 25 Jul 2013 17:08:17 +0200 Subject: [GenABEL-dev] using reshuffle In-Reply-To: References: Message-ID: <51F13F61.7040604@karssen.org> Hi Sodbo, On 25-07-13 08:44, ????? ??????? wrote: > Dear all! > I commited newest version of reshuffle > Now reshuffle works 2x faster!=) That's always good news! > Reasons: > > --ostringstream oss: outputs cache > > --exclude from cycle's and put them upper > double* buf = new double[per_trait_per_snp]; > char s[30]; > > --(int64_t) blablabla instead of (int64_t)bla + (int64_t)bla + > (int64_t)bla > > To find "hot spots" in reshuffle, I used > > GNU Profiler > GNU Coverage testing tool I vaguely remember having heard of the coverage testing tool, but I've never used it. Interesting! > > Very useful tools to find right places in programm to optimizate! > > Now 5Gb CLAK-GWAS output convert to 16 Gb txt files for 380 sec or 6 > minutes. > Machine: Intel Core i7 930; 8Gb RAM (it is not cluster's node, I think on > cluster's node reshuffle's run would be faster=) > > There are problems with extract heritability and write slim data. > I'll check soon > Thanks for all the work! Lennart. > > > > > 2013/7/20 ????? ??????? > >> Hello! >> I'll will check reshuffle tomorrow. >> 20.07.2013 22:26 ???????????? "Yurii Aulchenko" >> ???????: >> >> Another point: apparently you do not check boundaries - e.g. when I try to >>> get results for trait #200,000 (I have 107,000 only) I get the core dump. >>> >>> YA >>> >>> On Sat, Jul 20, 2013 at 5:15 PM, Yurii Aulchenko < >>> yurii.aulchenko at gmail.com> wrote: >>> >>>> Hi Sodbo, >>>> >>>> It seems that reshuffle does not work correctly, at least I can not get >>>> to the results with it (see below). I use a dataset with ~107k traits and >>>> ~280k SNPs. >>>> >>>> Any idea? - do I do something wrong? >>>> >>>> YA >>>> >>>> With perl-extractor I get chi2 of 62 >>>> >>>> ya567666 at cluster:~[167]$ perl extractCell.pl >>>> /hpcwork/df938257/natgen/B2 329 209602 | gawk '{print $_,($2/$4)^2}' >>>> -0.165153577923775 0.580845952033997 0.0298683661967516 >>>> 0.0734809562563896 -0.00155110028572381 62.4845 >>>> >>>> But this is not the case with reshuffle (and also I do not get any >>>> output with reshuffle /hpcwork/df938257/natgen/B2 --chi=30, while I know >>>> there are such chi2's in the results) >>>> >>>> ya567666 at cluster:~[167]$ reshuffle /hpcwork/df938257/natgen/B2 >>>> --snps=209602 --traits=329 --chi >>>> Finish iout_file read 0.11 sec >>>> Start_write_chi_data=0.14 sec >>>> End_write_chi_trait spm_1_AND_spmp_23 0.14 sec >>>> Finish_write_chi_data 0.14 sec >>>> Finish reshuffling 0.14 sec >>>> ya567666 at cluster:~[168]$ cat chi_data.txt >>>> SNP Trait beta_1 beta_SNP se_1 se_SNP cov_SNP_1 >>>> Chi2 >>>> rs4902242 spm_1_AND_spmp_23 -0.00234050769358873 >>>> -0.0338250175118446 0.128280490636826 0.0329618416726589 >>>> 0.0770578160881996 1.05306001371466 >>>> ya567666 at cluster:~[169]$ >>>> >>> >>> >>> >>> -- >>> ----------------------------------------------------- >>> Yurii S. Aulchenko >>> >>> [ LinkedIn ] [ Twitter] [ >>> Blog ] >>> >> > > > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- ----------------------------------------------------------------- L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org Stuur mij aub geen Word of Powerpoint bestanden! Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html ------------------------------------------------------------------ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 230 bytes Desc: OpenPGP digital signature URL: From lennart at karssen.org Tue Jul 30 09:32:42 2013 From: lennart at karssen.org (L.C. Karssen) Date: Tue, 30 Jul 2013 09:32:42 +0200 Subject: [GenABEL-dev] Precision and scientific notation in ProbABEL Message-ID: <51F76C1A.2000205@karssen.org> Dear list, I'm finalising version 0.4.0 of ProbABEL and there are two things I'd like your opinion on: 1) with what precision should we print the betas, standard errors and Chi^2 values to the output files? 2) Should we use scientific notation in the output (for betas, standard errors and Chi^2)? In ProbABEL v0.3.0 and earlier output was simply sent to cout without any explicit formatting. In practice this lead usually to 6 significant digits, but sometimes less. My proposal is to fix the precision at 6 significant digits. Regarding item 2): most of the betas I see are in the range between 0 and 10, although in case of no effect beta's can be of the order of 1e-2, 1e-3. All in all, I don't think switching to scientific notation will improve the output. What are your opinions? Thanks, Lennart. -- ----------------------------------------------------------------- L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org Stuur mij aub geen Word of Powerpoint bestanden! Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html ------------------------------------------------------------------ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 230 bytes Desc: OpenPGP digital signature URL: From nicola.pirastu at burlo.trieste.it Tue Jul 30 10:33:04 2013 From: nicola.pirastu at burlo.trieste.it (Nicola Pirastu) Date: Tue, 30 Jul 2013 10:33:04 +0200 Subject: [GenABEL-dev] Precision and scientific notation in ProbABEL In-Reply-To: <51F76C1A.2000205@karssen.org> References: <51F76C1A.2000205@karssen.org> Message-ID: <5B307B30-9137-4A17-8A36-D43FC2818B94@burlo.trieste.it> Dear Lennart, I think that switching to scientific notation is not really necessary and could lead to a little of loss in precision unless of course you still use 6 significant digits which will translate in just a reduction of 0 in the values. So if for example we were to choose scientific notation with 3 significant digits, although this would not affect very much the final results we could be asked to submit more and would not be able to comply. So to summarize I think that if it does not have any effect on performance of ProbABEL 6 significant digits without scientific notation is fine. Best Nicola Dr. Nicola Pirastu PhD Research Fellow Medical Sciences, Chirurgical and Health Department University of Trieste Medical Genetics IRCCS Burlo Garofolo Via dell'Istria 65/1 34137 Italy tel. +390403785539 Il giorno 30/lug/2013, alle ore 09:32, "L.C. Karssen" ha scritto: > Dear list, > > I'm finalising version 0.4.0 of ProbABEL and there are two things I'd > like your opinion on: > > 1) with what precision should we print the betas, standard errors and > Chi^2 values to the output files? > > 2) Should we use scientific notation in the output (for betas, standard > errors and Chi^2)? > > In ProbABEL v0.3.0 and earlier output was simply sent to cout without > any explicit formatting. In practice this lead usually to 6 significant > digits, but sometimes less. My proposal is to fix the precision at 6 > significant digits. > > Regarding item 2): most of the betas I see are in the range between 0 > and 10, although in case of no effect beta's can be of the order of > 1e-2, 1e-3. All in all, I don't think switching to scientific notation > will improve the output. > > > What are your opinions? > > > Thanks, > > Lennart. > -- > ----------------------------------------------------------------- > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > > Stuur mij aub geen Word of Powerpoint bestanden! > Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html > ------------------------------------------------------------------ > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel messaggio, o responsabili per la sua consegna alla persona, o se avete ricevuto il messaggio per errore, siete pregati di non trascriverlo, copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential information may be contained in this message or in its attachments. If you are not the addressee indicated in this message, or responsible for message delivering to that person, or if you have received this message in error, you may not transcribe, copy or deliver this message to anyone. In that case, you should delete this message and its attachments. Thank you.