From sharapovsodbo at gmail.com  Mon Jul  1 09:51:04 2013
From: sharapovsodbo at gmail.com (=?KOI8-R?B?88/Ews8g+8HSwdDP1w==?=)
Date: Mon, 1 Jul 2013 14:51:04 +0700
Subject: [GenABEL-dev] bug_in_OmicABEL_reshuffle_fixed
Message-ID: <CAPF08KtzFsLPqgpTPQ3VYB_6e4Yh3rfjFB+808NfD2Hns3brjQ@mail.gmail.com>

Dear all!
I fixed bug in OmicABEL_reshuffle.
This bug was only for big data. The reason is, that for big output data
value of tile_coordinate is higher, than max(int).
For example: for data with 1080 ids and 122756 SNPs
max(tile_coordinate)=1080(ids) * 122756(SNPs) * 8 (sizeof(double)) * 5
(columns:beta_1,se_1,beta_SNP,se_SNP, etc) =  5 303 059 200
max(int) = 2 147 483 647
max(unsigned int) = 4 294 967 295
This values is lower than max(tile_coordinate). That's why tile_coordinates
for a half of data were incorrect and senseless.
So, the solution of this problem is change type of variabels for
tile_coordinates: I select int64_t instead of int.
max (int64_t)= 9,223,372,036,854,775,808. I think this is enough!=)
Now, "reshuffle" works with big data correctly. Compilation for Linux and
Windows was succesful.
-- 
*_________________________________*
*
*With best regards

Sodbo Zh. Sharapov
Phone:  +79831347688
Email:    sharapovsodbo at gmail.com
             sharapov at bionet.nsc.ru
Skype:   sharapovsodbo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130701/60d68340/attachment.html>

From lennart at karssen.org  Mon Jul  1 11:06:08 2013
From: lennart at karssen.org (L.C. Karssen)
Date: Mon, 01 Jul 2013 11:06:08 +0200
Subject: [GenABEL-dev] bug_in_OmicABEL_reshuffle_fixed
In-Reply-To: <CAPF08KtzFsLPqgpTPQ3VYB_6e4Yh3rfjFB+808NfD2Hns3brjQ@mail.gmail.com>
References: <CAPF08KtzFsLPqgpTPQ3VYB_6e4Yh3rfjFB+808NfD2Hns3brjQ@mail.gmail.com>
Message-ID: <51D14680.4000908@karssen.org>

Thanks ?????, good work!

I've got a similar feature request/bug report for ProbABEL, do you know
what the effect of going from unsigned int to int64 will be on memory
usage? In the case of ProbABEL it is mostly about the counters for SNPs
and samples/IDs, so my guess is that it wouldn't be much of an increase
(only a few extra bits for those counters); all the allelic
dosages/probabilities are stored as doubles, so that won't change.

Of course going from unsigned int to in64 will mean people can load more
data at the same time, but in my opinion it is their responsibility to
have enough free memory (if they don't have that ProbABEL will fail with
an allocation error).


Thanks,

Lennart.

On 01-07-13 09:51, ????? ??????? wrote:
> Dear all!
> I fixed bug in OmicABEL_reshuffle.
> This bug was only for big data. The reason is, that for big output data
> value of tile_coordinate is higher, than max(int).
> For example: for data with 1080 ids and 122756 SNPs
> max(tile_coordinate)=1080(ids) * 122756(SNPs) * 8 (sizeof(double)) * 5
> (columns:beta_1,se_1,beta_SNP,se_SNP, etc) =  5 303 059 200
> max(int) = 2 147 483 647
> max(unsigned int) = 4 294 967 295
> This values is lower than max(tile_coordinate). That's why
> tile_coordinates for a half of data were incorrect and senseless.
> So, the solution of this problem is change type of variabels for
> tile_coordinates: I select int64_t instead of int.
> max (int64_t)= 9,223,372,036,854,775,808. I think this is enough!=)
> Now, "reshuffle" works with big data correctly. Compilation for Linux
> and Windows was succesful.
> -- 
> ___________________________________
> _
> _With best regards
> 
> Sodbo Zh. Sharapov
> Phone:  +79831347688
> Email:    sharapovsodbo at gmail.com <mailto:sharapovsodbo at gmail.com>
>              sharapov at bionet.nsc.ru <mailto:sharapov at bionet.nsc.ru>
> Skype:   sharapovsodbo
> 
> 
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> 

-- 
-----------------------------------------------------------------
L.C. Karssen
Utrecht
The Netherlands

lennart at karssen.org
http://blog.karssen.org

Stuur mij aub geen Word of Powerpoint bestanden!
Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
------------------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130701/160d2886/attachment.sig>

From yurii.aulchenko at gmail.com  Mon Jul  1 13:59:32 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Mon, 1 Jul 2013 13:59:32 +0200
Subject: [GenABEL-dev] [Genabel-commits] r1264 - in pkg/OmicABEL: . doc
	src src/float2double
In-Reply-To: <20130701085630.EE01B18070E@r-forge.r-project.org>
References: <20130701085630.EE01B18070E@r-forge.r-project.org>
Message-ID: <5083713419523740697@unknownmsgid>

Diego, thanks for reacting so quickly and arranging the float2double
converter for filevector files!

Two questions/suggestions:

1) I wonder if float2double is a good name - could that be the name is
already taken? Should we be more specific this is related to
filevector?

2) You check that inFile is != float and break execution if yes.
Should the program also report what format the data is in? e.g. "The
inFile contains filevector-INT, but I can only convert
filevector-FLOAT to filevector-DOUBLE"?

These are suggestions for discussion - I do not have a strong opinion here.

YA

----------------------
Yurii Aulchenko
(sent from mobile device)

On 1 Jul 2013, at 10:56, "noreply at r-forge.r-project.org"
<noreply at r-forge.r-project.org> wrote:

> Author: dfabregat
> Date: 2013-07-01 10:56:30 +0200 (Mon, 01 Jul 2013)
> New Revision: 1264
>
> Added:
>   pkg/OmicABEL/src/float2double/
>   pkg/OmicABEL/src/float2double/float2double.c
> Modified:
>   pkg/OmicABEL/Makefile
>   pkg/OmicABEL/doc/HOWTO
> Log:
> Adding the program float2double to translate DatABEL
> "float" data into DatABEL "double" data.
>
>
> Modified: pkg/OmicABEL/Makefile
> ===================================================================
> --- pkg/OmicABEL/Makefile    2013-07-01 08:50:00 UTC (rev 1263)
> +++ pkg/OmicABEL/Makefile    2013-07-01 08:56:30 UTC (rev 1264)
> @@ -2,8 +2,10 @@
>
> SRCDIR = ./src
> RESH_SRCDIR = ./src/reshuffle
> +F2D_SRCDIR  = ./src/float2double
> CLAKGWAS  = ./bin/CLAK-GWAS
> RESHUFFLE = ./bin/reshuffle
> +F2D       = ./bin/float2double
>
> #QUICK and DIRTY
> CXX=g++
> @@ -15,11 +17,13 @@
> SRCS = $(SRCDIR)/CLAK_GWAS.c $(SRCDIR)/fgls_chol.c $(SRCDIR)/fgls_eigen.c $(SRCDIR)/wrappers.c $(SRCDIR)/timing.c $(SRCDIR)/statistics.c $(SRCDIR)/REML.c $(SRCDIR)/optimization.c $(SRCDIR)/ooc_BLAS.c $(SRCDIR)/double_buffering.c $(SRCDIR)/utils.c $(SRCDIR)/GWAS.c $(SRCDIR)/databel.c
> OBJS = $(SRCS:.c=.o)
> RESH_SRCS=$(RESH_SRCDIR)/main.cpp $(RESH_SRCDIR)/iout_file.cpp $(RESH_SRCDIR)/Parameters.cpp $(RESH_SRCDIR)/reshuffle.cpp $(RESH_SRCDIR)/test.cpp
> -RESH_OBJS = $(RESH_SRCS:.cpp=.o)
> +RESH_OBJS=$(RESH_SRCS:.cpp=.o)
> +F2D_SRCS=$(F2D_SRCDIR)/float2double.c
> +F2D_OBJS=$(F2D_SRCS:.c=.o) $(SRCDIR)/databel.o $(SRCDIR)/wrappers.o
>
> .PHONY: all clean
>
> -all: ./bin/ $(CLAKGWAS) $(RESHUFFLE)
> +all: ./bin/ $(CLAKGWAS) $(RESHUFFLE) $(F2D)
>
> ./bin:
>    mkdir bin
> @@ -31,15 +35,19 @@
>    cd $(RESH_SRCDIR)
>    $(CXX) $^ -o $@
>
> +$(F2D): $(F2D_OBJS)
> +    cd $(F2D_SRCDIR)
> +    $(CC) $^ -o $@
> +
> # Dirty, improve
> platform=Linux
> bindistDir=OmicABEL-$(platform)-bin
> -bindist: ./bin/ $(CLAKGWAS) $(RESHUFFLE)
> +bindist: ./bin/ $(CLAKGWAS) $(RESHUFFLE) $(F2D)
>    rm -rf $(bindistDir)
>    mkdir $(bindistDir)
>    mkdir $(bindistDir)/bin/
>    mkdir $(bindistDir)/doc/
> -    cp -a $(CLAKGWAS) $(RESHUFFLE) $(bindistDir)/bin/
> +    cp -a $(CLAKGWAS) $(RESHUFFLE) $(F2D) $(bindistDir)/bin/
>    cp -a COPYING LICENSE README DISCLAIMER.$(platform) $(bindistDir)
>    cp -a doc/README-reshuffle doc/INSTALL doc/HOWTO $(bindistDir)/doc
>    tar -czvf $(bindistDir).tgz $(bindistDir)
> @@ -52,6 +60,8 @@
>    $(RM) $(SRCDIR)/*opari_GPU*
>    $(RM) $(RESH_OBJS)
>    $(RM) $(RESHUFFLE)
> +    $(RM) $(F2D_OBJS)
> +    $(RM) $(F2D)
>
>
> src/CLAK_GWAS.o: src/CLAK_GWAS.c src/wrappers.h src/utils.h src/GWAS.h \
>
> Modified: pkg/OmicABEL/doc/HOWTO
> ===================================================================
> --- pkg/OmicABEL/doc/HOWTO    2013-07-01 08:50:00 UTC (rev 1263)
> +++ pkg/OmicABEL/doc/HOWTO    2013-07-01 08:56:30 UTC (rev 1264)
> @@ -5,6 +5,9 @@
>
> * CLAK-GWAS: the program to run GWAS analyses (through CLAK-Chol or CLAK-Eig)
> * reshuffle: the program to extract the output of CLAK-GWAS into text format
> +* float2double: the program to translate databel files (*.fvi, *.fvd)
> +                in single precision "float" format into double precision
> +                "double" format.
>
> The output produced by CLAK-GWAS is kept in a compact binary format
> for performance reasons. The user can then use "reshuffle" to
> @@ -21,6 +24,10 @@
>
> http://www.genabel.org/packages/OmicABEL
>
> +If you already prepared your data in DatABEL format, but you used
> +single precision (float) data. You can make use of float2double
> +to transform it into double precision (double) data.
> +
> If you need help, please contact us, or use the GenABEL project forum
>
> http://forum.genabel.org
> @@ -40,7 +47,7 @@
>
> The example in the tutorial also provides a basic example on using OmicABEL
> to run your GWAS analyses. Here we detail the options of CLAK-GWAS.
> -The complete list of options for CLAK-GWAS is avaliable through the command
> +The complete list of options for CLAK-GWAS is available through the command
>
> ./CLAK-GWAS -h
>
> @@ -84,3 +91,5 @@
>
>
> For a detailed description of "reshuffle", please refer to doc/README-reshuffle
> +
> +
>
> Added: pkg/OmicABEL/src/float2double/float2double.c
> ===================================================================
> --- pkg/OmicABEL/src/float2double/float2double.c                            (rev 0)
> +++ pkg/OmicABEL/src/float2double/float2double.c    2013-07-01 08:56:30 UTC (rev 1264)
> @@ -0,0 +1,142 @@
> +/*
> + * Copyright (c) 2010-2013, Diego Fabregat-Traver and Paolo Bientinesi.
> + * All rights reserved.
> + *
> + * This file is part of OmicABEL.
> + *
> + * OmicABEL is free software: you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 3 of the License, or
> + * (at your option) any later version.
> + *
> + * OmicABEL is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with OmicABEL. If not, see <http://www.gnu.org/licenses/>.
> + *
> + *
> + * Coded by:
> + *   Diego Fabregat-Traver (fabregat at aices.rwth-aachen.de)
> + */
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +#include "../wrappers.h"
> +#include "../databel.h"
> +
> +#define MB (1L<<20)
> +#define STR_BUFFER_SIZE 256
> +
> +int main( int argc, char *argv[] )
> +{
> +    char  fin_path_fvi[STR_BUFFER_SIZE],
> +          fin_path_fvd[STR_BUFFER_SIZE],
> +         fout_path_fvi[STR_BUFFER_SIZE],
> +         fout_path_fvd[STR_BUFFER_SIZE];
> +    FILE *fin, *fout;
> +    struct databel_fvi *databel_in, *databel_out;
> +
> +    float *datain;
> +    double *dataout;
> +    size_t buff_size = 256*MB;
> +
> +    long long int nelems;
> +    int nelems_in_buff, nelems_to_write;
> +    int header_data_size;
> +
> +    int i, j, out;
> +
> +    if ( argc != 3 )
> +    {
> +        fprintf( stderr, "Usage: %s floatFileIn doubleFileOut\n", argv[0] );
> +        exit( EXIT_FAILURE );
> +    }
> +
> +    snprintf(  fin_path_fvi, STR_BUFFER_SIZE, "%s.fvi", argv[1] );
> +    snprintf(  fin_path_fvd, STR_BUFFER_SIZE, "%s.fvd", argv[1] );
> +    snprintf( fout_path_fvi, STR_BUFFER_SIZE, "%s.fvi", argv[2] );
> +    snprintf( fout_path_fvd, STR_BUFFER_SIZE, "%s.fvd", argv[2] );
> +
> +    // FVI files
> +    databel_in = load_databel_fvi( fin_path_fvi );
> +    if ( databel_in->fvi_header.type != FLOAT_TYPE )
> +    {
> +        fprintf( stderr, "Input databel file(s) %s should include \"float\" data\n", argv[1]);
> +        exit( EXIT_FAILURE );
> +    }
> +    databel_out = (databel_fvi *) fgls_malloc( sizeof(databel_fvi) );
> +    // Header
> +    databel_out->fvi_header.type = DOUBLE_TYPE;
> +    databel_out->fvi_header.nelements       = databel_in->fvi_header.nelements;
> +    databel_out->fvi_header.numObservations = databel_in->fvi_header.numObservations;
> +    databel_out->fvi_header.numVariables    = databel_in->fvi_header.numVariables;
> +    databel_out->fvi_header.bytesPerRecord  = sizeof( double );
> +    databel_out->fvi_header.bitsPerRecord   = databel_out->fvi_header.bytesPerRecord * 8;
> +    databel_out->fvi_header.namelength      = databel_in->fvi_header.namelength;
> +    for ( i = 0; i < RESERVEDSPACE; i++ )
> +        databel_out->fvi_header.reserved[i] = '\0';
> +    // Labels
> +    header_data_size = (databel_out->fvi_header.numVariables + databel_out->fvi_header.numObservations ) *
> +                        databel_out->fvi_header.namelength * sizeof(char);
> +    databel_out->fvi_data = (char *) fgls_malloc ( header_data_size );
> +    memcpy( databel_out->fvi_data, databel_in->fvi_data, header_data_size );
> +
> +    // Write
> +    fout = fgls_fopen( fout_path_fvi, "wb" );
> +    out = fwrite( &databel_out->fvi_header, sizeof(databel_fvi_header), 1, fout);
> +    if ( out != 1 )
> +    {
> +        fprintf(stderr, "Error writing fvi header\n" );
> +        exit( EXIT_FAILURE );
> +    }
> +    out = fwrite( databel_out->fvi_data,
> +                  databel_out->fvi_header.namelength * sizeof(char),
> +                  databel_out->fvi_header.numVariables + databel_out->fvi_header.numObservations,
> +                  fout);
> +    if ( out != (databel_out->fvi_header.numVariables + databel_out->fvi_header.numObservations) )
> +    {
> +        fprintf(stderr, "Error writing fvi data\n" );
> +        exit( EXIT_FAILURE );
> +    }
> +    fclose( fout );
> +
> +    // FVD
> +    fin  = fgls_fopen(  fin_path_fvd, "rb" );
> +    fout = fgls_fopen( fout_path_fvd, "wb" );
> +    // buff_size determines the size of the buffer for the "double" array.
> +    // For the same amount of elements, float needs half the memory space
> +    datain  = (float *)  fgls_malloc( buff_size / 2 );
> +    dataout = (double *) fgls_malloc( buff_size );
> +
> +    nelems = databel_out->fvi_header.numVariables * databel_out->fvi_header.numObservations; // total elems in file
> +    nelems_in_buff = buff_size / sizeof(double);
> +    for ( i = 0; i < nelems; i += nelems_in_buff )
> +    {
> +        nelems_to_write = ((nelems - i) >= nelems_in_buff) ? nelems_in_buff : nelems - i;
> +        if ( fread( datain, sizeof(float), nelems_to_write, fin ) != nelems_to_write )
> +        {
> +            fprintf( stderr, "Error reading data from %s\n", fin_path_fvd );
> +            exit( EXIT_FAILURE );
> +        }
> +        for ( j = 0; j < nelems_to_write; j++ )
> +            dataout[j] = (double)datain[j];
> +        if ( fwrite( dataout, sizeof(double), nelems_to_write, fout ) != nelems_to_write )
> +        {
> +            fprintf( stderr, "Error writing data to %s\n", fout_path_fvd );
> +            exit( EXIT_FAILURE );
> +        }
> +    }
> +    fclose( fin );
> +    fclose( fout );
> +    free( datain );
> +    free( dataout );
> +    free_databel_fvi( &databel_in );
> +    free_databel_fvi( &databel_out );
> +
> +    return 0;
> +}
>
> _______________________________________________
> Genabel-commits mailing list
> Genabel-commits at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits

From yurii.aulchenko at gmail.com  Mon Jul  1 14:40:01 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Mon, 1 Jul 2013 14:40:01 +0200
Subject: [GenABEL-dev] bug_in_OmicABEL_reshuffle_fixed
In-Reply-To: <CAPF08KtzFsLPqgpTPQ3VYB_6e4Yh3rfjFB+808NfD2Hns3brjQ@mail.gmail.com>
References: <CAPF08KtzFsLPqgpTPQ3VYB_6e4Yh3rfjFB+808NfD2Hns3brjQ@mail.gmail.com>
Message-ID: <CAHX9t6LD+bYBO8OjqpFCnzPcDrghpKws592nps0RfafT8Es+UA@mail.gmail.com>

Thanks, Sodbo - does pass my test now! :)

This is actually very good - I was so depressed not seeing any association,
then happy to discover a bug, and now even more happy to see quite a few
significant hits!

YA

On Mon, Jul 1, 2013 at 9:51 AM, ????? ??????? <sharapovsodbo at gmail.com>wrote:

> Dear all!
> I fixed bug in OmicABEL_reshuffle.
> This bug was only for big data. The reason is, that for big output data
> value of tile_coordinate is higher, than max(int).
> For example: for data with 1080 ids and 122756 SNPs
> max(tile_coordinate)=1080(ids) * 122756(SNPs) * 8 (sizeof(double)) * 5
> (columns:beta_1,se_1,beta_SNP,se_SNP, etc) =  5 303 059 200
> max(int) = 2 147 483 647
> max(unsigned int) = 4 294 967 295
> This values is lower than max(tile_coordinate). That's why
> tile_coordinates for a half of data were incorrect and senseless.
> So, the solution of this problem is change type of variabels for
> tile_coordinates: I select int64_t instead of int.
> max (int64_t)= 9,223,372,036,854,775,808. I think this is enough!=)
> Now, "reshuffle" works with big data correctly. Compilation for Linux and
> Windows was succesful.
> --
> *_________________________________*
> *
> *With best regards
>
> Sodbo Zh. Sharapov
> Phone:  +79831347688
> Email:    sharapovsodbo at gmail.com
>              sharapov at bionet.nsc.ru
> Skype:   sharapovsodbo
>
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130701/1424be47/attachment.html>

From yurii.aulchenko at gmail.com  Mon Jul  1 14:50:44 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Mon, 1 Jul 2013 14:50:44 +0200
Subject: [GenABEL-dev] update of OmicABEL binaries on genabel.org
Message-ID: <CAHX9t6KNAAkNh8bHgsTdSqJkyZsUV0k1ZwPPZLB+1OLihufoHQ@mail.gmail.com>

Dear Diego,

can you please compile the (updated) OmicABEL for Linux and push the
bin-dist to the genabel.org?

before that - should we also change the version number so people do not get
confused?

YA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130701/f34d004e/attachment.html>

From sharapovsodbo at gmail.com  Mon Jul  1 16:52:17 2013
From: sharapovsodbo at gmail.com (=?KOI8-R?B?88/Ews8g+8HSwdDP1w==?=)
Date: Mon, 1 Jul 2013 21:52:17 +0700
Subject: [GenABEL-dev] bug_in_OmicABEL_reshuffle_fixed
In-Reply-To: <CAHX9t6LD+bYBO8OjqpFCnzPcDrghpKws592nps0RfafT8Es+UA@mail.gmail.com>
References: <CAPF08KtzFsLPqgpTPQ3VYB_6e4Yh3rfjFB+808NfD2Hns3brjQ@mail.gmail.com>
 <CAHX9t6LD+bYBO8OjqpFCnzPcDrghpKws592nps0RfafT8Es+UA@mail.gmail.com>
Message-ID: <CAPF08Ksv5D8ZU2r6fwketagpst44tSM4we3g_PFtFpO=Mu+p8A@mail.gmail.com>

Thank you, Lennart and Yurii=)

>I've got a similar feature request/bug report for ProbABEL, do you know
>what the effect of going from unsigned int to int64 will be on memory
>usage?

int64_t use 8 bytes instead of 4 bytes for int.
In case of "reshuffle", now there are only three int64_t variables. As you
can see, there is no problem with size of memory.
But, during "reshuffling" tile_coordinates counting many times (about one
time per 5-10 doubles from data).
So, now reshuffle's runtime for data [1080traits;122756SNP;5 columns] is
about 21 sec (this runtime is for --chi=25 opertaion).
Before correction, runtime was about 16 sec...faster than now.

PS: I found some another bugs in reshuffle(with --heritabilities) and
,also, ways to optimized work with big data. As soon as possible, I'll do
it.


2013/7/1 Yurii Aulchenko <yurii.aulchenko at gmail.com>

> Thanks, Sodbo - does pass my test now! :)
>
> This is actually very good - I was so depressed not seeing any
> association, then happy to discover a bug, and now even more happy to see
> quite a few significant hits!
>
> YA
>
> On Mon, Jul 1, 2013 at 9:51 AM, ????? ??????? <sharapovsodbo at gmail.com>wrote:
>
>> Dear all!
>> I fixed bug in OmicABEL_reshuffle.
>> This bug was only for big data. The reason is, that for big output data
>> value of tile_coordinate is higher, than max(int).
>> For example: for data with 1080 ids and 122756 SNPs
>> max(tile_coordinate)=1080(ids) * 122756(SNPs) * 8 (sizeof(double)) * 5
>> (columns:beta_1,se_1,beta_SNP,se_SNP, etc) =  5 303 059 200
>> max(int) = 2 147 483 647
>> max(unsigned int) = 4 294 967 295
>> This values is lower than max(tile_coordinate). That's why
>> tile_coordinates for a half of data were incorrect and senseless.
>> So, the solution of this problem is change type of variabels for
>> tile_coordinates: I select int64_t instead of int.
>> max (int64_t)= 9,223,372,036,854,775,808. I think this is enough!=)
>> Now, "reshuffle" works with big data correctly. Compilation for Linux and
>> Windows was succesful.
>> --
>> *_________________________________*
>> *
>> *With best regards
>>
>> Sodbo Zh. Sharapov
>> Phone:  +79831347688
>> Email:    sharapovsodbo at gmail.com
>>              sharapov at bionet.nsc.ru
>> Skype:   sharapovsodbo
>>
>> _______________________________________________
>> genabel-devel mailing list
>> genabel-devel at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>
>
>
>
> --
> -----------------------------------------------------
> Yurii S. Aulchenko
>
> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>


-- 
*_________________________________*
*
*With best regards

Sodbo Zh. Sharapov
Phone:  +79831347688
Email:    sharapovsodbo at gmail.com
             sharapov at bionet.nsc.ru
Skype:   sharapovsodbo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130701/f44aba55/attachment-0001.html>

From sharapovsodbo at gmail.com  Mon Jul  1 17:39:18 2013
From: sharapovsodbo at gmail.com (=?KOI8-R?B?88/Ews8g+8HSwdDP1w==?=)
Date: Mon, 1 Jul 2013 22:39:18 +0700
Subject: [GenABEL-dev] OmicABEL_float2double_compilation_failed
Message-ID: <CAPF08KuS9JQ2RnmvhyU6oXoCC9RkFdZEWZCp9WBY=2PHyzBbaw@mail.gmail.com>

Hello!

I have a problem with compilation float2double for Linux:

lima at mga:~/Sodbo/Packages/OmicABEL/src/float2double$ gcc float2double.c
-Wall -o float2double
float2double.c: In function ?main?:
float2double.c:67: error: expected expression before ?)? token
float2double.c:74: error: expected expression before ?;? token

and for Windows the same:

gcc float2double.c
float2double.c: In function 'main':
float2double.c:67:49 error: expected expression before ?)? token
float2double.c:74:44: error: expected expression before ?;? token

As far as I can judge, the problem is in this expresions:

1) if ( databel_in->fvi_header.type != FLOAT_TYPE ){}
2) databel_out->fvi_header.type = DOUBLE_TYPE;


-- 
*_________________________________*
*
*With best regards

Sodbo Zh. Sharapov
Phone:  +79831347688
Email:    sharapovsodbo at gmail.com
             sharapov at bionet.nsc.ru
Skype:   sharapovsodbo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130701/a971225f/attachment.html>

From yurii.aulchenko at gmail.com  Mon Jul  1 21:13:25 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Mon, 1 Jul 2013 21:13:25 +0200
Subject: [GenABEL-dev] OmicABEL_float2double_compilation_failed
In-Reply-To: <CAPF08KuS9JQ2RnmvhyU6oXoCC9RkFdZEWZCp9WBY=2PHyzBbaw@mail.gmail.com>
References: <CAPF08KuS9JQ2RnmvhyU6oXoCC9RkFdZEWZCp9WBY=2PHyzBbaw@mail.gmail.com>
Message-ID: <CAHX9t6KH+AvYZuCn3O4CP3c2PON_OcSeDsfPPpvuiO01qbm2ew@mail.gmail.com>

Sodbo - please check the Makefile - it looks like float2double make use of
other source files as well!

YA

On Mon, Jul 1, 2013 at 5:39 PM, ????? ??????? <sharapovsodbo at gmail.com>wrote:

> Hello!
>
> I have a problem with compilation float2double for Linux:
>
> lima at mga:~/Sodbo/Packages/OmicABEL/src/float2double$ gcc float2double.c
> -Wall -o float2double
> float2double.c: In function ?main?:
> float2double.c:67: error: expected expression before ?)? token
> float2double.c:74: error: expected expression before ?;? token
>
> and for Windows the same:
>
> gcc float2double.c
> float2double.c: In function 'main':
> float2double.c:67:49 error: expected expression before ?)? token
> float2double.c:74:44: error: expected expression before ?;? token
>
> As far as I can judge, the problem is in this expresions:
>
> 1) if ( databel_in->fvi_header.type != FLOAT_TYPE ){}
> 2) databel_out->fvi_header.type = DOUBLE_TYPE;
>
>
>
> --
> *_________________________________*
> *
> *With best regards
>
> Sodbo Zh. Sharapov
> Phone:  +79831347688
> Email:    sharapovsodbo at gmail.com
>              sharapov at bionet.nsc.ru
> Skype:   sharapovsodbo
>
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130701/c712387c/attachment.html>

From yurii.aulchenko at gmail.com  Tue Jul  2 09:27:34 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Tue, 2 Jul 2013 09:27:34 +0200
Subject: [GenABEL-dev] bug_in_OmicABEL_reshuffle_fixed
In-Reply-To: <CAPF08Ksv5D8ZU2r6fwketagpst44tSM4we3g_PFtFpO=Mu+p8A@mail.gmail.com>
References: <CAPF08KtzFsLPqgpTPQ3VYB_6e4Yh3rfjFB+808NfD2Hns3brjQ@mail.gmail.com>
 <CAHX9t6LD+bYBO8OjqpFCnzPcDrghpKws592nps0RfafT8Es+UA@mail.gmail.com>
 <CAPF08Ksv5D8ZU2r6fwketagpst44tSM4we3g_PFtFpO=Mu+p8A@mail.gmail.com>
Message-ID: <CAHX9t6K6bV9N2XwE_izvWGH9uLqWPwQp=uyGVQ4=aG1wjbEj9w@mail.gmail.com>

On Mon, Jul 1, 2013 at 4:52 PM, ????? ??????? <sharapovsodbo at gmail.com>wrote:

> Thank you, Lennart and Yurii=)
>
> >I've got a similar feature request/bug report for ProbABEL, do you know
> >what the effect of going from unsigned int to int64 will be on memory
> >usage?
>
> int64_t use 8 bytes instead of 4 bytes for int.
> In case of "reshuffle", now there are only three int64_t variables. As you
> can see, there is no problem with size of memory.
> But, during "reshuffling" tile_coordinates counting many times (about one
> time per 5-10 doubles from data).
> So, now reshuffle's runtime for data [1080traits;122756SNP;5 columns] is
> about 21 sec (this runtime is for --chi=25 opertaion).
> Before correction, runtime was about 16 sec...faster than now.
>
> PS: I found some another bugs in reshuffle(with --heritabilities) and
> ,also, ways to optimized work with big data. As soon as possible, I'll do
> it.
>
>
Yep, I noticed that outputs of --heritabilities are a bit strange (some
small negatives for parameters which should be positive) :)

keep us posted!

YA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130702/156bac02/attachment.html>

From fabregat at aices.rwth-aachen.de  Tue Jul  2 10:56:01 2013
From: fabregat at aices.rwth-aachen.de (Diego Fabregat Traver)
Date: Tue, 02 Jul 2013 10:56:01 +0200
Subject: [GenABEL-dev] OmicABEL_float2double_compilation_failed
In-Reply-To: <CAPF08KuS9JQ2RnmvhyU6oXoCC9RkFdZEWZCp9WBY=2PHyzBbaw@mail.gmail.com>
References: <CAPF08KuS9JQ2RnmvhyU6oXoCC9RkFdZEWZCp9WBY=2PHyzBbaw@mail.gmail.com>
Message-ID: <fb9d29c71a4ef2.51d2b1c1@aices.rwth-aachen.de>

Hi Sodbo,

thanks for the report. I didn't commit databel.h, which
defines and assigns a value to the datatype identifiers.

It should work now. Please, let me know.

Best,
Diego

> Hello!
> 
> 
> I have a problem with compilation float2double for Linux:
> lima at mga:~/Sodbo/Packages/OmicABEL/src/float2double$ gcc float2double.c -Wall -o float2double
> float2double.c: In function ?main?:
> 
> float2double.c:67: error: expected expression before ?)? token
> float2double.c:74: error: expected expression before ?;? token
> 
> 
> 
> and for Windows the same:
> 
> 
> 
> gcc float2double.c
> 
> float2double.c: In function 'main':
> 
> float2double.c:67:49 error: expected expression before ?)? token
> float2double.c:74:44: error: expected expression before ?;? token
> 
> 
> 
> 
> As far as I can judge, the problem is in this expresions:
> 
> 
> 1) if ( databel_in->fvi_header.type != FLOAT_TYPE ){}
> 2) databel_out->fvi_header.type = DOUBLE_TYPE;


From fabregat at aices.rwth-aachen.de  Tue Jul  2 10:59:28 2013
From: fabregat at aices.rwth-aachen.de (Diego Fabregat Traver)
Date: Tue, 02 Jul 2013 10:59:28 +0200
Subject: [GenABEL-dev] OmicABEL_float2double_compilation_failed
In-Reply-To: <CAHX9t6KH+AvYZuCn3O4CP3c2PON_OcSeDsfPPpvuiO01qbm2ew@mail.gmail.com>
References: <CAPF08KuS9JQ2RnmvhyU6oXoCC9RkFdZEWZCp9WBY=2PHyzBbaw@mail.gmail.com>
 <CAHX9t6KH+AvYZuCn3O4CP3c2PON_OcSeDsfPPpvuiO01qbm2ew@mail.gmail.com>
Message-ID: <fbb5754e1a2948.51d2b290@aices.rwth-aachen.de>


On 01/07/13, Yurii Aulchenko  <yurii.aulchenko at gmail.com> wrote:

> Sodbo - please check the Makefile - it looks like float2double make use of other source files as well!

This is also true. With that compile line you will have linking errors.
For Linux, typing make at OmicABEL's root directory should work fine.

> 
> YA
> 
> 
> On Mon, Jul 1, 2013 at 5:39 PM, ????? ??????? <sharapovsodbo at gmail.com(javascript:main.compose()> wrote:
> 
> 
> > 
> > Hello!
> > 
> > 
> > I have a problem with compilation float2double for Linux:
> > lima at mga:~/Sodbo/Packages/OmicABEL/src/float2double$ gcc float2double.c -Wall -o float2double
> > 
> > float2double.c: In function ?main?:
> > 
> > float2double.c:67: error: expected expression before ?)? token
> > float2double.c:74: error: expected expression before ?;? token
> > 
> > 
> > 
> > and for Windows the same:
> > 
> > 
> > 
> > gcc float2double.c
> > 
> > float2double.c: In function 'main':
> > 
> > float2double.c:67:49 error: expected expression before ?)? token
> > float2double.c:74:44: error: expected expression before ?;? token
> > 
> > 
> > 
> > 
> > 
> > As far as I can judge, the problem is in this expresions:
> > 
> > 
> > 1) if ( databel_in->fvi_header.type != FLOAT_TYPE ){}
> > 2) databel_out->fvi_header.type = DOUBLE_TYPE;
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > -- 
> > 
> > _________________________________
> > 
> > With best regards
> > 
> > Sodbo Zh. Sharapov
> > Phone: ?+79831347688
> > Email: ? ?sharapovsodbo at gmail.com(javascript:main.compose()
> > 
> > 
> > ? ? ? ? ? ? ?sharapov at bionet.nsc.ru(javascript:main.compose()
> > Skype: ? sharapovsodbo
> > 
> > 
> > 
> > _______________________________________________
> > 
> > genabel-devel mailing list
> > 
> > genabel-devel at lists.r-forge.r-project.org <genabel-devel at lists.r-forge.r-project.org>
> > 
> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> > 
> 
> 
> 
> 
> 
> -- 
> -----------------------------------------------------
> Yurii S. Aulchenko
> 
> 
> 
> [?LinkedIn(http://nl.linkedin.com/in/yuriiaulchenko)?]?[ Twitter(http://twitter.com/YuriiAulchenko) ] [ Blog(http://yurii-aulchenko.blogspot.nl/) ]
> 
> 
> 
> 
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel

From sharapovsodbo at gmail.com  Tue Jul  2 11:03:58 2013
From: sharapovsodbo at gmail.com (=?KOI8-R?B?88/Ews8g+8HSwdDP1w==?=)
Date: Tue, 2 Jul 2013 16:03:58 +0700
Subject: [GenABEL-dev] OmicABEL_float2double_compilation_failed
In-Reply-To: <fb9d29c71a4ef2.51d2b1c1@aices.rwth-aachen.de>
References: <CAPF08KuS9JQ2RnmvhyU6oXoCC9RkFdZEWZCp9WBY=2PHyzBbaw@mail.gmail.com>
 <fb9d29c71a4ef2.51d2b1c1@aices.rwth-aachen.de>
Message-ID: <CAPF08KvNOCObFX4rFBCr3PS6hggGVTfzH-wmcaN4wGBhx=ZOig@mail.gmail.com>

Great!
float2double compilation successfully complete!
Thank you!


2013/7/2 Diego Fabregat Traver <fabregat at aices.rwth-aachen.de>

> Hi Sodbo,
>
> thanks for the report. I didn't commit databel.h, which
> defines and assigns a value to the datatype identifiers.
>
> It should work now. Please, let me know.
>
> Best,
> Diego
>
> > Hello!
> >
> >
> > I have a problem with compilation float2double for Linux:
> > lima at mga:~/Sodbo/Packages/OmicABEL/src/float2double$ gcc float2double.c
> -Wall -o float2double
> > float2double.c: In function ?main?:
> >
> > float2double.c:67: error: expected expression before ?)? token
> > float2double.c:74: error: expected expression before ?;? token
> >
> >
> >
> > and for Windows the same:
> >
> >
> >
> > gcc float2double.c
> >
> > float2double.c: In function 'main':
> >
> > float2double.c:67:49 error: expected expression before ?)? token
> > float2double.c:74:44: error: expected expression before ?;? token
> >
> >
> >
> >
> > As far as I can judge, the problem is in this expresions:
> >
> >
> > 1) if ( databel_in->fvi_header.type != FLOAT_TYPE ){}
> > 2) databel_out->fvi_header.type = DOUBLE_TYPE;
>
>


-- 
*_________________________________*
*
*With best regards

Sodbo Zh. Sharapov
Phone:  +79831347688
Email:    sharapovsodbo at gmail.com
             sharapov at bionet.nsc.ru
Skype:   sharapovsodbo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130702/e4ec01d6/attachment-0001.html>

From yurii.aulchenko at gmail.com  Tue Jul  2 11:22:48 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Tue, 2 Jul 2013 11:22:48 +0200
Subject: [GenABEL-dev] [Genabel-commits] r1267 - pkg/OmicABEL/src
In-Reply-To: <20130702085259.3ECDE184468@r-forge.r-project.org>
References: <20130702085259.3ECDE184468@r-forge.r-project.org>
Message-ID: <6167880795671206958@unknownmsgid>

Diego,

I understand this file is the part of filevector. In that may it be
better to have a symlink instead of hard copy? - this is what we do
for say DatA, MixA and GenA.

Y

----------------------
Yurii Aulchenko
(sent from mobile device)

On 2 Jul 2013, at 10:53, "noreply at r-forge.r-project.org"
<noreply at r-forge.r-project.org> wrote:

> Author: dfabregat
> Date: 2013-07-02 10:52:58 +0200 (Tue, 02 Jul 2013)
> New Revision: 1267
>
> Modified:
>   pkg/OmicABEL/src/databel.h
> Log:
> Defining DatABEL datatypes and their associated value
> for *.fvi headers.
>
>
> Modified: pkg/OmicABEL/src/databel.h
> ===================================================================
> --- pkg/OmicABEL/src/databel.h    2013-07-01 12:55:37 UTC (rev 1266)
> +++ pkg/OmicABEL/src/databel.h    2013-07-02 08:52:58 UTC (rev 1267)
> @@ -25,14 +25,14 @@
> #ifndef DATABEL_H
> #define DATABEL_H
>
> -#define UNSIGNED_SHORT_INT_TYPE
> -#define SHORT_INT_TYPE
> -#define UNSIGNED_INT_TYPE
> -#define INT_TYPE
> -#define FLOAT_TYPE
> -#define DOUBLE_TYPE
> -#define SIGNED_CHAR_TYPE
> -#define UNSIGNED_CHAR_TYPE
> +enum datatype{ UNSIGNED_SHORT_INT_TYPE = 1,
> +               SHORT_INT_TYPE,
> +               UNSIGNED_INT_TYPE,
> +               INT_TYPE,
> +               FLOAT_TYPE,
> +               DOUBLE_TYPE,
> +               SIGNED_CHAR_TYPE,
> +               UNSIGNED_CHAR_TYPE };
>
> #define NAMELENGTH 32
> #define RESERVEDSPACE 5
>
> _______________________________________________
> Genabel-commits mailing list
> Genabel-commits at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits

From fabregat at aices.rwth-aachen.de  Tue Jul  2 11:45:17 2013
From: fabregat at aices.rwth-aachen.de (Diego Fabregat Traver)
Date: Tue, 02 Jul 2013 11:45:17 +0200
Subject: [GenABEL-dev] [Genabel-commits] r1267 - pkg/OmicABEL/src
Message-ID: <fb5e41811a0d8c.51d2bd4d@aices.rwth-aachen.de>


On 02/07/13, Yurii Aulchenko  <yurii.aulchenko at gmail.com> wrote:

> Diego,
> 
> I understand this file is the part of filevector. In that may it be
> better to have a symlink instead of hard copy? - this is what we do
> for say DatA, MixA and GenA.

I am not sure what you mean by "part of". If you mean a copy of a file from
filevector, it is not. If you mean related, yes it is.

databel.{c,h} is OmicABEL is just a small module with a couple utilities: 

https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.h?view=markup&root=genabel
https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.c?view=markup&root=genabel

Diego

> Y
> 
> ----------------------
> Yurii Aulchenko
> (sent from mobile device)
> 
> On 2 Jul 2013, at 10:53, "noreply at r-forge.r-project.org"
> <noreply at r-forge.r-project.org> wrote:
> 
> > Author: dfabregat
> > Date: 2013-07-02 10:52:58 +0200 (Tue, 02 Jul 2013)
> > New Revision: 1267
> >
> > Modified:
> >?? pkg/OmicABEL/src/databel.h
> > Log:
> > Defining DatABEL datatypes and their associated value
> > for *.fvi headers.
> >
> >
> > Modified: pkg/OmicABEL/src/databel.h
> > ===================================================================
> > --- pkg/OmicABEL/src/databel.h??? 2013-07-01 12:55:37 UTC (rev 1266)
> > +++ pkg/OmicABEL/src/databel.h??? 2013-07-02 08:52:58 UTC (rev 1267)
> > @@ -25,14 +25,14 @@
> > #ifndef DATABEL_H
> > #define DATABEL_H
> >
> > -#define UNSIGNED_SHORT_INT_TYPE
> > -#define SHORT_INT_TYPE
> > -#define UNSIGNED_INT_TYPE
> > -#define INT_TYPE
> > -#define FLOAT_TYPE
> > -#define DOUBLE_TYPE
> > -#define SIGNED_CHAR_TYPE
> > -#define UNSIGNED_CHAR_TYPE
> > +enum datatype{ UNSIGNED_SHORT_INT_TYPE = 1,
> > +?????????????? SHORT_INT_TYPE,
> > +?????????????? UNSIGNED_INT_TYPE,
> > +?????????????? INT_TYPE,
> > +?????????????? FLOAT_TYPE,
> > +?????????????? DOUBLE_TYPE,
> > +?????????????? SIGNED_CHAR_TYPE,
> > +?????????????? UNSIGNED_CHAR_TYPE };
> >
> > #define NAMELENGTH 32
> > #define RESERVEDSPACE 5
> >
> > _______________________________________________
> > Genabel-commits mailing list
> > Genabel-commits at lists.r-forge.r-project.org
> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel

From yurii.aulchenko at gmail.com  Tue Jul  2 12:38:08 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Tue, 2 Jul 2013 12:38:08 +0200
Subject: [GenABEL-dev] [Genabel-commits] r1267 - pkg/OmicABEL/src
In-Reply-To: <fb5e41811a0d8c.51d2bd4d@aices.rwth-aachen.de>
References: <fb5e41811a0d8c.51d2bd4d@aices.rwth-aachen.de>
Message-ID: <CAHX9t6KgSuV5xt9PEOaRwpKYS--iqEL29jR1ewtxqf0N_uaBsg@mail.gmail.com>

ah, ok, I thought it was a copy, sorry for confusion

in principle we should think of tighter integration OmicA-filevector/DatA,
but this is not something for 5 minutes :)

YA

On Tue, Jul 2, 2013 at 11:45 AM, Diego Fabregat Traver <
fabregat at aices.rwth-aachen.de> wrote:

>
>
> On 02/07/13, Yurii Aulchenko  <yurii.aulchenko at gmail.com> wrote:
>
> > Diego,
> >
> > I understand this file is the part of filevector. In that may it be
> > better to have a symlink instead of hard copy? - this is what we do
> > for say DatA, MixA and GenA.
>
> I am not sure what you mean by "part of". If you mean a copy of a file from
> filevector, it is not. If you mean related, yes it is.
>
> databel.{c,h} is OmicABEL is just a small module with a couple utilities:
>
>
> https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.h?view=markup&root=genabel
>
> https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.c?view=markup&root=genabel
>
> Diego
>
> > Y
> >
> > ----------------------
> > Yurii Aulchenko
> > (sent from mobile device)
> >
> > On 2 Jul 2013, at 10:53, "noreply at r-forge.r-project.org"
> > <noreply at r-forge.r-project.org> wrote:
> >
> > > Author: dfabregat
> > > Date: 2013-07-02 10:52:58 +0200 (Tue, 02 Jul 2013)
> > > New Revision: 1267
> > >
> > > Modified:
> > >   pkg/OmicABEL/src/databel.h
> > > Log:
> > > Defining DatABEL datatypes and their associated value
> > > for *.fvi headers.
> > >
> > >
> > > Modified: pkg/OmicABEL/src/databel.h
> > > ===================================================================
> > > --- pkg/OmicABEL/src/databel.h    2013-07-01 12:55:37 UTC (rev 1266)
> > > +++ pkg/OmicABEL/src/databel.h    2013-07-02 08:52:58 UTC (rev 1267)
> > > @@ -25,14 +25,14 @@
> > > #ifndef DATABEL_H
> > > #define DATABEL_H
> > >
> > > -#define UNSIGNED_SHORT_INT_TYPE
> > > -#define SHORT_INT_TYPE
> > > -#define UNSIGNED_INT_TYPE
> > > -#define INT_TYPE
> > > -#define FLOAT_TYPE
> > > -#define DOUBLE_TYPE
> > > -#define SIGNED_CHAR_TYPE
> > > -#define UNSIGNED_CHAR_TYPE
> > > +enum datatype{ UNSIGNED_SHORT_INT_TYPE = 1,
> > > +               SHORT_INT_TYPE,
> > > +               UNSIGNED_INT_TYPE,
> > > +               INT_TYPE,
> > > +               FLOAT_TYPE,
> > > +               DOUBLE_TYPE,
> > > +               SIGNED_CHAR_TYPE,
> > > +               UNSIGNED_CHAR_TYPE };
> > >
> > > #define NAMELENGTH 32
> > > #define RESERVEDSPACE 5
> > >
> > > _______________________________________________
> > > Genabel-commits mailing list
> > > Genabel-commits at lists.r-forge.r-project.org
> > >
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits
> > _______________________________________________
> > genabel-devel mailing list
> > genabel-devel at lists.r-forge.r-project.org
> >
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130702/5d22d2f3/attachment.html>

From fabregat at aices.rwth-aachen.de  Tue Jul  2 13:11:51 2013
From: fabregat at aices.rwth-aachen.de (Diego Fabregat Traver)
Date: Tue, 02 Jul 2013 13:11:51 +0200
Subject: [GenABEL-dev] layout of GenABEL main page
Message-ID: <fb5e10b31a0eb5.51d2d197@aices.rwth-aachen.de>

On 28/06/13, Yurii Aulchenko  <yurii.aulchenko at gmail.com> wrote:

> How do you like this one?

I like it a lot. 

What do you think about reducing the font size for the subtitle 
and right-justifying it? Would it still be readable? I liked that 
detail from the previous attempts with the "Project" subtitle.

In any case, this is just a minor detail. It looks great as it is.

Thanks to Grant Borodin!
 
> YA
> 
> 
> On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko <yurii.aulchenko at gmail.com(javascript:main.compose()> wrote:
> 
> 
> > Dear Nicola, Diego, Lennart,?
> > 
> > 
> > Thanks for your feedback! I will ask Grant Borodin, who kindly designed these logos, if he could change C according to your comment (capital "ABEL" and "statistical genomics" as in F).
> > 
> > 
> > 
> > 
> > Yurii
> > 
> > 
> > 
> > On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver <fabregat at aices.rwth-aachen.de(javascript:main.compose()> wrote:
> > 
> > 
> > 
> > > 
> > > 
> > > Congrats to whoever designed these logos, they look very nice :)
> > > 
> > > 
> > > 
> > > With respect to my preferences, I fully agree with Lennart: "C with capital ABEL and statistical genomics below it" would be my choice.
> > > 
> > > 
> > > 
> > > Best,
> > > 
> > > Diego
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > On 20/06/13, "L.C. Karssen" ?<lennart at karssen.org(javascript:main.compose()> wrote:
> > > 
> > > 
> > > 
> > > > Wow! Those look really nice!
> > > 
> > > >
> > > 
> > > > I like options C and F the most. Actually a combination would be even
> > > 
> > > > better IMHO: use C with capital ABEL and statistical genomics below it.
> > > 
> > > >
> > > 
> > > >
> > > 
> > > > Looking forward to head the opinion of others,
> > > 
> > > >
> > > 
> > > > Lennart.
> > > 
> > > >
> > > 
> > > > On 20-06-13 09:34, Yurii Aulchenko wrote:
> > > 
> > > > > Please find attached few more logo variants
> > > 
> > > > >
> > > 
> > > > > Yurii
> > > 
> > > 
> > 
> > 
> > 
> 
> 
> 
> 

From kooyman at gmail.com  Tue Jul  2 14:10:53 2013
From: kooyman at gmail.com (Maarten Kooyman)
Date: Tue, 02 Jul 2013 14:10:53 +0200
Subject: [GenABEL-dev] layout of GenABEL main page
In-Reply-To: <fb5e10b31a0eb5.51d2d197@aices.rwth-aachen.de>
References: <fb5e10b31a0eb5.51d2d197@aices.rwth-aachen.de>
Message-ID: <51D2C34D.2000907@gmail.com>

Dear all,


It looks really nice ! Credits for who made it.  However, I have more 
the impression that it looks like a polypeptide chain or a rosary. The 
seventies font is a matter of taste, but it remind me of comic 
sans(including a upside down e as a). I wonder if it readable if you 
print it on a poster: I think this is a important use-case of a 
scientific logo.

Kind regards,


Maarten


On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote:
> On 28/06/13, Yurii Aulchenko  <yurii.aulchenko at gmail.com> wrote:
>
>> How do you like this one?
> I like it a lot.
>
> What do you think about reducing the font size for the subtitle
> and right-justifying it? Would it still be readable? I liked that
> detail from the previous attempts with the "Project" subtitle.
>
> In any case, this is just a minor detail. It looks great as it is.
>
> Thanks to Grant Borodin!
>   
>> YA
>>
>>
>> On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko <yurii.aulchenko at gmail.com(javascript:main.compose()> wrote:
>>
>>
>>> Dear Nicola, Diego, Lennart,
>>>
>>>
>>> Thanks for your feedback! I will ask Grant Borodin, who kindly designed these logos, if he could change C according to your comment (capital "ABEL" and "statistical genomics" as in F).
>>>
>>>
>>>
>>>
>>> Yurii
>>>
>>>
>>>
>>> On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver <fabregat at aices.rwth-aachen.de(javascript:main.compose()> wrote:
>>>
>>>
>>>
>>>>
>>>> Congrats to whoever designed these logos, they look very nice :)
>>>>
>>>>
>>>>
>>>> With respect to my preferences, I fully agree with Lennart: "C with capital ABEL and statistical genomics below it" would be my choice.
>>>>
>>>>
>>>>
>>>> Best,
>>>>
>>>> Diego
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 20/06/13, "L.C. Karssen"  <lennart at karssen.org(javascript:main.compose()> wrote:
>>>>
>>>>
>>>>
>>>>> Wow! Those look really nice!
>>>>> I like options C and F the most. Actually a combination would be even
>>>>> better IMHO: use C with capital ABEL and statistical genomics below it.
>>>>> Looking forward to head the opinion of others,
>>>>> Lennart.
>>>>> On 20-06-13 09:34, Yurii Aulchenko wrote:
>>>>>> Please find attached few more logo variants
>>>>>> Yurii
>>>>
>>>
>>>
>>
>>
>>
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel


From yurii.aulchenko at gmail.com  Tue Jul  2 14:38:39 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Tue, 2 Jul 2013 14:38:39 +0200
Subject: [GenABEL-dev] layout of GenABEL main page
In-Reply-To: <51D2C34D.2000907@gmail.com>
References: <fb5e10b31a0eb5.51d2d197@aices.rwth-aachen.de>
 <51D2C34D.2000907@gmail.com>
Message-ID: <CAHX9t6LXPDT7UR3+Dn1htv1pWBuvJtSAn_zi3ANmt3FYa5pnLw@mail.gmail.com>

Dear All,

I agree with critique of Maarten, and I actually still not sure if I like
Maarten's or Grant's idea better. Interesting thing is that - not sure all
realize it - Grant's variant is his vision of Maarten's prototype :)
However, Grant's variant has an important advantage - it is ready to serve
as logo. And I actually want to use a logo in my slides for UseR!-2013.

So I suggest we take Grant's logo as a working variant. No doubt that the
logo is going to evolve with time - as anything we do in the project -
code, documentation; logo is no different, I think. The element which is
going to stay and keep it recognizable is the way of spelling the GenABEL
:) - Like Gnu's horns in the GNU logo.

What we can do next is to place an open call on site/forum for other users
to contribute, but this is going to take time, and meanwhile I suggest to
stick with Grant's variant.

Yurii

On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman <kooyman at gmail.com> wrote:

> Dear all,
>
>
> It looks really nice ! Credits for who made it.  However, I have more the
> impression that it looks like a polypeptide chain or a rosary. The
> seventies font is a matter of taste, but it remind me of comic
> sans(including a upside down e as a). I wonder if it readable if you print
> it on a poster: I think this is a important use-case of a scientific logo.
>
> Kind regards,
>
>
> Maarten
>
>
>
>
> On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote:
>
>> On 28/06/13, Yurii Aulchenko  <yurii.aulchenko at gmail.com> wrote:
>>
>>  How do you like this one?
>>>
>> I like it a lot.
>>
>> What do you think about reducing the font size for the subtitle
>> and right-justifying it? Would it still be readable? I liked that
>> detail from the previous attempts with the "Project" subtitle.
>>
>> In any case, this is just a minor detail. It looks great as it is.
>>
>> Thanks to Grant Borodin!
>>
>>
>>> YA
>>>
>>>
>>> On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko <
>>> yurii.aulchenko at gmail.com(**javascript:main.compose()> wrote:
>>>
>>>
>>>  Dear Nicola, Diego, Lennart,
>>>>
>>>>
>>>> Thanks for your feedback! I will ask Grant Borodin, who kindly designed
>>>> these logos, if he could change C according to your comment (capital "ABEL"
>>>> and "statistical genomics" as in F).
>>>>
>>>>
>>>>
>>>>
>>>> Yurii
>>>>
>>>>
>>>>
>>>> On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver <
>>>> fabregat at aices.rwth-aachen.de**(javascript:main.compose()> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>> Congrats to whoever designed these logos, they look very nice :)
>>>>>
>>>>>
>>>>>
>>>>> With respect to my preferences, I fully agree with Lennart: "C with
>>>>> capital ABEL and statistical genomics below it" would be my choice.
>>>>>
>>>>>
>>>>>
>>>>> Best,
>>>>>
>>>>> Diego
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 20/06/13, "L.C. Karssen"  <lennart at karssen.org(**javascript:main.compose()>
>>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>>  Wow! Those look really nice!
>>>>>> I like options C and F the most. Actually a combination would be even
>>>>>> better IMHO: use C with capital ABEL and statistical genomics below
>>>>>> it.
>>>>>> Looking forward to head the opinion of others,
>>>>>> Lennart.
>>>>>> On 20-06-13 09:34, Yurii Aulchenko wrote:
>>>>>>
>>>>>>> Please find attached few more logo variants
>>>>>>> Yurii
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>  ______________________________**_________________
>> genabel-devel mailing list
>> genabel-devel at lists.r-forge.r-**project.org<genabel-devel at lists.r-forge.r-project.org>
>> https://lists.r-forge.r-**project.org/cgi-bin/mailman/**
>> listinfo/genabel-devel<https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel>
>>
>
> ______________________________**_________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-**project.org<genabel-devel at lists.r-forge.r-project.org>
> https://lists.r-forge.r-**project.org/cgi-bin/mailman/**
> listinfo/genabel-devel<https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel>
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130702/42d7795e/attachment-0001.html>

From nicola.pirastu at burlo.trieste.it  Tue Jul  2 16:27:48 2013
From: nicola.pirastu at burlo.trieste.it (Nicola Pirastu)
Date: Tue, 2 Jul 2013 16:27:48 +0200
Subject: [GenABEL-dev] layout of GenABEL main page
In-Reply-To: <CAHX9t6LXPDT7UR3+Dn1htv1pWBuvJtSAn_zi3ANmt3FYa5pnLw@mail.gmail.com>
References: <fb5e10b31a0eb5.51d2d197@aices.rwth-aachen.de>
 <51D2C34D.2000907@gmail.com>
 <CAHX9t6LXPDT7UR3+Dn1htv1pWBuvJtSAn_zi3ANmt3FYa5pnLw@mail.gmail.com>
Message-ID: <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it>

Just to add my two cents to the discussion,

I think that the problem is not with the DNA helix but with the font. I've played around a bit with it and if you use for example Helvetica or something less comic-sans-like it does look better. Also for some reason I'm still disturbed by the green but it is a very personal opinion..

Nicola

Dr. Nicola Pirastu PhD
Research Fellow
Medical Sciences, Chirurgical and Health Department
University of Trieste
Medical Genetics
IRCCS Burlo Garofolo
Via dell'Istria 65/1
34137 Italy
tel. +390403785539

Il giorno 02/lug/2013, alle ore 14:38, Yurii Aulchenko <yurii.aulchenko at gmail.com<mailto:yurii.aulchenko at gmail.com>> ha scritto:

Dear All,

I agree with critique of Maarten, and I actually still not sure if I like Maarten's or Grant's idea better. Interesting thing is that - not sure all realize it - Grant's variant is his vision of Maarten's prototype :) However, Grant's variant has an important advantage - it is ready to serve as logo. And I actually want to use a logo in my slides for UseR!-2013.

So I suggest we take Grant's logo as a working variant. No doubt that the logo is going to evolve with time - as anything we do in the project - code, documentation; logo is no different, I think. The element which is going to stay and keep it recognizable is the way of spelling the GenABEL :) - Like Gnu's horns in the GNU logo.

What we can do next is to place an open call on site/forum for other users to contribute, but this is going to take time, and meanwhile I suggest to stick with Grant's variant.

Yurii

On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman <kooyman at gmail.com<mailto:kooyman at gmail.com>> wrote:
Dear all,


It looks really nice ! Credits for who made it.  However, I have more the impression that it looks like a polypeptide chain or a rosary. The seventies font is a matter of taste, but it remind me of comic sans(including a upside down e as a). I wonder if it readable if you print it on a poster: I think this is a important use-case of a scientific logo.

Kind regards,


Maarten


On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote:
On 28/06/13, Yurii Aulchenko  <yurii.aulchenko at gmail.com<mailto:yurii.aulchenko at gmail.com>> wrote:

How do you like this one?
I like it a lot.

What do you think about reducing the font size for the subtitle
and right-justifying it? Would it still be readable? I liked that
detail from the previous attempts with the "Project" subtitle.

In any case, this is just a minor detail. It looks great as it is.

Thanks to Grant Borodin!

YA


On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko <yurii.aulchenko at gmail.com<mailto:yurii.aulchenko at gmail.com>(javascript:main.compose()> wrote:


Dear Nicola, Diego, Lennart,


Thanks for your feedback! I will ask Grant Borodin, who kindly designed these logos, if he could change C according to your comment (capital "ABEL" and "statistical genomics" as in F).


Yurii


On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver <fabregat at aices.rwth-aachen.de<mailto:fabregat at aices.rwth-aachen.de>(javascript:main.compose()> wrote:


Congrats to whoever designed these logos, they look very nice :)


With respect to my preferences, I fully agree with Lennart: "C with capital ABEL and statistical genomics below it" would be my choice.


Best,

Diego


On 20/06/13, "L.C. Karssen"  <lennart at karssen.org<mailto:lennart at karssen.org>(javascript:main.compose()> wrote:


Wow! Those look really nice!
I like options C and F the most. Actually a combination would be even
better IMHO: use C with capital ABEL and statistical genomics below it.
Looking forward to head the opinion of others,
Lennart.
On 20-06-13 09:34, Yurii Aulchenko wrote:
Please find attached few more logo variants
Yurii


_______________________________________________
genabel-devel mailing list
genabel-devel at lists.r-forge.r-project.org<mailto:genabel-devel at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel

_______________________________________________
genabel-devel mailing list
genabel-devel at lists.r-forge.r-project.org<mailto:genabel-devel at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel


--
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn<http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko> ] [ Blog<http://yurii-aulchenko.blogspot.nl/> ]
_______________________________________________
genabel-devel mailing list
genabel-devel at lists.r-forge.r-project.org<mailto:genabel-devel at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel

AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel messaggio, o responsabili per la sua consegna alla persona, o se avete ricevuto il messaggio per errore, siete pregati di non trascriverlo, copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential information may be contained in this message or in its attachments. If you are not the addressee indicated in this message, or responsible for message delivering to that person, or if you have received this message in error, you may not transcribe, copy or deliver this message to anyone. In that case, you should delete this message and its attachments. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130702/46b93eb9/attachment.html>

From lennart at karssen.org  Tue Jul  2 18:05:57 2013
From: lennart at karssen.org (L.C. Karssen)
Date: Tue, 02 Jul 2013 18:05:57 +0200
Subject: [GenABEL-dev] [Genabel-commits] r1267 - pkg/OmicABEL/src
In-Reply-To: <CAHX9t6KgSuV5xt9PEOaRwpKYS--iqEL29jR1ewtxqf0N_uaBsg@mail.gmail.com>
References: <fb5e41811a0d8c.51d2bd4d@aices.rwth-aachen.de>
 <CAHX9t6KgSuV5xt9PEOaRwpKYS--iqEL29jR1ewtxqf0N_uaBsg@mail.gmail.com>
Message-ID: <51D2FA65.7050102@karssen.org>

Dear all,

On 02-07-13 12:38, Yurii Aulchenko wrote:
> ah, ok, I thought it was a copy, sorry for confusion
> 
> in principle we should think of tighter integration
> OmicA-filevector/DatA, but this is not something for 5 minutes :)
> 

Definitely not 5 minutes :-). My idea has always been be to turn the
contents of the filevector directory in SVN into a (shared) library.
This would allow us to package it separately without the need to
copy/symlink the directory for each of the other packages. It would more
easily allow versioning of the filevector files.
If this all works, we can compile the other ABELs with a -lfilevector
option. For the user, however, it would mean that the need to install
two packages (e.g. for ProbABEL they would need to install the
filevector library and ProbABEL itself) unless we distribute the various
"filevector.so" files with the other packages.

Of course, any ideas/opinions on this are welcome!

Lennart.

> YA
> 
> On Tue, Jul 2, 2013 at 11:45 AM, Diego Fabregat Traver
> <fabregat at aices.rwth-aachen.de <mailto:fabregat at aices.rwth-aachen.de>>
> wrote:
> 
> 
> 
>     On 02/07/13, Yurii Aulchenko  <yurii.aulchenko at gmail.com
>     <mailto:yurii.aulchenko at gmail.com>> wrote:
> 
>     > Diego,
>     >
>     > I understand this file is the part of filevector. In that may it be
>     > better to have a symlink instead of hard copy? - this is what we do
>     > for say DatA, MixA and GenA.
> 
>     I am not sure what you mean by "part of". If you mean a copy of a
>     file from
>     filevector, it is not. If you mean related, yes it is.
> 
>     databel.{c,h} is OmicABEL is just a small module with a couple
>     utilities:
> 
>     https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.h?view=markup&root=genabel
>     https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.c?view=markup&root=genabel
> 
>     Diego
> 
>     > Y
>     >
>     > ----------------------
>     > Yurii Aulchenko
>     > (sent from mobile device)
>     >
>     > On 2 Jul 2013, at 10:53, "noreply at r-forge.r-project.org
>     <mailto:noreply at r-forge.r-project.org>"
>     > <noreply at r-forge.r-project.org
>     <mailto:noreply at r-forge.r-project.org>> wrote:
>     >
>     > > Author: dfabregat
>     > > Date: 2013-07-02 10:52:58 +0200 (Tue, 02 Jul 2013)
>     > > New Revision: 1267
>     > >
>     > > Modified:
>     > >   pkg/OmicABEL/src/databel.h
>     > > Log:
>     > > Defining DatABEL datatypes and their associated value
>     > > for *.fvi headers.
>     > >
>     > >
>     > > Modified: pkg/OmicABEL/src/databel.h
>     > > ===================================================================
>     > > --- pkg/OmicABEL/src/databel.h    2013-07-01 12:55:37 UTC (rev 1266)
>     > > +++ pkg/OmicABEL/src/databel.h    2013-07-02 08:52:58 UTC (rev 1267)
>     > > @@ -25,14 +25,14 @@
>     > > #ifndef DATABEL_H
>     > > #define DATABEL_H
>     > >
>     > > -#define UNSIGNED_SHORT_INT_TYPE
>     > > -#define SHORT_INT_TYPE
>     > > -#define UNSIGNED_INT_TYPE
>     > > -#define INT_TYPE
>     > > -#define FLOAT_TYPE
>     > > -#define DOUBLE_TYPE
>     > > -#define SIGNED_CHAR_TYPE
>     > > -#define UNSIGNED_CHAR_TYPE
>     > > +enum datatype{ UNSIGNED_SHORT_INT_TYPE = 1,
>     > > +               SHORT_INT_TYPE,
>     > > +               UNSIGNED_INT_TYPE,
>     > > +               INT_TYPE,
>     > > +               FLOAT_TYPE,
>     > > +               DOUBLE_TYPE,
>     > > +               SIGNED_CHAR_TYPE,
>     > > +               UNSIGNED_CHAR_TYPE };
>     > >
>     > > #define NAMELENGTH 32
>     > > #define RESERVEDSPACE 5
>     > >
>     > > _______________________________________________
>     > > Genabel-commits mailing list
>     > > Genabel-commits at lists.r-forge.r-project.org
>     <mailto:Genabel-commits at lists.r-forge.r-project.org>
>     > >
>     https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits
>     > _______________________________________________
>     > genabel-devel mailing list
>     > genabel-devel at lists.r-forge.r-project.org
>     <mailto:genabel-devel at lists.r-forge.r-project.org>
>     >
>     https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>     _______________________________________________
>     genabel-devel mailing list
>     genabel-devel at lists.r-forge.r-project.org
>     <mailto:genabel-devel at lists.r-forge.r-project.org>
>     https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> 
> 
> 
> 
> -- 
> -----------------------------------------------------
> Yurii S. Aulchenko
> 
> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter
> <http://twitter.com/YuriiAulchenko> ] [ Blog
> <http://yurii-aulchenko.blogspot.nl/> ]
> 
> 
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> 

-- 
-----------------------------------------------------------------
L.C. Karssen
Utrecht
The Netherlands

lennart at karssen.org
http://blog.karssen.org

Stuur mij aub geen Word of Powerpoint bestanden!
Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
------------------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130702/d722f6fb/attachment.sig>

From yurii.aulchenko at gmail.com  Tue Jul  2 21:27:09 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Tue, 2 Jul 2013 21:27:09 +0200
Subject: [GenABEL-dev] [Genabel-commits] r1267 - pkg/OmicABEL/src
In-Reply-To: <51D2FA65.7050102@karssen.org>
References: <fb5e41811a0d8c.51d2bd4d@aices.rwth-aachen.de>
 <CAHX9t6KgSuV5xt9PEOaRwpKYS--iqEL29jR1ewtxqf0N_uaBsg@mail.gmail.com>
 <51D2FA65.7050102@karssen.org>
Message-ID: <CAHX9t6KYtbwJSRCcij5=TzF+2G-3t40E8jkzZVKh7tUsDhESCw@mail.gmail.com>

On Tue, Jul 2, 2013 at 6:05 PM, L.C. Karssen <lennart at karssen.org> wrote:

> Dear all,
>
> On 02-07-13 12:38, Yurii Aulchenko wrote:
> > ah, ok, I thought it was a copy, sorry for confusion
> >
> > in principle we should think of tighter integration
> > OmicA-filevector/DatA, but this is not something for 5 minutes :)
> >
>
> Definitely not 5 minutes :-). My idea has always been be to turn the
> contents of the filevector directory in SVN into a (shared) library.
> This would allow us to package it separately without the need to
> copy/symlink the directory for each of the other packages. It would more
> easily allow versioning of the filevector files.
> If this all works, we can compile the other ABELs with a -lfilevector
> option. For the user, however, it would mean that the need to install
> two packages (e.g. for ProbABEL they would need to install the
> filevector library and ProbABEL itself) unless we distribute the various
> "filevector.so" files with the other packages.
>
> Of course, any ideas/opinions on this are welcome!
>
>

I like the 'library' idea very much; in a way I think this should be as
cool as what Lennart did to ProbABEL (autotools) :)

Not sure that this will be very easy to do, especially cross different
platforms though. Also not sure how that will work with CRAN-submissions.I
think we will obtain useful experience while solving the MixABEL (which
does use GSL, and is not on CRAN).

YA

Lennart.
>
> > YA
> >
> > On Tue, Jul 2, 2013 at 11:45 AM, Diego Fabregat Traver
> > <fabregat at aices.rwth-aachen.de <mailto:fabregat at aices.rwth-aachen.de>>
> > wrote:
> >
> >
> >
> >     On 02/07/13, Yurii Aulchenko  <yurii.aulchenko at gmail.com
> >     <mailto:yurii.aulchenko at gmail.com>> wrote:
> >
> >     > Diego,
> >     >
> >     > I understand this file is the part of filevector. In that may it be
> >     > better to have a symlink instead of hard copy? - this is what we do
> >     > for say DatA, MixA and GenA.
> >
> >     I am not sure what you mean by "part of". If you mean a copy of a
> >     file from
> >     filevector, it is not. If you mean related, yes it is.
> >
> >     databel.{c,h} is OmicABEL is just a small module with a couple
> >     utilities:
> >
> >
> https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.h?view=markup&root=genabel
> >
> https://r-forge.r-project.org/scm/viewvc.php/pkg/OmicABEL/src/databel.c?view=markup&root=genabel
> >
> >     Diego
> >
> >     > Y
> >     >
> >     > ----------------------
> >     > Yurii Aulchenko
> >     > (sent from mobile device)
> >     >
> >     > On 2 Jul 2013, at 10:53, "noreply at r-forge.r-project.org
> >     <mailto:noreply at r-forge.r-project.org>"
> >     > <noreply at r-forge.r-project.org
> >     <mailto:noreply at r-forge.r-project.org>> wrote:
> >     >
> >     > > Author: dfabregat
> >     > > Date: 2013-07-02 10:52:58 +0200 (Tue, 02 Jul 2013)
> >     > > New Revision: 1267
> >     > >
> >     > > Modified:
> >     > >   pkg/OmicABEL/src/databel.h
> >     > > Log:
> >     > > Defining DatABEL datatypes and their associated value
> >     > > for *.fvi headers.
> >     > >
> >     > >
> >     > > Modified: pkg/OmicABEL/src/databel.h
> >     > >
> ===================================================================
> >     > > --- pkg/OmicABEL/src/databel.h    2013-07-01 12:55:37 UTC (rev
> 1266)
> >     > > +++ pkg/OmicABEL/src/databel.h    2013-07-02 08:52:58 UTC (rev
> 1267)
> >     > > @@ -25,14 +25,14 @@
> >     > > #ifndef DATABEL_H
> >     > > #define DATABEL_H
> >     > >
> >     > > -#define UNSIGNED_SHORT_INT_TYPE
> >     > > -#define SHORT_INT_TYPE
> >     > > -#define UNSIGNED_INT_TYPE
> >     > > -#define INT_TYPE
> >     > > -#define FLOAT_TYPE
> >     > > -#define DOUBLE_TYPE
> >     > > -#define SIGNED_CHAR_TYPE
> >     > > -#define UNSIGNED_CHAR_TYPE
> >     > > +enum datatype{ UNSIGNED_SHORT_INT_TYPE = 1,
> >     > > +               SHORT_INT_TYPE,
> >     > > +               UNSIGNED_INT_TYPE,
> >     > > +               INT_TYPE,
> >     > > +               FLOAT_TYPE,
> >     > > +               DOUBLE_TYPE,
> >     > > +               SIGNED_CHAR_TYPE,
> >     > > +               UNSIGNED_CHAR_TYPE };
> >     > >
> >     > > #define NAMELENGTH 32
> >     > > #define RESERVEDSPACE 5
> >     > >
> >     > > _______________________________________________
> >     > > Genabel-commits mailing list
> >     > > Genabel-commits at lists.r-forge.r-project.org
> >     <mailto:Genabel-commits at lists.r-forge.r-project.org>
> >     > >
> >
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits
> >     > _______________________________________________
> >     > genabel-devel mailing list
> >     > genabel-devel at lists.r-forge.r-project.org
> >     <mailto:genabel-devel at lists.r-forge.r-project.org>
> >     >
> >
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> >     _______________________________________________
> >     genabel-devel mailing list
> >     genabel-devel at lists.r-forge.r-project.org
> >     <mailto:genabel-devel at lists.r-forge.r-project.org>
> >
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> >
> >
> >
> >
> > --
> > -----------------------------------------------------
> > Yurii S. Aulchenko
> >
> > [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter
> > <http://twitter.com/YuriiAulchenko> ] [ Blog
> > <http://yurii-aulchenko.blogspot.nl/> ]
> >
> >
> > _______________________________________________
> > genabel-devel mailing list
> > genabel-devel at lists.r-forge.r-project.org
> >
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> >
>
> --
> -----------------------------------------------------------------
> L.C. Karssen
> Utrecht
> The Netherlands
>
> lennart at karssen.org
> http://blog.karssen.org
>
> Stuur mij aub geen Word of Powerpoint bestanden!
> Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
> ------------------------------------------------------------------
>
>
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130702/9e4f2450/attachment-0001.html>

From yurii.aulchenko at gmail.com  Wed Jul  3 22:44:58 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Wed, 3 Jul 2013 22:44:58 +0200
Subject: [GenABEL-dev] joining the GenABEL project - what is the
	procedure?
In-Reply-To: <75533175-C431-46A8-8B57-AC12B67C968E@burlo.trieste.it>
References: <CAHX9t6+LDcyCnSNmhywWY5Wz9y+zrZO93CF+0jS5D6SefQsM9w@mail.gmail.com>
 <51916427.8020307@karssen.org>
 <CB4CF543-7387-4B98-B412-7ED794CEB37B@burlo.trieste.it>
 <CAHX9t6+kfOPUCosuAaOhQsJLWdtef-cTmCH509fRrnj-fuNf7A@mail.gmail.com>
 <E6BAADB6-B7B5-4DBF-BB6C-BED1013740AD@burlo.trieste.it>
 <CAHX9t6Lmb-L9RXDspAD+8L+dQWC4OgwgG-GeHQBX3A9oMqaJ0Q@mail.gmail.com>
 <CAHX9t6+3Bk6tyLbtmBtZX55LWtuzmSY8gj4NFhEzM99U795_QQ@mail.gmail.com>
 <75533175-C431-46A8-8B57-AC12B67C968E@burlo.trieste.it>
Message-ID: <CAHX9t6+5DjwAcG=2LQ2Wy6ULFet2pS_9Rjqc0m3tJXEQ3A2O=Q@mail.gmail.com>

Dear All,

I just discovered that we already had this discussion (under slightly
different angle) a couple of years ago, see

https://lists.r-forge.r-project.org/pipermail/genabel-devel/2011-October/000367.html

In the light of that discussion I have added questions

* Legal issues
** Is there a clear (standard) license?
** Is the license GNU GPL-compatible?

to the reviewer's list

YA


On Thu, Jun 27, 2013 at 11:10 AM, Nicola Pirastu <
nicola.pirastu at burlo.trieste.it> wrote:

>  Sorry for the late reply.
>
>  Great! I'll look at the comments and modify the package accordingly.
>
>  I'll let you know when I'm done.
>
>  Thanks a lot.
>
>  Best
>
>  Nicola
>
>
>
> Dr. Nicola Pirastu PhD
> Research Fellow
> Medical Sciences, Chirurgical and Health Department
> University of Trieste
> Medical Genetics
> IRCCS Burlo Garofolo
> Via dell'Istria 65/1
> 34137 Italy
> tel. +390403785539
>
>  Il giorno 20/giu/2013, alle ore 23:33, Yurii Aulchenko <
> yurii.aulchenko at gmail.com> ha scritto:
>
> ok, draft review form (and a review of RegionABEL) completed at
> http://piratepad.net/9ExdfmuJHV
>
>  Definitely not all comments are equal - some are minor/suggestive,
> others more important.
>
>  Nicola, may be you can reply directly in that form to my concerns
>
>  I would really like if someone else will do the next round of review - I
> feel I am really biased - any volunteers?
>
>  YA
>
> On Thu, Jun 20, 2013 at 1:09 AM, Yurii Aulchenko <
> yurii.aulchenko at gmail.com> wrote:
>
>> FYI, I started drafting more detailed reviewers' instructions (
>> http://piratepad.net/9ExdfmuJHV) and going to apply this template for
>> Nicola's package. Few questions will pop up on the way, I am sure.
>>  YA
>>
>>
>> On Tue, May 28, 2013 at 8:52 AM, Nicola Pirastu <
>> nicola.pirastu at burlo.trieste.it> wrote:
>>
>>> Hi,
>>>
>>>  I think this is a very good plan. As for time I think a couple of
>>> months is fine, I still need to do some work to demonstrate that everything
>>> works fine (simulations, etc etc?.). Actually if some one would like to
>>> lend a hand on that side he/she would be more than welcome :).
>>>
>>>  I'll send you the code separately with a tutorial attached so we can
>>> get started.
>>>
>>>  Best.
>>>
>>>  Nicola
>>>
>>>
>>>  Il giorno 28/mag/2013, alle ore 04:39, Yurii Aulchenko <
>>> yurii.aulchenko at gmail.com> ha scritto:
>>>
>>> I think it may be indeed a good idea to start with a 'case' and
>>> develop/tune the recommendations on the way. Nicola's new package would
>>> provide a good starting point (then we actually can think of re-review of
>>> some of the packages which are in the GenABEL suite already).
>>>
>>>  What about following plan
>>>
>>>  1) We (Nicola, Yurii, ...) draft reviewer's instructions (starting
>>> with points made during this discussion) - I made a piratepad
>>> http://piratepad.net/9ExdfmuJHV (at the moment simply a copy of latest
>>> Nicola's email); later we will circulate the draft on the list
>>>
>>>  2) Take RegionABEL as an example (I am volunteering to be the 'test'
>>> reviewer), and explore this case to check the review procedure. Nicola, may
>>> be you can send me the code already.
>>>
>>>  3) Ask an external person to act as a reviewer - this is for testing
>>> our reviewers' instructions
>>>
>>>  The whole process (esp if we want to go for (3)) may take a couple of
>>> months. Nicola, how much in hurry are you with publication?
>>>
>>>  Yurii
>>>
>>>
>>> On Wed, May 22, 2013 at 2:55 PM, Nicola Pirastu <
>>> nicola.pirastu at burlo.trieste.it> wrote:
>>>
>>>> Dear all,
>>>>
>>>> I think that the best way we can discuss about this is to start with a
>>>> real case. I would propose to start from the package
>>>> I've just written to run gene/region wide analysis which I've called
>>>> RegionABEL.
>>>>
>>>> It basically gives gene wide value with real or imputed data, with or
>>>> without kinship included. It is not for analyzing rare variants, so it is
>>>> not like SKAT. If you want to think of it in terms of existing software it
>>>> is like VEGAS or plink-ave. The main advance is that since it does not use
>>>> simulation/permutations to get pvalues it is much faster (4 hours on 1000G
>>>> data vs 12-16 of VEGAS on HapMap 2.5). The other great advantage is that it
>>>> does not require prior knowledge of LD as in other methods.
>>>> I have beta version of the package and I've written a Tutorial to
>>>> explain how to use it.
>>>>
>>>> So how do you think we should proceed now? Should we ask some
>>>> volunteers to review it?
>>>>
>>>>
>>>> Best.
>>>>
>>>> Nicola
>>>>
>>>>
>>>>
>>>>
>>>> Il giorno 14/mag/2013, alle ore 00:07, L.C. Karssen <
>>>> lennart at karssen.org> ha scritto:
>>>>
>>>> > Dear all,
>>>> >
>>>> > It's been a while but this mail was still on my todo list. I agree
>>>> with
>>>> > Yurii that we should start establishing procedures for projects
>>>> wanting
>>>> > to join the GenABEL project umbrella. Software lifecycle management is
>>>> > too often overlooked when developing a package and we don't want to
>>>> > 'degrade' the GenABEL project brand name by including packages that
>>>> are
>>>> > not maintained anymore after the initial paper is published. Or,
>>>> another
>>>> > argument I've come across: we make it open source so everyone can
>>>> > contribute to it (and therefore it will 'somehow' be maintained
>>>> without
>>>> > us putting more effort into it). That's not how it works. The software
>>>> > ecosystem in which a package lives is dynamic and a package should
>>>> adapt
>>>> > to that.
>>>> >
>>>> > As Yurii wrote we discussed this at the EMGM conference and agreed
>>>> that
>>>> > code review should be part of it. This neatly ties into the discussion
>>>> > we had on thils list some time ago about coding standards. This does
>>>> not
>>>> > mean we force everybody to use four spaces instead of eight when
>>>> > indenting code, but more serious stuff like variables named "a" or
>>>> "df"
>>>> > are not helpful when someone wants to contribute or take over
>>>> > maintenance of the package.
>>>> >
>>>> > I've just committed the draft document of the coding standards to the
>>>> > www folder of the SVN repo (rev. 1215). It's a (plain text) Org-mode
>>>> > file; the HTML file is created from this Org file (using org-mode
>>>> allows
>>>> > us to easily export the text in various formats). Those of you who
>>>> want
>>>> > to convert without ever opening emacs can run the command
>>>> > emacs --batch --eval '(and (find-file "codingstyle.org")
>>>> > (org-export-as-html nil))'
>>>> > from the command line.
>>>> >
>>>> > Looking forward to your comments, both on this e-mail and the coding
>>>> > standards.
>>>> >
>>>> >
>>>> > Lennart.
>>>> >
>>>> > On 02-05-13 15:15, Yurii Aulchenko wrote:
>>>> >> Dear All,
>>>> >>
>>>> >> I have recently received several requests from people who would like
>>>> to
>>>> >> join to the GenABEL project with their software. Given this is a
>>>> >> community-based project, neither me nor someone else is in a
>>>> position to
>>>> >> say 'yes' or 'no' - we need to develop some procedure how a software
>>>> >> joins the project.
>>>> >>
>>>> >> We have discussed this with Nicola and Lennart during EMGM-2013, and
>>>> we
>>>> >> think that we do need a technical review as a part of the procedure
>>>> >> (addressing the issues of license, clarity of the code, integration
>>>> with
>>>> >> other packages, etc.). We also need to think how we do maintenance:
>>>> the
>>>> >> suggestion would be to request that the author joins the forum and
>>>> the
>>>> >> list. If we see that a package is not actively maintained (e.g. we
>>>> can
>>>> >> not reach the maintainer), we should tag such a package as
>>>> 'orphaned'.
>>>> >>
>>>> >> In many respects, we can base our procedure on the procedures
>>>> developed
>>>> >> by Bioconductor. In our procedures we need to achieve two conflicting
>>>> >> goals: a) we do not want to repel potential contributors by a long
>>>> list
>>>> >> of technical requirements but at the same time b) in the sake of
>>>> >> maintainability we need the code to comply to some requirements.
>>>> >> Probably we should have 'minimal' and 'complete' requirements with
>>>> >> packages clearly tagged on the web pages.
>>>> >>
>>>> >> Let us know what you think. I will initiate a PiratPad document after
>>>> >> having initial response from you.
>>>> >>
>>>> >> best regards,
>>>> >> YA
>>>> >>
>>>> >>
>>>> >> _______________________________________________
>>>> >> genabel-devel mailing list
>>>> >> genabel-devel at lists.r-forge.r-project.org
>>>> >>
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>> >>
>>>> >
>>>> > --
>>>> > -----------------------------------------------------------------
>>>> > L.C. Karssen
>>>> > Utrecht
>>>> > The Netherlands
>>>> >
>>>> > lennart at karssen.org
>>>> > http://blog.karssen.org
>>>> >
>>>> > Stuur mij aub geen Word of Powerpoint bestanden!
>>>> > Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
>>>> > ------------------------------------------------------------------
>>>> >
>>>> > _______________________________________________
>>>> > genabel-devel mailing list
>>>> > genabel-devel at lists.r-forge.r-project.org
>>>> >
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>
>>>>  AVVISO DI RISERVATEZZA Informazioni riservate possono essere
>>>> contenute nel messaggio o nei suoi allegati. Se non siete i destinatari
>>>> indicati nel messaggio, o responsabili per la sua consegna alla persona, o
>>>> se avete ricevuto il messaggio per errore, siete pregati di non
>>>> trascriverlo, copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a
>>>> cancellare il messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE
>>>> Confidential information may be contained in this message or in its
>>>> attachments. If you are not the addressee indicated in this message, or
>>>> responsible for message delivering to that person, or if you have received
>>>> this message in error, you may not transcribe, copy or deliver this message
>>>> to anyone. In that case, you should delete this message and its
>>>> attachments. Thank you.
>>>>  _______________________________________________
>>>> genabel-devel mailing list
>>>> genabel-devel at lists.r-forge.r-project.org
>>>>
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>
>>>
>>>
>>>
>>>  --
>>> -----------------------------------------------------
>>> Yurii S. Aulchenko
>>>
>>>  [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
>>> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>>>
>>>
>>>  AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute
>>> nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel
>>> messaggio, o responsabili per la sua consegna alla persona, o se avete
>>> ricevuto il messaggio per errore, siete pregati di non trascriverlo,
>>> copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il
>>> messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential
>>> information may be contained in this message or in its attachments. If you
>>> are not the addressee indicated in this message, or responsible for message
>>> delivering to that person, or if you have received this message in error,
>>> you may not transcribe, copy or deliver this message to anyone. In that
>>> case, you should delete this message and its attachments. Thank you.
>>>
>>
>>
>>
>>  --
>> -----------------------------------------------------
>> Yurii S. Aulchenko
>>
>>  [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
>> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>>
>
>
>
>  --
> -----------------------------------------------------
> Yurii S. Aulchenko
>
>  [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>  _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>
>
> AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute nel
> messaggio o nei suoi allegati. Se non siete i destinatari indicati nel
> messaggio, o responsabili per la sua consegna alla persona, o se avete
> ricevuto il messaggio per errore, siete pregati di non trascriverlo,
> copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il
> messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential
> information may be contained in this message or in its attachments. If you
> are not the addressee indicated in this message, or responsible for message
> delivering to that person, or if you have received this message in error,
> you may not transcribe, copy or deliver this message to anyone. In that
> case, you should delete this message and its attachments. Thank you.
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130703/1495ec57/attachment-0001.html>

From yurii.aulchenko at gmail.com  Fri Jul  5 11:04:16 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Fri, 5 Jul 2013 11:04:16 +0200
Subject: [GenABEL-dev] presentation at UseR!-2013
Message-ID: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>

Dear All,

I am now drafting my presentation for UseR!-2013 (
http://www.edii.uclm.es/~useR-2013/). My presentation about "The GenABEL
suite for genome-wide association analyses" is scheduled for Wed July 10
morning. I will send it to the list for the discussion as soon as I have a
draft (most likely by Saturday eve).

I thought it may be a good idea to present the evolution of the GenABEL in
number, so the idea is to get the numbers by years/quartes of the year
(say, #posts in 2009=x1, 2010=x2...) and present them graphically. For some
of growth metrics I can get the dynamics by years easily, but for some I
have no idea and hope you could help me (may be also by providing the
numbers directly).

Here a small list of metrics I thought of:

#packages: very easy to count :)
#posts on GenABEL-devel: possible to count
#posts on forum: no idea how to do that for defined time periods
#number of lines of code in our SVN repo: no idea
#citations (GenA, ProbA...): easy to count thanks to Google Scholar
#mentions on the Web: ???

Any other nice and easily computed metrics?

I will appreciate your help and suggestions, and sorry for late notice.

best,
Yurii
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130705/97f9ef99/attachment.html>

From yurii.aulchenko at gmail.com  Fri Jul  5 11:12:56 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Fri, 5 Jul 2013 11:12:56 +0200
Subject: [GenABEL-dev] presentation at UseR!-2013
In-Reply-To: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
References: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
Message-ID: <-2742659264291419413@unknownmsgid>

PS the presentation will be a much-updated version of my previous
presentation

 http://mga.bionet.nsc.ru/~yurii/courses/ge03-2013/_GenABEL.pdf

If you have some relevant slides, I
will appreciate greatly if you could
send these to me (of cause your
Contribution will be acknowledged)

Best,
Yurii

----------------------
Yurii Aulchenko
(sent from mobile device)

On 5 Jul 2013, at 11:04, Yurii Aulchenko <yurii.aulchenko at gmail.com> wrote:

Dear All,

I am now drafting my presentation for UseR!-2013 (
http://www.edii.uclm.es/~useR-2013/). My presentation about "The GenABEL
suite for genome-wide association analyses" is scheduled for Wed July 10
morning. I will send it to the list for the discussion as soon as I have a
draft (most likely by Saturday eve).

I thought it may be a good idea to present the evolution of the GenABEL in
number, so the idea is to get the numbers by years/quartes of the year
(say, #posts in 2009=x1, 2010=x2...) and present them graphically. For some
of growth metrics I can get the dynamics by years easily, but for some I
have no idea and hope you could help me (may be also by providing the
numbers directly).

Here a small list of metrics I thought of:

#packages: very easy to count :)
#posts on GenABEL-devel: possible to count
#posts on forum: no idea how to do that for defined time periods
#number of lines of code in our SVN repo: no idea
#citations (GenA, ProbA...): easy to count thanks to Google Scholar
#mentions on the Web: ???

Any other nice and easily computed metrics?

I will appreciate your help and suggestions, and sorry for late notice.

best,
Yurii
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130705/789dea3f/attachment.html>

From m.kooijman at erasmusmc.nl  Fri Jul  5 12:24:03 2013
From: m.kooijman at erasmusmc.nl (Maarten Kooyman)
Date: Fri, 05 Jul 2013 12:24:03 +0200
Subject: [GenABEL-dev] presentation at UseR!-2013
In-Reply-To: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
References: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
Message-ID: <51D69EC3.5040000@erasmusmc.nl>

Hi Yurri,

You might try to install phpBB statistics.

https://www.phpbb.com/customise/db/mod/phpbb_statistics/


Good luck!

Maarten Kooyman
Erasmus MC
Department of Epidemiology
Room Na27-18

Postbus 2040
3000 CA Rotterdam
The Netherlands

phone: +31-10-7038194
mobile: +31-6-28569364
e-mail: m.kooijman at erasmusmc.nl
GPG key ID: AA2CAF11

On 07/05/2013 11:04 AM, Yurii Aulchenko wrote:
> Dear All,
>
> I am now drafting my presentation for UseR!-2013
> (http://www.edii.uclm.es/~useR-2013/
> <http://www.edii.uclm.es/%7EuseR-2013/>). My presentation about "The
> GenABEL suite for genome-wide association analyses" is scheduled for
> Wed July 10 morning. I will send it to the list for the discussion as
> soon as I have a draft (most likely by Saturday eve).
>
> I thought it may be a good idea to present the evolution of the
> GenABEL in number, so the idea is to get the numbers by years/quartes
> of the year (say, #posts in 2009=x1, 2010=x2...) and present them
> graphically. For some of growth metrics I can get the dynamics by
> years easily, but for some I have no idea and hope you could help me
> (may be also by providing the numbers directly).
>
> Here a small list of metrics I thought of:
>
> #packages: very easy to count :)
> #posts on GenABEL-devel: possible to count
> #posts on forum: no idea how to do that for defined time periods
> #number of lines of code in our SVN repo: no idea
> #citations (GenA, ProbA...): easy to count thanks to Google Scholar
> #mentions on the Web: ???
>
> Any other nice and easily computed metrics?
>
> I will appreciate your help and suggestions, and sorry for late notice. 
>
> best,
> Yurii
>
>
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130705/9f13b951/attachment.html>

From lennart at karssen.org  Fri Jul  5 12:30:39 2013
From: lennart at karssen.org (L.C. Karssen)
Date: Fri, 05 Jul 2013 12:30:39 +0200
Subject: [GenABEL-dev] presentation at UseR!-2013
In-Reply-To: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
References: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
Message-ID: <51D6A04F.7050708@karssen.org>

Hi Yurii,

On 07/05/2013 11:04 AM, Yurii Aulchenko wrote:
> Dear All,
> 
> I am now drafting my presentation for UseR!-2013 (
> http://www.edii.uclm.es/~useR-2013/). My presentation about "The GenABEL
> suite for genome-wide association analyses" is scheduled for Wed July 10
> morning. I will send it to the list for the discussion as soon as I have a
> draft (most likely by Saturday eve).
> 
> I thought it may be a good idea to present the evolution of the GenABEL in
> number, so the idea is to get the numbers by years/quartes of the year
> (say, #posts in 2009=x1, 2010=x2...) and present them graphically. For some
> of growth metrics I can get the dynamics by years easily, but for some I
> have no idea and hope you could help me (may be also by providing the
> numbers directly).
> 
> Here a small list of metrics I thought of:
> 
> #packages: very easy to count :)
> #posts on GenABEL-devel: possible to count
> #posts on forum: no idea how to do that for defined time periods

I guess you need to run a query on the database to get those. Our hoster
has a phpmyadmin interface yuo can use for that (or you could probably
use the SSH account and run the MySQL client from the command line).
Probably a query along this line:

 SELECT yearweek(date(from_unixtime(post_time))) AS week, COUNT(*) AS
num_posts FROM phpbb_posts GROUP BY yearweek(date(from_unixtime(post_time)))


> #number of lines of code in our SVN repo: no idea

Probably SLOCcount will help: http://www.dwheeler.com/sloccount/

> #citations (GenA, ProbA...): easy to count thanks to Google Scholar
> #mentions on the Web: ???
> 
> Any other nice and easily computed metrics?
> 
> I will appreciate your help and suggestions, and sorry for late notice.
> 


Good luck,

Lennart.

> best,
> Yurii
> 
> 
> 
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> 


-- 
-----------------------------------------------------------------
L.C. Karssen
Utrecht
The Netherlands

lennart at karssen.org
http://blog.karssen.org

Stuur mij aub geen Word of Powerpoint bestanden!
Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
------------------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130705/6666001b/attachment.sig>

From yurii.aulchenko at gmail.com  Fri Jul  5 12:44:48 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Fri, 5 Jul 2013 12:44:48 +0200
Subject: [GenABEL-dev] presentation at UseR!-2013
In-Reply-To: <51D69EC3.5040000@erasmusmc.nl>
References: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
 <51D69EC3.5040000@erasmusmc.nl>
Message-ID: <CAHX9t6+3g4A96qCo9ar-peuu2eq-TxwYesxLS0XFaccTgOz81A@mail.gmail.com>

Thanks, Maarten, good suggestion! I now I remember that we linked the site
to the Google Analytics, which does allow nice summaries per period.

So the question of #visitors to the GenABEL.org per time period is solved!

Wonder about forum...

Yurii

On Fri, Jul 5, 2013 at 12:24 PM, Maarten Kooyman <m.kooijman at erasmusmc.nl>wrote:

>  Hi Yurri,
>
> You might try to install phpBB statistics.
>
> https://www.phpbb.com/customise/db/mod/phpbb_statistics/
>
>
> Good luck!
>
> Maarten Kooyman
> Erasmus MC
> Department of Epidemiology
> Room Na27-18
>
> Postbus 2040
> 3000 CA Rotterdam
> The Netherlands
>
> phone: +31-10-7038194
> mobile: +31-6-28569364
> e-mail: m.kooijman at erasmusmc.nl
> GPG key ID: AA2CAF11
>
> On 07/05/2013 11:04 AM, Yurii Aulchenko wrote:
>
> Dear All,
>
>  I am now drafting my presentation for UseR!-2013 (
> http://www.edii.uclm.es/~useR-2013/). My presentation about "The GenABEL
> suite for genome-wide association analyses" is scheduled for Wed July 10
> morning. I will send it to the list for the discussion as soon as I have a
> draft (most likely by Saturday eve).
>
>  I thought it may be a good idea to present the evolution of the GenABEL
> in number, so the idea is to get the numbers by years/quartes of the year
> (say, #posts in 2009=x1, 2010=x2...) and present them graphically. For some
> of growth metrics I can get the dynamics by years easily, but for some I
> have no idea and hope you could help me (may be also by providing the
> numbers directly).
>
>  Here a small list of metrics I thought of:
>
>  #packages: very easy to count :)
> #posts on GenABEL-devel: possible to count
> #posts on forum: no idea how to do that for defined time periods
> #number of lines of code in our SVN repo: no idea
> #citations (GenA, ProbA...): easy to count thanks to Google Scholar
> #mentions on the Web: ???
>
>  Any other nice and easily computed metrics?
>
>  I will appreciate your help and suggestions, and sorry for late notice.
>
>  best,
> Yurii
>
>
> _______________________________________________
> genabel-devel mailing listgenabel-devel at lists.r-forge.r-project.orghttps://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>
>
>
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130705/7d85ac65/attachment.html>

From yurii.aulchenko at gmail.com  Fri Jul  5 13:42:35 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Fri, 5 Jul 2013 13:42:35 +0200
Subject: [GenABEL-dev] presentation at UseR!-2013
In-Reply-To: <CAHX9t6+3g4A96qCo9ar-peuu2eq-TxwYesxLS0XFaccTgOz81A@mail.gmail.com>
References: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
 <51D69EC3.5040000@erasmusmc.nl>
 <CAHX9t6+3g4A96qCo9ar-peuu2eq-TxwYesxLS0XFaccTgOz81A@mail.gmail.com>
Message-ID: <CAHX9t6K6111mN-whBPGdH=2oxmZ7qUPREMO0zDbvzwrjZYsT2Q@mail.gmail.com>

Maarten has kindly agreed to check if he could generate some numbers/graphs
using google analytics.

Mind that we can probably spend just a couple of slides on the "progress
number"  for which we need some impressive figures

Anyways, even if the numbers/graphs do not make it into presentation, we
can probably use the graphs/numbers for the (much under-attended!) "showcase"
section <http://www.genabel.org/showcase> on the web-site :)

best, and many thanks,
YA

On Fri, Jul 5, 2013 at 12:44 PM, Yurii Aulchenko
<yurii.aulchenko at gmail.com>wrote:

> Thanks, Maarten, good suggestion! I now I remember that we linked the site
> to the Google Analytics, which does allow nice summaries per period.
>
> So the question of #visitors to the GenABEL.org per time period is solved!
>
> Wonder about forum...
>
> Yurii
>
>
> On Fri, Jul 5, 2013 at 12:24 PM, Maarten Kooyman <m.kooijman at erasmusmc.nl>wrote:
>
>>  Hi Yurri,
>>
>> You might try to install phpBB statistics.
>>
>> https://www.phpbb.com/customise/db/mod/phpbb_statistics/
>>
>>
>> Good luck!
>>
>> Maarten Kooyman
>> Erasmus MC
>> Department of Epidemiology
>> Room Na27-18
>>
>> Postbus 2040
>> 3000 CA Rotterdam
>> The Netherlands
>>
>> phone: +31-10-7038194
>> mobile: +31-6-28569364
>> e-mail: m.kooijman at erasmusmc.nl
>> GPG key ID: AA2CAF11
>>
>> On 07/05/2013 11:04 AM, Yurii Aulchenko wrote:
>>
>> Dear All,
>>
>>  I am now drafting my presentation for UseR!-2013 (
>> http://www.edii.uclm.es/~useR-2013/). My presentation about "The GenABEL
>> suite for genome-wide association analyses" is scheduled for Wed July 10
>> morning. I will send it to the list for the discussion as soon as I have a
>> draft (most likely by Saturday eve).
>>
>>  I thought it may be a good idea to present the evolution of the GenABEL
>> in number, so the idea is to get the numbers by years/quartes of the year
>> (say, #posts in 2009=x1, 2010=x2...) and present them graphically. For some
>> of growth metrics I can get the dynamics by years easily, but for some I
>> have no idea and hope you could help me (may be also by providing the
>> numbers directly).
>>
>>  Here a small list of metrics I thought of:
>>
>>  #packages: very easy to count :)
>> #posts on GenABEL-devel: possible to count
>> #posts on forum: no idea how to do that for defined time periods
>> #number of lines of code in our SVN repo: no idea
>> #citations (GenA, ProbA...): easy to count thanks to Google Scholar
>> #mentions on the Web: ???
>>
>>  Any other nice and easily computed metrics?
>>
>>  I will appreciate your help and suggestions, and sorry for late notice.
>>
>>  best,
>> Yurii
>>
>>
>> _______________________________________________
>> genabel-devel mailing listgenabel-devel at lists.r-forge.r-project.orghttps://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>
>>
>>
>> _______________________________________________
>> genabel-devel mailing list
>> genabel-devel at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>
>
>
>
> --
> -----------------------------------------------------
> Yurii S. Aulchenko
>
> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130705/f34f5449/attachment-0001.html>

From yurii.aulchenko at gmail.com  Fri Jul  5 14:04:36 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Fri, 5 Jul 2013 14:04:36 +0200
Subject: [GenABEL-dev] presentation at UseR!-2013
In-Reply-To: <51D6A04F.7050708@karssen.org>
References: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
 <51D6A04F.7050708@karssen.org>
Message-ID: <CAHX9t6+p25aXJiJOAxhAF8=VNjRcKX1s2R69hVhZFk7QEEnVbg@mail.gmail.com>

On Fri, Jul 5, 2013 at 12:30 PM, L.C. Karssen <lennart at karssen.org> wrote:

> Hi Yurii,
>
> On 07/05/2013 11:04 AM, Yurii Aulchenko wrote:
> > Dear All,
> >
> > I am now drafting my presentation for UseR!-2013 (
> > http://www.edii.uclm.es/~useR-2013/). My presentation about "The GenABEL
> > suite for genome-wide association analyses" is scheduled for Wed July 10
> > morning. I will send it to the list for the discussion as soon as I have
> a
> > draft (most likely by Saturday eve).
> >
> > I thought it may be a good idea to present the evolution of the GenABEL
> in
> > number, so the idea is to get the numbers by years/quartes of the year
> > (say, #posts in 2009=x1, 2010=x2...) and present them graphically. For
> some
> > of growth metrics I can get the dynamics by years easily, but for some I
> > have no idea and hope you could help me (may be also by providing the
> > numbers directly).
> >
> > Here a small list of metrics I thought of:
> >
> > #packages: very easy to count :)
> > #posts on GenABEL-devel: possible to count
> > #posts on forum: no idea how to do that for defined time periods
>
> I guess you need to run a query on the database to get those. Our hoster
> has a phpmyadmin interface yuo can use for that (or you could probably
> use the SSH account and run the MySQL client from the command line).
> Probably a query along this line:
>
>  SELECT yearweek(date(from_unixtime(post_time))) AS week, COUNT(*) AS
> num_posts FROM phpbb_posts GROUP BY
> yearweek(date(from_unixtime(post_time)))
>
>
arrgh... probably I can figure this out if I had enough time, but gonna to
invest into presentation now. If you/someone could give a hand, would be
great :)


>
> > #number of lines of code in our SVN repo: no idea
>
> Probably SLOCcount will help: http://www.dwheeler.com/sloccount/
>
>
This is a nice one! Two problems: it does not count/recognize R; did not
see how to use it to see the dynamics (what was there in repo 2 years
ago?..)

But I like that even without the R code counts (which is 148,000 lines),
for ~65,000 lines of mostly C/C++ I get the message indicating that GenABEL
is worth few millions of dollars:

Development Effort Estimate, Person-Years (Person-Months) = 15.44 (185.24)
 (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
Schedule Estimate, Years (Months)                         = 1.05 (12.61)
 (Basic COCOMO model, Months = 2.5 * (person-months**0.38))
Total Estimated Cost to Develop                           = $ 2,085,323
 (average salary = $56,286/year, overhead = 2.40).

So I think I should use these figures in my presentation :)

> #citations (GenA, ProbA...): easy to count thanks to Google Scholar
> > #mentions on the Web: ???
> >
> > Any other nice and easily computed metrics?
> >
> > I will appreciate your help and suggestions, and sorry for late notice.
> >
>
>
> Good luck,
>
> Lennart.
>
> > best,
> > Yurii
> >
> >
> >
> > _______________________________________________
> > genabel-devel mailing list
> > genabel-devel at lists.r-forge.r-project.org
> >
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> >
>
>
> --
> -----------------------------------------------------------------
> L.C. Karssen
> Utrecht
> The Netherlands
>
> lennart at karssen.org
> http://blog.karssen.org
>
> Stuur mij aub geen Word of Powerpoint bestanden!
> Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
> ------------------------------------------------------------------
>
>
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130705/be458dea/attachment.html>

From yurii.aulchenko at gmail.com  Fri Jul  5 14:36:42 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Fri, 5 Jul 2013 14:36:42 +0200
Subject: [GenABEL-dev] presentation at UseR!-2013
In-Reply-To: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
References: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
Message-ID: <CAHX9t6KV6Nc4hnFTRfXvi=WrMTNee-hMPe-xBdrhfN+8z1RV0g@mail.gmail.com>

On Fri, Jul 5, 2013 at 11:04 AM, Yurii Aulchenko
<yurii.aulchenko at gmail.com>wrote:

> Dear All,
>
> I am now drafting my presentation for UseR!-2013 (
> http://www.edii.uclm.es/~useR-2013/). My presentation about "The GenABEL
> suite for genome-wide association analyses" is scheduled for Wed July 10
> morning. I will send it to the list for the discussion as soon as I have a
> draft (most likely by Saturday eve).
>
> I thought it may be a good idea to present the evolution of the GenABEL in
> number, so the idea is to get the numbers by years/quartes of the year
> (say, #posts in 2009=x1, 2010=x2...) and present them graphically. For some
> of growth metrics I can get the dynamics by years easily, but for some I
> have no idea and hope you could help me (may be also by providing the
> numbers directly).
>
> Here a small list of metrics I thought of:
>
> #packages: very easy to count :)
> #posts on GenABEL-devel: possible to count
> #posts on forum: no idea how to do that for defined time periods
> #number of lines of code in our SVN repo: no idea
> #citations (GenA, ProbA...): easy to count thanks to Google Scholar
>

Even easier than I thought:

http://scholar.google.nl/citations?view_op=view_citation&hl=en&user=wdqXTTEAAAAJ&citation_for_view=wdqXTTEAAAAJ:UeHWp8X0CEIC

http://scholar.google.nl/citations?view_op=view_citation&hl=en&user=wdqXTTEAAAAJ&citation_for_view=wdqXTTEAAAAJ:KlAtU1dfN6UC

Will add the "projection" for 2013 + total # citations + arrows for when
package/paper got out


> #mentions on the Web: ???
>
> Any other nice and easily computed metrics?
>
> I will appreciate your help and suggestions, and sorry for late notice.
>
> best,
> Yurii
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130705/91b55345/attachment.html>

From yurii.aulchenko at gmail.com  Fri Jul  5 14:55:23 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Fri, 5 Jul 2013 14:55:23 +0200
Subject: [GenABEL-dev] layout of GenABEL main page
In-Reply-To: <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it>
References: <fb5e10b31a0eb5.51d2d197@aices.rwth-aachen.de>
 <51D2C34D.2000907@gmail.com>
 <CAHX9t6LXPDT7UR3+Dn1htv1pWBuvJtSAn_zi3ANmt3FYa5pnLw@mail.gmail.com>
 <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it>
Message-ID: <CAHX9t6KDjHXVM8PLWcNxr0CNGJg=LcKvLcmMSOYfotd5scXomA@mail.gmail.com>

I suggest that for the moment we go with what we have (Grant's variant); we
can change later.

Please let me know if you have a strong opinion against! - I really would
like to use the logo for my presentation and also play a bit how well it
fits our pages (genabel.org, facebook, twitter)

YA

On Tue, Jul 2, 2013 at 4:27 PM, Nicola Pirastu <
nicola.pirastu at burlo.trieste.it> wrote:

>  Just to add my two cents to the discussion,
>
>  I think that the problem is not with the DNA helix but with the font.
> I've played around a bit with it and if you use for example Helvetica or
> something less comic-sans-like it does look better. Also for some reason
> I'm still disturbed by the green but it is a very personal opinion..
>
>  Nicola
>
>  Dr. Nicola Pirastu PhD
> Research Fellow
> Medical Sciences, Chirurgical and Health Department
> University of Trieste
> Medical Genetics
> IRCCS Burlo Garofolo
> Via dell'Istria 65/1
> 34137 Italy
> tel. +390403785539
>
>  Il giorno 02/lug/2013, alle ore 14:38, Yurii Aulchenko <
> yurii.aulchenko at gmail.com> ha scritto:
>
> Dear All,
>
>  I agree with critique of Maarten, and I actually still not sure if I
> like Maarten's or Grant's idea better. Interesting thing is that - not sure
> all realize it - Grant's variant is his vision of Maarten's prototype :)
> However, Grant's variant has an important advantage - it is ready to serve
> as logo. And I actually want to use a logo in my slides for UseR!-2013.
>
>  So I suggest we take Grant's logo as a working variant. No doubt that
> the logo is going to evolve with time - as anything we do in the project -
> code, documentation; logo is no different, I think. The element which is
> going to stay and keep it recognizable is the way of spelling the GenABEL
> :) - Like Gnu's horns in the GNU logo.
>
>  What we can do next is to place an open call on site/forum for other
> users to contribute, but this is going to take time, and meanwhile I
> suggest to stick with Grant's variant.
>
>  Yurii
>
> On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman <kooyman at gmail.com> wrote:
>
>> Dear all,
>>
>>
>> It looks really nice ! Credits for who made it.  However, I have more the
>> impression that it looks like a polypeptide chain or a rosary. The
>> seventies font is a matter of taste, but it remind me of comic
>> sans(including a upside down e as a). I wonder if it readable if you print
>> it on a poster: I think this is a important use-case of a scientific logo.
>>
>> Kind regards,
>>
>>
>> Maarten
>>
>>
>>
>>
>> On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote:
>>
>>>  On 28/06/13, Yurii Aulchenko  <yurii.aulchenko at gmail.com> wrote:
>>>
>>>  How do you like this one?
>>>>
>>> I like it a lot.
>>>
>>> What do you think about reducing the font size for the subtitle
>>> and right-justifying it? Would it still be readable? I liked that
>>> detail from the previous attempts with the "Project" subtitle.
>>>
>>> In any case, this is just a minor detail. It looks great as it is.
>>>
>>> Thanks to Grant Borodin!
>>>
>>>
>>>> YA
>>>>
>>>>
>>>> On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko <
>>>> yurii.aulchenko at gmail.com(**javascript:main.compose()> wrote:
>>>>
>>>>
>>>>  Dear Nicola, Diego, Lennart,
>>>>>
>>>>>
>>>>> Thanks for your feedback! I will ask Grant Borodin, who kindly
>>>>> designed these logos, if he could change C according to your comment
>>>>> (capital "ABEL" and "statistical genomics" as in F).
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Yurii
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver <
>>>>> fabregat at aices.rwth-aachen.de**(javascript:main.compose()> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Congrats to whoever designed these logos, they look very nice :)
>>>>>>
>>>>>>
>>>>>>
>>>>>> With respect to my preferences, I fully agree with Lennart: "C with
>>>>>> capital ABEL and statistical genomics below it" would be my choice.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Diego
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 20/06/13, "L.C. Karssen"  <lennart at karssen.org(**javascript:main.compose()>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>  Wow! Those look really nice!
>>>>>>> I like options C and F the most. Actually a combination would be even
>>>>>>> better IMHO: use C with capital ABEL and statistical genomics below
>>>>>>> it.
>>>>>>> Looking forward to head the opinion of others,
>>>>>>> Lennart.
>>>>>>> On 20-06-13 09:34, Yurii Aulchenko wrote:
>>>>>>>
>>>>>>>> Please find attached few more logo variants
>>>>>>>> Yurii
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>   ______________________________**_________________
>>> genabel-devel mailing list
>>> genabel-devel at lists.r-forge.r-**project.org<genabel-devel at lists.r-forge.r-project.org>
>>> https://lists.r-forge.r-**project.org/cgi-bin/mailman/**
>>> listinfo/genabel-devel<https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel>
>>>
>>
>> ______________________________**_________________
>> genabel-devel mailing list
>> genabel-devel at lists.r-forge.r-**project.org<genabel-devel at lists.r-forge.r-project.org>
>> https://lists.r-forge.r-**project.org/cgi-bin/mailman/**
>> listinfo/genabel-devel<https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel>
>>
>
>
>
>  --
> -----------------------------------------------------
> Yurii S. Aulchenko
>
>  [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>  _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>
>
> AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute nel
> messaggio o nei suoi allegati. Se non siete i destinatari indicati nel
> messaggio, o responsabili per la sua consegna alla persona, o se avete
> ricevuto il messaggio per errore, siete pregati di non trascriverlo,
> copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il
> messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential
> information may be contained in this message or in its attachments. If you
> are not the addressee indicated in this message, or responsible for message
> delivering to that person, or if you have received this message in error,
> you may not transcribe, copy or deliver this message to anyone. In that
> case, you should delete this message and its attachments. Thank you.
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130705/0f022b4a/attachment-0001.html>

From nicola.pirastu at burlo.trieste.it  Fri Jul  5 15:05:16 2013
From: nicola.pirastu at burlo.trieste.it (Nicola Pirastu)
Date: Fri, 5 Jul 2013 15:05:16 +0200
Subject: [GenABEL-dev] layout of GenABEL main page
In-Reply-To: <CAHX9t6KDjHXVM8PLWcNxr0CNGJg=LcKvLcmMSOYfotd5scXomA@mail.gmail.com>
References: <fb5e10b31a0eb5.51d2d197@aices.rwth-aachen.de>
 <51D2C34D.2000907@gmail.com>
 <CAHX9t6LXPDT7UR3+Dn1htv1pWBuvJtSAn_zi3ANmt3FYa5pnLw@mail.gmail.com>
 <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it>
 <CAHX9t6KDjHXVM8PLWcNxr0CNGJg=LcKvLcmMSOYfotd5scXomA@mail.gmail.com>
Message-ID: <6632A424-420E-423B-957A-3B8481DD0122@burlo.trieste.it>

I agree, in the end it's not the coca-cola logo and we have not been using it for years so I don't think people are going to be confused if the Logo changes in a few months.

I am actually curious to see how it will look on the forum. I do think that if it's not too much work, the colors of the forum and website should match those of the logo though.

Nicola


Dr. Nicola Pirastu PhD
Research Fellow
Medical Sciences, Chirurgical and Health Department
University of Trieste
Medical Genetics
IRCCS Burlo Garofolo
Via dell'Istria 65/1
34137 Italy
tel. +390403785539

Il giorno 05/lug/2013, alle ore 14:55, Yurii Aulchenko <yurii.aulchenko at gmail.com<mailto:yurii.aulchenko at gmail.com>> ha scritto:

I suggest that for the moment we go with what we have (Grant's variant); we can change later.

Please let me know if you have a strong opinion against! - I really would like to use the logo for my presentation and also play a bit how well it fits our pages (genabel.org<http://genabel.org/>, facebook, twitter)

YA

On Tue, Jul 2, 2013 at 4:27 PM, Nicola Pirastu <nicola.pirastu at burlo.trieste.it<mailto:nicola.pirastu at burlo.trieste.it>> wrote:
Just to add my two cents to the discussion,

I think that the problem is not with the DNA helix but with the font. I've played around a bit with it and if you use for example Helvetica or something less comic-sans-like it does look better. Also for some reason I'm still disturbed by the green but it is a very personal opinion..

Nicola

Dr. Nicola Pirastu PhD
Research Fellow
Medical Sciences, Chirurgical and Health Department
University of Trieste
Medical Genetics
IRCCS Burlo Garofolo
Via dell'Istria 65/1
34137 Italy
tel. +390403785539

Il giorno 02/lug/2013, alle ore 14:38, Yurii Aulchenko <yurii.aulchenko at gmail.com<mailto:yurii.aulchenko at gmail.com>> ha scritto:

Dear All,

I agree with critique of Maarten, and I actually still not sure if I like Maarten's or Grant's idea better. Interesting thing is that - not sure all realize it - Grant's variant is his vision of Maarten's prototype :) However, Grant's variant has an important advantage - it is ready to serve as logo. And I actually want to use a logo in my slides for UseR!-2013.

So I suggest we take Grant's logo as a working variant. No doubt that the logo is going to evolve with time - as anything we do in the project - code, documentation; logo is no different, I think. The element which is going to stay and keep it recognizable is the way of spelling the GenABEL :) - Like Gnu's horns in the GNU logo.

What we can do next is to place an open call on site/forum for other users to contribute, but this is going to take time, and meanwhile I suggest to stick with Grant's variant.

Yurii

On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman <kooyman at gmail.com<mailto:kooyman at gmail.com>> wrote:
Dear all,


It looks really nice ! Credits for who made it.  However, I have more the impression that it looks like a polypeptide chain or a rosary. The seventies font is a matter of taste, but it remind me of comic sans(including a upside down e as a). I wonder if it readable if you print it on a poster: I think this is a important use-case of a scientific logo.

Kind regards,


Maarten


On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote:
On 28/06/13, Yurii Aulchenko  <yurii.aulchenko at gmail.com<mailto:yurii.aulchenko at gmail.com>> wrote:

How do you like this one?
I like it a lot.

What do you think about reducing the font size for the subtitle
and right-justifying it? Would it still be readable? I liked that
detail from the previous attempts with the "Project" subtitle.

In any case, this is just a minor detail. It looks great as it is.

Thanks to Grant Borodin!

YA


On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko <yurii.aulchenko at gmail.com<mailto:yurii.aulchenko at gmail.com>(javascript:main.compose()> wrote:


Dear Nicola, Diego, Lennart,


Thanks for your feedback! I will ask Grant Borodin, who kindly designed these logos, if he could change C according to your comment (capital "ABEL" and "statistical genomics" as in F).


Yurii


On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver <fabregat at aices.rwth-aachen.de<mailto:fabregat at aices.rwth-aachen.de>(javascript:main.compose()> wrote:


Congrats to whoever designed these logos, they look very nice :)


With respect to my preferences, I fully agree with Lennart: "C with capital ABEL and statistical genomics below it" would be my choice.


Best,

Diego


On 20/06/13, "L.C. Karssen"  <lennart at karssen.org<mailto:lennart at karssen.org>(javascript:main.compose()> wrote:


Wow! Those look really nice!
I like options C and F the most. Actually a combination would be even
better IMHO: use C with capital ABEL and statistical genomics below it.
Looking forward to head the opinion of others,
Lennart.
On 20-06-13 09:34, Yurii Aulchenko wrote:
Please find attached few more logo variants
Yurii


_______________________________________________
genabel-devel mailing list
genabel-devel at lists.r-forge.r-project.org<mailto:genabel-devel at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel

_______________________________________________
genabel-devel mailing list
genabel-devel at lists.r-forge.r-project.org<mailto:genabel-devel at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel


--
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn<http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko> ] [ Blog<http://yurii-aulchenko.blogspot.nl/> ]
_______________________________________________
genabel-devel mailing list
genabel-devel at lists.r-forge.r-project.org<mailto:genabel-devel at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel

AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel messaggio, o responsabili per la sua consegna alla persona, o se avete ricevuto il messaggio per errore, siete pregati di non trascriverlo, copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential information may be contained in this message or in its attachments. If you are not the addressee indicated in this message, or responsible for message delivering to that person, or if you have received this message in error, you may not transcribe, copy or deliver this message to anyone. In that case, you should delete this message and its attachments. Thank you.


--
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn<http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko> ] [ Blog<http://yurii-aulchenko.blogspot.nl/> ]

AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel messaggio, o responsabili per la sua consegna alla persona, o se avete ricevuto il messaggio per errore, siete pregati di non trascriverlo, copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential information may be contained in this message or in its attachments. If you are not the addressee indicated in this message, or responsible for message delivering to that person, or if you have received this message in error, you may not transcribe, copy or deliver this message to anyone. In that case, you should delete this message and its attachments. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130705/588e1d5b/attachment.html>

From yurii.aulchenko at gmail.com  Fri Jul  5 15:09:34 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Fri, 5 Jul 2013 15:09:34 +0200
Subject: [GenABEL-dev] layout of GenABEL main page
In-Reply-To: <6632A424-420E-423B-957A-3B8481DD0122@burlo.trieste.it>
References: <fb5e10b31a0eb5.51d2d197@aices.rwth-aachen.de>
 <51D2C34D.2000907@gmail.com>
 <CAHX9t6LXPDT7UR3+Dn1htv1pWBuvJtSAn_zi3ANmt3FYa5pnLw@mail.gmail.com>
 <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it>
 <CAHX9t6KDjHXVM8PLWcNxr0CNGJg=LcKvLcmMSOYfotd5scXomA@mail.gmail.com>
 <6632A424-420E-423B-957A-3B8481DD0122@burlo.trieste.it>
Message-ID: <CAHX9t6KV69R_KLLJ_0Kqnys+nndAs=J-_s3kZHUd7N4dEVObgQ@mail.gmail.com>

On Fri, Jul 5, 2013 at 3:05 PM, Nicola Pirastu <
nicola.pirastu at burlo.trieste.it> wrote:

>  I agree, in the end it's not the coca-cola logo and we have not been
> using it for years so I don't think people are going to be confused if the
> Logo changes in a few months.
>
>
More than that - I really think it should evolve as our project does :)


>  I am actually curious to see how it will look on the forum. I do think
> that if it's not too much work, the colors of the forum and website should
> match those of the logo though.
>

Yep. I now start understanding why people were giving the costs estimates
of few thousands of euro for the that basic design package: e.g. for
facebook we need cover and avatar (latter would do for the twitter as
well). So this is whole project :)

May be later we should think of inviting some guys from a design school -
they must be looking for graduation projects to make, and may be they would
be willing to do that for free :)

YA


>
>  Nicola
>
>
> Dr. Nicola Pirastu PhD
> Research Fellow
> Medical Sciences, Chirurgical and Health Department
> University of Trieste
> Medical Genetics
> IRCCS Burlo Garofolo
> Via dell'Istria 65/1
> 34137 Italy
> tel. +390403785539
>
>  Il giorno 05/lug/2013, alle ore 14:55, Yurii Aulchenko <
> yurii.aulchenko at gmail.com> ha scritto:
>
> I suggest that for the moment we go with what we have (Grant's variant);
> we can change later.
>
>  Please let me know if you have a strong opinion against! - I really
> would like to use the logo for my presentation and also play a bit how well
> it fits our pages (genabel.org, facebook, twitter)
>
>  YA
>
> On Tue, Jul 2, 2013 at 4:27 PM, Nicola Pirastu <
> nicola.pirastu at burlo.trieste.it> wrote:
>
>> Just to add my two cents to the discussion,
>>
>>  I think that the problem is not with the DNA helix but with the font.
>> I've played around a bit with it and if you use for example Helvetica or
>> something less comic-sans-like it does look better. Also for some reason
>> I'm still disturbed by the green but it is a very personal opinion..
>>
>>  Nicola
>>
>>  Dr. Nicola Pirastu PhD
>> Research Fellow
>> Medical Sciences, Chirurgical and Health Department
>> University of Trieste
>> Medical Genetics
>> IRCCS Burlo Garofolo
>> Via dell'Istria 65/1
>> 34137 Italy
>> tel. +390403785539
>>
>>  Il giorno 02/lug/2013, alle ore 14:38, Yurii Aulchenko <
>> yurii.aulchenko at gmail.com> ha scritto:
>>
>> Dear All,
>>
>>  I agree with critique of Maarten, and I actually still not sure if I
>> like Maarten's or Grant's idea better. Interesting thing is that - not sure
>> all realize it - Grant's variant is his vision of Maarten's prototype :)
>> However, Grant's variant has an important advantage - it is ready to serve
>> as logo. And I actually want to use a logo in my slides for UseR!-2013.
>>
>>  So I suggest we take Grant's logo as a working variant. No doubt that
>> the logo is going to evolve with time - as anything we do in the project -
>> code, documentation; logo is no different, I think. The element which is
>> going to stay and keep it recognizable is the way of spelling the GenABEL
>> :) - Like Gnu's horns in the GNU logo.
>>
>>  What we can do next is to place an open call on site/forum for other
>> users to contribute, but this is going to take time, and meanwhile I
>> suggest to stick with Grant's variant.
>>
>>  Yurii
>>
>> On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman <kooyman at gmail.com>wrote:
>>
>>> Dear all,
>>>
>>>
>>> It looks really nice ! Credits for who made it.  However, I have more
>>> the impression that it looks like a polypeptide chain or a rosary. The
>>> seventies font is a matter of taste, but it remind me of comic
>>> sans(including a upside down e as a). I wonder if it readable if you print
>>> it on a poster: I think this is a important use-case of a scientific logo.
>>>
>>> Kind regards,
>>>
>>>
>>> Maarten
>>>
>>>
>>>
>>>
>>> On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote:
>>>
>>>>  On 28/06/13, Yurii Aulchenko  <yurii.aulchenko at gmail.com> wrote:
>>>>
>>>>  How do you like this one?
>>>>>
>>>> I like it a lot.
>>>>
>>>> What do you think about reducing the font size for the subtitle
>>>> and right-justifying it? Would it still be readable? I liked that
>>>> detail from the previous attempts with the "Project" subtitle.
>>>>
>>>> In any case, this is just a minor detail. It looks great as it is.
>>>>
>>>> Thanks to Grant Borodin!
>>>>
>>>>
>>>>> YA
>>>>>
>>>>>
>>>>> On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko <
>>>>> yurii.aulchenko at gmail.com(**javascript:main.compose()> wrote:
>>>>>
>>>>>
>>>>>  Dear Nicola, Diego, Lennart,
>>>>>>
>>>>>>
>>>>>> Thanks for your feedback! I will ask Grant Borodin, who kindly
>>>>>> designed these logos, if he could change C according to your comment
>>>>>> (capital "ABEL" and "statistical genomics" as in F).
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Yurii
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver <
>>>>>> fabregat at aices.rwth-aachen.de**(javascript:main.compose()> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Congrats to whoever designed these logos, they look very nice :)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> With respect to my preferences, I fully agree with Lennart: "C with
>>>>>>> capital ABEL and statistical genomics below it" would be my choice.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Diego
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 20/06/13, "L.C. Karssen"  <lennart at karssen.org(**javascript:main.compose()>
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  Wow! Those look really nice!
>>>>>>>> I like options C and F the most. Actually a combination would be
>>>>>>>> even
>>>>>>>> better IMHO: use C with capital ABEL and statistical genomics below
>>>>>>>> it.
>>>>>>>> Looking forward to head the opinion of others,
>>>>>>>> Lennart.
>>>>>>>> On 20-06-13 09:34, Yurii Aulchenko wrote:
>>>>>>>>
>>>>>>>>> Please find attached few more logo variants
>>>>>>>>> Yurii
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>   ______________________________**_________________
>>>> genabel-devel mailing list
>>>> genabel-devel at lists.r-forge.r-**project.org<genabel-devel at lists.r-forge.r-project.org>
>>>> https://lists.r-forge.r-**project.org/cgi-bin/mailman/**
>>>> listinfo/genabel-devel<https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel>
>>>>
>>>
>>> ______________________________**_________________
>>> genabel-devel mailing list
>>> genabel-devel at lists.r-forge.r-**project.org<genabel-devel at lists.r-forge.r-project.org>
>>> https://lists.r-forge.r-**project.org/cgi-bin/mailman/**
>>> listinfo/genabel-devel<https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel>
>>>
>>
>>
>>
>>  --
>> -----------------------------------------------------
>> Yurii S. Aulchenko
>>
>>  [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
>> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>>  _______________________________________________
>> genabel-devel mailing list
>> genabel-devel at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>
>>
>> AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute
>> nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel
>> messaggio, o responsabili per la sua consegna alla persona, o se avete
>> ricevuto il messaggio per errore, siete pregati di non trascriverlo,
>> copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il
>> messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential
>> information may be contained in this message or in its attachments. If you
>> are not the addressee indicated in this message, or responsible for message
>> delivering to that person, or if you have received this message in error,
>> you may not transcribe, copy or deliver this message to anyone. In that
>> case, you should delete this message and its attachments. Thank you.
>>
>
>
>
>  --
> -----------------------------------------------------
> Yurii S. Aulchenko
>
>  [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>
>
>  AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute
> nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel
> messaggio, o responsabili per la sua consegna alla persona, o se avete
> ricevuto il messaggio per errore, siete pregati di non trascriverlo,
> copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il
> messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential
> information may be contained in this message or in its attachments. If you
> are not the addressee indicated in this message, or responsible for message
> delivering to that person, or if you have received this message in error,
> you may not transcribe, copy or deliver this message to anyone. In that
> case, you should delete this message and its attachments. Thank you.
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130705/cf3724da/attachment-0001.html>

From lennart at karssen.org  Sat Jul  6 18:10:48 2013
From: lennart at karssen.org (L.C. Karssen)
Date: Sat, 06 Jul 2013 18:10:48 +0200
Subject: [GenABEL-dev] presentation at UseR!-2013
In-Reply-To: <CAHX9t6+p25aXJiJOAxhAF8=VNjRcKX1s2R69hVhZFk7QEEnVbg@mail.gmail.com>
References: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
 <51D6A04F.7050708@karssen.org>
 <CAHX9t6+p25aXJiJOAxhAF8=VNjRcKX1s2R69hVhZFk7QEEnVbg@mail.gmail.com>
Message-ID: <51D84188.6010306@karssen.org>

Hi Yurii,

Please find attached the output of the MySQL statement. I added another
column in which the week numbers are separated from the year by a dash,
that makes it easier to read in e.g. R:

posts <- read.table("tmp/posts_per_week_converted.out", header=TRUE,
sep=" ", row.names=NULL)

colnames(posts) <- c("date", "num_posts")

# Convert year-week to year-month-day
posts$weekdate <- as.Date(paste(posts$date, 1), format="%Y-%U %u")

head(posts)
     date num_posts   weekdate
1 2011-01         1 2011-01-03
2 2011-04        15 2011-01-24
3 2011-05         7 2011-01-31
4 2011-06        24 2011-02-07
5 2011-07        10 2011-02-14
6 2011-08         7 2011-02-21


This should help making a bar plot of "weekdate" vs. "num_posts".


By the way, the SQL script is in the ~/scripts/ directory on the SSH
server of our hoster. You can execute it like this:
 mysql -u USERNAME --password=PASSWORD -h HOSTNAME <
get_weekly_posts.sql > posts_per_week.out

The user name, password and host name can be found in the backup scripts
in that same directory.


Best,

Lennart.


On 05-07-13 14:04, Yurii Aulchenko wrote:
> 
> 
> On Fri, Jul 5, 2013 at 12:30 PM, L.C. Karssen <lennart at karssen.org
> <mailto:lennart at karssen.org>> wrote:
> 
>     Hi Yurii,
> 
>     On 07/05/2013 11:04 AM, Yurii Aulchenko wrote:
>     > Dear All,
>     >
>     > I am now drafting my presentation for UseR!-2013 (
>     > http://www.edii.uclm.es/~useR-2013/). My presentation about "The
>     GenABEL
>     > suite for genome-wide association analyses" is scheduled for Wed
>     July 10
>     > morning. I will send it to the list for the discussion as soon as
>     I have a
>     > draft (most likely by Saturday eve).
>     >
>     > I thought it may be a good idea to present the evolution of the
>     GenABEL in
>     > number, so the idea is to get the numbers by years/quartes of the year
>     > (say, #posts in 2009=x1, 2010=x2...) and present them graphically.
>     For some
>     > of growth metrics I can get the dynamics by years easily, but for
>     some I
>     > have no idea and hope you could help me (may be also by providing the
>     > numbers directly).
>     >
>     > Here a small list of metrics I thought of:
>     >
>     > #packages: very easy to count :)
>     > #posts on GenABEL-devel: possible to count
>     > #posts on forum: no idea how to do that for defined time periods
> 
>     I guess you need to run a query on the database to get those. Our hoster
>     has a phpmyadmin interface yuo can use for that (or you could probably
>     use the SSH account and run the MySQL client from the command line).
>     Probably a query along this line:
> 
>      SELECT yearweek(date(from_unixtime(post_time))) AS week, COUNT(*) AS
>     num_posts FROM phpbb_posts GROUP BY
>     yearweek(date(from_unixtime(post_time)))
> 
> 
> arrgh... probably I can figure this out if I had enough time, but gonna
> to invest into presentation now. If you/someone could give a hand, would
> be great :)
> 
>  
> 
> 
>     > #number of lines of code in our SVN repo: no idea
> 
>     Probably SLOCcount will help: http://www.dwheeler.com/sloccount/
> 
> 
> This is a nice one! Two problems: it does not count/recognize R; did not
> see how to use it to see the dynamics (what was there in repo 2 years
> ago?..)
> 
> But I like that even without the R code counts (which is 148,000 lines),
> for ~65,000 lines of mostly C/C++ I get the message indicating that
> GenABEL is worth few millions of dollars:
> 
> Development Effort Estimate, Person-Years (Person-Months) = 15.44 (185.24)
>  (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
> Schedule Estimate, Years (Months)                         = 1.05 (12.61)
>  (Basic COCOMO model, Months = 2.5 * (person-months**0.38))
> Total Estimated Cost to Develop                           = $ 2,085,323
>  (average salary = $56,286/year, overhead = 2.40).
> 
> So I think I should use these figures in my presentation :) 
> 
>     > #citations (GenA, ProbA...): easy to count thanks to Google Scholar
>     > #mentions on the Web: ???
>     >
>     > Any other nice and easily computed metrics?
>     >
>     > I will appreciate your help and suggestions, and sorry for late
>     notice.
>     >
> 
> 
>     Good luck,
> 
>     Lennart.
> 
>     > best,
>     > Yurii
>     >
>     >
>     >
>     > _______________________________________________
>     > genabel-devel mailing list
>     > genabel-devel at lists.r-forge.r-project.org
>     <mailto:genabel-devel at lists.r-forge.r-project.org>
>     >
>     https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>     >
> 
> 
>     --
>     -----------------------------------------------------------------
>     L.C. Karssen
>     Utrecht
>     The Netherlands
> 
>     lennart at karssen.org <mailto:lennart at karssen.org>
>     http://blog.karssen.org
> 
>     Stuur mij aub geen Word of Powerpoint bestanden!
>     Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
>     ------------------------------------------------------------------
> 
> 
>     _______________________________________________
>     genabel-devel mailing list
>     genabel-devel at lists.r-forge.r-project.org
>     <mailto:genabel-devel at lists.r-forge.r-project.org>
>     https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> 
> 
> 
> 
> -- 
> -----------------------------------------------------
> Yurii S. Aulchenko
> 
> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter
> <http://twitter.com/YuriiAulchenko> ] [ Blog
> <http://yurii-aulchenko.blogspot.nl/> ]

-- 
-----------------------------------------------------------------
L.C. Karssen
Utrecht
The Netherlands

lennart at karssen.org
http://blog.karssen.org

Stuur mij aub geen Word of Powerpoint bestanden!
Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
------------------------------------------------------------------
-------------- next part --------------
week	num_posts
2011-01 1
2011-04 15
2011-05 7
2011-06 24
2011-07 10
2011-08 7
2011-09 17
2011-10 27
2011-11 11
2011-12 11
2011-13 19
2011-14 4
2011-15 12
2011-16 20
2011-17 6
2011-18 6
2011-19 6
2011-20 12
2011-21 9
2011-22 13
2011-23 4
2011-24 40
2011-25 19
2011-26 6
2011-27 26
2011-28 5
2011-29 3
2011-30 2
2011-31 2
2011-32 4
2011-33 11
2011-34 17
2011-35 1
2011-36 5
2011-37 2
2011-38 3
2011-39 16
2011-40 15
2011-41 4
2011-42 7
2011-43 18
2011-44 3
2011-45 4
2011-46 6
2011-47 5
2011-48 7
2011-49 6
2011-50 3
2011-51 9
2011-52 2
2012-01 1
2012-02 2
2012-03 1
2012-04 4
2012-05 6
2012-06 9
2012-07 12
2012-08 26
2012-09 1
2012-10 2
2012-11 6
2012-12 5
2012-13 4
2012-14 2
2012-15 2
2012-16 5
2012-17 4
2012-18 1
2012-19 3
2012-20 2
2012-22 9
2012-23 8
2012-24 9
2012-25 14
2012-26 6
2012-27 6
2012-28 16
2012-29 5
2012-30 2
2012-32 6
2012-33 1
2012-34 10
2012-35 2
2012-36 1
2012-37 9
2012-38 20
2012-39 8
2012-40 8
2012-41 1
2012-42 9
2012-43 9
2012-44 5
2012-45 1
2012-46 1
2012-47 12
2012-48 2
2012-49 16
2012-50 4
2012-51 7
2012-53 4
2013-01 31
2013-02 9
2013-03 7
2013-04 19
2013-05 5
2013-06 8
2013-07 20
2013-08 7
2013-09 15
2013-10 12
2013-11 2
2013-12 6
2013-13 17
2013-14 18
2013-15 9
2013-16 2
2013-17 10
2013-18 12
2013-19 18
2013-20 7
2013-21 3
2013-22 2
2013-23 2
2013-24 9
2013-25 7
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130706/e9b57d1b/attachment.sig>

From yurii.aulchenko at gmail.com  Sun Jul  7 03:40:06 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Sun, 7 Jul 2013 03:40:06 +0200
Subject: [GenABEL-dev] presentation at UseR!-2013
In-Reply-To: <51D84188.6010306@karssen.org>
References: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
 <51D6A04F.7050708@karssen.org>
 <CAHX9t6+p25aXJiJOAxhAF8=VNjRcKX1s2R69hVhZFk7QEEnVbg@mail.gmail.com>
 <51D84188.6010306@karssen.org>
Message-ID: <CAHX9t6+2kBd0c1UYmLR0VNK4mfgSvt7B+uE_SXzGa0Fvze_8Dg@mail.gmail.com>

Thank you very much, Lennart! - not sure I will manage to use this data for
current presentation - it is getting rather big now, and I am getting
tired... I probably can use these numbers to make a figure "how the
community sets off", but not sure, did not have time to present these
numbers graphically yet.

You can find the current draft of presentation at my public Dropbox,
https://dl.dropboxusercontent.com/u/13260693/GenABEL-1.odp

Comments/suggestions/improvements are welcome!

Note that I have 15-17 minutes for the presentation, so slide count is
already too high. Can probably cut short on the "history". Also wonder if
this presentation will be interesting for the R people - it is kind of very
general one at the moment.

YA

On Sat, Jul 6, 2013 at 6:10 PM, L.C. Karssen <lennart at karssen.org> wrote:

> Hi Yurii,
>
> Please find attached the output of the MySQL statement. I added another
> column in which the week numbers are separated from the year by a dash,
> that makes it easier to read in e.g. R:
>
> posts <- read.table("tmp/posts_per_week_converted.out", header=TRUE,
> sep=" ", row.names=NULL)
>
> colnames(posts) <- c("date", "num_posts")
>
> # Convert year-week to year-month-day
> posts$weekdate <- as.Date(paste(posts$date, 1), format="%Y-%U %u")
>
> head(posts)
>      date num_posts   weekdate
> 1 2011-01         1 2011-01-03
> 2 2011-04        15 2011-01-24
> 3 2011-05         7 2011-01-31
> 4 2011-06        24 2011-02-07
> 5 2011-07        10 2011-02-14
> 6 2011-08         7 2011-02-21
>
>
> This should help making a bar plot of "weekdate" vs. "num_posts".
>
>
> By the way, the SQL script is in the ~/scripts/ directory on the SSH
> server of our hoster. You can execute it like this:
>  mysql -u USERNAME --password=PASSWORD -h HOSTNAME <
> get_weekly_posts.sql > posts_per_week.out
>
> The user name, password and host name can be found in the backup scripts
> in that same directory.
>
>
> Best,
>
> Lennart.
>
>
> On 05-07-13 14:04, Yurii Aulchenko wrote:
> >
> >
> > On Fri, Jul 5, 2013 at 12:30 PM, L.C. Karssen <lennart at karssen.org
> > <mailto:lennart at karssen.org>> wrote:
> >
> >     Hi Yurii,
> >
> >     On 07/05/2013 11:04 AM, Yurii Aulchenko wrote:
> >     > Dear All,
> >     >
> >     > I am now drafting my presentation for UseR!-2013 (
> >     > http://www.edii.uclm.es/~useR-2013/). My presentation about "The
> >     GenABEL
> >     > suite for genome-wide association analyses" is scheduled for Wed
> >     July 10
> >     > morning. I will send it to the list for the discussion as soon as
> >     I have a
> >     > draft (most likely by Saturday eve).
> >     >
> >     > I thought it may be a good idea to present the evolution of the
> >     GenABEL in
> >     > number, so the idea is to get the numbers by years/quartes of the
> year
> >     > (say, #posts in 2009=x1, 2010=x2...) and present them graphically.
> >     For some
> >     > of growth metrics I can get the dynamics by years easily, but for
> >     some I
> >     > have no idea and hope you could help me (may be also by providing
> the
> >     > numbers directly).
> >     >
> >     > Here a small list of metrics I thought of:
> >     >
> >     > #packages: very easy to count :)
> >     > #posts on GenABEL-devel: possible to count
> >     > #posts on forum: no idea how to do that for defined time periods
> >
> >     I guess you need to run a query on the database to get those. Our
> hoster
> >     has a phpmyadmin interface yuo can use for that (or you could
> probably
> >     use the SSH account and run the MySQL client from the command line).
> >     Probably a query along this line:
> >
> >      SELECT yearweek(date(from_unixtime(post_time))) AS week, COUNT(*) AS
> >     num_posts FROM phpbb_posts GROUP BY
> >     yearweek(date(from_unixtime(post_time)))
> >
> >
> > arrgh... probably I can figure this out if I had enough time, but gonna
> > to invest into presentation now. If you/someone could give a hand, would
> > be great :)
> >
> >
> >
> >
> >     > #number of lines of code in our SVN repo: no idea
> >
> >     Probably SLOCcount will help: http://www.dwheeler.com/sloccount/
> >
> >
> > This is a nice one! Two problems: it does not count/recognize R; did not
> > see how to use it to see the dynamics (what was there in repo 2 years
> > ago?..)
> >
> > But I like that even without the R code counts (which is 148,000 lines),
> > for ~65,000 lines of mostly C/C++ I get the message indicating that
> > GenABEL is worth few millions of dollars:
> >
> > Development Effort Estimate, Person-Years (Person-Months) = 15.44
> (185.24)
> >  (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
> > Schedule Estimate, Years (Months)                         = 1.05 (12.61)
> >  (Basic COCOMO model, Months = 2.5 * (person-months**0.38))
> > Total Estimated Cost to Develop                           = $ 2,085,323
> >  (average salary = $56,286/year, overhead = 2.40).
> >
> > So I think I should use these figures in my presentation :)
> >
> >     > #citations (GenA, ProbA...): easy to count thanks to Google Scholar
> >     > #mentions on the Web: ???
> >     >
> >     > Any other nice and easily computed metrics?
> >     >
> >     > I will appreciate your help and suggestions, and sorry for late
> >     notice.
> >     >
> >
> >
> >     Good luck,
> >
> >     Lennart.
> >
> >     > best,
> >     > Yurii
> >     >
> >     >
> >     >
> >     > _______________________________________________
> >     > genabel-devel mailing list
> >     > genabel-devel at lists.r-forge.r-project.org
> >     <mailto:genabel-devel at lists.r-forge.r-project.org>
> >     >
> >
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> >     >
> >
> >
> >     --
> >     -----------------------------------------------------------------
> >     L.C. Karssen
> >     Utrecht
> >     The Netherlands
> >
> >     lennart at karssen.org <mailto:lennart at karssen.org>
> >     http://blog.karssen.org
> >
> >     Stuur mij aub geen Word of Powerpoint bestanden!
> >     Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
> >     ------------------------------------------------------------------
> >
> >
> >     _______________________________________________
> >     genabel-devel mailing list
> >     genabel-devel at lists.r-forge.r-project.org
> >     <mailto:genabel-devel at lists.r-forge.r-project.org>
> >
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> >
> >
> >
> >
> > --
> > -----------------------------------------------------
> > Yurii S. Aulchenko
> >
> > [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter
> > <http://twitter.com/YuriiAulchenko> ] [ Blog
> > <http://yurii-aulchenko.blogspot.nl/> ]
>
> --
> -----------------------------------------------------------------
> L.C. Karssen
> Utrecht
> The Netherlands
>
> lennart at karssen.org
> http://blog.karssen.org
>
> Stuur mij aub geen Word of Powerpoint bestanden!
> Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
> ------------------------------------------------------------------
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130707/efb43c32/attachment-0001.html>

From yurii.aulchenko at gmail.com  Wed Jul 10 20:42:45 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Wed, 10 Jul 2013 20:42:45 +0200
Subject: [GenABEL-dev] presentation at UseR!-2013
In-Reply-To: <CAHX9t6+2kBd0c1UYmLR0VNK4mfgSvt7B+uE_SXzGa0Fvze_8Dg@mail.gmail.com>
References: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
 <51D6A04F.7050708@karssen.org>
 <CAHX9t6+p25aXJiJOAxhAF8=VNjRcKX1s2R69hVhZFk7QEEnVbg@mail.gmail.com>
 <51D84188.6010306@karssen.org>
 <CAHX9t6+2kBd0c1UYmLR0VNK4mfgSvt7B+uE_SXzGa0Fvze_8Dg@mail.gmail.com>
Message-ID: <CAHX9t6KqDGPJVHiyPKiYAVz5-F8F39ROqOjapR7qK8C7pgM_hg@mail.gmail.com>

Dear All,

The last variant of presentation - the one I presented at UseR!-2013 this
morning - is available at previous link. The presentation went fine (though
I was slightly over time and therefore was wrapping up a bit too quickly).
Several people contacted me after the talk.

Later, we should probably move that presentation to our web-site. (what
section? showcase?..)

Lennart, Maarten, many thanks for your input!

YA

On Sun, Jul 7, 2013 at 3:40 AM, Yurii Aulchenko
<yurii.aulchenko at gmail.com>wrote:

> Thank you very much, Lennart! - not sure I will manage to use this data
> for current presentation - it is getting rather big now, and I am getting
> tired... I probably can use these numbers to make a figure "how the
> community sets off", but not sure, did not have time to present these
> numbers graphically yet.
>
> You can find the current draft of presentation at my public Dropbox,
> https://dl.dropboxusercontent.com/u/13260693/GenABEL-1.odp
>
> Comments/suggestions/improvements are welcome!
>
> Note that I have 15-17 minutes for the presentation, so slide count is
> already too high. Can probably cut short on the "history". Also wonder if
> this presentation will be interesting for the R people - it is kind of very
> general one at the moment.
>
> YA
>
>
> On Sat, Jul 6, 2013 at 6:10 PM, L.C. Karssen <lennart at karssen.org> wrote:
>
>> Hi Yurii,
>>
>> Please find attached the output of the MySQL statement. I added another
>> column in which the week numbers are separated from the year by a dash,
>> that makes it easier to read in e.g. R:
>>
>> posts <- read.table("tmp/posts_per_week_converted.out", header=TRUE,
>> sep=" ", row.names=NULL)
>>
>> colnames(posts) <- c("date", "num_posts")
>>
>> # Convert year-week to year-month-day
>> posts$weekdate <- as.Date(paste(posts$date, 1), format="%Y-%U %u")
>>
>> head(posts)
>>      date num_posts   weekdate
>> 1 2011-01         1 2011-01-03
>> 2 2011-04        15 2011-01-24
>> 3 2011-05         7 2011-01-31
>> 4 2011-06        24 2011-02-07
>> 5 2011-07        10 2011-02-14
>> 6 2011-08         7 2011-02-21
>>
>>
>> This should help making a bar plot of "weekdate" vs. "num_posts".
>>
>>
>> By the way, the SQL script is in the ~/scripts/ directory on the SSH
>> server of our hoster. You can execute it like this:
>>  mysql -u USERNAME --password=PASSWORD -h HOSTNAME <
>> get_weekly_posts.sql > posts_per_week.out
>>
>> The user name, password and host name can be found in the backup scripts
>> in that same directory.
>>
>>
>> Best,
>>
>> Lennart.
>>
>>
>> On 05-07-13 14:04, Yurii Aulchenko wrote:
>> >
>> >
>> > On Fri, Jul 5, 2013 at 12:30 PM, L.C. Karssen <lennart at karssen.org
>> > <mailto:lennart at karssen.org>> wrote:
>> >
>> >     Hi Yurii,
>> >
>> >     On 07/05/2013 11:04 AM, Yurii Aulchenko wrote:
>> >     > Dear All,
>> >     >
>> >     > I am now drafting my presentation for UseR!-2013 (
>> >     > http://www.edii.uclm.es/~useR-2013/). My presentation about "The
>> >     GenABEL
>> >     > suite for genome-wide association analyses" is scheduled for Wed
>> >     July 10
>> >     > morning. I will send it to the list for the discussion as soon as
>> >     I have a
>> >     > draft (most likely by Saturday eve).
>> >     >
>> >     > I thought it may be a good idea to present the evolution of the
>> >     GenABEL in
>> >     > number, so the idea is to get the numbers by years/quartes of the
>> year
>> >     > (say, #posts in 2009=x1, 2010=x2...) and present them graphically.
>> >     For some
>> >     > of growth metrics I can get the dynamics by years easily, but for
>> >     some I
>> >     > have no idea and hope you could help me (may be also by providing
>> the
>> >     > numbers directly).
>> >     >
>> >     > Here a small list of metrics I thought of:
>> >     >
>> >     > #packages: very easy to count :)
>> >     > #posts on GenABEL-devel: possible to count
>> >     > #posts on forum: no idea how to do that for defined time periods
>> >
>> >     I guess you need to run a query on the database to get those. Our
>> hoster
>> >     has a phpmyadmin interface yuo can use for that (or you could
>> probably
>> >     use the SSH account and run the MySQL client from the command line).
>> >     Probably a query along this line:
>> >
>> >      SELECT yearweek(date(from_unixtime(post_time))) AS week, COUNT(*)
>> AS
>> >     num_posts FROM phpbb_posts GROUP BY
>> >     yearweek(date(from_unixtime(post_time)))
>> >
>> >
>> > arrgh... probably I can figure this out if I had enough time, but gonna
>> > to invest into presentation now. If you/someone could give a hand, would
>> > be great :)
>> >
>> >
>> >
>> >
>> >     > #number of lines of code in our SVN repo: no idea
>> >
>> >     Probably SLOCcount will help: http://www.dwheeler.com/sloccount/
>> >
>> >
>> > This is a nice one! Two problems: it does not count/recognize R; did not
>> > see how to use it to see the dynamics (what was there in repo 2 years
>> > ago?..)
>> >
>> > But I like that even without the R code counts (which is 148,000 lines),
>> > for ~65,000 lines of mostly C/C++ I get the message indicating that
>> > GenABEL is worth few millions of dollars:
>> >
>> > Development Effort Estimate, Person-Years (Person-Months) = 15.44
>> (185.24)
>> >  (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
>> > Schedule Estimate, Years (Months)                         = 1.05 (12.61)
>> >  (Basic COCOMO model, Months = 2.5 * (person-months**0.38))
>> > Total Estimated Cost to Develop                           = $ 2,085,323
>> >  (average salary = $56,286/year, overhead = 2.40).
>> >
>> > So I think I should use these figures in my presentation :)
>> >
>> >     > #citations (GenA, ProbA...): easy to count thanks to Google
>> Scholar
>> >     > #mentions on the Web: ???
>> >     >
>> >     > Any other nice and easily computed metrics?
>> >     >
>> >     > I will appreciate your help and suggestions, and sorry for late
>> >     notice.
>> >     >
>> >
>> >
>> >     Good luck,
>> >
>> >     Lennart.
>> >
>> >     > best,
>> >     > Yurii
>> >     >
>> >     >
>> >     >
>> >     > _______________________________________________
>> >     > genabel-devel mailing list
>> >     > genabel-devel at lists.r-forge.r-project.org
>> >     <mailto:genabel-devel at lists.r-forge.r-project.org>
>> >     >
>> >
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>> >     >
>> >
>> >
>> >     --
>> >     -----------------------------------------------------------------
>> >     L.C. Karssen
>> >     Utrecht
>> >     The Netherlands
>> >
>> >     lennart at karssen.org <mailto:lennart at karssen.org>
>> >     http://blog.karssen.org
>> >
>> >     Stuur mij aub geen Word of Powerpoint bestanden!
>> >     Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
>> >     ------------------------------------------------------------------
>> >
>> >
>> >     _______________________________________________
>> >     genabel-devel mailing list
>> >     genabel-devel at lists.r-forge.r-project.org
>> >     <mailto:genabel-devel at lists.r-forge.r-project.org>
>> >
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>> >
>> >
>> >
>> >
>> > --
>> > -----------------------------------------------------
>> > Yurii S. Aulchenko
>> >
>> > [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter
>> > <http://twitter.com/YuriiAulchenko> ] [ Blog
>> > <http://yurii-aulchenko.blogspot.nl/> ]
>>
>> --
>> -----------------------------------------------------------------
>> L.C. Karssen
>> Utrecht
>> The Netherlands
>>
>> lennart at karssen.org
>> http://blog.karssen.org
>>
>> Stuur mij aub geen Word of Powerpoint bestanden!
>> Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
>> ------------------------------------------------------------------
>>
>
>
>
> --
> -----------------------------------------------------
> Yurii S. Aulchenko
>
> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130710/daaa1fd3/attachment.html>

From lennart at karssen.org  Thu Jul 11 23:56:37 2013
From: lennart at karssen.org (L.C. Karssen)
Date: Thu, 11 Jul 2013 23:56:37 +0200
Subject: [GenABEL-dev] ProbABEL, chi^2, Wald and log-likelihood
Message-ID: <51DF2A15.4020607@karssen.org>

Dear all,

For the upcoming release of ProbABEL I've run into the following. In the
past (~ v 0.1-3) the output of ProbABEL had chi^2 values when doing Cox
regression. These were based on the likelihood ratio test:
 2 * (loglik -loglik_null) ~ chi_1^2
However, at some point, when having hamissing data was allowed in
ProbABEL, we ran into the problem that the null model had to be
recalculated for cases with missing genotype data. To do that 'simply'
for each SNP would be time consuming, so the chi^2 values were removed
from the output and replaced by the loglik values for the full model.
(At least, that's how I guess it went).

Now, I would like to get them back. This can be done in two ways:
1) calculate chi^2 as described above, with some smart way of only
recalculating the null model when a missing value occurs (this shouldn't
be often with today's imputed data).
2) simply calculate the chi^2 value through the Wald test. We have betas
and se_betas, so that is easy.

Many of you have more knowledge about statistics than I do, so,
statistically, are these methods equivalent? Or is one better (more
precise/unbiased) than the other?


Another question:
While testing the Wald-type implementation I ran into the following:
I would assume that for the 2df models (where we get beta_SNP_A1A2 and
beta_SNP_A1A1) the final chi^2 value would be the sum of the individual
Wald statistics, which would be distributed as chi_2^2 (so 2 df). Is
that correct? I ask this because if I compare them with the chi^2 values
from the LRT I get different values. In the example data set I get:
name      chi^2_Wald        chi^2_LRT
rs7247199 0.880949           0.452465
rs8102643 0.0116651          0.512709   <- here we have a missing value!
rs8102615 1.51434            0.754701
rs8105536 2.56337            1.33223
rs2312724 0.492364           0.256649

When running the additive model I do get (almost) the same results:
name       chi^2_Wald        chi^2_LRT
rs7247199  0.0101558          0.01012
rs8102643  0.353168           0.492147  <- here we have a missing value!
rs8102615  0.0181841          0.0180033
rs8105536  0.00222781         0.00222216
rs2312724  0.0412005          0.0401556

Shouldn't the chi_2 values be equal in both cases? FYI: the LRT chi^2
values are the same as those obtained with ProbABEL v0.1-3.


Any suggestions?
Thanks,

Lennart.

--
-----------------------------------------------------------------
L.C. Karssen
Utrecht
The Netherlands

lennart at karssen.org
http://blog.karssen.org

Stuur mij aub geen Word of Powerpoint bestanden!
Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
------------------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130711/72c25c38/attachment.sig>

From yurii.aulchenko at gmail.com  Fri Jul 12 01:41:47 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Fri, 12 Jul 2013 01:41:47 +0200
Subject: [GenABEL-dev] ProbABEL, chi^2, Wald and log-likelihood
In-Reply-To: <51DF2A15.4020607@karssen.org>
References: <51DF2A15.4020607@karssen.org>
Message-ID: <CAHX9t6+sPZyK3aCB9wErLbAy7ret=43c3bKhPa_H0nzYTQ2P0g@mail.gmail.com>

In principle score, Wald, and LRT have to give similar answers in
non-extreme cases. LRT is theoretically the most superior method (if
underlying model assumptions, e.g. normality, hold).  Score / Wald are the
approximations to LRT derived at the point of null/alternative,
respectively. They actually ARE derived from quadratic approximations of
the likleihood function derived at these points :)

As for practical advantages/disadvantages of these, may be someone else
could comment. I remember there are good/bad sides in both...

Re: Wald on 2df - you can not add Walds from individual beta/se, you need
to take the covariance into account. For full treatment of the problem, see

http://www.math.chalmers.se/~wermuth/pdfs/86-95/CoxWer90_An_approximation_to_ML.pdf

For a simple variant, I think our ProbABEL paper does give some details on
score/Wald.

Would that be good idea to put this discussion topic to our "Journal club"?
- these are kind of topics of general interest irrespective of GenABEL.

best,
Yurii

On Thu, Jul 11, 2013 at 11:56 PM, L.C. Karssen <lennart at karssen.org> wrote:

> Dear all,
>
> For the upcoming release of ProbABEL I've run into the following. In the
> past (~ v 0.1-3) the output of ProbABEL had chi^2 values when doing Cox
> regression. These were based on the likelihood ratio test:
>  2 * (loglik -loglik_null) ~ chi_1^2
> However, at some point, when having hamissing data was allowed in
> ProbABEL, we ran into the problem that the null model had to be
> recalculated for cases with missing genotype data. To do that 'simply'
> for each SNP would be time consuming, so the chi^2 values were removed
> from the output and replaced by the loglik values for the full model.
> (At least, that's how I guess it went).
>
> Now, I would like to get them back. This can be done in two ways:
> 1) calculate chi^2 as described above, with some smart way of only
> recalculating the null model when a missing value occurs (this shouldn't
> be often with today's imputed data).
> 2) simply calculate the chi^2 value through the Wald test. We have betas
> and se_betas, so that is easy.
>
> Many of you have more knowledge about statistics than I do, so,
> statistically, are these methods equivalent? Or is one better (more
> precise/unbiased) than the other?
>
>
> Another question:
> While testing the Wald-type implementation I ran into the following:
> I would assume that for the 2df models (where we get beta_SNP_A1A2 and
> beta_SNP_A1A1) the final chi^2 value would be the sum of the individual
> Wald statistics, which would be distributed as chi_2^2 (so 2 df). Is
> that correct? I ask this because if I compare them with the chi^2 values
> from the LRT I get different values. In the example data set I get:
> name      chi^2_Wald        chi^2_LRT
> rs7247199 0.880949           0.452465
> rs8102643 0.0116651          0.512709   <- here we have a missing value!
> rs8102615 1.51434            0.754701
> rs8105536 2.56337            1.33223
> rs2312724 0.492364           0.256649
>
> When running the additive model I do get (almost) the same results:
> name       chi^2_Wald        chi^2_LRT
> rs7247199  0.0101558          0.01012
> rs8102643  0.353168           0.492147  <- here we have a missing value!
> rs8102615  0.0181841          0.0180033
> rs8105536  0.00222781         0.00222216
> rs2312724  0.0412005          0.0401556
>
> Shouldn't the chi_2 values be equal in both cases? FYI: the LRT chi^2
> values are the same as those obtained with ProbABEL v0.1-3.
>
>
> Any suggestions?
> Thanks,
>
> Lennart.
>
> --
> -----------------------------------------------------------------
> L.C. Karssen
> Utrecht
> The Netherlands
>
> lennart at karssen.org
> http://blog.karssen.org
>
> Stuur mij aub geen Word of Powerpoint bestanden!
> Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
> ------------------------------------------------------------------
>
>
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130712/ad250569/attachment.html>

From yurii.aulchenko at gmail.com  Fri Jul 12 12:19:29 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Fri, 12 Jul 2013 12:19:29 +0200
Subject: [GenABEL-dev] Fwd: [R-Forge] Downtime 15 July
References: <20130712101528.4B2BA185606@r-forge.r-project.org>
Message-ID: <4309418709146154992@unknownmsgid>

FYI

----------------------
Yurii Aulchenko
(sent from mobile device)

Begin forwarded message:

*From:* <noreply at r-forge.r-project.org>
*Date:* 12 July 2013 12:15:28 CEST
*To:* yurii.aulchenko at gmail.com
*Subject:* *[R-Forge] Downtime 15 July*

Dear R-Forge users

Packages are available for download once again and the build process has
been restarted.

A second (complete sitewide) downtime has to be announced. It will be next
Monday on 15th July starting at around 9:00 CEST and will last up to 1 day.
That should be however the last downtime necessary associated with the
relocation of the WU campus.

We apologize for any inconvenience

The R-Forge Team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130712/a2b56882/attachment.html>

From yurii.aulchenko at gmail.com  Sun Jul 14 19:43:49 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Sun, 14 Jul 2013 19:43:49 +0200
Subject: [GenABEL-dev] presentation at UseR!-2013
In-Reply-To: <CAHX9t6KqDGPJVHiyPKiYAVz5-F8F39ROqOjapR7qK8C7pgM_hg@mail.gmail.com>
References: <CAHX9t6+_vyEwPnS9GKBg6SeizZe4NJTSEA6-ZwTQWE3QkgkPKQ@mail.gmail.com>
 <51D6A04F.7050708@karssen.org>
 <CAHX9t6+p25aXJiJOAxhAF8=VNjRcKX1s2R69hVhZFk7QEEnVbg@mail.gmail.com>
 <51D84188.6010306@karssen.org>
 <CAHX9t6+2kBd0c1UYmLR0VNK4mfgSvt7B+uE_SXzGa0Fvze_8Dg@mail.gmail.com>
 <CAHX9t6KqDGPJVHiyPKiYAVz5-F8F39ROqOjapR7qK8C7pgM_hg@mail.gmail.com>
Message-ID: <CAHX9t6Lk5GUe1U_B6yfnkpbS2At0_3X=mOq9Bgp85kZWTphzCA@mail.gmail.com>

Dear All,

I have composed a short report from UseR! conference, see

http://www.genabel.org/news20130714

for the links.

best wishes, and thanks again for your help,

Yurii


On Wed, Jul 10, 2013 at 8:42 PM, Yurii Aulchenko
<yurii.aulchenko at gmail.com>wrote:

> Dear All,
>
> The last variant of presentation - the one I presented at UseR!-2013 this
> morning - is available at previous link. The presentation went fine (though
> I was slightly over time and therefore was wrapping up a bit too quickly).
> Several people contacted me after the talk.
>
> Later, we should probably move that presentation to our web-site. (what
> section? showcase?..)
>
> Lennart, Maarten, many thanks for your input!
>
> YA
>
>
> On Sun, Jul 7, 2013 at 3:40 AM, Yurii Aulchenko <yurii.aulchenko at gmail.com
> > wrote:
>
>> Thank you very much, Lennart! - not sure I will manage to use this data
>> for current presentation - it is getting rather big now, and I am getting
>> tired... I probably can use these numbers to make a figure "how the
>> community sets off", but not sure, did not have time to present these
>> numbers graphically yet.
>>
>> You can find the current draft of presentation at my public Dropbox,
>> https://dl.dropboxusercontent.com/u/13260693/GenABEL-1.odp
>>
>> Comments/suggestions/improvements are welcome!
>>
>> Note that I have 15-17 minutes for the presentation, so slide count is
>> already too high. Can probably cut short on the "history". Also wonder if
>> this presentation will be interesting for the R people - it is kind of very
>> general one at the moment.
>>
>> YA
>>
>>
>> On Sat, Jul 6, 2013 at 6:10 PM, L.C. Karssen <lennart at karssen.org> wrote:
>>
>>> Hi Yurii,
>>>
>>> Please find attached the output of the MySQL statement. I added another
>>> column in which the week numbers are separated from the year by a dash,
>>> that makes it easier to read in e.g. R:
>>>
>>> posts <- read.table("tmp/posts_per_week_converted.out", header=TRUE,
>>> sep=" ", row.names=NULL)
>>>
>>> colnames(posts) <- c("date", "num_posts")
>>>
>>> # Convert year-week to year-month-day
>>> posts$weekdate <- as.Date(paste(posts$date, 1), format="%Y-%U %u")
>>>
>>> head(posts)
>>>      date num_posts   weekdate
>>> 1 2011-01         1 2011-01-03
>>> 2 2011-04        15 2011-01-24
>>> 3 2011-05         7 2011-01-31
>>> 4 2011-06        24 2011-02-07
>>> 5 2011-07        10 2011-02-14
>>> 6 2011-08         7 2011-02-21
>>>
>>>
>>> This should help making a bar plot of "weekdate" vs. "num_posts".
>>>
>>>
>>> By the way, the SQL script is in the ~/scripts/ directory on the SSH
>>> server of our hoster. You can execute it like this:
>>>  mysql -u USERNAME --password=PASSWORD -h HOSTNAME <
>>> get_weekly_posts.sql > posts_per_week.out
>>>
>>> The user name, password and host name can be found in the backup scripts
>>> in that same directory.
>>>
>>>
>>> Best,
>>>
>>> Lennart.
>>>
>>>
>>> On 05-07-13 14:04, Yurii Aulchenko wrote:
>>> >
>>> >
>>> > On Fri, Jul 5, 2013 at 12:30 PM, L.C. Karssen <lennart at karssen.org
>>> > <mailto:lennart at karssen.org>> wrote:
>>> >
>>> >     Hi Yurii,
>>> >
>>> >     On 07/05/2013 11:04 AM, Yurii Aulchenko wrote:
>>> >     > Dear All,
>>> >     >
>>> >     > I am now drafting my presentation for UseR!-2013 (
>>> >     > http://www.edii.uclm.es/~useR-2013/). My presentation about "The
>>> >     GenABEL
>>> >     > suite for genome-wide association analyses" is scheduled for Wed
>>> >     July 10
>>> >     > morning. I will send it to the list for the discussion as soon as
>>> >     I have a
>>> >     > draft (most likely by Saturday eve).
>>> >     >
>>> >     > I thought it may be a good idea to present the evolution of the
>>> >     GenABEL in
>>> >     > number, so the idea is to get the numbers by years/quartes of
>>> the year
>>> >     > (say, #posts in 2009=x1, 2010=x2...) and present them
>>> graphically.
>>> >     For some
>>> >     > of growth metrics I can get the dynamics by years easily, but for
>>> >     some I
>>> >     > have no idea and hope you could help me (may be also by
>>> providing the
>>> >     > numbers directly).
>>> >     >
>>> >     > Here a small list of metrics I thought of:
>>> >     >
>>> >     > #packages: very easy to count :)
>>> >     > #posts on GenABEL-devel: possible to count
>>> >     > #posts on forum: no idea how to do that for defined time periods
>>> >
>>> >     I guess you need to run a query on the database to get those. Our
>>> hoster
>>> >     has a phpmyadmin interface yuo can use for that (or you could
>>> probably
>>> >     use the SSH account and run the MySQL client from the command
>>> line).
>>> >     Probably a query along this line:
>>> >
>>> >      SELECT yearweek(date(from_unixtime(post_time))) AS week, COUNT(*)
>>> AS
>>> >     num_posts FROM phpbb_posts GROUP BY
>>> >     yearweek(date(from_unixtime(post_time)))
>>> >
>>> >
>>> > arrgh... probably I can figure this out if I had enough time, but gonna
>>> > to invest into presentation now. If you/someone could give a hand,
>>> would
>>> > be great :)
>>> >
>>> >
>>> >
>>> >
>>> >     > #number of lines of code in our SVN repo: no idea
>>> >
>>> >     Probably SLOCcount will help: http://www.dwheeler.com/sloccount/
>>> >
>>> >
>>> > This is a nice one! Two problems: it does not count/recognize R; did
>>> not
>>> > see how to use it to see the dynamics (what was there in repo 2 years
>>> > ago?..)
>>> >
>>> > But I like that even without the R code counts (which is 148,000
>>> lines),
>>> > for ~65,000 lines of mostly C/C++ I get the message indicating that
>>> > GenABEL is worth few millions of dollars:
>>> >
>>> > Development Effort Estimate, Person-Years (Person-Months) = 15.44
>>> (185.24)
>>> >  (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
>>> > Schedule Estimate, Years (Months)                         = 1.05
>>> (12.61)
>>> >  (Basic COCOMO model, Months = 2.5 * (person-months**0.38))
>>> > Total Estimated Cost to Develop                           = $ 2,085,323
>>> >  (average salary = $56,286/year, overhead = 2.40).
>>> >
>>> > So I think I should use these figures in my presentation :)
>>> >
>>> >     > #citations (GenA, ProbA...): easy to count thanks to Google
>>> Scholar
>>> >     > #mentions on the Web: ???
>>> >     >
>>> >     > Any other nice and easily computed metrics?
>>> >     >
>>> >     > I will appreciate your help and suggestions, and sorry for late
>>> >     notice.
>>> >     >
>>> >
>>> >
>>> >     Good luck,
>>> >
>>> >     Lennart.
>>> >
>>> >     > best,
>>> >     > Yurii
>>> >     >
>>> >     >
>>> >     >
>>> >     > _______________________________________________
>>> >     > genabel-devel mailing list
>>> >     > genabel-devel at lists.r-forge.r-project.org
>>> >     <mailto:genabel-devel at lists.r-forge.r-project.org>
>>> >     >
>>> >
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>> >     >
>>> >
>>> >
>>> >     --
>>> >     -----------------------------------------------------------------
>>> >     L.C. Karssen
>>> >     Utrecht
>>> >     The Netherlands
>>> >
>>> >     lennart at karssen.org <mailto:lennart at karssen.org>
>>> >     http://blog.karssen.org
>>> >
>>> >     Stuur mij aub geen Word of Powerpoint bestanden!
>>> >     Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
>>> >     ------------------------------------------------------------------
>>> >
>>> >
>>> >     _______________________________________________
>>> >     genabel-devel mailing list
>>> >     genabel-devel at lists.r-forge.r-project.org
>>> >     <mailto:genabel-devel at lists.r-forge.r-project.org>
>>> >
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > -----------------------------------------------------
>>> > Yurii S. Aulchenko
>>> >
>>> > [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter
>>> > <http://twitter.com/YuriiAulchenko> ] [ Blog
>>> > <http://yurii-aulchenko.blogspot.nl/> ]
>>>
>>> --
>>> -----------------------------------------------------------------
>>> L.C. Karssen
>>> Utrecht
>>> The Netherlands
>>>
>>> lennart at karssen.org
>>> http://blog.karssen.org
>>>
>>> Stuur mij aub geen Word of Powerpoint bestanden!
>>> Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
>>> ------------------------------------------------------------------
>>>
>>
>>
>>
>> --
>> -----------------------------------------------------
>> Yurii S. Aulchenko
>>
>> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
>> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>>
>
>
>
> --
> -----------------------------------------------------
> Yurii S. Aulchenko
>
> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130714/fe3a0727/attachment.html>

From lennart at karssen.org  Sun Jul 14 22:00:38 2013
From: lennart at karssen.org (L.C. Karssen)
Date: Sun, 14 Jul 2013 22:00:38 +0200
Subject: [GenABEL-dev] ProbABEL, chi^2, Wald and log-likelihood
In-Reply-To: <CAHX9t6+sPZyK3aCB9wErLbAy7ret=43c3bKhPa_H0nzYTQ2P0g@mail.gmail.com>
References: <51DF2A15.4020607@karssen.org>
 <CAHX9t6+sPZyK3aCB9wErLbAy7ret=43c3bKhPa_H0nzYTQ2P0g@mail.gmail.com>
Message-ID: <51E30366.4010201@karssen.org>

Thanks for the explanation Yurii.

On 12-07-13 01:41, Yurii Aulchenko wrote:
> In principle score, Wald, and LRT have to give similar answers in
> non-extreme cases. LRT is theoretically the most superior method (if
> underlying model assumptions, e.g. normality, hold).  Score / Wald are
> the approximations to LRT derived at the point of null/alternative,
> respectively. They actually ARE derived from quadratic approximations of
> the likleihood function derived at these points :) 

Interesting! I didn't know that.

> 
> As for practical advantages/disadvantages of these, may be someone else
> could comment. I remember there are good/bad sides in both...
> 
> Re: Wald on 2df - you can not add Walds from individual beta/se, you
> need to take the covariance into account.

I see, I guess adding them is only allowed when the two are independent
(hence no covariance). Right?

> For full treatment of the
> problem, see
> 
> http://www.math.chalmers.se/~wermuth/pdfs/86-95/CoxWer90_An_approximation_to_ML.pdf
> 

Thanks. Not an easy piece to read...

> For a simple variant, I think our ProbABEL paper does give some details
> on score/Wald. 
> 
> Would that be good idea to put this discussion topic to our "Journal
> club"? - these are kind of topics of general interest irrespective of
> GenABEL.
> 

Good idea. I'll see if I can find the time to start the discussion there.


Best,

Lennart.


> best,
> Yurii
> 
> On Thu, Jul 11, 2013 at 11:56 PM, L.C. Karssen <lennart at karssen.org
> <mailto:lennart at karssen.org>> wrote:
> 
>     Dear all,
> 
>     For the upcoming release of ProbABEL I've run into the following. In the
>     past (~ v 0.1-3) the output of ProbABEL had chi^2 values when doing Cox
>     regression. These were based on the likelihood ratio test:
>      2 * (loglik -loglik_null) ~ chi_1^2
>     However, at some point, when having hamissing data was allowed in
>     ProbABEL, we ran into the problem that the null model had to be
>     recalculated for cases with missing genotype data. To do that 'simply'
>     for each SNP would be time consuming, so the chi^2 values were removed
>     from the output and replaced by the loglik values for the full model.
>     (At least, that's how I guess it went).
> 
>     Now, I would like to get them back. This can be done in two ways:
>     1) calculate chi^2 as described above, with some smart way of only
>     recalculating the null model when a missing value occurs (this shouldn't
>     be often with today's imputed data).
>     2) simply calculate the chi^2 value through the Wald test. We have betas
>     and se_betas, so that is easy.
> 
>     Many of you have more knowledge about statistics than I do, so,
>     statistically, are these methods equivalent? Or is one better (more
>     precise/unbiased) than the other?
> 
> 
>     Another question:
>     While testing the Wald-type implementation I ran into the following:
>     I would assume that for the 2df models (where we get beta_SNP_A1A2 and
>     beta_SNP_A1A1) the final chi^2 value would be the sum of the individual
>     Wald statistics, which would be distributed as chi_2^2 (so 2 df). Is
>     that correct? I ask this because if I compare them with the chi^2 values
>     from the LRT I get different values. In the example data set I get:
>     name      chi^2_Wald        chi^2_LRT
>     rs7247199 0.880949           0.452465
>     rs8102643 0.0116651          0.512709   <- here we have a missing value!
>     rs8102615 1.51434            0.754701
>     rs8105536 2.56337            1.33223
>     rs2312724 0.492364           0.256649
> 
>     When running the additive model I do get (almost) the same results:
>     name       chi^2_Wald        chi^2_LRT
>     rs7247199  0.0101558          0.01012
>     rs8102643  0.353168           0.492147  <- here we have a missing value!
>     rs8102615  0.0181841          0.0180033
>     rs8105536  0.00222781         0.00222216
>     rs2312724  0.0412005          0.0401556
> 
>     Shouldn't the chi_2 values be equal in both cases? FYI: the LRT chi^2
>     values are the same as those obtained with ProbABEL v0.1-3.
> 
> 
>     Any suggestions?
>     Thanks,
> 
>     Lennart.
> 
>     --
>     -----------------------------------------------------------------
>     L.C. Karssen
>     Utrecht
>     The Netherlands
> 
>     lennart at karssen.org <mailto:lennart at karssen.org>
>     http://blog.karssen.org
> 
>     Stuur mij aub geen Word of Powerpoint bestanden!
>     Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
>     ------------------------------------------------------------------
> 
> 
>     _______________________________________________
>     genabel-devel mailing list
>     genabel-devel at lists.r-forge.r-project.org
>     <mailto:genabel-devel at lists.r-forge.r-project.org>
>     https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> 
> 
> 
> 
> -- 
> -----------------------------------------------------
> Yurii S. Aulchenko
> 
> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter
> <http://twitter.com/YuriiAulchenko> ] [ Blog
> <http://yurii-aulchenko.blogspot.nl/> ]

-- 
-----------------------------------------------------------------
L.C. Karssen
Utrecht
The Netherlands

lennart at karssen.org
http://blog.karssen.org

Stuur mij aub geen Word of Powerpoint bestanden!
Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
------------------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130714/ad2e6eb5/attachment-0001.sig>

From alvaro.frank at rwth-aachen.de  Mon Jul 15 17:07:26 2013
From: alvaro.frank at rwth-aachen.de (Alvaro Jesus Frank)
Date: Mon, 15 Jul 2013 17:07:26 +0200
Subject: [GenABEL-dev]  multiple ProbABEL's palinear runs
Message-ID: <fbb490a416c2be.51e42c4e@rwth-aachen.de>


Dear all,

I am working on a high performance implementation of an ordinary linear estimator (OLS model), similar to the one implemented in ProbABEL's palinear (without --mmscore option), where X are SNP given and Y are the phenotypes.
(As given by the ProbABEl manual on section 7 "Methodology" at http://www.genabel.org/sites/default/files/pdfs/ProbABEL_manual.pdf)


 b = (X'*X)^-1 * X' * y.

The goal is to solve this with multiple design matrices (SNPs??) X and Phenotypes Y. For this we compute the formula as

for each X
   for each Y
       b=(X'*X)^-1 * X' * y.


We want to offer the GenABEL community an Estimator to be used in the same way people use the current tools (ProbABEL in R), but faster, and capable of handling LARGE datasets (in disk & memory).
That is why I am writing it in C++, while making sure that it can be called directly from R.

My understanding:
A few concerns came to mind when researching the workflow in using OMICS data in Linear Estimators.
There seems to be a long process before the real life data from MaCH (test.mldose? for X and mlinfo? for Y) that is sitting on files can be used in calculations. The first concern is how to obtain the design matrices X from the files.

It is my understanding that there are two types of data, imputed data and databel data. Either way, data seems to be pre-processed early in the workflow; my impression is that this preprocessing is done in R. It also seems that R can't handle large amounts of data loaded in memory at once.

>From what I see, data comes with some irregularities in its values (missing values, invalid rows in X/Y matrices), and this makes it difficult to use Linear Estimators right away; this is why the preprocessing exists. DatABEL seems to be the R tool (implemented in C++) that can do fast pre-processing of big sets of data. Well, I think that DatABEL only does the reading and writing of files in C++ (called filevector), while the pre-processing functions are defined and implemented in R. Am I correct?


My Problems:
This is where my troubles start. Since I am trying to make this tool usable for the GenABEL community while still being able to handle TERABYTES of data with fast computations, I would really like to include the preprocessing of X and Y into my C++ workflow. To solve the memory and performance limitations of R, I am trying to load the data from disk from within C++. Since I am performing my estimator function in C++, it expects those matrices to have numbers that can be used for computation. Assuming that data must be preprocessed to be able to get valid matrices with usable numbers, I have the following options:

A)
For performance reasons, I was considering having the data already pre-processed in disk files. Is this feasible, (preprocessed data would take at most as much space in disk as original data, is this cumbersome)?

B)
If there are only a few preprocessing functions that people use, I could re-implement them inside C++ and use them on the fly while loading the data from disk. This would be more difficult if everyone has their own customized R pre-processing functions.

C)
Another alternative is to allow users to use their own R pre-processing functions that pre-process the data. I would then go about preprocessing on the fly from inside C++ by doing calls back to R. This would be slower and harder to do than B).

D)
If DatABEL really does all the necesary pre-processing from inside C++, I could just directly use it or allow the user to specify what to use and won't need to re-implement the pre-processing functions. It seems tho, that preprocessing of the data takes from 30mins to an hour into DatABEL filevector format.


I would really appreciate any help that would clarify my understanding of how the pre-processing of data works and where it fits in the work-flow.

Best regards,

- Alvaro Frank

From yurii.aulchenko at gmail.com  Mon Jul 15 22:02:09 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Mon, 15 Jul 2013 22:02:09 +0200
Subject: [GenABEL-dev] layout of GenABEL main page
In-Reply-To: <CAHX9t6KV69R_KLLJ_0Kqnys+nndAs=J-_s3kZHUd7N4dEVObgQ@mail.gmail.com>
References: <fb5e10b31a0eb5.51d2d197@aices.rwth-aachen.de>
 <51D2C34D.2000907@gmail.com>
 <CAHX9t6LXPDT7UR3+Dn1htv1pWBuvJtSAn_zi3ANmt3FYa5pnLw@mail.gmail.com>
 <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it>
 <CAHX9t6KDjHXVM8PLWcNxr0CNGJg=LcKvLcmMSOYfotd5scXomA@mail.gmail.com>
 <6632A424-420E-423B-957A-3B8481DD0122@burlo.trieste.it>
 <CAHX9t6KV69R_KLLJ_0Kqnys+nndAs=J-_s3kZHUd7N4dEVObgQ@mail.gmail.com>
Message-ID: <CAHX9t6+z+rRpaZMWUyyN2H+MH86tBbZ6sSX+4PoUzsYXuG2deA@mail.gmail.com>

Dear All,

a small update - I have original vector graphics files from Grant at my
disposal; if some people would like to play with these files, send me a
message and I can forward the vector files to you.

best,
Yurii

On Fri, Jul 5, 2013 at 3:09 PM, Yurii Aulchenko
<yurii.aulchenko at gmail.com>wrote:

>
>
> On Fri, Jul 5, 2013 at 3:05 PM, Nicola Pirastu <
> nicola.pirastu at burlo.trieste.it> wrote:
>
>>  I agree, in the end it's not the coca-cola logo and we have not been
>> using it for years so I don't think people are going to be confused if the
>> Logo changes in a few months.
>>
>>
> More than that - I really think it should evolve as our project does :)
>
>
>
>>  I am actually curious to see how it will look on the forum. I do think
>> that if it's not too much work, the colors of the forum and website should
>> match those of the logo though.
>>
>
> Yep. I now start understanding why people were giving the costs estimates
> of few thousands of euro for the that basic design package: e.g. for
> facebook we need cover and avatar (latter would do for the twitter as
> well). So this is whole project :)
>
> May be later we should think of inviting some guys from a design school -
> they must be looking for graduation projects to make, and may be they would
> be willing to do that for free :)
>
> YA
>
>
>>
>>  Nicola
>>
>>
>> Dr. Nicola Pirastu PhD
>> Research Fellow
>> Medical Sciences, Chirurgical and Health Department
>> University of Trieste
>> Medical Genetics
>> IRCCS Burlo Garofolo
>> Via dell'Istria 65/1
>> 34137 Italy
>> tel. +390403785539
>>
>>  Il giorno 05/lug/2013, alle ore 14:55, Yurii Aulchenko <
>> yurii.aulchenko at gmail.com> ha scritto:
>>
>> I suggest that for the moment we go with what we have (Grant's variant);
>> we can change later.
>>
>>  Please let me know if you have a strong opinion against! - I really
>> would like to use the logo for my presentation and also play a bit how well
>> it fits our pages (genabel.org, facebook, twitter)
>>
>>  YA
>>
>> On Tue, Jul 2, 2013 at 4:27 PM, Nicola Pirastu <
>> nicola.pirastu at burlo.trieste.it> wrote:
>>
>>> Just to add my two cents to the discussion,
>>>
>>>  I think that the problem is not with the DNA helix but with the font.
>>> I've played around a bit with it and if you use for example Helvetica or
>>> something less comic-sans-like it does look better. Also for some reason
>>> I'm still disturbed by the green but it is a very personal opinion..
>>>
>>>  Nicola
>>>
>>>  Dr. Nicola Pirastu PhD
>>> Research Fellow
>>> Medical Sciences, Chirurgical and Health Department
>>> University of Trieste
>>> Medical Genetics
>>> IRCCS Burlo Garofolo
>>> Via dell'Istria 65/1
>>> 34137 Italy
>>> tel. +390403785539
>>>
>>>  Il giorno 02/lug/2013, alle ore 14:38, Yurii Aulchenko <
>>> yurii.aulchenko at gmail.com> ha scritto:
>>>
>>> Dear All,
>>>
>>>  I agree with critique of Maarten, and I actually still not sure if I
>>> like Maarten's or Grant's idea better. Interesting thing is that - not sure
>>> all realize it - Grant's variant is his vision of Maarten's prototype :)
>>> However, Grant's variant has an important advantage - it is ready to serve
>>> as logo. And I actually want to use a logo in my slides for UseR!-2013.
>>>
>>>  So I suggest we take Grant's logo as a working variant. No doubt that
>>> the logo is going to evolve with time - as anything we do in the project -
>>> code, documentation; logo is no different, I think. The element which is
>>> going to stay and keep it recognizable is the way of spelling the GenABEL
>>> :) - Like Gnu's horns in the GNU logo.
>>>
>>>  What we can do next is to place an open call on site/forum for other
>>> users to contribute, but this is going to take time, and meanwhile I
>>> suggest to stick with Grant's variant.
>>>
>>>  Yurii
>>>
>>> On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman <kooyman at gmail.com>wrote:
>>>
>>>> Dear all,
>>>>
>>>>
>>>> It looks really nice ! Credits for who made it.  However, I have more
>>>> the impression that it looks like a polypeptide chain or a rosary. The
>>>> seventies font is a matter of taste, but it remind me of comic
>>>> sans(including a upside down e as a). I wonder if it readable if you print
>>>> it on a poster: I think this is a important use-case of a scientific logo.
>>>>
>>>> Kind regards,
>>>>
>>>>
>>>> Maarten
>>>>
>>>>
>>>>
>>>>
>>>> On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote:
>>>>
>>>>>  On 28/06/13, Yurii Aulchenko  <yurii.aulchenko at gmail.com> wrote:
>>>>>
>>>>>  How do you like this one?
>>>>>>
>>>>> I like it a lot.
>>>>>
>>>>> What do you think about reducing the font size for the subtitle
>>>>> and right-justifying it? Would it still be readable? I liked that
>>>>> detail from the previous attempts with the "Project" subtitle.
>>>>>
>>>>> In any case, this is just a minor detail. It looks great as it is.
>>>>>
>>>>> Thanks to Grant Borodin!
>>>>>
>>>>>
>>>>>> YA
>>>>>>
>>>>>>
>>>>>> On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko <
>>>>>> yurii.aulchenko at gmail.com(**javascript:main.compose()> wrote:
>>>>>>
>>>>>>
>>>>>>  Dear Nicola, Diego, Lennart,
>>>>>>>
>>>>>>>
>>>>>>> Thanks for your feedback! I will ask Grant Borodin, who kindly
>>>>>>> designed these logos, if he could change C according to your comment
>>>>>>> (capital "ABEL" and "statistical genomics" as in F).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Yurii
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver <
>>>>>>> fabregat at aices.rwth-aachen.de**(javascript:main.compose()> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Congrats to whoever designed these logos, they look very nice :)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> With respect to my preferences, I fully agree with Lennart: "C with
>>>>>>>> capital ABEL and statistical genomics below it" would be my choice.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Diego
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 20/06/13, "L.C. Karssen"  <lennart at karssen.org(**javascript:main.compose()>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  Wow! Those look really nice!
>>>>>>>>> I like options C and F the most. Actually a combination would be
>>>>>>>>> even
>>>>>>>>> better IMHO: use C with capital ABEL and statistical genomics
>>>>>>>>> below it.
>>>>>>>>> Looking forward to head the opinion of others,
>>>>>>>>> Lennart.
>>>>>>>>> On 20-06-13 09:34, Yurii Aulchenko wrote:
>>>>>>>>>
>>>>>>>>>> Please find attached few more logo variants
>>>>>>>>>> Yurii
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>   ______________________________**_________________
>>>>> genabel-devel mailing list
>>>>> genabel-devel at lists.r-forge.r-**project.org<genabel-devel at lists.r-forge.r-project.org>
>>>>> https://lists.r-forge.r-**project.org/cgi-bin/mailman/**
>>>>> listinfo/genabel-devel<https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel>
>>>>>
>>>>
>>>> ______________________________**_________________
>>>> genabel-devel mailing list
>>>> genabel-devel at lists.r-forge.r-**project.org<genabel-devel at lists.r-forge.r-project.org>
>>>> https://lists.r-forge.r-**project.org/cgi-bin/mailman/**
>>>> listinfo/genabel-devel<https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel>
>>>>
>>>
>>>
>>>
>>>  --
>>> -----------------------------------------------------
>>> Yurii S. Aulchenko
>>>
>>>  [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
>>> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>>>  _______________________________________________
>>> genabel-devel mailing list
>>> genabel-devel at lists.r-forge.r-project.org
>>>
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>
>>>
>>> AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute
>>> nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel
>>> messaggio, o responsabili per la sua consegna alla persona, o se avete
>>> ricevuto il messaggio per errore, siete pregati di non trascriverlo,
>>> copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il
>>> messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential
>>> information may be contained in this message or in its attachments. If you
>>> are not the addressee indicated in this message, or responsible for message
>>> delivering to that person, or if you have received this message in error,
>>> you may not transcribe, copy or deliver this message to anyone. In that
>>> case, you should delete this message and its attachments. Thank you.
>>>
>>
>>
>>
>>  --
>> -----------------------------------------------------
>> Yurii S. Aulchenko
>>
>>  [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
>> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>>
>>
>>  AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute
>> nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel
>> messaggio, o responsabili per la sua consegna alla persona, o se avete
>> ricevuto il messaggio per errore, siete pregati di non trascriverlo,
>> copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il
>> messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential
>> information may be contained in this message or in its attachments. If you
>> are not the addressee indicated in this message, or responsible for message
>> delivering to that person, or if you have received this message in error,
>> you may not transcribe, copy or deliver this message to anyone. In that
>> case, you should delete this message and its attachments. Thank you.
>>
>
>
>
> --
> -----------------------------------------------------
> Yurii S. Aulchenko
>
> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130715/515d4478/attachment-0001.html>

From yurii.aulchenko at gmail.com  Mon Jul 15 10:06:55 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Mon, 15 Jul 2013 10:06:55 +0200
Subject: [GenABEL-dev] ProbABEL, chi^2, Wald and log-likelihood
In-Reply-To: <51E30366.4010201@karssen.org>
References: <51DF2A15.4020607@karssen.org>
 <CAHX9t6+sPZyK3aCB9wErLbAy7ret=43c3bKhPa_H0nzYTQ2P0g@mail.gmail.com>
 <51E30366.4010201@karssen.org>
Message-ID: <CAHX9t6LF=D38wruFAhgoHZM=e4MMPAFmh4AMBusN3+bujDHHwA@mail.gmail.com>

On Sun, Jul 14, 2013 at 10:00 PM, L.C. Karssen <lennart at karssen.org> wrote:

> Thanks for the explanation Yurii.
>
> On 12-07-13 01:41, Yurii Aulchenko wrote:
> > In principle score, Wald, and LRT have to give similar answers in
> > non-extreme cases. LRT is theoretically the most superior method (if
> > underlying model assumptions, e.g. normality, hold).  Score / Wald are
> > the approximations to LRT derived at the point of null/alternative,
> > respectively. They actually ARE derived from quadratic approximations of
> > the likleihood function derived at these points :)
>
> Interesting! I didn't know that.
>

Yep, this is quite interesting. I think David Clayton's book (Statistical
Models in Epi?) gives very simple and clear explanation of how you get to
the score and Wald from LRT - very nice reading.


>
> >
> > As for practical advantages/disadvantages of these, may be someone else
> > could comment. I remember there are good/bad sides in both...
> >
> > Re: Wald on 2df - you can not add Walds from individual beta/se, you
> > need to take the covariance into account.
>
> I see, I guess adding them is only allowed when the two are independent
> (hence no covariance). Right?
>

True. And zero-covariance is definitely not the case with the 2df test :)


>
> > For full treatment of the
> > problem, see
> >
> >
> http://www.math.chalmers.se/~wermuth/pdfs/86-95/CoxWer90_An_approximation_to_ML.pdf
> >
>
> Thanks. Not an easy piece to read...
>

It is not, but at the end it is simple (see the ProbABEL paper)...
unfortunately this is one of these "simple" things which are "so simple"
after you have figured them out - and after some time you only remember
that they were "simple", but not exact way how it works (this is why I
refer you to papers).


>
> > For a simple variant, I think our ProbABEL paper does give some details
> > on score/Wald.
> >
> > Would that be good idea to put this discussion topic to our "Journal
> > club"? - these are kind of topics of general interest irrespective of
> > GenABEL.
> >
>
> Good idea. I'll see if I can find the time to start the discussion there.
>
>
> Best,
>
> Lennart.
>
>
> > best,
> > Yurii
> >
> > On Thu, Jul 11, 2013 at 11:56 PM, L.C. Karssen <lennart at karssen.org
> > <mailto:lennart at karssen.org>> wrote:
> >
> >     Dear all,
> >
> >     For the upcoming release of ProbABEL I've run into the following. In
> the
> >     past (~ v 0.1-3) the output of ProbABEL had chi^2 values when doing
> Cox
> >     regression. These were based on the likelihood ratio test:
> >      2 * (loglik -loglik_null) ~ chi_1^2
> >     However, at some point, when having hamissing data was allowed in
> >     ProbABEL, we ran into the problem that the null model had to be
> >     recalculated for cases with missing genotype data. To do that
> 'simply'
> >     for each SNP would be time consuming, so the chi^2 values were
> removed
> >     from the output and replaced by the loglik values for the full model.
> >     (At least, that's how I guess it went).
> >
> >     Now, I would like to get them back. This can be done in two ways:
> >     1) calculate chi^2 as described above, with some smart way of only
> >     recalculating the null model when a missing value occurs (this
> shouldn't
> >     be often with today's imputed data).
> >     2) simply calculate the chi^2 value through the Wald test. We have
> betas
> >     and se_betas, so that is easy.
> >
> >     Many of you have more knowledge about statistics than I do, so,
> >     statistically, are these methods equivalent? Or is one better (more
> >     precise/unbiased) than the other?
> >
> >
> >     Another question:
> >     While testing the Wald-type implementation I ran into the following:
> >     I would assume that for the 2df models (where we get beta_SNP_A1A2
> and
> >     beta_SNP_A1A1) the final chi^2 value would be the sum of the
> individual
> >     Wald statistics, which would be distributed as chi_2^2 (so 2 df). Is
> >     that correct? I ask this because if I compare them with the chi^2
> values
> >     from the LRT I get different values. In the example data set I get:
> >     name      chi^2_Wald        chi^2_LRT
> >     rs7247199 0.880949           0.452465
> >     rs8102643 0.0116651          0.512709   <- here we have a missing
> value!
> >     rs8102615 1.51434            0.754701
> >     rs8105536 2.56337            1.33223
> >     rs2312724 0.492364           0.256649
> >
> >     When running the additive model I do get (almost) the same results:
> >     name       chi^2_Wald        chi^2_LRT
> >     rs7247199  0.0101558          0.01012
> >     rs8102643  0.353168           0.492147  <- here we have a missing
> value!
> >     rs8102615  0.0181841          0.0180033
> >     rs8105536  0.00222781         0.00222216
> >     rs2312724  0.0412005          0.0401556
> >
> >     Shouldn't the chi_2 values be equal in both cases? FYI: the LRT chi^2
> >     values are the same as those obtained with ProbABEL v0.1-3.
> >
> >
> >     Any suggestions?
> >     Thanks,
> >
> >     Lennart.
> >
> >     --
> >     -----------------------------------------------------------------
> >     L.C. Karssen
> >     Utrecht
> >     The Netherlands
> >
> >     lennart at karssen.org <mailto:lennart at karssen.org>
> >     http://blog.karssen.org
> >
> >     Stuur mij aub geen Word of Powerpoint bestanden!
> >     Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
> >     ------------------------------------------------------------------
> >
> >
> >     _______________________________________________
> >     genabel-devel mailing list
> >     genabel-devel at lists.r-forge.r-project.org
> >     <mailto:genabel-devel at lists.r-forge.r-project.org>
> >
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> >
> >
> >
> >
> > --
> > -----------------------------------------------------
> > Yurii S. Aulchenko
> >
> > [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter
> > <http://twitter.com/YuriiAulchenko> ] [ Blog
> > <http://yurii-aulchenko.blogspot.nl/> ]
>
> --
> -----------------------------------------------------------------
> L.C. Karssen
> Utrecht
> The Netherlands
>
> lennart at karssen.org
> http://blog.karssen.org
>
> Stuur mij aub geen Word of Powerpoint bestanden!
> Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
> ------------------------------------------------------------------
>
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130715/99fbc0d7/attachment.html>

From kooyman at gmail.com  Tue Jul 16 09:13:37 2013
From: kooyman at gmail.com (Maarten Kooyman)
Date: Tue, 16 Jul 2013 09:13:37 +0200
Subject: [GenABEL-dev] layout of GenABEL main page
In-Reply-To: <CAHX9t6+z+rRpaZMWUyyN2H+MH86tBbZ6sSX+4PoUzsYXuG2deA@mail.gmail.com>
References: <fb5e10b31a0eb5.51d2d197@aices.rwth-aachen.de>
 <51D2C34D.2000907@gmail.com>
 <CAHX9t6LXPDT7UR3+Dn1htv1pWBuvJtSAn_zi3ANmt3FYa5pnLw@mail.gmail.com>
 <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it>
 <CAHX9t6KDjHXVM8PLWcNxr0CNGJg=LcKvLcmMSOYfotd5scXomA@mail.gmail.com>
 <6632A424-420E-423B-957A-3B8481DD0122@burlo.trieste.it>
 <CAHX9t6KV69R_KLLJ_0Kqnys+nndAs=J-_s3kZHUd7N4dEVObgQ@mail.gmail.com>
 <CAHX9t6+z+rRpaZMWUyyN2H+MH86tBbZ6sSX+4PoUzsYXuG2deA@mail.gmail.com>
Message-ID: <51E4F2A1.4040002@gmail.com>

Hi Yurii,

Under what kind of licence are the logo's available? Maybe it is handy 
to put them on the website for easy access.

Kind regards,

Maarten

On 07/15/2013 10:02 PM, Yurii Aulchenko wrote:
> Dear All,
>
> a small update - I have original vector graphics files from Grant at 
> my disposal; if some people would like to play with these files, send 
> me a message and I can forward the vector files to you.
>
> best,
> Yurii
>
> On Fri, Jul 5, 2013 at 3:09 PM, Yurii Aulchenko 
> <yurii.aulchenko at gmail.com <mailto:yurii.aulchenko at gmail.com>> wrote:
>
>
>
>     On Fri, Jul 5, 2013 at 3:05 PM, Nicola Pirastu
>     <nicola.pirastu at burlo.trieste.it
>     <mailto:nicola.pirastu at burlo.trieste.it>> wrote:
>
>         I agree, in the end it's not the coca-cola logo and we have
>         not been using it for years so I don't think people are going
>         to be confused if the Logo changes in a few months.
>
>
>     More than that - I really think it should evolve as our project
>     does :)
>
>         I am actually curious to see how it will look on the forum. I
>         do think that if it's not too much work, the colors of the
>         forum and website should match those of the logo though.
>
>
>     Yep. I now start understanding why people were giving the costs
>     estimates of few thousands of euro for the that basic design
>     package: e.g. for facebook we need cover and avatar (latter would
>     do for the twitter as well). So this is whole project :)
>
>     May be later we should think of inviting some guys from a design
>     school - they must be looking for graduation projects to make, and
>     may be they would be willing to do that for free :)
>
>     YA
>
>
>         Nicola
>
>
>         Dr. Nicola Pirastu PhD
>         Research Fellow
>         Medical Sciences, Chirurgical and Health Department
>         University of Trieste
>         Medical Genetics
>         IRCCS Burlo Garofolo
>         Via dell'Istria 65/1
>         34137 Italy
>         tel. +390403785539
>
>         Il giorno 05/lug/2013, alle ore 14:55, Yurii Aulchenko
>         <yurii.aulchenko at gmail.com <mailto:yurii.aulchenko at gmail.com>>
>         ha scritto:
>
>>         I suggest that for the moment we go with what we have
>>         (Grant's variant); we can change later.
>>
>>         Please let me know if you have a strong opinion against! - I
>>         really would like to use the logo for my presentation and
>>         also play a bit how well it fits our pages (genabel.org
>>         <http://genabel.org/>, facebook, twitter)
>>
>>         YA
>>
>>         On Tue, Jul 2, 2013 at 4:27 PM, Nicola Pirastu
>>         <nicola.pirastu at burlo.trieste.it
>>         <mailto:nicola.pirastu at burlo.trieste.it>> wrote:
>>
>>             Just to add my two cents to the discussion,
>>
>>             I think that the problem is not with the DNA helix but
>>             with the font. I've played around a bit with it and if
>>             you use for example Helvetica or something less
>>             comic-sans-like it does look better. Also for some reason
>>             I'm still disturbed by the green but it is a very
>>             personal opinion..
>>
>>             Nicola
>>
>>             Dr. Nicola Pirastu PhD
>>             Research Fellow
>>             Medical Sciences, Chirurgical and Health Department
>>             University of Trieste
>>             Medical Genetics
>>             IRCCS Burlo Garofolo
>>             Via dell'Istria 65/1
>>             34137 Italy
>>             tel. +390403785539
>>
>>             Il giorno 02/lug/2013, alle ore 14:38, Yurii Aulchenko
>>             <yurii.aulchenko at gmail.com
>>             <mailto:yurii.aulchenko at gmail.com>> ha scritto:
>>
>>>             Dear All,
>>>
>>>             I agree with critique of Maarten, and I actually still
>>>             not sure if I like Maarten's or Grant's idea better.
>>>             Interesting thing is that - not sure all realize it -
>>>             Grant's variant is his vision of Maarten's prototype :)
>>>             However, Grant's variant has an important advantage - it
>>>             is ready to serve as logo. And I actually want to use a
>>>             logo in my slides for UseR!-2013.
>>>
>>>             So I suggest we take Grant's logo as a working variant.
>>>             No doubt that the logo is going to evolve with time - as
>>>             anything we do in the project - code, documentation;
>>>             logo is no different, I think. The element which is
>>>             going to stay and keep it recognizable is the way of
>>>             spelling the GenABEL :) - Like Gnu's horns in the GNU logo.
>>>
>>>             What we can do next is to place an open call on
>>>             site/forum for other users to contribute, but this is
>>>             going to take time, and meanwhile I suggest to stick
>>>             with Grant's variant.
>>>
>>>             Yurii
>>>
>>>             On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman
>>>             <kooyman at gmail.com <mailto:kooyman at gmail.com>> wrote:
>>>
>>>                 Dear all,
>>>
>>>
>>>                 It looks really nice ! Credits for who made it.
>>>                  However, I have more the impression that it looks
>>>                 like a polypeptide chain or a rosary. The seventies
>>>                 font is a matter of taste, but it remind me of comic
>>>                 sans(including a upside down e as a). I wonder if it
>>>                 readable if you print it on a poster: I think this
>>>                 is a important use-case of a scientific logo.
>>>
>>>                 Kind regards,
>>>
>>>
>>>                 Maarten
>>>
>>>
>>>
>>>
>>>                 On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote:
>>>
>>>                     On 28/06/13, Yurii Aulchenko
>>>                      <yurii.aulchenko at gmail.com
>>>                     <mailto:yurii.aulchenko at gmail.com>> wrote:
>>>
>>>                         How do you like this one?
>>>
>>>                     I like it a lot.
>>>
>>>                     What do you think about reducing the font size
>>>                     for the subtitle
>>>                     and right-justifying it? Would it still be
>>>                     readable? I liked that
>>>                     detail from the previous attempts with the
>>>                     "Project" subtitle.
>>>
>>>                     In any case, this is just a minor detail. It
>>>                     looks great as it is.
>>>
>>>                     Thanks to Grant Borodin!
>>>
>>>                         YA
>>>
>>>
>>>                         On Thu, Jun 27, 2013 at 1:16 PM, Yurii
>>>                         Aulchenko <yurii.aulchenko at gmail.com
>>>                         <mailto:yurii.aulchenko at gmail.com>(javascript:main.compose()>
>>>                         wrote:
>>>
>>>
>>>                             Dear Nicola, Diego, Lennart,
>>>
>>>
>>>                             Thanks for your feedback! I will ask
>>>                             Grant Borodin, who kindly designed these
>>>                             logos, if he could change C according to
>>>                             your comment (capital "ABEL" and
>>>                             "statistical genomics" as in F).
>>>
>>>
>>>
>>>
>>>                             Yurii
>>>
>>>
>>>
>>>                             On Wed, Jun 26, 2013 at 4:16 PM, Diego
>>>                             Fabregat Traver
>>>                             <fabregat at aices.rwth-aachen.de
>>>                             <mailto:fabregat at aices.rwth-aachen.de>(javascript:main.compose()>
>>>                             wrote:
>>>
>>>
>>>
>>>
>>>                                 Congrats to whoever designed these
>>>                                 logos, they look very nice :)
>>>
>>>
>>>
>>>                                 With respect to my preferences, I
>>>                                 fully agree with Lennart: "C with
>>>                                 capital ABEL and statistical
>>>                                 genomics below it" would be my choice.
>>>
>>>
>>>
>>>                                 Best,
>>>
>>>                                 Diego
>>>
>>>
>>>
>>>
>>>
>>>
>>>                                 On 20/06/13, "L.C. Karssen"
>>>                                  <lennart at karssen.org
>>>                                 <mailto:lennart at karssen.org>(javascript:main.compose()>
>>>                                 wrote:
>>>
>>>
>>>
>>>                                     Wow! Those look really nice!
>>>                                     I like options C and F the most.
>>>                                     Actually a combination would be even
>>>                                     better IMHO: use C with capital
>>>                                     ABEL and statistical genomics
>>>                                     below it.
>>>                                     Looking forward to head the
>>>                                     opinion of others,
>>>                                     Lennart.
>>>                                     On 20-06-13 09:34, Yurii
>>>                                     Aulchenko wrote:
>>>
>>>                                         Please find attached few
>>>                                         more logo variants
>>>                                         Yurii
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>                     _______________________________________________
>>>                     genabel-devel mailing list
>>>                     genabel-devel at lists.r-forge.r-project.org
>>>                     <mailto:genabel-devel at lists.r-forge.r-project.org>
>>>                     https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>
>>>
>>>                 _______________________________________________
>>>                 genabel-devel mailing list
>>>                 genabel-devel at lists.r-forge.r-project.org
>>>                 <mailto:genabel-devel at lists.r-forge.r-project.org>
>>>                 https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>
>>>
>>>
>>>
>>>             -- 
>>>             -----------------------------------------------------
>>>             Yurii S. Aulchenko
>>>
>>>             [ LinkedIn
>>>             <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter
>>>             <http://twitter.com/YuriiAulchenko> ] [ Blog
>>>             <http://yurii-aulchenko.blogspot.nl/> ]
>>>             _______________________________________________
>>>             genabel-devel mailing list
>>>             genabel-devel at lists.r-forge.r-project.org
>>>             <mailto:genabel-devel at lists.r-forge.r-project.org>
>>>             https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>
>>             AVVISO DI RISERVATEZZA Informazioni riservate possono
>>             essere contenute nel messaggio o nei suoi allegati. Se
>>             non siete i destinatari indicati nel messaggio, o
>>             responsabili per la sua consegna alla persona, o se avete
>>             ricevuto il messaggio per errore, siete pregati di non
>>             trascriverlo, copiarlo o inviarlo a nessuno. In tal caso
>>             vi invitiamo a cancellare il messaggio ed i suoi
>>             allegati. Grazie. CONFIDENTIALITY NOTICE Confidential
>>             information may be contained in this message or in its
>>             attachments. If you are not the addressee indicated in
>>             this message, or responsible for message delivering to
>>             that person, or if you have received this message in
>>             error, you may not transcribe, copy or deliver this
>>             message to anyone. In that case, you should delete this
>>             message and its attachments. Thank you.
>>
>>
>>
>>
>>         -- 
>>         -----------------------------------------------------
>>         Yurii S. Aulchenko
>>
>>         [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
>>         Twitter <http://twitter.com/YuriiAulchenko> ] [ Blog
>>         <http://yurii-aulchenko.blogspot.nl/> ]
>
>         AVVISO DI RISERVATEZZA Informazioni riservate possono essere
>         contenute nel messaggio o nei suoi allegati. Se non siete i
>         destinatari indicati nel messaggio, o responsabili per la sua
>         consegna alla persona, o se avete ricevuto il messaggio per
>         errore, siete pregati di non trascriverlo, copiarlo o inviarlo
>         a nessuno. In tal caso vi invitiamo a cancellare il messaggio
>         ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE
>         Confidential information may be contained in this message or
>         in its attachments. If you are not the addressee indicated in
>         this message, or responsible for message delivering to that
>         person, or if you have received this message in error, you may
>         not transcribe, copy or deliver this message to anyone. In
>         that case, you should delete this message and its attachments.
>         Thank you.
>
>
>
>
>     -- 
>     -----------------------------------------------------
>     Yurii S. Aulchenko
>
>     [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter
>     <http://twitter.com/YuriiAulchenko> ] [ Blog
>     <http://yurii-aulchenko.blogspot.nl/> ]
>
>
>
>
> -- 
> -----------------------------------------------------
> Yurii S. Aulchenko
>
> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter 
> <http://twitter.com/YuriiAulchenko> ] [ Blog 
> <http://yurii-aulchenko.blogspot.nl/> ]
>
>
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130716/40b0db6b/attachment-0001.html>

From lennart at karssen.org  Tue Jul 16 17:11:56 2013
From: lennart at karssen.org (L.C. Karssen)
Date: Tue, 16 Jul 2013 17:11:56 +0200
Subject: [GenABEL-dev] Creation of genabel-announce mailing list
Message-ID: <51E562BC.20605@karssen.org>

Dear all,

I've created a new mailing list on the r-forge page. The address is
genabel-announce at lists.r-forge.r-project.org and its intended to be used
by package maintainers to announce new version of their packages (or
completely new packages) so that users who want to stay up to date only
need to subscribe to this list to be informed.

By default e-mails to this list will be held until approved by the list
owner or a list moderator.
At present I have listed myself and Yurii as list-owners.


Best,

Lennart.


-- 
-----------------------------------------------------------------
L.C. Karssen
Utrecht
The Netherlands

lennart at karssen.org
http://blog.karssen.org

Stuur mij aub geen Word of Powerpoint bestanden!
Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
------------------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130716/71e591bc/attachment.sig>

From lennart at karssen.org  Tue Jul 16 17:25:55 2013
From: lennart at karssen.org (L.C. Karssen)
Date: Tue, 16 Jul 2013 17:25:55 +0200
Subject: [GenABEL-dev] Creation of genabel-announce mailing list
In-Reply-To: <51E562BC.20605@karssen.org>
References: <51E562BC.20605@karssen.org>
Message-ID: <51E56603.9080809@karssen.org>

I've added an announcement on the GenABEL.org wesite as well:
http://www.genabel.org/node/284


Lennart.

On 16-07-13 17:11, L.C. Karssen wrote:
> Dear all,
> 
> I've created a new mailing list on the r-forge page. The address is
> genabel-announce at lists.r-forge.r-project.org and its intended to be used
> by package maintainers to announce new version of their packages (or
> completely new packages) so that users who want to stay up to date only
> need to subscribe to this list to be informed.
> 
> By default e-mails to this list will be held until approved by the list
> owner or a list moderator.
> At present I have listed myself and Yurii as list-owners.
> 
> 
> Best,
> 
> Lennart.
> 
> 
> 
> 
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> 

-- 
-----------------------------------------------------------------
L.C. Karssen
Utrecht
The Netherlands

lennart at karssen.org
http://blog.karssen.org

Stuur mij aub geen Word of Powerpoint bestanden!
Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
------------------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130716/ad5adc20/attachment.sig>

From lennart at karssen.org  Thu Jul 18 23:33:26 2013
From: lennart at karssen.org (L.C. Karssen)
Date: Thu, 18 Jul 2013 23:33:26 +0200
Subject: [GenABEL-dev] multiple ProbABEL's palinear runs
In-Reply-To: <fbb490a416c2be.51e42c4e@rwth-aachen.de>
References: <fbb490a416c2be.51e42c4e@rwth-aachen.de>
Message-ID: <51E85F26.1030600@karssen.org>

Dear Alvaro,

Thank you for showing interest in the ProbABEL project!

On 15-07-13 17:07, Alvaro Jesus Frank wrote:
>
> Dear all,
>
> I am working on a high performance implementation of an ordinary
> linear estimator (OLS model), similar to the one implemented in
> ProbABEL's palinear (without --mmscore option), where X are SNP given
> and Y are the phenotypes. (As given by the ProbABEl manual on section
> 7 "Methodology" at
> http://www.genabel.org/sites/default/files/pdfs/ProbABEL_manual.pdf)
>
>
> b = (X'*X)^-1 * X' * y.
>
> The goal is to solve this with multiple design matrices (SNPs??) X

Indeed, the design matrix contains both SNP data and other covariates
(e.g. sex, age, etc.).

> and Phenotypes Y. For this we compute the formula as
>
> for each X for each Y b=(X'*X)^-1 * X' * y.
>
>
> We want to offer the GenABEL community an Estimator to be used in the
> same way people use the current tools (ProbABEL in R)

Actually, ProbABEL is a command line tool. Even though several packages
of the GenABEL project are R packages, ProbABEL is not.

>, but faster,
> and capable of handling LARGE datasets (in disk & memory). That is
> why I am writing it in C++,

Sounds good! ProbABEL is written in a mixture of C and C++.

> while making sure that it can be called
> directly from R.

I'm not sure that that should be a requirement. At the moment the
workflow is roughly the following:
1) prepare phenotype data (e.g. specify covariates, do QC like removing
outliers, log transformation, etc.). This is done by each researcher
independently, as they are the experts on their phenotypes.
2) Imputation of genetic data is done centrally as this is a time
consuming task, that only needs to be redone if additional individuals
have been genotyped or whenever a genomic reference set has been
updated. This happens roughly once or twice per year.

>
> My understanding: A few concerns came to mind when researching the
> workflow in using OMICS data in Linear Estimators. There seems to be
> a long process before the real life data from MaCH (test.mldose? for
> X

Just to be sure, for each SNP, X contains dosage or probability data
that SNP and the covariate data as specified by the researcher.

> and mlinfo? for Y)

Nope, Y is not take from the mlinfo file. The data from the mlinfo file
is not used in the regression. After the regression is done, the
information in the mlinfo file (e.g. SNP name, chromosome number, base
pair position) is simply copied to the output file.


> that is sitting on files can be used in
> calculations. The first concern is how to obtain the design matrices
> X from the files.

I agree.

>
> It is my understanding that there are two types of data, imputed data
> and databel data.

Almost correct. Imputed genotype data "comes out of" the imputation
software in the form of (possibly zipped) text files, the test.mldose
(basically N_SNPs x N_ids) and test.mlinfo files (N_SNPs x ~7).

The filevector/DatABEL file format is simply a way to store the dosage
data in such a (binary) way that we don't need to load a complete text
file into memory.

FYI: An imputed data set of ~7000 individuals and ~20e6 imputed SNPs
uses 459 GB in DatABEL format, the text-based mlinfo files take up 881
MB and the gzipped dosage text files take up 59GB.
The top item on my wishlist is a compressed form of the
filevector/DatABEL files, as you can see from these numbers.

> Either way, data seems to be pre-processed early in
> the workflow;

Actually, there isn't too much preprocessing going on. If we only look
at dosage data the only thing that needs to be done for each SNP is to
add the dosage data for each individual as a column to the (constant)
matrix of covariate data to form the design matrix X.
Because we want to allow for missing (genotype) data we have added some
routines to get the data without missing values.

> my impression is that this preprocessing is done in R.

Usually only for the creation of the phenotype file. For a single
(non-omics) phenotype like height, disease status, a blood lipid level,
etc. this is easy. The researcher usually has these files (N_IDs rows,
one column for the phenotype and a few columns for covariates like age,
sex, age^2, etc).

Of course, for omics data the number of phenotypes is much larger. But
for that scenario OmicABEL is developed.

> It also seems that R can't handle large amounts of data loaded in
> memory at once.

That is another reasons why DatABEL (the R library interface to the
filevector format) was developed.


>
> From what I see, data comes with some irregularities in its values
> (missing values, invalid rows in X/Y matrices), and this makes it
> difficult to use Linear Estimators right away; this is why the
> preprocessing exists.

Correct. Most people use imputed genotype data, there won't be many NA's
there. On the other hand, since genotype imputation is done centrally
for all genotype individuals, it is very common to have missing data in
the phenotype file (i.e. Y and covariate data).

> DatABEL seems to be the R tool (implemented in
> C++) that can do fast pre-processing of big sets of data. 

A very common use case after running a GWAS (the genome-wide linear
regression we're talking about), is that a reasearcher wants to know the
exact dosages for all individuals for his top ten of most significant
hits. This is when (s)he uses grep to find out in which mlinfo files the
SNPs are located. This is necessary because the genotype data is split
up in several files per chromosome to make handling the files easier
(parallel computation of the linear regression on a multicore cluster is
easy that way, we simply submit one job per 'chunk' and in that way
several chunks run in parallel). Then (s)he starts R, loads the DatABEL
library to read the genotype data from those specific files.

> Well, I
> think that DatABEL only does the reading and writing of files in C++
> (called filevector), 

Correct.

> while the pre-processing functions are defined
> and implemented in R. Am I correct?
>

Not quite. Apart from the one-time only conversion of the text files
with (imputed) genotype data to DatABEL format (which is done in R
usually, but the filevector lib also has command line tools (written in
C++) to do this), the end user doesn't do much with DatABEL (for
pre-processing). Within ProbABEL we do some pre-processing (e.g. removal
of individuals without genotype information), and in the loop over all
SNPs the combining of the genotype information with the other covariates
into the design matrix.

>
> My Problems: This is where my troubles start. Since I am trying to
> make this tool usable for the GenABEL community while still being
> able to handle TERABYTES of data with fast computations, I would
> really like to include the preprocessing of X and Y into my C++
> workflow. To solve the memory and performance limitations of R, I am
> trying to load the data from disk from within C++. Since I am
> performing my estimator function in C++, it expects those matrices to
> have numbers that can be used for computation. Assuming that data
> must be preprocessed to be able to get valid matrices with usable
> numbers, I have the following options:
>
> A) For performance reasons, I was considering having the data already
> pre-processed in disk files. Is this feasible, (preprocessed data
> would take at most as much space in disk as original data, is this
> cumbersome)?
>
> B) If there are only a few preprocessing functions that people use, I
> could re-implement them inside C++ and use them on the fly while
> loading the data from disk. This would be more difficult if everyone
> has their own customized R pre-processing functions.

>
> C) Another alternative is to allow users to use their own R
> pre-processing functions that pre-process the data. I would then go
> about preprocessing on the fly from inside C++ by doing calls back to
> R. This would be slower and harder to do than B).
>
> D) If DatABEL really does all the necesary pre-processing from inside
> C++, I could just directly use it or allow the user to specify what
> to use and won't need to re-implement the pre-processing functions.
> It seems tho, that preprocessing of the data takes from 30mins to an
> hour into DatABEL filevector format.

I think it would be a good idea to rethink the DatABEL/filevector
format. As I already mentioned, if we could store the data in a
compressed way (while still retaining good speed and (relatively) low
RAM usage life for the user would be much better.

>
>
> I would really appreciate any help that would clarify my
> understanding of how the pre-processing of data works and where it
> fits in the work-flow.

If you like we could set up a Skype call. I think that would help both
of us a lot in understanding each other. Maybe Yurii and Maarten would
like to participate as well?


Thanks again for showing interest in ProbABEL. I think we can learn a
lot from your expertise!


Best regards,

Lennart Karssen
(present maintainer of the ProbABEL package)

>
> Best regards,
>
> - Alvaro Frank _______________________________________________
> genabel-devel mailing list genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>
>
--
-----------------------------------------------------------------
L.C. Karssen
Utrecht
The Netherlands

lennart at karssen.org
http://blog.karssen.org

Stuur mij aub geen Word of Powerpoint bestanden!
Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
------------------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130718/2d5eb853/attachment.sig>

From yurii.aulchenko at gmail.com  Sat Jul 20 10:19:26 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Sat, 20 Jul 2013 10:19:26 +0200
Subject: [GenABEL-dev] layout of GenABEL main page
In-Reply-To: <51E4F2A1.4040002@gmail.com>
References: <fb5e10b31a0eb5.51d2d197@aices.rwth-aachen.de>
 <51D2C34D.2000907@gmail.com>
 <CAHX9t6LXPDT7UR3+Dn1htv1pWBuvJtSAn_zi3ANmt3FYa5pnLw@mail.gmail.com>
 <0177E59A-0CA1-4465-8186-A8EC79A20BB4@burlo.trieste.it>
 <CAHX9t6KDjHXVM8PLWcNxr0CNGJg=LcKvLcmMSOYfotd5scXomA@mail.gmail.com>
 <6632A424-420E-423B-957A-3B8481DD0122@burlo.trieste.it>
 <CAHX9t6KV69R_KLLJ_0Kqnys+nndAs=J-_s3kZHUd7N4dEVObgQ@mail.gmail.com>
 <CAHX9t6+z+rRpaZMWUyyN2H+MH86tBbZ6sSX+4PoUzsYXuG2deA@mail.gmail.com>
 <51E4F2A1.4040002@gmail.com>
Message-ID: <CAHX9t6KWA=dJS4Ydg-nN7ox=j2g5EQe5jDB5kuNOE+KtD1NMMg@mail.gmail.com>

The license is the one we decide on - I paid for the logo and own the
copyright et al.

So I was thinking we can release it under some license which would allow
people to play with it. At the same time I would like to make sure that the
original logo and the derivatives are used only for the GenABEL project.

Any ideas what is good license for that? I am a bit lost on that... - some
variant of Creative Commons license?

YA

On Tue, Jul 16, 2013 at 9:13 AM, Maarten Kooyman <kooyman at gmail.com> wrote:

>  Hi Yurii,
>
> Under what kind of licence are the logo's available? Maybe it is handy to
> put them on the website for easy access.
>
> Kind regards,
>
> Maarten
>
> On 07/15/2013 10:02 PM, Yurii Aulchenko wrote:
>
> Dear All,
>
>  a small update - I have original vector graphics files from Grant at my
> disposal; if some people would like to play with these files, send me a
> message and I can forward the vector files to you.
>
>  best,
> Yurii
>
> On Fri, Jul 5, 2013 at 3:09 PM, Yurii Aulchenko <yurii.aulchenko at gmail.com
> > wrote:
>
>>
>>
>>  On Fri, Jul 5, 2013 at 3:05 PM, Nicola Pirastu <
>> nicola.pirastu at burlo.trieste.it> wrote:
>>
>>>  I agree, in the end it's not the coca-cola logo and we have not been
>>> using it for years so I don't think people are going to be confused if the
>>> Logo changes in a few months.
>>>
>>>
>>  More than that - I really think it should evolve as our project does :)
>>
>>
>>
>>>   I am actually curious to see how it will look on the forum. I do
>>> think that if it's not too much work, the colors of the forum and website
>>> should match those of the logo though.
>>>
>>
>>  Yep. I now start understanding why people were giving the costs
>> estimates of few thousands of euro for the that basic design package: e.g.
>> for facebook we need cover and avatar (latter would do for the twitter as
>> well). So this is whole project :)
>>
>>  May be later we should think of inviting some guys from a design school
>> - they must be looking for graduation projects to make, and may be they
>> would be willing to do that for free :)
>>
>>  YA
>>
>>
>>>
>>>  Nicola
>>>
>>>
>>> Dr. Nicola Pirastu PhD
>>> Research Fellow
>>> Medical Sciences, Chirurgical and Health Department
>>> University of Trieste
>>> Medical Genetics
>>> IRCCS Burlo Garofolo
>>> Via dell'Istria 65/1
>>> 34137 Italy
>>> tel. +390403785539
>>>
>>>  Il giorno 05/lug/2013, alle ore 14:55, Yurii Aulchenko <
>>> yurii.aulchenko at gmail.com> ha scritto:
>>>
>>> I suggest that for the moment we go with what we have (Grant's variant);
>>> we can change later.
>>>
>>>  Please let me know if you have a strong opinion against! - I really
>>> would like to use the logo for my presentation and also play a bit how well
>>> it fits our pages (genabel.org, facebook, twitter)
>>>
>>>  YA
>>>
>>> On Tue, Jul 2, 2013 at 4:27 PM, Nicola Pirastu <
>>> nicola.pirastu at burlo.trieste.it> wrote:
>>>
>>>> Just to add my two cents to the discussion,
>>>>
>>>>  I think that the problem is not with the DNA helix but with the font.
>>>> I've played around a bit with it and if you use for example Helvetica or
>>>> something less comic-sans-like it does look better. Also for some reason
>>>> I'm still disturbed by the green but it is a very personal opinion..
>>>>
>>>>  Nicola
>>>>
>>>>  Dr. Nicola Pirastu PhD
>>>> Research Fellow
>>>> Medical Sciences, Chirurgical and Health Department
>>>> University of Trieste
>>>> Medical Genetics
>>>> IRCCS Burlo Garofolo
>>>> Via dell'Istria 65/1
>>>> 34137 Italy
>>>> tel. +390403785539
>>>>
>>>>  Il giorno 02/lug/2013, alle ore 14:38, Yurii Aulchenko <
>>>> yurii.aulchenko at gmail.com> ha scritto:
>>>>
>>>> Dear All,
>>>>
>>>>  I agree with critique of Maarten, and I actually still not sure if I
>>>> like Maarten's or Grant's idea better. Interesting thing is that - not sure
>>>> all realize it - Grant's variant is his vision of Maarten's prototype :)
>>>> However, Grant's variant has an important advantage - it is ready to serve
>>>> as logo. And I actually want to use a logo in my slides for UseR!-2013.
>>>>
>>>>  So I suggest we take Grant's logo as a working variant. No doubt that
>>>> the logo is going to evolve with time - as anything we do in the project -
>>>> code, documentation; logo is no different, I think. The element which is
>>>> going to stay and keep it recognizable is the way of spelling the GenABEL
>>>> :) - Like Gnu's horns in the GNU logo.
>>>>
>>>>  What we can do next is to place an open call on site/forum for other
>>>> users to contribute, but this is going to take time, and meanwhile I
>>>> suggest to stick with Grant's variant.
>>>>
>>>>  Yurii
>>>>
>>>> On Tue, Jul 2, 2013 at 2:10 PM, Maarten Kooyman <kooyman at gmail.com>wrote:
>>>>
>>>>> Dear all,
>>>>>
>>>>>
>>>>> It looks really nice ! Credits for who made it.  However, I have more
>>>>> the impression that it looks like a polypeptide chain or a rosary. The
>>>>> seventies font is a matter of taste, but it remind me of comic
>>>>> sans(including a upside down e as a). I wonder if it readable if you print
>>>>> it on a poster: I think this is a important use-case of a scientific logo.
>>>>>
>>>>> Kind regards,
>>>>>
>>>>>
>>>>> Maarten
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 07/02/2013 01:11 PM, Diego Fabregat Traver wrote:
>>>>>
>>>>>>  On 28/06/13, Yurii Aulchenko  <yurii.aulchenko at gmail.com> wrote:
>>>>>>
>>>>>>  How do you like this one?
>>>>>>>
>>>>>> I like it a lot.
>>>>>>
>>>>>> What do you think about reducing the font size for the subtitle
>>>>>> and right-justifying it? Would it still be readable? I liked that
>>>>>> detail from the previous attempts with the "Project" subtitle.
>>>>>>
>>>>>> In any case, this is just a minor detail. It looks great as it is.
>>>>>>
>>>>>> Thanks to Grant Borodin!
>>>>>>
>>>>>>
>>>>>>> YA
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jun 27, 2013 at 1:16 PM, Yurii Aulchenko <
>>>>>>> yurii.aulchenko at gmail.com(javascript:main.compose()> wrote:
>>>>>>>
>>>>>>>
>>>>>>>  Dear Nicola, Diego, Lennart,
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for your feedback! I will ask Grant Borodin, who kindly
>>>>>>>> designed these logos, if he could change C according to your comment
>>>>>>>> (capital "ABEL" and "statistical genomics" as in F).
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Yurii
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jun 26, 2013 at 4:16 PM, Diego Fabregat Traver <
>>>>>>>> fabregat at aices.rwth-aachen.de(javascript:main.compose()> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Congrats to whoever designed these logos, they look very nice :)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> With respect to my preferences, I fully agree with Lennart: "C
>>>>>>>>> with capital ABEL and statistical genomics below it" would be my choice.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Diego
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 20/06/13, "L.C. Karssen"  <lennart at karssen.org(
>>>>>>>>> javascript:main.compose()> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  Wow! Those look really nice!
>>>>>>>>>> I like options C and F the most. Actually a combination would be
>>>>>>>>>> even
>>>>>>>>>> better IMHO: use C with capital ABEL and statistical genomics
>>>>>>>>>> below it.
>>>>>>>>>> Looking forward to head the opinion of others,
>>>>>>>>>> Lennart.
>>>>>>>>>> On 20-06-13 09:34, Yurii Aulchenko wrote:
>>>>>>>>>>
>>>>>>>>>>> Please find attached few more logo variants
>>>>>>>>>>> Yurii
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>   _______________________________________________
>>>>>> genabel-devel mailing list
>>>>>> genabel-devel at lists.r-forge.r-project.org
>>>>>>
>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> genabel-devel mailing list
>>>>> genabel-devel at lists.r-forge.r-project.org
>>>>>
>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>>
>>>>
>>>>
>>>>
>>>>  --
>>>> -----------------------------------------------------
>>>> Yurii S. Aulchenko
>>>>
>>>>  [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
>>>> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>>>>  _______________________________________________
>>>> genabel-devel mailing list
>>>> genabel-devel at lists.r-forge.r-project.org
>>>>
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>
>>>>
>>>> AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute
>>>> nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel
>>>> messaggio, o responsabili per la sua consegna alla persona, o se avete
>>>> ricevuto il messaggio per errore, siete pregati di non trascriverlo,
>>>> copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il
>>>> messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential
>>>> information may be contained in this message or in its attachments. If you
>>>> are not the addressee indicated in this message, or responsible for message
>>>> delivering to that person, or if you have received this message in error,
>>>> you may not transcribe, copy or deliver this message to anyone. In that
>>>> case, you should delete this message and its attachments. Thank you.
>>>>
>>>
>>>
>>>
>>>  --
>>> -----------------------------------------------------
>>> Yurii S. Aulchenko
>>>
>>>  [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
>>> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>>>
>>>
>>>   AVVISO DI RISERVATEZZA Informazioni riservate possono essere
>>> contenute nel messaggio o nei suoi allegati. Se non siete i destinatari
>>> indicati nel messaggio, o responsabili per la sua consegna alla persona, o
>>> se avete ricevuto il messaggio per errore, siete pregati di non
>>> trascriverlo, copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a
>>> cancellare il messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE
>>> Confidential information may be contained in this message or in its
>>> attachments. If you are not the addressee indicated in this message, or
>>> responsible for message delivering to that person, or if you have received
>>> this message in error, you may not transcribe, copy or deliver this message
>>> to anyone. In that case, you should delete this message and its
>>> attachments. Thank you.
>>>
>>
>>
>>
>>  --
>> -----------------------------------------------------
>> Yurii S. Aulchenko
>>
>>  [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
>> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>>
>
>
>
>  --
> -----------------------------------------------------
> Yurii S. Aulchenko
>
>  [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>
>
> _______________________________________________
> genabel-devel mailing listgenabel-devel at lists.r-forge.r-project.orghttps://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>
>
>
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130720/e1eeedf6/attachment-0001.html>

From yurii.aulchenko at gmail.com  Sat Jul 20 17:15:41 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Sat, 20 Jul 2013 17:15:41 +0200
Subject: [GenABEL-dev] using reshuffle
Message-ID: <CAHX9t6JoBdSrNeNyWytYrJUKej5Uhb8LvZYK-6z8JcZHr9U8+g@mail.gmail.com>

Hi Sodbo,

It seems that reshuffle does not work correctly, at least I can not get to
the results with it (see below). I use a dataset with ~107k traits and
~280k SNPs.

Any idea? - do I do something wrong?

YA

With perl-extractor I get chi2 of 62

ya567666 at cluster:~[167]$ perl extractCell.pl /hpcwork/df938257/natgen/B2
329 209602 | gawk '{print $_,($2/$4)^2}'
 -0.165153577923775 0.580845952033997 0.0298683661967516 0.0734809562563896
-0.00155110028572381 62.4845

But this is not the case with reshuffle (and also I do not get any output
with reshuffle /hpcwork/df938257/natgen/B2 --chi=30, while I know there are
such chi2's in the results)

ya567666 at cluster:~[167]$ reshuffle /hpcwork/df938257/natgen/B2
--snps=209602 --traits=329 --chi
Finish iout_file read   0.11 sec
Start_write_chi_data=0.14 sec
End_write_chi_trait     spm_1_AND_spmp_23 0.14 sec
Finish_write_chi_data   0.14 sec
Finish reshuffling 0.14 sec
ya567666 at cluster:~[168]$ cat chi_data.txt
SNP     Trait   beta_1  beta_SNP        se_1    se_SNP  cov_SNP_1       Chi2
rs4902242       spm_1_AND_spmp_23       -0.00234050769358873
 -0.0338250175118446     0.128280490636826   0.0329618416726589
0.0770578160881996      1.05306001371466
ya567666 at cluster:~[169]$
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130720/80d3116f/attachment.html>

From yurii.aulchenko at gmail.com  Sat Jul 20 17:26:57 2013
From: yurii.aulchenko at gmail.com (Yurii Aulchenko)
Date: Sat, 20 Jul 2013 17:26:57 +0200
Subject: [GenABEL-dev] using reshuffle
In-Reply-To: <CAHX9t6JoBdSrNeNyWytYrJUKej5Uhb8LvZYK-6z8JcZHr9U8+g@mail.gmail.com>
References: <CAHX9t6JoBdSrNeNyWytYrJUKej5Uhb8LvZYK-6z8JcZHr9U8+g@mail.gmail.com>
Message-ID: <CAHX9t6LC7AKS=tZO1PK69cNkWcNwtE3X6pfNtEiwZfQ-y8akMQ@mail.gmail.com>

Another point: apparently you do not check boundaries - e.g. when I try to
get results for trait #200,000 (I have 107,000 only) I get the core dump.

YA

On Sat, Jul 20, 2013 at 5:15 PM, Yurii Aulchenko
<yurii.aulchenko at gmail.com>wrote:

> Hi Sodbo,
>
> It seems that reshuffle does not work correctly, at least I can not get to
> the results with it (see below). I use a dataset with ~107k traits and
> ~280k SNPs.
>
> Any idea? - do I do something wrong?
>
> YA
>
> With perl-extractor I get chi2 of 62
>
> ya567666 at cluster:~[167]$ perl extractCell.pl /hpcwork/df938257/natgen/B2
> 329 209602 | gawk '{print $_,($2/$4)^2}'
>  -0.165153577923775 0.580845952033997 0.0298683661967516
> 0.0734809562563896 -0.00155110028572381 62.4845
>
> But this is not the case with reshuffle (and also I do not get any output
> with reshuffle /hpcwork/df938257/natgen/B2 --chi=30, while I know there are
> such chi2's in the results)
>
> ya567666 at cluster:~[167]$ reshuffle /hpcwork/df938257/natgen/B2
> --snps=209602 --traits=329 --chi
> Finish iout_file read   0.11 sec
> Start_write_chi_data=0.14 sec
> End_write_chi_trait     spm_1_AND_spmp_23 0.14 sec
> Finish_write_chi_data   0.14 sec
> Finish reshuffling 0.14 sec
> ya567666 at cluster:~[168]$ cat chi_data.txt
> SNP     Trait   beta_1  beta_SNP        se_1    se_SNP  cov_SNP_1
> Chi2
> rs4902242       spm_1_AND_spmp_23       -0.00234050769358873
>  -0.0338250175118446     0.128280490636826   0.0329618416726589
> 0.0770578160881996      1.05306001371466
> ya567666 at cluster:~[169]$
>


-- 
-----------------------------------------------------
Yurii S. Aulchenko

[ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [
Twitter<http://twitter.com/YuriiAulchenko>] [
Blog <http://yurii-aulchenko.blogspot.nl/> ]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130720/b97c37f5/attachment.html>

From sharapovsodbo at gmail.com  Sat Jul 20 18:10:15 2013
From: sharapovsodbo at gmail.com (=?KOI8-R?B?88/Ews8g+8HSwdDP1w==?=)
Date: Sat, 20 Jul 2013 23:10:15 +0700
Subject: [GenABEL-dev] using reshuffle
In-Reply-To: <CAHX9t6LC7AKS=tZO1PK69cNkWcNwtE3X6pfNtEiwZfQ-y8akMQ@mail.gmail.com>
References: <CAHX9t6JoBdSrNeNyWytYrJUKej5Uhb8LvZYK-6z8JcZHr9U8+g@mail.gmail.com>
 <CAHX9t6LC7AKS=tZO1PK69cNkWcNwtE3X6pfNtEiwZfQ-y8akMQ@mail.gmail.com>
Message-ID: <CAPF08KuVbwYG2adNfsnMcZLjTM7SAm34p0iAquuX4W-onw_sgg@mail.gmail.com>

Hello!
I'll will check reshuffle tomorrow.
20.07.2013 22:26 ???????????? "Yurii Aulchenko" <yurii.aulchenko at gmail.com>
???????:

> Another point: apparently you do not check boundaries - e.g. when I try to
> get results for trait #200,000 (I have 107,000 only) I get the core dump.
>
> YA
>
> On Sat, Jul 20, 2013 at 5:15 PM, Yurii Aulchenko <
> yurii.aulchenko at gmail.com> wrote:
>
>> Hi Sodbo,
>>
>> It seems that reshuffle does not work correctly, at least I can not get
>> to the results with it (see below). I use a dataset with ~107k traits and
>> ~280k SNPs.
>>
>> Any idea? - do I do something wrong?
>>
>> YA
>>
>> With perl-extractor I get chi2 of 62
>>
>> ya567666 at cluster:~[167]$ perl extractCell.pl /hpcwork/df938257/natgen/B2
>> 329 209602 | gawk '{print $_,($2/$4)^2}'
>>  -0.165153577923775 0.580845952033997 0.0298683661967516
>> 0.0734809562563896 -0.00155110028572381 62.4845
>>
>> But this is not the case with reshuffle (and also I do not get any output
>> with reshuffle /hpcwork/df938257/natgen/B2 --chi=30, while I know there are
>> such chi2's in the results)
>>
>> ya567666 at cluster:~[167]$ reshuffle /hpcwork/df938257/natgen/B2
>> --snps=209602 --traits=329 --chi
>> Finish iout_file read   0.11 sec
>> Start_write_chi_data=0.14 sec
>> End_write_chi_trait     spm_1_AND_spmp_23 0.14 sec
>> Finish_write_chi_data   0.14 sec
>> Finish reshuffling 0.14 sec
>> ya567666 at cluster:~[168]$ cat chi_data.txt
>> SNP     Trait   beta_1  beta_SNP        se_1    se_SNP  cov_SNP_1
>> Chi2
>> rs4902242       spm_1_AND_spmp_23       -0.00234050769358873
>>  -0.0338250175118446     0.128280490636826   0.0329618416726589
>> 0.0770578160881996      1.05306001371466
>> ya567666 at cluster:~[169]$
>>
>
>
>
> --
> -----------------------------------------------------
> Yurii S. Aulchenko
>
> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130720/40836048/attachment.html>

From alvaro.frank at rwth-aachen.de  Sun Jul 21 20:28:52 2013
From: alvaro.frank at rwth-aachen.de (Alvaro Jesus Frank)
Date: Sun, 21 Jul 2013 20:28:52 +0200
Subject: [GenABEL-dev] multiple ProbABEL's palinear runs
Message-ID: <fb988b4b1ba7fd.51ec4484@rwth-aachen.de>


Dear Lennart,

Thanks for the reply with all the useful information. Perhaps when I have a prototype (computational core excluding real data handling) working we could set up the skype call? 
Here I have some follow up questions.
> 
> I'm not sure that that should be a requirement. At the moment the
> workflow is roughly the following:
> 1) prepare phenotype data (e.g. specify covariates, do QC like removing
> outliers, log transformation, etc.). This is done by each researcher
> independently, as they are the experts on their phenotypes.
> Usually only for the creation of the phenotype file. For a single
> (non-omics) phenotype like height, disease status, a blood lipid level,
> etc. this is easy. The researcher usually has these files (N_IDs rows,
> one column for the phenotype and a few columns for covariates like age,
> sex, age^2, etc).
> Of course, for omics data the number of phenotypes is much larger. But
> for that scenario OmicABEL is developed.

The purpose is to go along the lines of OmicABEL where multiple phenotypes can be used in the computation, but by being as flexible to any existing ways of storing the multiple phenotype data as possible. I.e: If the standard already is (for a single phenotype ) to have a .txt file for analysis, simply use their existing files in bulk. If everyone stores this in their own way, then simply going the way of OmicABEL would be the best, requiring all phen. files to be re-packaged in a DatABEL format. 
If everyone uses the same standard for phenotype files, then I can just support those directly (supporting low memory usage too, as this is not dependent on how data is stored, but on how it is accessed).

> 2) Imputation of genetic data is done centrally as this is a time
> consuming task,

It takes hours to my understanding right?

> that only needs to be redone if additional individuals
> have been genotyped or whenever a genomic reference set has been
> updated. This happens roughly once or twice per year.

Data on files on disk that is used in computations already went through this process right? (I.e: is ready to compute)

> FYI: An imputed data set of ~7000 individuals and ~20e6 imputed SNPs
> uses 459 GB in DatABEL format, the text-based mlinfo files take up 881
> MB and the gzipped dosage text files take up 59GB.
> The top item on my wishlist is a compressed form of the
> filevector/DatABEL files, as you can see from these numbers.

So the DatABEL binary file takes MORE space than the raw equivalent dosage text files *.mldose (when gzipped)?
What about when they are not compressed? 
According to my calculations if there are N=10^9 entries, in binary you can store with single precision 1 entry in 32bits(4Bytes)
to a total of 3.72 Gigs (N*4) but in raw text file each digit requires 1 byte, storing 9 characters to represent the number, then it would requires at least N*8 = 8,38 Gigs, which should be double the size.

> Imputed genotype data "comes out of" the imputation
> software in the form of (possibly zipped) text files, the test.mldose
> (basically N_SNPs x N_ids) and test.mlinfo files (N_SNPs x ~7).
> 
> The filevector/DatABEL file format is simply a way to store the dosage
> data in such a (binary) way that we don't need to load a complete text
> file into memory.


If users had the choice, what would the rather have the application do:
a) Use the existing raw text .mldose file(files?) they already have without requiring to use their entire memory at once (similar to filevector).
b) Force them to transform their files into even more files that use the filevector format and the application would use those (also low memory usage).
c) Something else? 


> Actually, there isn't too much preprocessing going on. If we only look
> at dosage data the only thing that needs to be done for each SNP is to
> add the dosage data for each individual as a column to the (constant)
> matrix of covariate data to form the design matrix X.

This is the process that I refer to as X = [ XL | XR ] where the design matrix X is formed like:
-Covariates XL that is constant (of size N_ids:rows, N_covariates:columns)  
-XR that is built with dosage data and is different for each ___ what? (how? If the dosage data is a big sequence how do you establish how much to take and add to XL to form X.

> Because we want to allow for missing (genotype) data we have added some
> routines to get the data without missing values.
> That is another reasons why DatABEL (the R library interface to the
> filevector format) was developed.

This is already done in that central process that happens only once or twice a year like you mentioned before right? 
Data sitting on files already accounts for this missing values?
  
> Most people use imputed genotype data, there won't be many NA's
> there. On the other hand, since genotype imputation is done centrally
> for all genotype individuals, it is very common to have missing data in
> the phenotype file (i.e. Y and covariate data).

How does the processing of genotype data create missing pheno data?
How is this then corrected? (by user/probabel?)

Does this mean that if phenotype data is missing for an individual, then this individual is simply not used in the calculation?
I.e: in the part of the regresion where: X' * Y  the calculation is not performed? 
Or "missing data in the phenotype file" means that Y has missing rows and data must be dropped/filled? (for non covariate entries).

I know that OmicABEL does averaging for the missing covariate entries. Is this done for non covariate missing entries?
If each Phenotype file comes with both covariate data (whic his supposed to be cosntant) and phenotype data, does this mean that the constant data is duplicated in disk?
 
> Not quite. Apart from the one-time only conversion of the text files
> with (imputed) genotype data to DatABEL format (which is done in R
> usually, but the filevector lib also has command line tools (written in
> C++) to do this), the end user doesn't do much with DatABEL (for
> pre-processing). Within ProbABEL we do some pre-processing (e.g. removal
> of individuals without genotype information), 

How do you determine which are these? This means that the users leaves their phenotypic data uncorrected in files? (prev.question).
So if genotype data is also missing for Y's that DO exist, these are also dropped?

What other data manipulations not part of the regression process are done inside ProbABEL?

> and in the loop over all
> SNPs the combining of the genotype information with the other covariates
> into the design matrix.
> 
this is the formation of
X = [ XL | XR ] right?

> The top item on my wishlist is a compressed form of the
> filevector/DatABEL files, as you can see from these numbers.
> 
> I think it would be a good idea to rethink the DatABEL/filevector
> format. As I already mentioned, if we could store the data in a
> compressed way (while still retaining good speed and (relatively) low
> RAM usage life for the user would be much better.

I have looked into this and there are some solutions for data compression of random floating point data. I am not sure how efficient they are but my guess is what disk usage can be reduced to around 70-60%. It must be stated that data loading into memory is independet on how it is stored. It is ALWAYS possible to just load parts of files into memory, be either in filevector format or inputed *.mldose data.
The routine that DatABEL uses to load memory are the only thing that needs to be worked on to support ow memory usage, and not the format itself. 

On another topic related to OmicABEL, I wish to know to what extend it is used and if its not used widely, what the reason is. 
What hinders its adoption to do multiple Xr and Y analysis?

Thanks again for the input!

-Alvaro Frank


From kooyman at gmail.com  Sun Jul 21 21:21:04 2013
From: kooyman at gmail.com (Maarten Kooyman)
Date: Sun, 21 Jul 2013 21:21:04 +0200
Subject: [GenABEL-dev] multiple ProbABEL's palinear runs
In-Reply-To: <fbb490a416c2be.51e42c4e@rwth-aachen.de>
References: <fbb490a416c2be.51e42c4e@rwth-aachen.de>
Message-ID: <51EC34A0.4060303@gmail.com>

Dear Alvaro,

I did some benchmarking on ProbABEL's palinear (without --mmscore 
option) in the past and I can recall that most time the program spend on 
getting the genotype data to the OLS part, and not the OLS part itself. 
I could not find the results of the profiling so I am not sure this was 
truly the case. Loading the genotypes only ones instead of it N 
times(where N is number of phenotypes) would give a speed up. However, 
be aware when using real life data, outliers of the phonotypes are 
removed. If this outliers are not removed in your data, the amount of 
false positives will be high. So matrix X is for every phenotype 
unique.  Since the

  (X'*X)^-1 * X'


which  is a part of

  b = (X'*X)^-1 * X' * y.

is not the same for each phenotype, the speed-up there will be hard(er) 
to get.

I think without the ability to censor phenotypes the program will not 
have much real life use.


Kind regards,

Maarten


On 07/15/2013 05:07 PM, Alvaro Jesus Frank wrote:
> Dear all,
>
> I am working on a high performance implementation of an ordinary linear estimator (OLS model), similar to the one implemented in ProbABEL's palinear (without --mmscore option), where X are SNP given and Y are the phenotypes.
> (As given by the ProbABEl manual on section 7 "Methodology" at http://www.genabel.org/sites/default/files/pdfs/ProbABEL_manual.pdf)
>
>
>   b = (X'*X)^-1 * X' * y.
>
> The goal is to solve this with multiple design matrices (SNPs??) X and Phenotypes Y. For this we compute the formula as
>
> for each X
>     for each Y
>         b=(X'*X)^-1 * X' * y.
>
>
> We want to offer the GenABEL community an Estimator to be used in the same way people use the current tools (ProbABEL in R), but faster, and capable of handling LARGE datasets (in disk & memory).
> That is why I am writing it in C++, while making sure that it can be called directly from R.
>
> My understanding:
> A few concerns came to mind when researching the workflow in using OMICS data in Linear Estimators.
> There seems to be a long process before the real life data from MaCH (test.mldose? for X and mlinfo? for Y) that is sitting on files can be used in calculations. The first concern is how to obtain the design matrices X from the files.
>
> It is my understanding that there are two types of data, imputed data and databel data. Either way, data seems to be pre-processed early in the workflow; my impression is that this preprocessing is done in R. It also seems that R can't handle large amounts of data loaded in memory at once.
>
>  From what I see, data comes with some irregularities in its values (missing values, invalid rows in X/Y matrices), and this makes it difficult to use Linear Estimators right away; this is why the preprocessing exists. DatABEL seems to be the R tool (implemented in C++) that can do fast pre-processing of big sets of data. Well, I think that DatABEL only does the reading and writing of files in C++ (called filevector), while the pre-processing functions are defined and implemented in R. Am I correct?
>
>
> My Problems:
> This is where my troubles start. Since I am trying to make this tool usable for the GenABEL community while still being able to handle TERABYTES of data with fast computations, I would really like to include the preprocessing of X and Y into my C++ workflow. To solve the memory and performance limitations of R, I am trying to load the data from disk from within C++. Since I am performing my estimator function in C++, it expects those matrices to have numbers that can be used for computation. Assuming that data must be preprocessed to be able to get valid matrices with usable numbers, I have the following options:
>
> A)
> For performance reasons, I was considering having the data already pre-processed in disk files. Is this feasible, (preprocessed data would take at most as much space in disk as original data, is this cumbersome)?
>
> B)
> If there are only a few preprocessing functions that people use, I could re-implement them inside C++ and use them on the fly while loading the data from disk. This would be more difficult if everyone has their own customized R pre-processing functions.
>
> C)
> Another alternative is to allow users to use their own R pre-processing functions that pre-process the data. I would then go about preprocessing on the fly from inside C++ by doing calls back to R. This would be slower and harder to do than B).
>
> D)
> If DatABEL really does all the necesary pre-processing from inside C++, I could just directly use it or allow the user to specify what to use and won't need to re-implement the pre-processing functions. It seems tho, that preprocessing of the data takes from 30mins to an hour into DatABEL filevector format.
>
>
> I would really appreciate any help that would clarify my understanding of how the pre-processing of data works and where it fits in the work-flow.
>
> Best regards,
>
> - Alvaro Frank
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel


From sharapovsodbo at gmail.com  Thu Jul 25 08:44:17 2013
From: sharapovsodbo at gmail.com (=?KOI8-R?B?88/Ews8g+8HSwdDP1w==?=)
Date: Thu, 25 Jul 2013 13:44:17 +0700
Subject: [GenABEL-dev] using reshuffle
In-Reply-To: <CAPF08KuVbwYG2adNfsnMcZLjTM7SAm34p0iAquuX4W-onw_sgg@mail.gmail.com>
References: <CAHX9t6JoBdSrNeNyWytYrJUKej5Uhb8LvZYK-6z8JcZHr9U8+g@mail.gmail.com>
 <CAHX9t6LC7AKS=tZO1PK69cNkWcNwtE3X6pfNtEiwZfQ-y8akMQ@mail.gmail.com>
 <CAPF08KuVbwYG2adNfsnMcZLjTM7SAm34p0iAquuX4W-onw_sgg@mail.gmail.com>
Message-ID: <CAPF08Ks8demySdbMimZcmuCkNwhVakvrFQ-71j6gnEi+hyGq8w@mail.gmail.com>

Dear all!
I commited newest version of reshuffle
Now reshuffle works 2x faster!=)
Reasons:

  --ostringstream oss: outputs cache

  --exclude from cycle's and  put them upper
     double* buf = new double[per_trait_per_snp];
     char s[30];

  --(int64_t) blablabla instead of (int64_t)bla + (int64_t)bla +
(int64_t)bla

To find "hot spots" in reshuffle, I used

GNU Profiler
GNU Coverage testing tool

Very useful tools to find right places in programm to optimizate!

Now 5Gb CLAK-GWAS output convert to 16 Gb txt files for 380 sec or 6
minutes.
Machine: Intel Core i7 930; 8Gb RAM (it is not cluster's node, I think on
cluster's node reshuffle's run would be faster=)

There are problems with extract heritability and write slim data.
I'll check soon


2013/7/20 ????? ??????? <sharapovsodbo at gmail.com>

> Hello!
> I'll will check reshuffle tomorrow.
> 20.07.2013 22:26 ???????????? "Yurii Aulchenko" <yurii.aulchenko at gmail.com>
> ???????:
>
> Another point: apparently you do not check boundaries - e.g. when I try to
>> get results for trait #200,000 (I have 107,000 only) I get the core dump.
>>
>> YA
>>
>> On Sat, Jul 20, 2013 at 5:15 PM, Yurii Aulchenko <
>> yurii.aulchenko at gmail.com> wrote:
>>
>>> Hi Sodbo,
>>>
>>> It seems that reshuffle does not work correctly, at least I can not get
>>> to the results with it (see below). I use a dataset with ~107k traits and
>>> ~280k SNPs.
>>>
>>> Any idea? - do I do something wrong?
>>>
>>> YA
>>>
>>> With perl-extractor I get chi2 of 62
>>>
>>> ya567666 at cluster:~[167]$ perl extractCell.pl
>>> /hpcwork/df938257/natgen/B2 329 209602 | gawk '{print $_,($2/$4)^2}'
>>>  -0.165153577923775 0.580845952033997 0.0298683661967516
>>> 0.0734809562563896 -0.00155110028572381 62.4845
>>>
>>> But this is not the case with reshuffle (and also I do not get any
>>> output with reshuffle /hpcwork/df938257/natgen/B2 --chi=30, while I know
>>> there are such chi2's in the results)
>>>
>>> ya567666 at cluster:~[167]$ reshuffle /hpcwork/df938257/natgen/B2
>>> --snps=209602 --traits=329 --chi
>>> Finish iout_file read   0.11 sec
>>> Start_write_chi_data=0.14 sec
>>> End_write_chi_trait     spm_1_AND_spmp_23 0.14 sec
>>> Finish_write_chi_data   0.14 sec
>>> Finish reshuffling 0.14 sec
>>> ya567666 at cluster:~[168]$ cat chi_data.txt
>>> SNP     Trait   beta_1  beta_SNP        se_1    se_SNP  cov_SNP_1
>>> Chi2
>>> rs4902242       spm_1_AND_spmp_23       -0.00234050769358873
>>>  -0.0338250175118446     0.128280490636826   0.0329618416726589
>>> 0.0770578160881996      1.05306001371466
>>> ya567666 at cluster:~[169]$
>>>
>>
>>
>>
>> --
>> -----------------------------------------------------
>> Yurii S. Aulchenko
>>
>> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
>> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>>
>


-- 
*_________________________________*
*
*With best regards

Sodbo Zh. Sharapov
Phone:  +79831347688
Email:    sharapovsodbo at gmail.com
             sharapov at bionet.nsc.ru
Skype:   sharapovsodbo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130725/f3123d05/attachment.html>

From lennart at karssen.org  Thu Jul 25 17:08:17 2013
From: lennart at karssen.org (L.C. Karssen)
Date: Thu, 25 Jul 2013 17:08:17 +0200
Subject: [GenABEL-dev] using reshuffle
In-Reply-To: <CAPF08Ks8demySdbMimZcmuCkNwhVakvrFQ-71j6gnEi+hyGq8w@mail.gmail.com>
References: <CAHX9t6JoBdSrNeNyWytYrJUKej5Uhb8LvZYK-6z8JcZHr9U8+g@mail.gmail.com>
 <CAHX9t6LC7AKS=tZO1PK69cNkWcNwtE3X6pfNtEiwZfQ-y8akMQ@mail.gmail.com>
 <CAPF08KuVbwYG2adNfsnMcZLjTM7SAm34p0iAquuX4W-onw_sgg@mail.gmail.com>
 <CAPF08Ks8demySdbMimZcmuCkNwhVakvrFQ-71j6gnEi+hyGq8w@mail.gmail.com>
Message-ID: <51F13F61.7040604@karssen.org>

Hi Sodbo,

On 25-07-13 08:44, ????? ??????? wrote:
> Dear all!
> I commited newest version of reshuffle
> Now reshuffle works 2x faster!=)

That's always good news!

> Reasons:
> 
>   --ostringstream oss: outputs cache
> 
>   --exclude from cycle's and  put them upper
>      double* buf = new double[per_trait_per_snp];
>      char s[30];
> 
>   --(int64_t) blablabla instead of (int64_t)bla + (int64_t)bla +
> (int64_t)bla
> 
> To find "hot spots" in reshuffle, I used
> 
> GNU Profiler
> GNU Coverage testing tool

I vaguely remember having heard of the coverage testing tool, but I've
never used it. Interesting!

> 
> Very useful tools to find right places in programm to optimizate!
> 
> Now 5Gb CLAK-GWAS output convert to 16 Gb txt files for 380 sec or 6
> minutes.
> Machine: Intel Core i7 930; 8Gb RAM (it is not cluster's node, I think on
> cluster's node reshuffle's run would be faster=)
> 
> There are problems with extract heritability and write slim data.
> I'll check soon
> 

Thanks for all the work!

Lennart.

> 
> 
> 
> 
> 2013/7/20 ????? ??????? <sharapovsodbo at gmail.com>
> 
>> Hello!
>> I'll will check reshuffle tomorrow.
>> 20.07.2013 22:26 ???????????? "Yurii Aulchenko" <yurii.aulchenko at gmail.com>
>> ???????:
>>
>> Another point: apparently you do not check boundaries - e.g. when I try to
>>> get results for trait #200,000 (I have 107,000 only) I get the core dump.
>>>
>>> YA
>>>
>>> On Sat, Jul 20, 2013 at 5:15 PM, Yurii Aulchenko <
>>> yurii.aulchenko at gmail.com> wrote:
>>>
>>>> Hi Sodbo,
>>>>
>>>> It seems that reshuffle does not work correctly, at least I can not get
>>>> to the results with it (see below). I use a dataset with ~107k traits and
>>>> ~280k SNPs.
>>>>
>>>> Any idea? - do I do something wrong?
>>>>
>>>> YA
>>>>
>>>> With perl-extractor I get chi2 of 62
>>>>
>>>> ya567666 at cluster:~[167]$ perl extractCell.pl
>>>> /hpcwork/df938257/natgen/B2 329 209602 | gawk '{print $_,($2/$4)^2}'
>>>>  -0.165153577923775 0.580845952033997 0.0298683661967516
>>>> 0.0734809562563896 -0.00155110028572381 62.4845
>>>>
>>>> But this is not the case with reshuffle (and also I do not get any
>>>> output with reshuffle /hpcwork/df938257/natgen/B2 --chi=30, while I know
>>>> there are such chi2's in the results)
>>>>
>>>> ya567666 at cluster:~[167]$ reshuffle /hpcwork/df938257/natgen/B2
>>>> --snps=209602 --traits=329 --chi
>>>> Finish iout_file read   0.11 sec
>>>> Start_write_chi_data=0.14 sec
>>>> End_write_chi_trait     spm_1_AND_spmp_23 0.14 sec
>>>> Finish_write_chi_data   0.14 sec
>>>> Finish reshuffling 0.14 sec
>>>> ya567666 at cluster:~[168]$ cat chi_data.txt
>>>> SNP     Trait   beta_1  beta_SNP        se_1    se_SNP  cov_SNP_1
>>>> Chi2
>>>> rs4902242       spm_1_AND_spmp_23       -0.00234050769358873
>>>>  -0.0338250175118446     0.128280490636826   0.0329618416726589
>>>> 0.0770578160881996      1.05306001371466
>>>> ya567666 at cluster:~[169]$
>>>>
>>>
>>>
>>>
>>> --
>>> -----------------------------------------------------
>>> Yurii S. Aulchenko
>>>
>>> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
>>> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>>>
>>
> 
> 
> 
> 
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> 


-- 
-----------------------------------------------------------------
L.C. Karssen
Utrecht
The Netherlands

lennart at karssen.org
http://blog.karssen.org

Stuur mij aub geen Word of Powerpoint bestanden!
Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
------------------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130725/61c4d9c7/attachment.sig>

From lennart at karssen.org  Tue Jul 30 09:32:42 2013
From: lennart at karssen.org (L.C. Karssen)
Date: Tue, 30 Jul 2013 09:32:42 +0200
Subject: [GenABEL-dev] Precision and scientific notation in ProbABEL
Message-ID: <51F76C1A.2000205@karssen.org>

Dear list,

I'm finalising version 0.4.0 of ProbABEL and there are two things I'd
like your opinion on:

1) with what precision should we print the betas, standard errors and
Chi^2 values to the output files?

2) Should we use scientific notation in the output (for betas, standard
errors and Chi^2)?

In ProbABEL v0.3.0 and earlier output was simply sent to cout without
any explicit formatting. In practice this lead usually to 6 significant
digits, but sometimes less. My proposal is to fix the precision at 6
significant digits.

Regarding item 2): most of the betas I see are in the range between 0
and 10, although in case of no effect beta's can be of the order of
1e-2, 1e-3. All in all, I don't think switching to scientific notation
will improve the output.


What are your opinions?


Thanks,

Lennart.
-- 
-----------------------------------------------------------------
L.C. Karssen
Utrecht
The Netherlands

lennart at karssen.org
http://blog.karssen.org

Stuur mij aub geen Word of Powerpoint bestanden!
Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
------------------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130730/84a5020e/attachment.sig>

From nicola.pirastu at burlo.trieste.it  Tue Jul 30 10:33:04 2013
From: nicola.pirastu at burlo.trieste.it (Nicola Pirastu)
Date: Tue, 30 Jul 2013 10:33:04 +0200
Subject: [GenABEL-dev] Precision and scientific notation in ProbABEL
In-Reply-To: <51F76C1A.2000205@karssen.org>
References: <51F76C1A.2000205@karssen.org>
Message-ID: <5B307B30-9137-4A17-8A36-D43FC2818B94@burlo.trieste.it>

Dear Lennart,

I think that switching to scientific notation is not really necessary and could lead to a little of loss in precision unless of course you still
use 6 significant digits which will translate in just a reduction of 0 in the values.
So if for example we were to choose scientific notation with 3 significant digits, although this would not affect very much the final results we could be asked to submit more and
would not be able to comply.
So to summarize I think that if it does not have any effect on performance of ProbABEL 6 significant digits without scientific notation is fine.

Best

Nicola


Dr. Nicola Pirastu PhD
Research Fellow
Medical Sciences, Chirurgical and Health Department
University of Trieste
Medical Genetics
IRCCS Burlo Garofolo
Via dell'Istria 65/1
34137 Italy
tel. +390403785539

Il giorno 30/lug/2013, alle ore 09:32, "L.C. Karssen" <lennart at karssen.org> ha scritto:

> Dear list,
>
> I'm finalising version 0.4.0 of ProbABEL and there are two things I'd
> like your opinion on:
>
> 1) with what precision should we print the betas, standard errors and
> Chi^2 values to the output files?
>
> 2) Should we use scientific notation in the output (for betas, standard
> errors and Chi^2)?
>
> In ProbABEL v0.3.0 and earlier output was simply sent to cout without
> any explicit formatting. In practice this lead usually to 6 significant
> digits, but sometimes less. My proposal is to fix the precision at 6
> significant digits.
>
> Regarding item 2): most of the betas I see are in the range between 0
> and 10, although in case of no effect beta's can be of the order of
> 1e-2, 1e-3. All in all, I don't think switching to scientific notation
> will improve the output.
>
>
> What are your opinions?
>
>
> Thanks,
>
> Lennart.
> --
> -----------------------------------------------------------------
> L.C. Karssen
> Utrecht
> The Netherlands
>
> lennart at karssen.org
> http://blog.karssen.org
>
> Stuur mij aub geen Word of Powerpoint bestanden!
> Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
> ------------------------------------------------------------------
>
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel

AVVISO DI RISERVATEZZA Informazioni riservate possono essere contenute nel messaggio o nei suoi allegati. Se non siete i destinatari indicati nel messaggio, o responsabili per la sua consegna alla persona, o se avete ricevuto il messaggio per errore, siete pregati di non trascriverlo, copiarlo o inviarlo a nessuno. In tal caso vi invitiamo a cancellare il messaggio ed i suoi allegati. Grazie. CONFIDENTIALITY NOTICE Confidential information may be contained in this message or in its attachments. If you are not the addressee indicated in this message, or responsible for message delivering to that person, or if you have received this message in error, you may not transcribe, copy or deliver this message to anyone. In that case, you should delete this message and its attachments. Thank you.