[GenABEL-dev] [Genabel-commits] r1264 - in pkg/OmicABEL: . doc src src/float2double

Mon Jul 1 13:59:32 CEST 2013

Diego, thanks for reacting so quickly and arranging the float2double
converter for filevector files!

Two questions/suggestions:

1) I wonder if float2double is a good name - could that be the name is
already taken? Should we be more specific this is related to
filevector?

2) You check that inFile is != float and break execution if yes.
Should the program also report what format the data is in? e.g. "The
inFile contains filevector-INT, but I can only convert
filevector-FLOAT to filevector-DOUBLE"?

These are suggestions for discussion - I do not have a strong opinion here.

YA

----------------------
Yurii Aulchenko
(sent from mobile device)

On 1 Jul 2013, at 10:56, "noreply at r-forge.r-project.org"
<noreply at r-forge.r-project.org> wrote:

> Author: dfabregat
> Date: 2013-07-01 10:56:30 +0200 (Mon, 01 Jul 2013)
> New Revision: 1264
>
> Added:
>   pkg/OmicABEL/src/float2double/
>   pkg/OmicABEL/src/float2double/float2double.c
> Modified:
>   pkg/OmicABEL/Makefile
>   pkg/OmicABEL/doc/HOWTO
> Log:
> Adding the program float2double to translate DatABEL
> "float" data into DatABEL "double" data.
>
>
> Modified: pkg/OmicABEL/Makefile
> ===================================================================
> --- pkg/OmicABEL/Makefile    2013-07-01 08:50:00 UTC (rev 1263)
> +++ pkg/OmicABEL/Makefile    2013-07-01 08:56:30 UTC (rev 1264)
> @@ -2,8 +2,10 @@
>
> SRCDIR = ./src
> RESH_SRCDIR = ./src/reshuffle
> +F2D_SRCDIR  = ./src/float2double
> CLAKGWAS  = ./bin/CLAK-GWAS
> RESHUFFLE = ./bin/reshuffle
> +F2D       = ./bin/float2double
>
> #QUICK and DIRTY
> CXX=g++
> @@ -15,11 +17,13 @@
> SRCS = $(SRCDIR)/CLAK_GWAS.c $(SRCDIR)/fgls_chol.c $(SRCDIR)/fgls_eigen.c $(SRCDIR)/wrappers.c $(SRCDIR)/timing.c $(SRCDIR)/statistics.c $(SRCDIR)/REML.c $(SRCDIR)/optimization.c $(SRCDIR)/ooc_BLAS.c $(SRCDIR)/double_buffering.c $(SRCDIR)/utils.c $(SRCDIR)/GWAS.c $(SRCDIR)/databel.c
> OBJS = $(SRCS:.c=.o)
> RESH_SRCS=$(RESH_SRCDIR)/main.cpp $(RESH_SRCDIR)/iout_file.cpp $(RESH_SRCDIR)/Parameters.cpp $(RESH_SRCDIR)/reshuffle.cpp $(RESH_SRCDIR)/test.cpp
> -RESH_OBJS = $(RESH_SRCS:.cpp=.o)
> +RESH_OBJS=$(RESH_SRCS:.cpp=.o)
> +F2D_SRCS=$(F2D_SRCDIR)/float2double.c
> +F2D_OBJS=$(F2D_SRCS:.c=.o) $(SRCDIR)/databel.o $(SRCDIR)/wrappers.o
>
> .PHONY: all clean
>
> -all: ./bin/ $(CLAKGWAS) $(RESHUFFLE)
> +all: ./bin/ $(CLAKGWAS) $(RESHUFFLE) $(F2D)
>
> ./bin:
>    mkdir bin
> @@ -31,15 +35,19 @@
>    cd $(RESH_SRCDIR)
>    $(CXX) $^ -o $@
>
> +$(F2D): $(F2D_OBJS)
> +    cd $(F2D_SRCDIR)
> +    $(CC) $^ -o $@
> +
> # Dirty, improve
> platform=Linux
> bindistDir=OmicABEL-$(platform)-bin
> -bindist: ./bin/ $(CLAKGWAS) $(RESHUFFLE)
> +bindist: ./bin/ $(CLAKGWAS) $(RESHUFFLE) $(F2D)
>    rm -rf $(bindistDir)
>    mkdir $(bindistDir)
>    mkdir $(bindistDir)/bin/
>    mkdir $(bindistDir)/doc/
> -    cp -a $(CLAKGWAS) $(RESHUFFLE) $(bindistDir)/bin/
> +    cp -a $(CLAKGWAS) $(RESHUFFLE) $(F2D) $(bindistDir)/bin/
>    cp -a COPYING LICENSE README DISCLAIMER.$(platform) $(bindistDir)
>    cp -a doc/README-reshuffle doc/INSTALL doc/HOWTO $(bindistDir)/doc
>    tar -czvf $(bindistDir).tgz $(bindistDir)
> @@ -52,6 +60,8 @@
>    $(RM) $(SRCDIR)/*opari_GPU*
>    $(RM) $(RESH_OBJS)
>    $(RM) $(RESHUFFLE)
> +    $(RM) $(F2D_OBJS)
> +    $(RM) $(F2D)
>
>
> src/CLAK_GWAS.o: src/CLAK_GWAS.c src/wrappers.h src/utils.h src/GWAS.h \
>
> Modified: pkg/OmicABEL/doc/HOWTO
> ===================================================================
> --- pkg/OmicABEL/doc/HOWTO    2013-07-01 08:50:00 UTC (rev 1263)
> +++ pkg/OmicABEL/doc/HOWTO    2013-07-01 08:56:30 UTC (rev 1264)
> @@ -5,6 +5,9 @@
>
> * CLAK-GWAS: the program to run GWAS analyses (through CLAK-Chol or CLAK-Eig)
> * reshuffle: the program to extract the output of CLAK-GWAS into text format
> +* float2double: the program to translate databel files (*.fvi, *.fvd)
> +                in single precision "float" format into double precision
> +                "double" format.
>
> The output produced by CLAK-GWAS is kept in a compact binary format
> for performance reasons. The user can then use "reshuffle" to
> @@ -21,6 +24,10 @@
>
> http://www.genabel.org/packages/OmicABEL
>
> +If you already prepared your data in DatABEL format, but you used
> +single precision (float) data. You can make use of float2double
> +to transform it into double precision (double) data.
> +
> If you need help, please contact us, or use the GenABEL project forum
>
> http://forum.genabel.org
> @@ -40,7 +47,7 @@
>
> The example in the tutorial also provides a basic example on using OmicABEL
> to run your GWAS analyses. Here we detail the options of CLAK-GWAS.
> -The complete list of options for CLAK-GWAS is avaliable through the command
> +The complete list of options for CLAK-GWAS is available through the command
>
> ./CLAK-GWAS -h
>
> @@ -84,3 +91,5 @@
>
>
> For a detailed description of "reshuffle", please refer to doc/README-reshuffle
> +
> +
>
> Added: pkg/OmicABEL/src/float2double/float2double.c
> ===================================================================
> --- pkg/OmicABEL/src/float2double/float2double.c                            (rev 0)
> +++ pkg/OmicABEL/src/float2double/float2double.c    2013-07-01 08:56:30 UTC (rev 1264)
> @@ -0,0 +1,142 @@
> +/*
> + * Copyright (c) 2010-2013, Diego Fabregat-Traver and Paolo Bientinesi.
> + * All rights reserved.
> + *
> + * This file is part of OmicABEL.
> + *
> + * OmicABEL is free software: you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 3 of the License, or
> + * (at your option) any later version.
> + *
> + * OmicABEL is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with OmicABEL. If not, see <http://www.gnu.org/licenses/>.
> + *
> + *
> + * Coded by:
> + *   Diego Fabregat-Traver (fabregat at aices.rwth-aachen.de)
> + */
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +#include "../wrappers.h"
> +#include "../databel.h"
> +
> +#define MB (1L<<20)
> +#define STR_BUFFER_SIZE 256
> +
> +int main( int argc, char *argv[] )
> +{
> +    char  fin_path_fvi[STR_BUFFER_SIZE],
> +          fin_path_fvd[STR_BUFFER_SIZE],
> +         fout_path_fvi[STR_BUFFER_SIZE],
> +         fout_path_fvd[STR_BUFFER_SIZE];
> +    FILE *fin, *fout;
> +    struct databel_fvi *databel_in, *databel_out;
> +
> +    float *datain;
> +    double *dataout;
> +    size_t buff_size = 256*MB;
> +
> +    long long int nelems;
> +    int nelems_in_buff, nelems_to_write;
> +    int header_data_size;
> +
> +    int i, j, out;
> +
> +    if ( argc != 3 )
> +    {
> +        fprintf( stderr, "Usage: %s floatFileIn doubleFileOut\n", argv[0] );
> +        exit( EXIT_FAILURE );
> +    }
> +
> +    snprintf(  fin_path_fvi, STR_BUFFER_SIZE, "%s.fvi", argv[1] );
> +    snprintf(  fin_path_fvd, STR_BUFFER_SIZE, "%s.fvd", argv[1] );
> +    snprintf( fout_path_fvi, STR_BUFFER_SIZE, "%s.fvi", argv[2] );
> +    snprintf( fout_path_fvd, STR_BUFFER_SIZE, "%s.fvd", argv[2] );
> +
> +    // FVI files
> +    databel_in = load_databel_fvi( fin_path_fvi );
> +    if ( databel_in->fvi_header.type != FLOAT_TYPE )
> +    {
> +        fprintf( stderr, "Input databel file(s) %s should include \"float\" data\n", argv[1]);
> +        exit( EXIT_FAILURE );
> +    }
> +    databel_out = (databel_fvi *) fgls_malloc( sizeof(databel_fvi) );
> +    // Header
> +    databel_out->fvi_header.type = DOUBLE_TYPE;
> +    databel_out->fvi_header.nelements       = databel_in->fvi_header.nelements;
> +    databel_out->fvi_header.numObservations = databel_in->fvi_header.numObservations;
> +    databel_out->fvi_header.numVariables    = databel_in->fvi_header.numVariables;
> +    databel_out->fvi_header.bytesPerRecord  = sizeof( double );
> +    databel_out->fvi_header.bitsPerRecord   = databel_out->fvi_header.bytesPerRecord * 8;
> +    databel_out->fvi_header.namelength      = databel_in->fvi_header.namelength;
> +    for ( i = 0; i < RESERVEDSPACE; i++ )
> +        databel_out->fvi_header.reserved[i] = '\0';
> +    // Labels
> +    header_data_size = (databel_out->fvi_header.numVariables + databel_out->fvi_header.numObservations ) *
> +                        databel_out->fvi_header.namelength * sizeof(char);
> +    databel_out->fvi_data = (char *) fgls_malloc ( header_data_size );
> +    memcpy( databel_out->fvi_data, databel_in->fvi_data, header_data_size );
> +
> +    // Write
> +    fout = fgls_fopen( fout_path_fvi, "wb" );
> +    out = fwrite( &databel_out->fvi_header, sizeof(databel_fvi_header), 1, fout);
> +    if ( out != 1 )
> +    {
> +        fprintf(stderr, "Error writing fvi header\n" );
> +        exit( EXIT_FAILURE );
> +    }
> +    out = fwrite( databel_out->fvi_data,
> +                  databel_out->fvi_header.namelength * sizeof(char),
> +                  databel_out->fvi_header.numVariables + databel_out->fvi_header.numObservations,
> +                  fout);
> +    if ( out != (databel_out->fvi_header.numVariables + databel_out->fvi_header.numObservations) )
> +    {
> +        fprintf(stderr, "Error writing fvi data\n" );
> +        exit( EXIT_FAILURE );
> +    }
> +    fclose( fout );
> +
> +    // FVD
> +    fin  = fgls_fopen(  fin_path_fvd, "rb" );
> +    fout = fgls_fopen( fout_path_fvd, "wb" );
> +    // buff_size determines the size of the buffer for the "double" array.
> +    // For the same amount of elements, float needs half the memory space
> +    datain  = (float *)  fgls_malloc( buff_size / 2 );
> +    dataout = (double *) fgls_malloc( buff_size );
> +
> +    nelems = databel_out->fvi_header.numVariables * databel_out->fvi_header.numObservations; // total elems in file
> +    nelems_in_buff = buff_size / sizeof(double);
> +    for ( i = 0; i < nelems; i += nelems_in_buff )
> +    {
> +        nelems_to_write = ((nelems - i) >= nelems_in_buff) ? nelems_in_buff : nelems - i;
> +        if ( fread( datain, sizeof(float), nelems_to_write, fin ) != nelems_to_write )
> +        {
> +            fprintf( stderr, "Error reading data from %s\n", fin_path_fvd );
> +            exit( EXIT_FAILURE );
> +        }
> +        for ( j = 0; j < nelems_to_write; j++ )
> +            dataout[j] = (double)datain[j];
> +        if ( fwrite( dataout, sizeof(double), nelems_to_write, fout ) != nelems_to_write )
> +        {
> +            fprintf( stderr, "Error writing data to %s\n", fout_path_fvd );
> +            exit( EXIT_FAILURE );
> +        }
> +    }
> +    fclose( fin );
> +    fclose( fout );
> +    free( datain );
> +    free( dataout );
> +    free_databel_fvi( &databel_in );
> +    free_databel_fvi( &databel_out );
> +
> +    return 0;
> +}
>
> _______________________________________________
> Genabel-commits mailing list
> Genabel-commits at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits