[GenABEL-dev] [Genabel-commits] r1264 - in pkg/OmicABEL: . doc src src/float2double
Yurii Aulchenko
yurii.aulchenko at gmail.com
Mon Jul 1 13:59:32 CEST 2013
Diego, thanks for reacting so quickly and arranging the float2double
converter for filevector files!
Two questions/suggestions:
1) I wonder if float2double is a good name - could that be the name is
already taken? Should we be more specific this is related to
filevector?
2) You check that inFile is != float and break execution if yes.
Should the program also report what format the data is in? e.g. "The
inFile contains filevector-INT, but I can only convert
filevector-FLOAT to filevector-DOUBLE"?
These are suggestions for discussion - I do not have a strong opinion here.
YA
----------------------
Yurii Aulchenko
(sent from mobile device)
On 1 Jul 2013, at 10:56, "noreply at r-forge.r-project.org"
<noreply at r-forge.r-project.org> wrote:
> Author: dfabregat
> Date: 2013-07-01 10:56:30 +0200 (Mon, 01 Jul 2013)
> New Revision: 1264
>
> Added:
> pkg/OmicABEL/src/float2double/
> pkg/OmicABEL/src/float2double/float2double.c
> Modified:
> pkg/OmicABEL/Makefile
> pkg/OmicABEL/doc/HOWTO
> Log:
> Adding the program float2double to translate DatABEL
> "float" data into DatABEL "double" data.
>
>
> Modified: pkg/OmicABEL/Makefile
> ===================================================================
> --- pkg/OmicABEL/Makefile 2013-07-01 08:50:00 UTC (rev 1263)
> +++ pkg/OmicABEL/Makefile 2013-07-01 08:56:30 UTC (rev 1264)
> @@ -2,8 +2,10 @@
>
> SRCDIR = ./src
> RESH_SRCDIR = ./src/reshuffle
> +F2D_SRCDIR = ./src/float2double
> CLAKGWAS = ./bin/CLAK-GWAS
> RESHUFFLE = ./bin/reshuffle
> +F2D = ./bin/float2double
>
> #QUICK and DIRTY
> CXX=g++
> @@ -15,11 +17,13 @@
> SRCS = $(SRCDIR)/CLAK_GWAS.c $(SRCDIR)/fgls_chol.c $(SRCDIR)/fgls_eigen.c $(SRCDIR)/wrappers.c $(SRCDIR)/timing.c $(SRCDIR)/statistics.c $(SRCDIR)/REML.c $(SRCDIR)/optimization.c $(SRCDIR)/ooc_BLAS.c $(SRCDIR)/double_buffering.c $(SRCDIR)/utils.c $(SRCDIR)/GWAS.c $(SRCDIR)/databel.c
> OBJS = $(SRCS:.c=.o)
> RESH_SRCS=$(RESH_SRCDIR)/main.cpp $(RESH_SRCDIR)/iout_file.cpp $(RESH_SRCDIR)/Parameters.cpp $(RESH_SRCDIR)/reshuffle.cpp $(RESH_SRCDIR)/test.cpp
> -RESH_OBJS = $(RESH_SRCS:.cpp=.o)
> +RESH_OBJS=$(RESH_SRCS:.cpp=.o)
> +F2D_SRCS=$(F2D_SRCDIR)/float2double.c
> +F2D_OBJS=$(F2D_SRCS:.c=.o) $(SRCDIR)/databel.o $(SRCDIR)/wrappers.o
>
> .PHONY: all clean
>
> -all: ./bin/ $(CLAKGWAS) $(RESHUFFLE)
> +all: ./bin/ $(CLAKGWAS) $(RESHUFFLE) $(F2D)
>
> ./bin:
> mkdir bin
> @@ -31,15 +35,19 @@
> cd $(RESH_SRCDIR)
> $(CXX) $^ -o $@
>
> +$(F2D): $(F2D_OBJS)
> + cd $(F2D_SRCDIR)
> + $(CC) $^ -o $@
> +
> # Dirty, improve
> platform=Linux
> bindistDir=OmicABEL-$(platform)-bin
> -bindist: ./bin/ $(CLAKGWAS) $(RESHUFFLE)
> +bindist: ./bin/ $(CLAKGWAS) $(RESHUFFLE) $(F2D)
> rm -rf $(bindistDir)
> mkdir $(bindistDir)
> mkdir $(bindistDir)/bin/
> mkdir $(bindistDir)/doc/
> - cp -a $(CLAKGWAS) $(RESHUFFLE) $(bindistDir)/bin/
> + cp -a $(CLAKGWAS) $(RESHUFFLE) $(F2D) $(bindistDir)/bin/
> cp -a COPYING LICENSE README DISCLAIMER.$(platform) $(bindistDir)
> cp -a doc/README-reshuffle doc/INSTALL doc/HOWTO $(bindistDir)/doc
> tar -czvf $(bindistDir).tgz $(bindistDir)
> @@ -52,6 +60,8 @@
> $(RM) $(SRCDIR)/*opari_GPU*
> $(RM) $(RESH_OBJS)
> $(RM) $(RESHUFFLE)
> + $(RM) $(F2D_OBJS)
> + $(RM) $(F2D)
>
>
> src/CLAK_GWAS.o: src/CLAK_GWAS.c src/wrappers.h src/utils.h src/GWAS.h \
>
> Modified: pkg/OmicABEL/doc/HOWTO
> ===================================================================
> --- pkg/OmicABEL/doc/HOWTO 2013-07-01 08:50:00 UTC (rev 1263)
> +++ pkg/OmicABEL/doc/HOWTO 2013-07-01 08:56:30 UTC (rev 1264)
> @@ -5,6 +5,9 @@
>
> * CLAK-GWAS: the program to run GWAS analyses (through CLAK-Chol or CLAK-Eig)
> * reshuffle: the program to extract the output of CLAK-GWAS into text format
> +* float2double: the program to translate databel files (*.fvi, *.fvd)
> + in single precision "float" format into double precision
> + "double" format.
>
> The output produced by CLAK-GWAS is kept in a compact binary format
> for performance reasons. The user can then use "reshuffle" to
> @@ -21,6 +24,10 @@
>
> http://www.genabel.org/packages/OmicABEL
>
> +If you already prepared your data in DatABEL format, but you used
> +single precision (float) data. You can make use of float2double
> +to transform it into double precision (double) data.
> +
> If you need help, please contact us, or use the GenABEL project forum
>
> http://forum.genabel.org
> @@ -40,7 +47,7 @@
>
> The example in the tutorial also provides a basic example on using OmicABEL
> to run your GWAS analyses. Here we detail the options of CLAK-GWAS.
> -The complete list of options for CLAK-GWAS is avaliable through the command
> +The complete list of options for CLAK-GWAS is available through the command
>
> ./CLAK-GWAS -h
>
> @@ -84,3 +91,5 @@
>
>
> For a detailed description of "reshuffle", please refer to doc/README-reshuffle
> +
> +
>
> Added: pkg/OmicABEL/src/float2double/float2double.c
> ===================================================================
> --- pkg/OmicABEL/src/float2double/float2double.c (rev 0)
> +++ pkg/OmicABEL/src/float2double/float2double.c 2013-07-01 08:56:30 UTC (rev 1264)
> @@ -0,0 +1,142 @@
> +/*
> + * Copyright (c) 2010-2013, Diego Fabregat-Traver and Paolo Bientinesi.
> + * All rights reserved.
> + *
> + * This file is part of OmicABEL.
> + *
> + * OmicABEL is free software: you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 3 of the License, or
> + * (at your option) any later version.
> + *
> + * OmicABEL is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with OmicABEL. If not, see <http://www.gnu.org/licenses/>.
> + *
> + *
> + * Coded by:
> + * Diego Fabregat-Traver (fabregat at aices.rwth-aachen.de)
> + */
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +#include "../wrappers.h"
> +#include "../databel.h"
> +
> +#define MB (1L<<20)
> +#define STR_BUFFER_SIZE 256
> +
> +int main( int argc, char *argv[] )
> +{
> + char fin_path_fvi[STR_BUFFER_SIZE],
> + fin_path_fvd[STR_BUFFER_SIZE],
> + fout_path_fvi[STR_BUFFER_SIZE],
> + fout_path_fvd[STR_BUFFER_SIZE];
> + FILE *fin, *fout;
> + struct databel_fvi *databel_in, *databel_out;
> +
> + float *datain;
> + double *dataout;
> + size_t buff_size = 256*MB;
> +
> + long long int nelems;
> + int nelems_in_buff, nelems_to_write;
> + int header_data_size;
> +
> + int i, j, out;
> +
> + if ( argc != 3 )
> + {
> + fprintf( stderr, "Usage: %s floatFileIn doubleFileOut\n", argv[0] );
> + exit( EXIT_FAILURE );
> + }
> +
> + snprintf( fin_path_fvi, STR_BUFFER_SIZE, "%s.fvi", argv[1] );
> + snprintf( fin_path_fvd, STR_BUFFER_SIZE, "%s.fvd", argv[1] );
> + snprintf( fout_path_fvi, STR_BUFFER_SIZE, "%s.fvi", argv[2] );
> + snprintf( fout_path_fvd, STR_BUFFER_SIZE, "%s.fvd", argv[2] );
> +
> + // FVI files
> + databel_in = load_databel_fvi( fin_path_fvi );
> + if ( databel_in->fvi_header.type != FLOAT_TYPE )
> + {
> + fprintf( stderr, "Input databel file(s) %s should include \"float\" data\n", argv[1]);
> + exit( EXIT_FAILURE );
> + }
> + databel_out = (databel_fvi *) fgls_malloc( sizeof(databel_fvi) );
> + // Header
> + databel_out->fvi_header.type = DOUBLE_TYPE;
> + databel_out->fvi_header.nelements = databel_in->fvi_header.nelements;
> + databel_out->fvi_header.numObservations = databel_in->fvi_header.numObservations;
> + databel_out->fvi_header.numVariables = databel_in->fvi_header.numVariables;
> + databel_out->fvi_header.bytesPerRecord = sizeof( double );
> + databel_out->fvi_header.bitsPerRecord = databel_out->fvi_header.bytesPerRecord * 8;
> + databel_out->fvi_header.namelength = databel_in->fvi_header.namelength;
> + for ( i = 0; i < RESERVEDSPACE; i++ )
> + databel_out->fvi_header.reserved[i] = '\0';
> + // Labels
> + header_data_size = (databel_out->fvi_header.numVariables + databel_out->fvi_header.numObservations ) *
> + databel_out->fvi_header.namelength * sizeof(char);
> + databel_out->fvi_data = (char *) fgls_malloc ( header_data_size );
> + memcpy( databel_out->fvi_data, databel_in->fvi_data, header_data_size );
> +
> + // Write
> + fout = fgls_fopen( fout_path_fvi, "wb" );
> + out = fwrite( &databel_out->fvi_header, sizeof(databel_fvi_header), 1, fout);
> + if ( out != 1 )
> + {
> + fprintf(stderr, "Error writing fvi header\n" );
> + exit( EXIT_FAILURE );
> + }
> + out = fwrite( databel_out->fvi_data,
> + databel_out->fvi_header.namelength * sizeof(char),
> + databel_out->fvi_header.numVariables + databel_out->fvi_header.numObservations,
> + fout);
> + if ( out != (databel_out->fvi_header.numVariables + databel_out->fvi_header.numObservations) )
> + {
> + fprintf(stderr, "Error writing fvi data\n" );
> + exit( EXIT_FAILURE );
> + }
> + fclose( fout );
> +
> + // FVD
> + fin = fgls_fopen( fin_path_fvd, "rb" );
> + fout = fgls_fopen( fout_path_fvd, "wb" );
> + // buff_size determines the size of the buffer for the "double" array.
> + // For the same amount of elements, float needs half the memory space
> + datain = (float *) fgls_malloc( buff_size / 2 );
> + dataout = (double *) fgls_malloc( buff_size );
> +
> + nelems = databel_out->fvi_header.numVariables * databel_out->fvi_header.numObservations; // total elems in file
> + nelems_in_buff = buff_size / sizeof(double);
> + for ( i = 0; i < nelems; i += nelems_in_buff )
> + {
> + nelems_to_write = ((nelems - i) >= nelems_in_buff) ? nelems_in_buff : nelems - i;
> + if ( fread( datain, sizeof(float), nelems_to_write, fin ) != nelems_to_write )
> + {
> + fprintf( stderr, "Error reading data from %s\n", fin_path_fvd );
> + exit( EXIT_FAILURE );
> + }
> + for ( j = 0; j < nelems_to_write; j++ )
> + dataout[j] = (double)datain[j];
> + if ( fwrite( dataout, sizeof(double), nelems_to_write, fout ) != nelems_to_write )
> + {
> + fprintf( stderr, "Error writing data to %s\n", fout_path_fvd );
> + exit( EXIT_FAILURE );
> + }
> + }
> + fclose( fin );
> + fclose( fout );
> + free( datain );
> + free( dataout );
> + free_databel_fvi( &databel_in );
> + free_databel_fvi( &databel_out );
> +
> + return 0;
> +}
>
> _______________________________________________
> Genabel-commits mailing list
> Genabel-commits at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits
More information about the genabel-devel
mailing list