[datatable-help] fread (sep2) on data with a comma as decimal delimiter

Matthew Dowle mdowle at mdowle.plus.com
Tue Apr 30 20:48:32 CEST 2013


Hi,

Ah yes,  fread is locale aware.  So if you Set.locale() for the numeric 
option to say the decimal separator is comma,  then fread should heed 
that.  Somewhere, either on S.O. or datatable-help this has come up 
before, with example and it was successful.  Try searching for 
"[data.table] Sys.setlocale"  (I forget that function's spelling 
exactly).

We could add this locale change as an option to data.table but it 
depends on choosing a particular installed locale that has the comma as 
separator, and doing this in a cross-platform way is not something I 
know a huge amount about.  There was a concern that locale changes are 
global, but as far as I know it only affects the current R session and 
switching back on.exit() should be safe enough (as a way to build it 
in).   fread uses a stdlib call to read floating point (rather than R 
which does it itself in its own C code).  It's that stdlib call that is 
locale aware and is quite convenient (and fast) from fread's internals 
point of view.

Matthew


On 30.04.2013 19:38, ravi wrote:
> Hi,
> I have a huge excel file that I have converted to a tab delimited
>  file. The numerical data have a comma as a decimal delimiter. I made 
> a
> compressed version of the file by just taking the first 100 rows. On
> this, I have confirmed that the following command works fine :
> 
> df<-read.table(file=file1,header=TRUE,sep="\t",dec=",",encoding="latin1")
> The following data.table also appears to work OK :
> dt<-fread(file1,sep="\t")
> But
>  the numerical data end up as characters. I would like to have help 
> with
>  the most efficient method of converting these into numeric class. I
> note that sep2 has not been implemented yet. Is there any workaround?
> Can I specify the encoding also?
> Would appreciate any help that I can get.
> Thanks,
> Ravi
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> 
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help



More information about the datatable-help mailing list