[datatable-help] fread for flat files

Mark Danese mark at outins.com
Sun Apr 20 09:04:41 CEST 2014


Thanks Michael.  The flat file format doesn’t have spaces between fields.
They are all concatenated.  It may be possible to use sed with a vector of
widths, but I am not a command-line person (yet).

It just may be one of those things that isn’t easy to implement in fread.
In healthcare in the US there are still a lot of flat files out there.  We
usually use SAS but I am trying to get away from that.  And R can read
flat files(read.fwf), but it is pretty slow.  From what I understand,
read.fwf actually does insert commas and then reads the file.  So, it
might be possible to hack read.fwf and fread together somehow.

My first experience with fread was to read in a 1.6 GB file in 30 seconds.
 That was pretty impressive.


On 4/19/14, 5:07 AM, "Michael Smith" <my.r.help at gmail.com> wrote:

>Probably you could do this from the Linux command line using `sed`, i.e.
>to replace several spaces with a comma.
>
>https://www.google.com/search?q=sed+replace+space+with+comma
>
>If you're on Windows, you probably can do the same using Cygwin.
>
>M
>
>
>On 04/19/2014 12:37 AM, Mark Danese wrote:
>> Is it possible to pass a vector of column widths to have fread read in a
>> flat file?  I saw that someone suggested using csvkit to add commas and
>> then use data table, but that is beyond my skill set.
>> 
>> 
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>> 
>>https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-he
>>lp
>> 



More information about the datatable-help mailing list