[datatable-help] Fread Skip Question

stanasa stanasa at latinumnetwork.com
Thu May 8 20:50:03 CEST 2014


First of all, thank you very much for creating, maintaining and updating this
package! Discovering "fread" and the data.table package have made my life a
lot easier. 

I'm using fread to read large (2-4Gb) .CSV files for subsequent RMySQL
bulkloads, and (since the computer I use is a bit memory limited) decided to
read it in chunks, using skip and nrows. I'm noticing that as I go through
the file (with a for loop) each individual read takes on average a bit
longer (as I'm guessing fread parses through the file line by line to reach
the skip to location). 

Is there any way to make fread "remember" the end of the last read location
for the next iteration? 
It would speed up my reads from minutes to seconds, I would guess. 

Also, should I worry that reusing the same data.table in a for loop causes
memory issues?

Many thanks,



Serban Tanasa, Ph.D.
Senior Analyst
Latinum Network

(o) (240) 482-8259
(f)  (240) 482-8265




--
View this message in context: http://r.789695.n4.nabble.com/Fread-Skip-Question-tp4690205.html
Sent from the datatable-help mailing list archive at Nabble.com.


More information about the datatable-help mailing list