<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN">

<html><body>

<p> </p>

<p>Hi,</p>

<p>fread memory maps the entire uncompressed file and this is baked into the way it works (e.g. skipping to the beginning, middle and last 5 rows to detect column types before starting to read the rows in) and where the convenience and speed comes from.</p>

<p>You could uncompress the .gz to a ramdisk first, and then fread the uncompressed file from that ramdisk, is probably the fastest way.  Which should still be pretty quick and I guess unlikely much slower than anything we could build into fread (provided you use a ramdisk).</p>

<p>Matthew</p>

<p> </p>

<p>On 02.04.2013 19:30, Nathaniel Graham wrote:</p>

<blockquote type="cite" style="padding-left:5px; border-left:#1010ff 2px solid; margin-left:5px; width:100%"><!-- html ignored --><!-- head ignored --><!-- meta ignored -->

<div dir="ltr">I have a moderately large csv file that's gzipped, but not in a tar

<div>archive, so it's "filename.csv.gz" that I want to read into a data.table.</div>

<div>I'd like to use fread(), but I can't seem to make it work.  I'm currently</div>

<div>using the following:</div>

<div>data.table(read.csv(gzfile("filename.csv.gz","r")))</div>

<div>Various combinations of gzfile, gzcon, file, readLines, and</div>

<div>textConnection all produce an error (invalid input).  Is there a better</div>

<div>way to read in large, compressed files?</div>

<div>

<div>

<div>-------<br />Nathaniel Graham<br /><a href="mailto:npgraham1@gmail.com">npgraham1@gmail.com</a><br /><a href="mailto:npgraham1@uky.edu">npgraham1@uky.edu</a></div>

</div>

</div>

</div>

</blockquote>

<p> </p>

<div> </div>

</body></html>