<div dir="ltr"><div style>Problem with fread on a large file</div><div><br></div>The file is 8GB, just short of 200,000 lines, produced as SQLoutput and modified by cygwin/perl to remove the second line.<br><div class="gmail_quote">
<div dir="ltr"><div><br></div><div>Using data.table 1.8.8 on R3.0.0 I get an fread error</div><div><br></div>
<div><div>fread("data/spd_all_fixed.csv",sep=",")</div><div>Error in fread("data/spd_all_fixed.csv", sep = ",") : </div><div> Expected sep (',') but '0' ends field 5 on line 6 when detecting types: 204038,2617097,20110803,0,0</div>
<div><br></div><div>Looking for the offending line,with line numbers in output so I'm guessing this is line 6 of the mid-file chunk examined,</div><div><br></div><div><div>$ grep -n '204038,2617097,201108' spd_all_fixed.csv</div>
<div>8316105:204038,2617097,20110801,0,0,0.64220529999999998,0,0,0</div>
<div>8751106:204038,2617097,20110802,1,0,0.65744469999999999,0,0,0</div><div>9186294:204038,2617097,20110803,0,0,0.49455500000000002,0,0,0</div><div>9621619:204038,2617097,20110804,0,0,0.3461342,0,0,0</div><div>10057189:204038,2617097,20110805,0,0,0.34128710000000001,0,0,0</div>
<div><br></div><div>and comparing to surrounding lines and the first ten lines</div><div><br></div><div><div>$ head spd_all_fixed.csv</div><div>s_key,i_key,p_key,q,pq,d,l,epi,class</div><div>203974,1107181,20110713,0,0,0.13700080000000001,0,0,0</div>
<div>203975,1107181,20110713,0,0,5.8352899999999999E-2,0,0,0</div><div>203976,1107181,20110713,0,0,7.1298999999999998E-3,0,0,0</div><div>203978,1107181,20110713,0,0,0.78346819999999995,0,0,0</div><div>203979,1107181,20110713,0,0,0.61627779999999999,0,0,0</div>
<div>203981,1107181,20110713,1,0,0.38610509999999998,0,0,0</div><div>203982,1107181,20110713,0,0,4.0657899999999997E-2,0,0,0</div><div>203983,1107181,20110713,2,0,0.71278109999999995,0,0,0</div><div>203984,1107181,20110713,0,0,0.42634430000000001,0.42634430000000001,2,13</div>
<div><br></div><div>I can't see any difference. I wonder if this is a bug? I have no problems on a small test data set run through an identical process and using the same fread command.</div><div><br></div>
<div>Regards</div><span class="HOEnZb"><font color="#888888"><div>Paul</div><div><br></div></font></span></div></div></div></div>
</div><br></div>