<div dir="ltr"><div class="gmail_default"><div class="gmail_default"><font face="arial, helvetica, sans-serif"><div class="gmail_default">Input contains a \n (or is ""), taking this to be text input (not a filename) </div>
<div class="gmail_default">Detected eol as \n only (no \r afterwards), the UNIX and Mac standard. </div><div class="gmail_default">Using line 30 to detect sep (the last non blank line in the first 30) ... '\t' </div>
<div class="gmail_default">Found 2 columns </div><div class="gmail_default">First row with 2 fields occurs on line 1 (either column names or first row of data) </div>
<div class="gmail_default">All the fields on line 1 are character fields. Treating as the column names. </div><div class="gmail_default">Count of eol after first data row: 1023 </div>
<div class="gmail_default">Subtracted 1 for last eol and any trailing empty lines, leaving 1022 data rows </div><div class="gmail_default">Type codes: 33 (first 5 rows) </div>
<div class="gmail_default">Type codes: 33 (+middle 5 rows) </div><div class="gmail_default">Type codes: 33 (+last 5 rows) </div>
<div class="gmail_default"> 0.000s (-nan%) Memory map (rerun may be quicker) </div><div class="gmail_default"> 0.000s (-nan%) sep and header detection </div>
<div class="gmail_default"> 0.000s (-nan%) Count rows (wc -l) </div><div class="gmail_default"> 0.000s (-nan%) Column type detection (first, middle and last 5 rows) </div>
<div class="gmail_default"> 0.000s (-nan%) Allocation of 1022x2 result (xMB) in RAM </div><div class="gmail_default"> 0.000s (-nan%) Reading data </div>
<div class="gmail_default"> 0.000s (-nan%) Allocation for type bumps (if any), including gc time if triggered</div><div class="gmail_default"> 0.000s (-nan%) Coercing data already read in type bumps (if any) </div>
<div class="gmail_default"> 0.000s (-nan%) Changing na.strings to NA </div><div class="gmail_default"> 0.000s Total </div>
<div class="gmail_default">4092 1022 </div><div>Input contains a \n (or is ""), taking this to be text input (not a filename) <br>
</div>
</font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Detected eol as \n only (no \r afterwards), the UNIX and Mac standard. </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Using line 30 to detect sep (the last non blank line in the first 30) ... '\t' </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">Found 2 columns </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">First row with 2 fields occurs on line 1 (either column names or first row of data) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">All the fields on line 1 are character fields. Treating as the column names. </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Count of eol after first data row: 1023 </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">Subtracted 0 for last eol and any trailing empty lines, leaving 1023 data rows </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (first 5 rows) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (+middle 5 rows) </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (+last 5 rows) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Memory map (rerun may be quicker) </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) sep and header detection </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Count rows (wc -l) </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Column type detection (first, middle and last 5 rows) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Allocation of 1023x2 result (xMB) in RAM </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Reading data </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Allocation for type bumps (if any), including gc time if triggered</font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Coercing data already read in type bumps (if any) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Changing na.strings to NA </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s Total </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">4096 1023 </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Input contains a \n (or is ""), taking this to be text input (not a filename) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">Detected eol as \n only (no \r afterwards), the UNIX and Mac standard. </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Using line 30 to detect sep (the last non blank line in the first 30) ... '\t' </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">Found 2 columns </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">First row with 2 fields occurs on line 1 (either column names or first row of data) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">All the fields on line 1 are character fields. Treating as the column names. </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Count of eol after first data row: 1023 </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">Subtracted 0 for last eol and any trailing empty lines, leaving 1023 data rows </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (first 5 rows) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (+middle 5 rows) </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (+last 5 rows) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Memory map (rerun may be quicker) </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) sep and header detection </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Count rows (wc -l) </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Column type detection (first, middle and last 5 rows) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Allocation of 1023x2 result (xMB) in RAM </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Reading data </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Allocation for type bumps (if any), including gc time if triggered</font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Coercing data already read in type bumps (if any) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Changing na.strings to NA </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s Total </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">4100 1023 </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Input contains a \n (or is ""), taking this to be text input (not a filename) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">Detected eol as \n only (no \r afterwards), the UNIX and Mac standard. </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Using line 30 to detect sep (the last non blank line in the first 30) ... '\t' </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">Found 2 columns </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">First row with 2 fields occurs on line 1 (either column names or first row of data) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">All the fields on line 1 are character fields. Treating as the column names. </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Count of eol after first data row: 1023 </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">Subtracted 0 for last eol and any trailing empty lines, leaving 1023 data rows </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (first 5 rows) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (+middle 5 rows) </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (+last 5 rows) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Memory map (rerun may be quicker) </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) sep and header detection </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Count rows (wc -l) </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Column type detection (first, middle and last 5 rows) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Allocation of 1023x2 result (xMB) in RAM </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Reading data </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Allocation for type bumps (if any), including gc time if triggered</font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Coercing data already read in type bumps (if any) </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s (-nan%) Changing na.strings to NA </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif"> 0.000s Total </font></div>
<div class="gmail_default"><font face="arial, helvetica, sans-serif">40000 1023 </font></div><div style="font-family:arial,helvetica,sans-serif"><br>
</div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Mar 28, 2013 at 2:55 PM, Matthew Dowle <span dir="ltr"><<a href="mailto:mdowle@mdowle.plus.com" target="_blank">mdowle@mdowle.plus.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><u></u>
<div>
<p> </p>
<p>Hm this is odd.</p>
<p>Could you run the following and paste back the (verbose) results please.</p>
<pre><div class="im">for (n in c(1023:1025, 10000)) {<br></div> input = paste( rep('a\tb\n', n), collapse='')<br> A = fread(input,verbose=TRUE)<br> cat(nchar(input), nrow(A), "\n")<br>}</pre><div>
<div class="h5">
<p> </p>
<p> </p>
<p>On 28.03.2013 14:38, Timothée Carayol wrote:</p>
<blockquote type="cite" style="padding-left:5px;border-left:#1010ff 2px solid;margin-left:5px;width:100%">
<div dir="ltr">
<div class="gmail_default">
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">Curiouser and curiouser..</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"><br></span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">I can reproduce on two computers with different versions of R and of data.table.</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"><br></span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"><br></span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"><br></span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">Computer 1 (it says unknown-linux but is actually ubuntu):</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"><br></span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">R version 2.15.3 (2013-03-01) </span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">Platform: x86_64-unknown-linux-gnu (64-bit) </span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"> </span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">locale: </span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"> LC_MESSAGES=en_GB.UTF-8 LC_PAPER=C LC_NAME=C LC_ADDRESS=C </span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">[10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C </span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"> </span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">attached base packages: </span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">[1] stats graphics grDevices utils datasets methods base </span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"> </span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">other attached packages: </span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">[1] bit64_0.9-2 bit_1.1-10 data.table_1.8.9 colorout_1.0-0 </span></div>
<div style="font-family:arial,helvetica,sans-serif">Computer 2:</div>
<div>
<div><span style="font-family:arial,helvetica,sans-serif">R version 2.15.2 (2012-10-26) </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">Platform: x86_64-redhat-linux-gnu (64-bit) </span></div>
<div><span style="font-family:arial,helvetica,sans-serif"> </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">locale: </span></div>
<div><span style="font-family:arial,helvetica,sans-serif"> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C </span></div>
<div><span style="font-family:arial,helvetica,sans-serif"> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 </span></div>
<div><span style="font-family:arial,helvetica,sans-serif"> [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 </span></div>
<div><span style="font-family:arial,helvetica,sans-serif"> [7] LC_PAPER=C LC_NAME=C </span></div>
<div><span style="font-family:arial,helvetica,sans-serif"> [9] LC_ADDRESS=C LC_TELEPHONE=C </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C </span></div>
<div><span style="font-family:arial,helvetica,sans-serif"> </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">attached base packages: </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">[1] stats graphics grDevices utils datasets methods base</span></div>
<div><span style="font-family:arial,helvetica,sans-serif"> </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">other attached packages: </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">[1] data.table_1.8.8 </span></div>
<div><span style="font-family:arial,helvetica,sans-serif"> </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">loaded via a namespace (and not attached): </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">[1] tools_2.15.2 </span></div>
</div>
</div>
</div>
<div class="gmail_extra"><br><br>
<div class="gmail_quote">On Thu, Mar 28, 2013 at 2:31 PM, Matthew Dowle <span><<a href="mailto:mdowle@mdowle.plus.com" target="_blank">mdowle@mdowle.plus.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span style="text-decoration:underline"></span>
<div>
<p> </p>
<p>Interesting, what's your sessionInfo() please?</p>
<p>For me it seems to work ok :</p>
<pre>[1] 1022
[1] 1023
[1] 1024
[1] 9999<br><br></pre>
<pre>> sessionInfo()<br>R version 2.15.2 (2012-10-26)<br>Platform: x86_64-w64-mingw32/x64 (64-bit)</pre>
<div>
<p> </p>
<p>On <a>27.03.2013 22</a>:49, Timothée Carayol wrote:</p>
</div>
<blockquote style="padding-left:5px;border-left:#1010ff 2px solid;margin-left:5px;width:100%">
<div dir="ltr">
<div>
<div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Agree with Muhammad, longer character strings are definitely permitted in R.</div>
<div class="gmail_default" style="font-family:arial,helvetica,sans-serif">A minimal example that show something strange happening with fread:</div>
</div>
<div class="gmail_default">
<div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">for (n in c(1023:1025, 10000)) {</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"> A </span></div>
</div>
<div>
<div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"> paste(</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"> rep('a\tb\n', n),</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"> collapse=''</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"> ),</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"> sep='\t'</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"> )</span></div>
<div class="gmail_default" style="font-family:arial,helvetica,sans-serif"> print(nrow(A))</div>
<div class="gmail_default" style="font-family:arial,helvetica,sans-serif">}</div>
<div class="gmail_default" style="font-family:arial,helvetica,sans-serif">On my computer, I obtain:</div>
<div class="gmail_default">
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">[1] 1022</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">[1] 1023</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">[1] 1023</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">[1] 1023</span></div>
</div>
<div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Hope this helps</div>
<div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Timothée</div>
</div>
</div>
</div>
</div>
<div>
<div>
<div class="gmail_extra"><br><br>
<div class="gmail_quote">On Wed, Mar 27, 2013 at 9:23 PM, Matthew Dowle <span><<a href="mailto:mdowle@mdowle.plus.com" target="_blank">mdowle@mdowle.plus.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br> Nice to hear from you. Nope not known to me. Obviously 4096 is 4k, is that<br> the R limit for a character string length? What happens at 4097?<br>
Matthew<br>
<div>
<div><br> > Hi,<br> ><br> > I have an example of a string of 4097 characters which can't be parsed by<br> > fread; however, if I remove any character, it can be parsed just fine. Is<br> > that a known limitation?<br>
><br> > (If I write the string to a file and then fread the file name, it works<br> > too.)<br> ><br> > Let me know if you need the string and/or a bug report.<br> ><br> > Thanks<br> > Timothée</div>
</div>
> _______________________________________________<br> > datatable-help mailing list<br> > <a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a><br>
> <a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a><br><br><br></blockquote>
</div>
</div>
</div>
</div>
</blockquote>
<p> </p>
<div> </div>
</div>
</blockquote>
</div>
</div>
</blockquote>
<p> </p>
<div> </div>
</div></div></div>
</blockquote></div><br></div>