<div dir="ltr"><div class="gmail_default"><div class="gmail_default"><font face="arial, helvetica, sans-serif"><div class="gmail_default">Input contains a \n (or is ""), taking this to be text input (not a filename)       </div>

<div class="gmail_default">Detected eol as \n only (no \r afterwards), the UNIX and Mac standard.              </div><div class="gmail_default">Using line 30 to detect sep (the last non blank line in the first 30) ... '\t'      </div>

<div class="gmail_default">Found 2 columns                                                                     </div><div class="gmail_default">First row with 2 fields occurs on line 1 (either column names or first row of data) </div>

<div class="gmail_default">All the fields on line 1 are character fields. Treating as the column names.        </div><div class="gmail_default">Count of eol after first data row: 1023                                             </div>

<div class="gmail_default">Subtracted 1 for last eol and any trailing empty lines, leaving 1022 data rows      </div><div class="gmail_default">Type codes: 33 (first 5 rows)                                                       </div>

<div class="gmail_default">Type codes: 33 (+middle 5 rows)                                                     </div><div class="gmail_default">Type codes: 33 (+last 5 rows)                                                       </div>

<div class="gmail_default">   0.000s (-nan%) Memory map (rerun may be quicker)                                 </div><div class="gmail_default">   0.000s (-nan%) sep and header detection                                          </div>

<div class="gmail_default">   0.000s (-nan%) Count rows (wc -l)                                                </div><div class="gmail_default">   0.000s (-nan%) Column type detection (first, middle and last 5 rows)             </div>

<div class="gmail_default">   0.000s (-nan%) Allocation of 1022x2 result (xMB) in RAM                          </div><div class="gmail_default">   0.000s (-nan%) Reading data                                                      </div>

<div class="gmail_default">   0.000s (-nan%) Allocation for type bumps (if any), including gc time if triggered</div><div class="gmail_default">   0.000s (-nan%) Coercing data already read in type bumps (if any)                 </div>

<div class="gmail_default">   0.000s (-nan%) Changing na.strings to NA                                         </div><div class="gmail_default">   0.000s        Total                                                              </div>

<div class="gmail_default">4092 1022                                                                           </div><div>Input contains a \n (or is ""), taking this to be text input (not a filename)       <br>
</div>
</font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Detected eol as \n only (no \r afterwards), the UNIX and Mac standard.              </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Using line 30 to detect sep (the last non blank line in the first 30) ... '\t'      </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">Found 2 columns                                                                     </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">First row with 2 fields occurs on line 1 (either column names or first row of data) </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">All the fields on line 1 are character fields. Treating as the column names.        </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Count of eol after first data row: 1023                                             </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">Subtracted 0 for last eol and any trailing empty lines, leaving 1023 data rows      </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (first 5 rows)                                                       </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (+middle 5 rows)                                                     </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (+last 5 rows)                                                       </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Memory map (rerun may be quicker)                                 </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) sep and header detection                                          </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Count rows (wc -l)                                                </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Column type detection (first, middle and last 5 rows)             </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Allocation of 1023x2 result (xMB) in RAM                          </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Reading data                                                      </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Allocation for type bumps (if any), including gc time if triggered</font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Coercing data already read in type bumps (if any)                 </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Changing na.strings to NA                                         </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s        Total                                                              </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">4096 1023                                                                           </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Input contains a \n (or is ""), taking this to be text input (not a filename)       </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">Detected eol as \n only (no \r afterwards), the UNIX and Mac standard.              </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Using line 30 to detect sep (the last non blank line in the first 30) ... '\t'      </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">Found 2 columns                                                                     </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">First row with 2 fields occurs on line 1 (either column names or first row of data) </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">All the fields on line 1 are character fields. Treating as the column names.        </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Count of eol after first data row: 1023                                             </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">Subtracted 0 for last eol and any trailing empty lines, leaving 1023 data rows      </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (first 5 rows)                                                       </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (+middle 5 rows)                                                     </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (+last 5 rows)                                                       </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Memory map (rerun may be quicker)                                 </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) sep and header detection                                          </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Count rows (wc -l)                                                </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Column type detection (first, middle and last 5 rows)             </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Allocation of 1023x2 result (xMB) in RAM                          </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Reading data                                                      </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Allocation for type bumps (if any), including gc time if triggered</font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Coercing data already read in type bumps (if any)                 </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Changing na.strings to NA                                         </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s        Total                                                              </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">4100 1023                                                                           </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Input contains a \n (or is ""), taking this to be text input (not a filename)       </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">Detected eol as \n only (no \r afterwards), the UNIX and Mac standard.              </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Using line 30 to detect sep (the last non blank line in the first 30) ... '\t'      </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">Found 2 columns                                                                     </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">First row with 2 fields occurs on line 1 (either column names or first row of data) </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">All the fields on line 1 are character fields. Treating as the column names.        </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Count of eol after first data row: 1023                                             </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">Subtracted 0 for last eol and any trailing empty lines, leaving 1023 data rows      </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (first 5 rows)                                                       </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (+middle 5 rows)                                                     </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">Type codes: 33 (+last 5 rows)                                                       </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Memory map (rerun may be quicker)                                 </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) sep and header detection                                          </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Count rows (wc -l)                                                </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Column type detection (first, middle and last 5 rows)             </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Allocation of 1023x2 result (xMB) in RAM                          </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Reading data                                                      </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Allocation for type bumps (if any), including gc time if triggered</font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Coercing data already read in type bumps (if any)                 </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s (-nan%) Changing na.strings to NA                                         </font></div><div class="gmail_default"><font face="arial, helvetica, sans-serif">   0.000s        Total                                                              </font></div>

<div class="gmail_default"><font face="arial, helvetica, sans-serif">40000 1023                                                                          </font></div><div style="font-family:arial,helvetica,sans-serif"><br>

</div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Mar 28, 2013 at 2:55 PM, Matthew Dowle <span dir="ltr"><<a href="mailto:mdowle@mdowle.plus.com" target="_blank">mdowle@mdowle.plus.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><u></u>
<div>
<p> </p>
<p>Hm this is odd.</p>
<p>Could you run the following and paste back the (verbose) results please.</p>
<pre><div class="im">for (n in c(1023:1025, 10000)) {<br></div> input = paste( rep('a\tb\n', n), collapse='')<br> A = fread(input,verbose=TRUE)<br> cat(nchar(input), nrow(A), "\n")<br>}</pre><div>

<div class="h5">
<p> </p>
<p> </p>
<p>On 28.03.2013 14:38, Timothée Carayol wrote:</p>
<blockquote type="cite" style="padding-left:5px;border-left:#1010ff 2px solid;margin-left:5px;width:100%">
<div dir="ltr">
<div class="gmail_default">
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">Curiouser and curiouser..</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"><br></span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">I can reproduce on two computers with different versions of R and of data.table.</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"><br></span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"><br></span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"><br></span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">Computer 1 (it says unknown-linux but is actually ubuntu):</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"><br></span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">R version 2.15.3 (2013-03-01)                                                                                                           </span></div>


<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">Platform: x86_64-unknown-linux-gnu (64-bit)                                                                                             </span></div>


<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">                                                                                                                                        </span></div>


<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">locale:                                                                                                                                 </span></div>


<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif"> [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8     LC_MONETARY=en_GB.UTF-8</span></div>


<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">   LC_MESSAGES=en_GB.UTF-8    LC_PAPER=C                 LC_NAME=C                  LC_ADDRESS=C                                        </span></div>


<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C                                                          </span></div>


<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">                                                                                                                                        </span></div>


<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">attached base packages:                                                                                                                 </span></div>


<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">[1] stats     graphics  grDevices utils     datasets  methods   base                                                                    </span></div>


<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">                                                                                                                                        </span></div>


<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">other attached packages:                                                                                                                </span></div>


<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">[1] bit64_0.9-2      bit_1.1-10       data.table_1.8.9 colorout_1.0-0                                                                   </span></div>


<div style="font-family:arial,helvetica,sans-serif">Computer 2:</div>
<div>
<div><span style="font-family:arial,helvetica,sans-serif">R version 2.15.2 (2012-10-26)                                       </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">Platform: x86_64-redhat-linux-gnu (64-bit)                          </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">                                                                    </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">locale:                                                             </span></div>
<div><span style="font-family:arial,helvetica,sans-serif"> [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C                        </span></div>
<div><span style="font-family:arial,helvetica,sans-serif"> [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8              </span></div>
<div><span style="font-family:arial,helvetica,sans-serif"> [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8             </span></div>
<div><span style="font-family:arial,helvetica,sans-serif"> [7] LC_PAPER=C                 LC_NAME=C                           </span></div>
<div><span style="font-family:arial,helvetica,sans-serif"> [9] LC_ADDRESS=C               LC_TELEPHONE=C                      </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C                 </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">                                                                    </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">attached base packages:                                             </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">[1] stats     graphics  grDevices utils     datasets  methods   base</span></div>
<div><span style="font-family:arial,helvetica,sans-serif">                                                                    </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">other attached packages:                                            </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">[1] data.table_1.8.8                                                </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">                                                                    </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">loaded via a namespace (and not attached):                          </span></div>
<div><span style="font-family:arial,helvetica,sans-serif">[1] tools_2.15.2                                                    </span></div>
</div>
</div>
</div>
<div class="gmail_extra"><br><br>
<div class="gmail_quote">On Thu, Mar 28, 2013 at 2:31 PM, Matthew Dowle <span><<a href="mailto:mdowle@mdowle.plus.com" target="_blank">mdowle@mdowle.plus.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span style="text-decoration:underline"></span>
<div>
<p> </p>
<p>Interesting, what's your sessionInfo() please?</p>
<p>For me it seems to work ok :</p>
<pre>[1] 1022
[1] 1023
[1] 1024   
[1] 9999<br><br></pre>
<pre>> sessionInfo()<br>R version 2.15.2 (2012-10-26)<br>Platform: x86_64-w64-mingw32/x64 (64-bit)</pre>
<div>
<p> </p>
<p>On <a>27.03.2013 22</a>:49, Timothée Carayol wrote:</p>
</div>
<blockquote style="padding-left:5px;border-left:#1010ff 2px solid;margin-left:5px;width:100%">
<div dir="ltr">
<div>
<div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Agree with Muhammad, longer character strings are definitely permitted in R.</div>
<div class="gmail_default" style="font-family:arial,helvetica,sans-serif">A minimal example that show something strange happening with fread:</div>
</div>
<div class="gmail_default">
<div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">for (n in c(1023:1025, 10000)) {</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">  A </span></div>
</div>
<div>
<div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">           paste(</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">                 rep('a\tb\n', n),</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">                 collapse=''</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">                 ),</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">           sep='\t'</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">           )</span></div>
<div class="gmail_default" style="font-family:arial,helvetica,sans-serif">  print(nrow(A))</div>
<div class="gmail_default" style="font-family:arial,helvetica,sans-serif">}</div>
<div class="gmail_default" style="font-family:arial,helvetica,sans-serif">On my computer, I obtain:</div>
<div class="gmail_default">
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">[1] 1022</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">[1] 1023</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">[1] 1023</span></div>
<div class="gmail_default"><span style="font-family:arial,helvetica,sans-serif">[1] 1023</span></div>
</div>
<div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Hope this helps</div>
<div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Timothée</div>
</div>
</div>
</div>
</div>
<div>
<div>
<div class="gmail_extra"><br><br>
<div class="gmail_quote">On Wed, Mar 27, 2013 at 9:23 PM, Matthew Dowle <span><<a href="mailto:mdowle@mdowle.plus.com" target="_blank">mdowle@mdowle.plus.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br> Nice to hear from you. Nope not known to me. Obviously 4096 is 4k, is that<br> the R limit for a character string length? What happens at 4097?<br>

 Matthew<br>
<div>
<div><br> > Hi,<br> ><br> > I have an example of a string of 4097 characters which can't be parsed by<br> > fread; however, if I remove any character, it can be parsed just fine. Is<br> > that a known limitation?<br>

 ><br> > (If I write the string to a file and then fread the file name, it works<br> > too.)<br> ><br> > Let me know if you need the string and/or a bug report.<br> ><br> > Thanks<br> > Timothée</div>


</div>
> _______________________________________________<br> > datatable-help mailing list<br> > <a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a><br>

 > <a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a><br><br><br></blockquote>
</div>
</div>
</div>
</div>
</blockquote>
<p> </p>
<div> </div>
</div>
</blockquote>
</div>
</div>
</blockquote>
<p> </p>
<div> </div>
</div></div></div>
</blockquote></div><br></div>