<html><head><style>body{font-family:Helvetica,Arial;font-size:13px}</style></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">Hi Martin,</div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"><br></div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">I'd recommend first to try with the current development version to see if this has already been fixed… Matt's already fixed some fread bugs that were recurring.</div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">You can get it from here: <a href="https://github.com/Rdatatable/data.table">https://github.com/Rdatatable/data.table</a> Please scroll down to see the installation instructions.</div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"><br></div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">And if you still get the error, could you please file a bug report <a href="https://github.com/Rdatatable/data.table/issues">https://github.com/Rdatatable/data.table/issues</a> with a *reproducible example* please? If necessary, you can also link to a *minimal* file that can reproduce the issue; it'd be much helpful.</div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"><br></div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">Thanks,</div> <div id="bloop_sign_1409959081516913920" class="bloop_sign"><div style="font-family:helvetica,arial;font-size:13px">Arun</div></div> <div style="color:black"><br>From: <span style="color:black">Martin Watts</span> <a href="mailto:martin.dunelm@gmail.com"><martin.dunelm@gmail.com></a><br>Reply: <span style="color:black">Martin Watts</span> <a href="mailto:martin.dunelm@gmail.com"><martin.dunelm@gmail.com>></a><br>Date: <span style="color:black">September 4, 2014 at 3:09:13 PM</span><br>To: <span style="color:black">datatable-help@lists.r-forge.r-project.org</span> <a href="mailto:datatable-help@lists.r-forge.r-project.org"><datatable-help@lists.r-forge.r-project.org>></a><br>Subject: <span style="color:black"> [datatable-help] Unexpected Result Reading in Data File using fread <br></span></div><br> <blockquote type="cite" class="clean_bq"><span><div><div></div><div>
<title></title>
<div dir="ltr">All
<div><br></div>
<div>I am trying to read in a data file using fread()</div>
<div><br></div>
<div>I am getting several warnings indicating that a non-numeric
entry was found in a numeric field and as a result the column is
being converted to a character vector, however the non-numeric
entry is one of the declared na.strings and indeed the specific
entry is returned as NA.</div>
<div><br></div>
<div>I expected that the "?" entry would been recognised as NA and
column to be read as numeric vector. I have tried the same
action with read.table() and it works as I was expecting.</div>
<div><br></div>
<div>I am using:</div>
<div>R version 3.1.1 (pre-compiled)</div>
<div>RStudio Version 0.98.983</div>
<div>data.table package v1.92</div>
<div>locale is: en_GB.UTF-8</div>
<div>on:</div>
<div> OS-X Version 10.9.4</div>
<div><br></div>
<div>the code I am using is:</div>
<div><br></div>
<div>
<div>"library("data.table")</div>
<div><br></div>
<div>column.class <- c(rep("character",2),
rep("numeric",7))</div>
<div>data2 <-
fread("./data/household_power_consumption.txt",</div>
<div>
sep=";",</div>
<div>
na.strings=c("?",""),</div>
<div>
colClasses=column.class,</div>
<div>
header=TRUE,</div>
<div>
nrows=7000,</div>
<div>
verbose=TRUE</div>
<div>)"</div>
</div>
<div><br></div>
<div>the 1st line in the data file causing the problem + the one
before are:</div>
<div>
<div>
21/12/2006;11:22:00;0.244;0.000;242.290;1.000;0.000;0.000;0.000</div>
<div>21/12/2006;11:23:00;?;?;?;?;?;?;</div>
</div>
<div><br></div>
<div>The 1st warning is:</div>
<div>
<div>1: In fread("./data/household_power_consumption.txt",
na.strings = "?") :</div>
<div> Bumped column 3 to type character on data row 6840,
field contains '?'. Coercing previously read values in this column
from integer or numeric back to character which may not be
lossless; e.g., if '00' and '000' occurred before they will now be
just '0', and there may be inconsistencies with treatment of ',,'
and ',NA,' too (if they occurred in this column before the bump).
If this matters please rerun and set 'colClasses' to 'character'
for this column. Please note that column type detection uses the
first 5 rows, the middle 5 rows and the last 5 rows, so hopefully
this message should be very rare. If reporting to datatable-help,
please rerun and include the output from verbose=TRUE.</div>
<div class="">
<div id=":169" class="" tabindex="0"><img class="" src="https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif"></div>
</div>
</div>
<div style="font-family:arial,sans-serif;font-size:13px">
<span class=""><font color="#888888"><br></font></span></div>
<div style="font-family:arial,sans-serif;font-size:13px">
<span class=""><font color="#888888">Martin</font></span></div>
<div><span class=""><font color="#888888"><br></font></span></div>
</div>
_______________________________________________
<br>datatable-help mailing list
<br>datatable-help@lists.r-forge.r-project.org
<br>https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</div></div></span></blockquote></body></html>