<div dir="ltr">All<div><br></div><div>I am trying to read in a data file using fread()</div><div><br></div><div>I am getting several warnings indicating that a non-numeric entry was found in a numeric field and as a result the column is being converted to a character vector, however the non-numeric entry is one of the declared na.strings and indeed the specific entry is returned as NA.</div>
<div><br></div><div>I expected that the "?" entry would been recognised as NA and column to be read as numeric vector. I have tried the same action with read.table() and it works as I was expecting.</div><div><br>
</div><div>I am using:</div><div>R version 3.1.1 (pre-compiled)</div><div>RStudio Version 0.98.983</div><div>data.table package v1.92</div><div>locale is: en_GB.UTF-8</div><div>on:</div><div> OS-X Version 10.9.4</div><div>
<br></div><div>the code I am using is:</div><div><br></div><div><div>"library("data.table")</div><div><br></div><div>column.class <- c(rep("character",2), rep("numeric",7))</div><div>
data2 <- fread("./data/household_power_consumption.txt",</div><div> sep=";",</div><div> na.strings=c("?",""),</div><div> colClasses=column.class,</div>
<div> header=TRUE,</div><div> nrows=7000,</div><div> verbose=TRUE</div><div>)"</div></div><div><br></div><div>the 1st line in the data file causing the problem + the one before are:</div>
<div><div>21/12/2006;11:22:00;0.244;0.000;242.290;1.000;0.000;0.000;0.000</div><div>21/12/2006;11:23:00;?;?;?;?;?;?;</div></div><div><br></div><div>The 1st warning is:</div><div><div>1: In fread("./data/household_power_consumption.txt", na.strings = "?") :</div>
<div> Bumped column 3 to type character on data row 6840, field contains '?'. Coercing previously read values in this column from integer or numeric back to character which may not be lossless; e.g., if '00' and '000' occurred before they will now be just '0', and there may be inconsistencies with treatment of ',,' and ',NA,' too (if they occurred in this column before the bump). If this matters please rerun and set 'colClasses' to 'character' for this column. Please note that column type detection uses the first 5 rows, the middle 5 rows and the last 5 rows, so hopefully this message should be very rare. If reporting to datatable-help, please rerun and include the output from verbose=TRUE.</div>
<div class=""><div id=":169" class="" tabindex="0"><img class="" src="https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif"></div></div><span class=""><font color="#888888"></font></span></div><span class=""><font color="#888888"><div style="font-family:arial,sans-serif;font-size:13px">
<br></div><div style="font-family:arial,sans-serif;font-size:13px">Martin</div><div><br></div></font></span></div>