[datatable-help] fread() coercing to character when seeing NA

Matthew Dowle mdowle at mdowle.plus.com
Mon Sep 30 20:58:10 CEST 2013


Yes, exactly.  On the bug list is #2660 " Improve fread na.strings 
handling" :

https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2660&group_id=240&atid=975

which points to :

http://stackoverflow.com/questions/15784138/bad-interpretation-of-n-a-using-fread

Matthew

On 30/09/13 15:06, Julien Barnier wrote:
> Hi,
>
>> dt3 <- fread( "a\n2\n4\n?\n5", na.strings=c("?"), colClasses=c(a="integer"))
> I think that running fread with the verbose flag allows to answer your
> question :
>
> R> dt3 <- fread( "a\n2\n4\n?\n5", na.strings=c("?"),colClasses=c(a="integer"),
> verbose=TRUE)
> ... <snip> ...
> Column 1 ('a') has been detected as type 'character'. Ignoring request from
> colClasses to read as 'integer' (a lower type) since NAs would result.
>     0.000s (  0%) Memory map (rerun may be quicker)
>     0.000s (  0%) sep and header detection
>     0.000s (  0%) Count rows (wc -l)
>     0.000s (  0%) Column type detection (first, middle and last 5 rows)
>     0.000s (  0%) Allocation of 4x1 result (xMB) in RAM
>     0.000s (  0%) Reading data
>     0.000s (  0%) Allocation for type bumps (if any), including gc time if
> triggered
>     0.000s (  0%) Coercing data already read in type bumps (if any)
>     0.000s (  0%) Changing na.strings to NA
>     0.000s        Total
>
> As your «a» column contains a character string "?", fread dtermines this
> column as character. And colClasses is ignored as that would result in
> possibly unwanted NA value. And all of this, as I understand it, is because
> the replacement of na.strings by NA happens as the last step of fread, after
> the column type has been set.
>
> So it seems that the only workarounds are either to change your data to
> replace your missing value code by a numerical value (like -9999 or anything
> else), or to convert your column back to numeric after using fread.
>
> Regards,
>
> Julien
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130930/5b15f51e/attachment.html>


More information about the datatable-help mailing list