[datatable-help] fread: coercion of class from integer to character due to NA string.

Matthew Dowle mdowle at mdowle.plus.com
Thu Apr 25 03:25:50 CEST 2013


Hi,
Thanks for reporting. Yes all known and will be tackled.  colClasses
should be next commit hopefully.
Matthew

> Hi there,
>
> I think this is probably a known issue, but just in case, here it is.
>
> I am trying to use fread to read a very large csv file, but I am having
> problems due to the fact that NAs in a numeric column are represented with
> some letters. For example, in my column of SIC codes I have "Z" to
> represent NAs. Even though I explicitly set those to be NAs in the
> command:
>
> data6281 <- fread("data6281.csv",header=TRUE,
> na.strings=c("C",".","B","Z",""))
>
> I get the warning message that that column was changed to be character
> even
> though it is supposed to be integer.
>
> With the read.csv I have no problem when I use the command
>
> data6281 <- data.table(read.csv("data6281.csv",header=TRUE,
> colClasses=c("integer","integer","integer","integer","integer","factor","character","factor","numeric","numeric","integer"),
> na.strings=c("C",".","B","Z","")))
>
> but fread does not allow me to set the column classes since it doesn't
> accept the argument colClasses.
>
> A shame really. fread is much faster, and I love that it shows the %
> progress.
>
> I don't supposed there is a way around this, but if there is I would be
> glad to know.
>
> I would also be happy to provide an example if that's necessary.
>
> Cheers,
>
> Vivianne Siqueira Campos Vilar
> ----------------------------------------------
> “Don't worry about the world coming to an end today. It is already
> tomorrow
> in Australia.”
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help




More information about the datatable-help mailing list