[datatable-help] fread() coercion bug?

Matt Dowle mdowle at mdowle.plus.com
Sun May 4 10:50:32 CEST 2014


Reproduced, thanks. Can't think why that is, but will fix.  Please file 
as a bug so it's not forgotten.

In the meantime, setting the class manually for that column (colClasses 
argument) works in this example :

fread( paste0( strData, collapse="\n" ), integer64="character", 
colClasses=list(numeric="b"))

Is that workable for the full example?  I've used that syntax for 
colClasses so you can pass a vector of column names to be read as 
numeric more easily, if need be.

Matt


On 03/05/14 02:08, Harish wrote:
> I was trying to use fread() to read data when I got the following 
> error which made no sense:
>
> In fread(paste0(strData, collapse = "\n"), integer64 = "character") :
>    Bumped column 2 to type character on data row 13, field contains '2464.77'. Coercing previously read values in this column from integer or numeric back to character which may not be lossless; e.g., if '00' and '000' occurred before they will now be just '0', and there may be inconsistencies with treatment of ',,' and ',NA,' too (if they occurred in this column before the bump). If this matters please rerun and set 'colClasses' to 'character' for this column. Please note that column type detection uses the first 5 rows, the middle 5 rows and the last 5 rows, so hopefully this message should be very rare. If reporting to datatable-help, please rerun and include the output from verbose=TRUE.
> because "2464.77" is a perfectly legitimate number and there is no 
> reason to coerce the column to character for that.
>
> Here is how to reproduce it:
>
>    dtT <- data.table( a = 1:72, b=0 )
>    dtT[ 13, b := 2464.77 ]
>
>    strData <- capture.output( write.table( dtT, row.names=FALSE, 
> quote=FALSE, sep="\t" ) )
>    fread( paste0( strData, collapse="\n" ), integer64="character" )
>
> Note that the following works okay without the integer64="character" 
> argument:
>    dtT <- data.table( a = 1:72, b=0 )
>    dtT[ 13, b := 2464.77 ]
>
>    strData <- capture.output( write.table( dtT, row.names=FALSE, 
> quote=FALSE, sep="\t" ) )
>    fread( paste0( strData, collapse="\n" ) )
>
> I would appreciate if you could provide some sort of a workaround for 
> this.  The reason I am using the integer64="character" argument is 
> that I have large numbers at times which seems to be having issues 
> once it is read as integer64 -- and that might have nothing to do with 
> data.table but I have not had time to look into it.  My work-around 
> for that issue was to read it as character, but I run into the above 
> issue.
>
> Thanks for your help.
>
> Regards,
> Harish
>
>
>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140504/c7c2223f/attachment.html>


More information about the datatable-help mailing list