[datatable-help] New function fread() in v1.8.7

Matthew Dowle mdowle at mdowle.plus.com
Mon Jan 7 22:06:37 CET 2013


No wasn't know. Now added to the to do list.

Thanks!
Matthew

On 05.01.2013 21:05, patricknic wrote:
> Hit a snag reading some imperfect data. I'm not sure what it was 
> exported
> from, but the file has some lines with consecutive quotation marks 
> (i.e., a
> character field actually contained quotation marks before it was 
> written to
> a text file). Not sure if this is a known issue. A reproducible 
> example:
>
> text <- paste(rep(c('a,b,c,d,e,f\na,b,c,"d",e,f\na,b,c,""d"",e,f'), 
> 10000),
> collapse="\n")
> f <- tempfile()
> writeLines(text, f)
>
> df <- read.table(f, sep=",")
> dt <- fread(f, sep=",", header=FALSE)
>
>
> No error for read.table, but I get this error for fread:
>
> Error in fread(f, sep = ",", header = FALSE) :
>   Expected sep (',') but 'd' ends field 4 on line 30 when detecting 
> types:
> a,b,c,""d
>
>
> This also gave me an idea for a suggestion: text replacement in 
> readfile.c.
> (I'm no C programmer, so I don't know if this would be more trouble 
> than
> it's worth. Also, not sure if it is in your project scope.) An R 
> mock-up
> (still using fread) of this would be something like:
>
> freadWrapper <- function(input=f, eliminate='"', ...) {
>   A <- readLines(f)
>   B <- gsub(eliminate, "", A)
>   C <- paste(B, collapse="\n")
>   fread(C, ...)
> }
> freadWrapper(f, sep=",", stringsAsFactors=FALSE, header=FALSE)
>
>
>
>
>
> --
> View this message in context:
> 
> http://r.789695.n4.nabble.com/New-function-fread-in-v1-8-7-tp4653745p4654754.html
> Sent from the datatable-help mailing list archive at Nabble.com.
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> 
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help



More information about the datatable-help mailing list