[datatable-help] Reading corrupt csv and replace wrong value
Allan Engelhardt
allane at cybaea.com
Sat Jun 18 01:11:14 CEST 2011
Probably easier to change it outside of R, e.g.
perl -pe 's{0742076391\?39524}{whatever}g' file > newfile
but you may want to check that it really is a '?' character and not just
printed that way.
You could of course write this in R along the lines of
while (length(line <- readLines(in, 1L)) > 0) {
line <- sub("0,0742076391?39524", "whatever", line, fixed = TRUE)
writeLines(line, out)
}
for suitable connections in and out.
HTH
Allan
On 16/06/11 22:40, DanMik wrote:
> Im fairly new to R.
>
> I have a huge csv file, of 400.000+ K, and now it looks like one of the
> values is corrupt. (it contains a ?, so one value becomes:
> "0,0742076391?39524")
> Because of the size i can't edit it in a text editor, and the file took
> several days to create (many calculations)
>
> When i read the file it cant be converted to numbers because of this one
> value which i found with scan() and have found the coordinates of.
>
> I'm reading the file with:
>
> x<- read.csv2("filename.csv", stringsAsFactor= FALSE)
>
> Can i read the file with everything as numeric, and replace non numeric
> values with 0 ?
>
> or somehow correct this one value?
>
> I have tried first reading the file, then set the value to 0 and then use
> as.matrix and afterwards as.numeric. This just creates a lot of NA
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Reading-corrupt-csv-and-replace-wrong-value-tp3603848p3603848.html
> Sent from the datatable-help mailing list archive at Nabble.com.
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
More information about the datatable-help
mailing list