<div dir="ltr">Hi there,<br><br>I think this is probably a known issue, but just in case, here it is.<br><br>I am trying to use fread to read a very large csv file, but I am having problems due to the fact that NAs in a numeric column are represented with some letters. For example, in my column of SIC codes I have "Z" to represent NAs. Even though I explicitly set those to be NAs in the command:<br>
<br>data6281 <- fread("data6281.csv",header=TRUE, na.strings=c("C",".","B","Z",""))<br><div><br></div><div>I get the warning message that that column was changed to be character even though it is supposed to be integer.<br>
<br></div><div>With the read.csv I have no problem when I use the command<br><br>data6281 <- data.table(read.csv("data6281.csv",header=TRUE, colClasses=c("integer","integer","integer","integer","integer","factor","character","factor","numeric","numeric","integer"), na.strings=c("C",".","B","Z","")))<br>
<br></div><div>but fread does not allow me to set the column classes since it doesn't accept the argument colClasses.<br><br></div><div>A shame really. fread is much faster, and I love that it shows the % progress.<br>
<br></div><div>I don't supposed there is a way around this, but if there is I would be glad to know.<br><br></div><div>I would also be happy to provide an example if that's necessary.<br></div><br><div>Cheers,<br>
</div><div><br clear="all"><div><div dir="ltr">Vivianne Siqueira Campos Vilar<div>----------------------------------------------</div>
<div>“Don't worry about the world coming to an end today. It is already tomorrow in Australia.”<br><img src="http://dl.dropbox.com/u/1885087/schulz.jpg"><br></div></div></div>
</div></div>