[datatable-help] data.table on existing data.frame list

iembry iruckaE at mail2world.com
Tue Aug 6 04:12:48 CEST 2013


Hi Matthew, thank you for your prompt and great assistance.

Yes, moving the autostart = 40 does work. Yes, it did detect the column
names. 

In order to read in the .exsa.rdb files I created a function that follows

getDataRatingDepotFiles <- function (file, hasHeader = TRUE, separator =
"\t") 
{
    RDdatatmp <- as.matrix(read.table(file, sep = "\t", fill = TRUE,
comment.char = "#", header = T, as.is = TRUE, stringsAsFactors = FALSE,
na.strings = "NA", col.names = c("y", "shift", "x", "stor")))
    RDdatatmp <- as.matrix(RDdatatmp[c(-1), c(-4)])
    RDdatatmp <- as.data.frame(RDdatatmp, stringsAsFactors = FALSE)
    RDdatatmp$y <- as.numeric(as.character(RDdatatmp$y))
    RDdatatmp$x <- as.numeric(as.character(RDdatatmp$x))
    RDdatatmp$shift <- as.numeric(as.character(RDdatatmp$shift))
    return(RDdatatmp)
}

I created an object called sitefiles that has the pattern of the file
extension that I want. In the same folder there are files with two other
file extensions that I do not want to use in this project.

sitefiles <- list.files(path ="/tried", pattern <- ".exsa.rdb$", full.names
= TRUE)
getratings <- lapply(sitefiles, getDataRatingDepotFiles)

Is there any way to replicate the above with fread?

Irucka








The comments are really a banner at the start of the file it seems. So this
is all built in to fread already. But the banner in the example is 34 rows,
so the default of autostart=30 isn't enough.  Try:

    fread("03217500.exsa.rsb", autostart=40)

That should do it in one shot, including detecting the column names. I've
just increased autostart a bit to be within the data block.  See ?fread for
a detailed description of autostart and the procedure.

Btw, if there is more than one table in a single file,  then setting
autostart to be within each one is how to read each one in.  And provided
there is no footer, you can set autostart to be very large, too (with
downside of time to seek back from the end to find the column names).

Matthew



--
View this message in context: http://r.789695.n4.nabble.com/data-table-on-existing-data-frame-list-tp4673142p4673201.html
Sent from the datatable-help mailing list archive at Nabble.com.


More information about the datatable-help mailing list