[datatable-help] data.table on existing data.frame list

Matthew Dowle mdowle at mdowle.plus.com
Tue Aug 6 19:41:44 CEST 2013


What I previously suggested should work; i.e.,

big = rbindlist(lapply(fileNameVector, fread, autostart=40))
big[,y:=y+shift]
big[,shift:=NULL]

just replace 'fileNameVector' with 'sitefiles'.


On 06/08/13 16:56, Irucka Embry wrote:
> Hi Matthew, how are you?
>
> Thank you for the notes on fread. I had tried fread to read sitefiles 
> (see the previous e-mail), but this error message was returned:
>
> Error in fread(sitefiles) :
> 'input' must be a single character string containing a file name, full 
> path to a file, a URL starting 'http://' or 'file://', or the input 
> data itself
>
> Is there a work around to get fread to read a file path like sitefiles?
>
> I was detailing what I was doing with read.table to make sure that 
> fread could also accomplish those same objectives with the files.
>
> Thank you.
>
> Irucka
>
>
>
> <-----Original Message----->
> >From: Matthew Dowle [mdowle at mdowle.plus.com]
> >Sent: 8/6/2013 3:49:44 AM
> >To: iruckaE at mail2world.com
> >Cc: datatable-help at lists.r-forge.r-project.org
> >Subject: Re: [datatable-help] data.table on existing data.frame list
> >
> >On 06/08/13 03:12, iembry wrote:
> >> Hi Matthew, thank you for your prompt and great assistance.
> >>
> >> Yes, moving the autostart = 40 does work. Yes, it did detect the column
> >> names.
> >Great.
> >>
> >> In order to read in the .exsa.rdb files I created a function that 
> follows
> >>
> >> getDataRatingDepotFiles <- function (file, hasHeader = TRUE, 
> separator =
> >> "\t")
> >> {
> >> RDdatatmp <- as.matrix(read.table(file, sep = "\t", fill = TRUE,
> >> comment.char = "#", header = T, as.is = TRUE, stringsAsFactors = FALSE,
> >> na.strings = "NA", col.names = c("y", "shift", "x", "stor")))
> >> RDdatatmp <- as.matrix(RDdatatmp[c(-1), c(-4)])
> >> RDdatatmp <- as.data.frame(RDdatatmp, stringsAsFactors = FALSE)
> >> RDdatatmp$y <- as.numeric(as.character(RDdatatmp$y))
> >> RDdatatmp$x <- as.numeric(as.character(RDdatatmp$x))
> >> RDdatatmp$shift <- as.numeric(as.character(RDdatatmp$shift))
> >> return(RDdatatmp)
> >> }
> >>
> >> I created an object called sitefiles that has the pattern of the file
> >> extension that I want. In the same folder there are files with two 
> other
> >> file extensions that I do not want to use in this project.
> >>
> >> sitefiles <- list.files(path ="/tried", pattern <- ".exsa.rdb$", 
> full.names
> >> = TRUE)
> >> getratings <- lapply(sitefiles, getDataRatingDepotFiles)
> >>
> >> Is there any way to replicate the above with fread?
> >I don't follow. fread reads the file. 'select' arg can be used to
> >select columns, or you can use setnames() afterwards to rename them.
> >fread doesn't create factors anyway. The numeric columns should be
> >detected automatically but you can pass 'colClasses' manually to fread
> >if you need to read integer data as a numeric type, in the latest
> >version. Or are you asking if fread can read multiple files?
> >
> >
> >>
> >> Irucka
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> The comments are really a banner at the start of the file it seems. 
> So this
> >> is all built in to fread already. But the banner in the example is 
> 34 rows,
> >> so the default of autostart=30 isn't enough. Try:
> >>
> >> fread("03217500.exsa.rsb", autostart=40)
> >>
> >> That should do it in one shot, including detecting the column 
> names. I've
> >> just increased autostart a bit to be within the data block. See 
> ?fread for
> >> a detailed description of autostart and the procedure.
> >>
> >> Btw, if there is more than one table in a single file, then setting
> >> autostart to be within each one is how to read each one in. And 
> provided
> >> there is no footer, you can set autostart to be very large, too (with
> >> downside of time to seek back from the end to find the column names).
> >>
> >> Matthew
> >>
> >>
> >>
> >> --
> >> View this message in context: 
> http://r.789695.n4.nabble.com/data-table-on-existing-data-
> >frame-list-tp4673142p4673201.html
> >> Sent from the datatable-help mailing list archive at Nabble.com.
> >> _______________________________________________
> >> datatable-help mailing list
> >> datatable-help at lists.r-forge.r-project.org
> >> 
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> >>
> >
> >.
> >
>
> _______________________________________________________________
> Get the Free email that has everyone talking at http://www.mail2world.com
> Unlimited Email Storage -- POP3 -- Calendar -- SMS -- Translator -- 
> Much More!
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130806/f3f2aa3d/attachment.html>


More information about the datatable-help mailing list