[datatable-help] .SD without by seems to fail

Matthew Dowle mdowle at mdowle.plus.com
Wed Jan 11 22:22:23 CET 2012


Thanks. .N and .BY added to FR#1732.

On Tue, 2012-01-10 at 17:28 -0500, Joseph Voelkel wrote:
> Same issue with .N and .BY, not surprisingly.
> 
> Regarding
>   "as.data.table(lapply(DT, sum)) works and doesn't seem much less elegant?"
> 
> In my example, I was writing several statements of .SD with by=, and then wanted a statement
> Without the by=. It just seemed that I shouldn't need to change .SD to DT ... Same fundamental point as Chris's but not is as elegant a context.
> 
> 
> -----Original Message-----
> From: Matthew Dowle [mailto:mdowlenoreply at virginmedia.com] On Behalf Of Matthew Dowle
> Sent: Monday, January 09, 2012 5:15 PM
> To: Chris Neff
> Cc: timothee.carayol at gmail.com; Joseph Voelkel; datatable-help at lists.r-forge.r-project.org
> Subject: Re: [datatable-help] .SD without by seems to fail
> 
> Ok, very persuasive. I've raised a bug report :
> 
> https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1732&group_id=240&atid=975
> 
> 
> On Mon, 2012-01-09 at 14:01 -0500, Chris Neff wrote:
> > I think the original example should be able to work.  There are times
> > where I am dynamically building the by list based on the number of
> > subslice keys I have for my data.table. Sometimes I have to aggregate
> > over a certain key, so if my keys are "x" and "y" and I have value
> > columns V1, V2, V3 I will do something like:
> > 
> > by.cols = setdiff(key(DT), "x")
> > DT[, sum(V1), by=by.cols]
> > 
> > Now if I only have one key column "x", I want this to aggregate over
> > the whole data.table. This in the past didn't work until I submitted a
> > bug report and it was fixed, but I never have had to do it with the
> > whole .SD.  But I could easily see wanting to do
> > 
> > DT[, lapply(.SD, sum), by=by.cols]
> > 
> > especially if I include .SDcols with it too.  And I don't want to have
> > to check for an empty by.cols and do some different code just for that
> > case.  It makes sense to have it all consistent.
> > 
> > 
> > 2012/1/9 Timothée Carayol <timothee.carayol at gmail.com>:
> > > as.data.table(lapply(DT, sum)) works and doesn't seem much less elegant?
> > >
> > > (Well -- it would work if you were not trying to take the sum of a factor
> > > ;-))
> > >
> > > t
> > >
> > > On Mon, Jan 9, 2012 at 4:45 PM, Joseph Voelkel <jgvcqa at rit.edu> wrote:
> > >>
> > >> # from  help(data.table)
> > >>
> > >> DT = data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
> > >>
> > >> DT[,lapply(.SD,sum),by=x]  # this works fine
> > >>
> > >>
> > >>
> > >> # but this fails
> > >>
> > >> DT[,lapply(.SD,sum)]
> > >>
> > >> # with this message: Error in lapply(.SD, sum) : object '.SD' not found
> > >>
> > >>
> > >>
> > >> # Am I missing something obvious here?
> > >>
> > >>
> > >>
> > >>
> > >> _______________________________________________
> > >> datatable-help mailing list
> > >> datatable-help at lists.r-forge.r-project.org
> > >>
> > >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > >
> > >
> > >
> > > _______________________________________________
> > > datatable-help mailing list
> > > datatable-help at lists.r-forge.r-project.org
> > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > _______________________________________________
> > datatable-help mailing list
> > datatable-help at lists.r-forge.r-project.org
> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> 
> 




More information about the datatable-help mailing list