[datatable-help] .SD without by seems to fail

Joseph Voelkel jgvcqa at rit.edu
Tue Jan 10 23:28:45 CET 2012


Same issue with .N and .BY, not surprisingly.

Regarding
  "as.data.table(lapply(DT, sum)) works and doesn't seem much less elegant?"

In my example, I was writing several statements of .SD with by=, and then wanted a statement
Without the by=. It just seemed that I shouldn't need to change .SD to DT ... Same fundamental point as Chris's but not is as elegant a context.


-----Original Message-----
From: Matthew Dowle [mailto:mdowlenoreply at virginmedia.com] On Behalf Of Matthew Dowle
Sent: Monday, January 09, 2012 5:15 PM
To: Chris Neff
Cc: timothee.carayol at gmail.com; Joseph Voelkel; datatable-help at lists.r-forge.r-project.org
Subject: Re: [datatable-help] .SD without by seems to fail

Ok, very persuasive. I've raised a bug report :

https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1732&group_id=240&atid=975


On Mon, 2012-01-09 at 14:01 -0500, Chris Neff wrote:
> I think the original example should be able to work.  There are times
> where I am dynamically building the by list based on the number of
> subslice keys I have for my data.table. Sometimes I have to aggregate
> over a certain key, so if my keys are "x" and "y" and I have value
> columns V1, V2, V3 I will do something like:
> 
> by.cols = setdiff(key(DT), "x")
> DT[, sum(V1), by=by.cols]
> 
> Now if I only have one key column "x", I want this to aggregate over
> the whole data.table. This in the past didn't work until I submitted a
> bug report and it was fixed, but I never have had to do it with the
> whole .SD.  But I could easily see wanting to do
> 
> DT[, lapply(.SD, sum), by=by.cols]
> 
> especially if I include .SDcols with it too.  And I don't want to have
> to check for an empty by.cols and do some different code just for that
> case.  It makes sense to have it all consistent.
> 
> 
> 2012/1/9 Timothée Carayol <timothee.carayol at gmail.com>:
> > as.data.table(lapply(DT, sum)) works and doesn't seem much less elegant?
> >
> > (Well -- it would work if you were not trying to take the sum of a factor
> > ;-))
> >
> > t
> >
> > On Mon, Jan 9, 2012 at 4:45 PM, Joseph Voelkel <jgvcqa at rit.edu> wrote:
> >>
> >> # from  help(data.table)
> >>
> >> DT = data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
> >>
> >> DT[,lapply(.SD,sum),by=x]  # this works fine
> >>
> >>
> >>
> >> # but this fails
> >>
> >> DT[,lapply(.SD,sum)]
> >>
> >> # with this message: Error in lapply(.SD, sum) : object '.SD' not found
> >>
> >>
> >>
> >> # Am I missing something obvious here?
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> datatable-help mailing list
> >> datatable-help at lists.r-forge.r-project.org
> >>
> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> >
> >
> >
> > _______________________________________________
> > datatable-help mailing list
> > datatable-help at lists.r-forge.r-project.org
> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help




More information about the datatable-help mailing list