[datatable-help] Feature Idea

Steve Lianoglou mailinglist.honeypot at gmail.com
Fri Jul 8 20:29:01 CEST 2011


Hi,

I find myself often wanting to use a data.table for its quick
aggregate&summary mojo, but I want to keep the ordering of my data as
I have it, and not as it would be if I set the appropriate keys for my
aggregation/summary.

How would you folks feel if I add a `by` (or dt.by) method for a data.table, eg:

result <- by(some.data.table, would.be.keys, {
 ## stuff
}, ...)

Which does the aggregate/summary encoded within { ... }, but the
result is returned in the same order as `some.data.table` was in when
it was passed into the function -- if { ... } returned as many rows as
were in the original data.table, then it's 1-for-1, but you are
summarizing groups of rows, the summary would be in the same
(appearance) order as it is in `some.data.table`.

The { ... } block would essentially be anything you can put in the `j`
part of a data.table[i, j, ...].

The `...` dots after { ... } maybe extra params that can get passed
into a "normal" data.table[i,j,...] call (haven't thought about that
yet, tho).

If I can get some consensus on whether or not it's worthwhile to put
such a function into the data.table package, I'll go ahead and add an
initial implementation -- otherwise I can just keep it in my personal
utility belt whenever I need to use it.

Thanks,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact


More information about the datatable-help mailing list