[datatable-help] Fast first row of each group

Steve Lianoglou mailinglist.honeypot at gmail.com
Fri Mar 11 06:21:38 CET 2011


Hey Matthew,

On Mon, Mar 7, 2011 at 8:08 PM, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
>
> Hi Steve,
>
> Have posted a follow up to your answer here :
>
> http://stats.stackexchange.com/questions/7884/fast-ways-in-r-to-get-the-first-row-of-a-data-frame-grouped-by-an-identifier/7985#7985
>
> Thought it might be of interest on list as it's such a large difference.
>
> I realise this probably isn't clear in the documentation or FAQs, so
> have added todo to make that clearer.
>
> Btw, could somebody vote me up please - I have 1 point and can't make
> comments!

As I commented on stats.stackeschange: Nicely done! (I also upvoted then, too).

I was just curious about what determines when the `.SD` object is built.

I often do things like:

my.data.table[, {
  ## some block of code
  list(a=whatever, b=something)
}, by='some.key']

If I never reference `.SD` and only reference some subset of the
columns of `my.data.table` in my "block of code", can data.table still
avoid building `.SD` and only copy the columns I reference in my
"block of code," or does this magic only restrict my group summaries
to something I can evaluate within a list(...), like:

my.data.table[, list(a=whatever, b=something), by='some.key']

Hopefully my example isn't too confusing w/o a real code sample --
which I can provide if so.

Thanks,
-steve


-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact


More information about the datatable-help mailing list