[datatable-help] Feature Idea
Alexander Peterhansl
APeterhansl at GAINCapital.com
Mon Jul 11 21:42:06 CEST 2011
I've found that "by" does not need a key. For example,
> temp <- data.table(Index1=1:4,Index2=c(4,2,2,1),Values=c(10,10,10,30)) # no key set here!
> temp[,sum(Values),by=Index2,bysameorder=TRUE]
Index2 V1
[1,] 4 10
[2,] 2 20
[3,] 1 30
> temp[,sum(Values),by=Index2,bysameorder=FALSE]
Index2 V1
[1,] 1 30
[2,] 2 20
[3,] 4 10
Nevertheless "bysameorder" changes the initial ordering.
But, more generally, is there a way to attach a key "on the fly" ?
Suppose I wanted to extract all table values where Index2 is equal to 1. Is there a better way to do this than:
>setkey(temp,"Index2")
> temp[J(1),]
Index2 Index1 Values
[1,] 1 4 30
Thanks,
Alex
-----Original Message-----
From: datatable-help-bounces at r-forge.wu-wien.ac.at [mailto:datatable-help-bounces at r-forge.wu-wien.ac.at] On Behalf Of Matthew Dowle
Sent: Saturday, July 09, 2011 3:54 AM
To: Steve Lianoglou
Cc: datatable-help at lists.r-forge.r-project.org
Subject: Re: [datatable-help] Feature Idea
(I think) it already does that. It's just that it sets a key on the result by default (which does the re-ordering of the grouped results at the end). If that's true, then could provide a way to not call setkey at the end. There is also the 'bysameorder' argument which might already be doing something similar.
Matthew
On Fri, 2011-07-08 at 14:29 -0400, Steve Lianoglou wrote:
> Hi,
>
> I find myself often wanting to use a data.table for its quick
> aggregate&summary mojo, but I want to keep the ordering of my data as
> I have it, and not as it would be if I set the appropriate keys for my
> aggregation/summary.
>
> How would you folks feel if I add a `by` (or dt.by) method for a data.table, eg:
>
> result <- by(some.data.table, would.be.keys, { ## stuff }, ...)
>
> Which does the aggregate/summary encoded within { ... }, but the
> result is returned in the same order as `some.data.table` was in when
> it was passed into the function -- if { ... } returned as many rows as
> were in the original data.table, then it's 1-for-1, but you are
> summarizing groups of rows, the summary would be in the same
> (appearance) order as it is in `some.data.table`.
>
> The { ... } block would essentially be anything you can put in the `j`
> part of a data.table[i, j, ...].
>
> The `...` dots after { ... } maybe extra params that can get passed
> into a "normal" data.table[i,j,...] call (haven't thought about that
> yet, tho).
>
> If I can get some consensus on whether or not it's worthwhile to put
> such a function into the data.table package, I'll go ahead and add an
> initial implementation -- otherwise I can just keep it in my personal
> utility belt whenever I need to use it.
>
> Thanks,
> -steve
>
_______________________________________________
datatable-help mailing list
datatable-help at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
More information about the datatable-help
mailing list