[datatable-help] select * and getting the full sub data.table/frame

Akhil Behl akhil at igidr.ac.in
Thu Jan 17 17:53:08 CET 2013


If I am not wrong, you are looking for `.SD'. In fact you can put in
the exact function you were throwing at ddply earlier. There are other
special names like .SD that you can find in the data.table FAQs.

Let's see:
R> require(plyr)
Loading required package: plyr

R> require(data.table)
Loading required package: data.table
data.table 1.8.7  For help type: help("data.table")

R> x.df <- data.frame(x=letters[1:2], y=1:10)
R> x.dt <- data.table(x.df)
R>
R> my.func <- function (d) { # Define a function on the subset
+ sum(sqrt(d[["y"]]))
+ }
R>
R> # The plyr way:
R> ddply(x.df, "x", my.func) -> ans.plyr
R>
R> # The data.table way:
R> x.dt[ , my.func(.SD), by=x] -> ans.dt
R>
R> ans.plyr
  x       V1
1 a 10.61387
2 b 11.85441

R> ans.dt
   x       V1
1: a 10.61387
2: b 11.85441

For more help, try this on an R prompt:

R> vignette('datatable-faq')

--
ASB.

On Thu, Jan 17, 2013 at 9:49 PM, David Bellot <david.bellot at gmail.com> wrote:
> Hi,
>
> I've been looking all around the web without a clear answer to this trivial
> problem. I'm sure I'm not looking where I should:
>
> in fact, I want to replace my use of ddply from the plyr package by
> data.table. One of my main use is to group a big data.frame by a group of
> variable and do something on this sub data.frame:
>
> ddply( my_df, my_grouping_var, function (d)   { do something with d } )
> ----> d is a data.frame again
>
> and it's slow on big data.frame.
>
>
> However, I don't really understand how to redo the same thing with a
> data.table. Basically if "j" in a data.table is equivalent to the select
> clause in SQL, then how do I do SELECT * FROM etc...
>
> I want to be able to pass a function like in ddply that will receive not
> only a few columns but the full subset that is selected by the "by" clause.
>
> Thanks...
> Best,
> David
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help


More information about the datatable-help mailing list