[datatable-help] Data Table Subset Question

Frank Erickson fperickson at wisc.edu
Thu Aug 17 00:23:42 CEST 2017


One idiom for testing group-level conditions is:

data[, if (mean(x) < 10) .SD, by=g]

This might be slower in the special case of taking a mean. See ?GForce.

There's a request for an idiom like SQL HAVING over here:
https://github.com/Rdatatable/data.table/issues/788

--Frank

On Wed, Aug 16, 2017 at 4:44 PM, Bernstein, Elliot J <
EJBernstein at wellington.com> wrote:

> Is there a way to subset a data table by the result of a grouped
> aggregation without adding an interim column to the table? For example, if
> I want to select all rows for which the group mean value of x is less than
> 10, I can do the following:
>
>
>
> data <- data.table(x = 1:20, g = rep(c("a", "b"), each = 10))
>
> data[, mean.x := mean(x), by = .(g)]
>
> data[mean.x < 10,]
>
>
>
> But I’m not really interested in “mean.x”. Can I do the same thing without
> adding it to the table?
>
>
>
> Thanks.
>
>
>
> - Elliot
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/
> listinfo/datatable-help
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20170816/24ae3917/attachment.html>


More information about the datatable-help mailing list