[datatable-help] checking an approach to filtering rows in a data.table
Arunkumar Srinivasan
aragorn168b at gmail.com
Mon Mar 10 14:08:51 CET 2014
Hi Vincent,
Have you checked out the special variable `.I`? Have a look at `?data.table`. This SO post may also be relevant: http://stackoverflow.com/questions/21198937/subset-data-table-using-min-condition/21199009#21199009
Arun
From: Vincent Carey Vincent Carey
Reply: Vincent Carey stvjc at channing.harvard.edu
Date: March 10, 2014 at 4:33:27 AM
To: datatable-help at lists.r-forge.r-project.org datatable-help at lists.r-forge.r-project.org
Subject: [datatable-help] checking an approach to filtering rows in a data.table
I have looked around for code on row filtering with data.table, but have
not found anything addressing this use case.
I want to retrieve the rows satisfying a certain condition within groups, in this case having the maximum value for a specific variable. The following
seems to work, but I wonder if there is a more direct approach.
rowsWmaxVinG = function(dt, V, by) {
#
# filter dt to the rows possessing max value of
# variable V within groups formed using by
#
# example: data(mtcars)
# ddt = data.table(mtcars)
#> rowsWmaxVinG( ddt, by="cyl", V="mpg")
# mpg cyl disp hp drat wt qsec vs am gear carb
#1: 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
#2: 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
#3: 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
#
setkeyv(dt, c(by, V)) # sort within groups
dt[ cumsum(dt[, .N, by=by]$N), ] # take last row from each group
}
_______________________________________________
datatable-help mailing list
datatable-help at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140310/390c9e24/attachment.html>
More information about the datatable-help
mailing list