[datatable-help] checking an approach to filtering rows in a data.table

Arunkumar Srinivasan aragorn168b at gmail.com
Mon Mar 10 14:08:51 CET 2014


Hi Vincent,

Have you checked out the special variable `.I`? Have a look at `?data.table`. This SO post may also be relevant: http://stackoverflow.com/questions/21198937/subset-data-table-using-min-condition/21199009#21199009
Arun
From: Vincent Carey Vincent Carey
Reply: Vincent Carey stvjc at channing.harvard.edu
Date: March 10, 2014 at 4:33:27 AM
To: datatable-help at lists.r-forge.r-project.org datatable-help at lists.r-forge.r-project.org
Subject:  [datatable-help] checking an approach to filtering rows in a data.table  
I have looked around for code on row filtering with data.table, but have
not found anything addressing this use case.

I want to retrieve the rows satisfying a certain condition within groups, in this case having the maximum value for a specific variable.  The following
seems to work, but I wonder if there is a more direct approach.

rowsWmaxVinG = function(dt, V, by) {
#
# filter dt to the rows possessing max value of
# variable V within groups formed using by
#
# example: data(mtcars)
# ddt = data.table(mtcars)
#> rowsWmaxVinG( ddt, by="cyl", V="mpg")
#    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#1: 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
#2: 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
#3: 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
#
 setkeyv(dt, c(by, V)) # sort within groups
 dt[ cumsum(dt[, .N, by=by]$N), ]  # take last row from each group
}
_______________________________________________  
datatable-help mailing list  
datatable-help at lists.r-forge.r-project.org  
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140310/390c9e24/attachment.html>


More information about the datatable-help mailing list