<div dir="ltr"><div>I have looked around for code on row filtering with data.table, but have</div><div>not found anything addressing this use case.</div><div><br></div><div>I want to retrieve the rows satisfying a certain condition within groups, in this case having the maximum value for a specific variable. The following</div>
<div>seems to work, but I wonder if there is a more direct approach.</div><div><br></div><div>rowsWmaxVinG = function(dt, V, by) {</div><div>#</div><div># filter dt to the rows possessing max value of</div><div># variable V within groups formed using by</div>
<div>#</div><div># example: data(mtcars)</div><div># ddt = data.table(mtcars)</div><div>#> rowsWmaxVinG( ddt, by="cyl", V="mpg")</div><div># mpg cyl disp hp drat wt qsec vs am gear carb</div>
<div>#1: 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1</div><div>#2: 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1</div><div>#3: 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2</div><div>#</div><div> setkeyv(dt, c(by, V)) # sort within groups</div>
<div> dt[ cumsum(dt[, .N, by=by]$N), ] # take last row from each group</div><div>}</div></div>