[datatable-help] problem returning single row for each subset

Matthew Finkbeiner matthew.finkbeiner at gmail.com
Wed Jun 6 08:11:06 CEST 2012


I have a data.table that I am trying to pull, from each subset, the
row with a max column value.

here is an example:

> library(data.table)
data.table 1.8.0  For help type: help("data.table")
> tables()
     NAME    NROW MB COLS
                               KEY
[1,] one  165,798 15
Subj,wordPos,correctResp,LiftOff,resp,RT,BeepTargSOA,presT,key,pathoffset,xvel,S
Subj,cnd,key
Total: 15MB
> one
      Subj wordPos correctResp LiftOff resp  RT BeepTargSOA     presT
key   pathoffset      xvel Sample  cnd
 [1,]    1       0           0     -13    1 657         694 -196.3333
131 -0.002970074 0.2758665      1 cong
 [2,]    1       0           0     -13    1 657         694 -192.1667
131 -0.002987193 0.2615092      2 cong
 [3,]    1       0           0     -13    1 657         694 -188.0000
131 -0.003003959 0.2463627      3 cong
 [4,]    1       0           0     -13    1 657         694 -183.8333
131 -0.003020312 0.2301467      4 cong
 [5,]    1       0           0     -13    1 657         694 -179.6667
131 -0.003036162 0.2127339      5 cong
 [6,]    1       0           0     -13    1 657         694 -175.5000
131 -0.003051398 0.1941322      6 cong
 [7,]    1       0           0     -13    1 657         694 -171.3333
131 -0.003065893 0.1744955      7 cong
 [8,]    1       0           0     -13    1 657         694 -167.1667
131 -0.003079516 0.1541903      8 cong
 [9,]    1       0           0     -13    1 657         694 -163.0000
131 -0.003092148 0.1338630      9 cong
[10,]    1       0           0     -13    1 657         694 -158.8333
131 -0.003103696 0.1144705     10 cong
First 10 rows of 165798 printed.


each subset is defined by unique values in the "Subj" and "key" column
and I use this to retrieve the row with the max "pathoffset" value
from each subset:

> mx<- one[one[,pathoffset==max(pathoffset),by='Subj,key'][[3]]]
> head(mx)
     Subj wordPos correctResp LiftOff resp  RT BeepTargSOA    presT
key pathoffset     xvel Sample  cnd
[1,]    1       0           0     -13    1 657         694 657.8333
131          0 1.617281    206 cong
[2,]    1       0           0     -13    1 657         694 657.8333
131          0 1.617281    207 cong
[3,]    1       0           0     -13    1 657         694 657.8333
131          0 1.617281    208 cong
[4,]    1       0           0     -13    1 657         694 657.8333
131          0 1.617281    209 cong
[5,]    1       0           0     -13    1 657         694 657.8333
131          0 1.617281    210 cong
[6,]    1       0           0     -13    1 657         694 657.8333
131          0 1.617281    211 cong
>

The columns are likely to have shifted, so it may not be clear, but
this did not return unique rows.  nor does this:

 mx<- one[, .SD[which(pathoffset == max(pathoffset)),], by = list(Subj, key)]
>
> head(mx)
     Subj key wordPos correctResp LiftOff resp  RT BeepTargSOA
presT pathoffset     xvel Sample  cnd
[1,]    1 131       0           0     -13    1 657         694
657.8333          0 1.617281    206 cong
[2,]    1 131       0           0     -13    1 657         694
657.8333          0 1.617281    207 cong
[3,]    1 131       0           0     -13    1 657         694
657.8333          0 1.617281    208 cong
[4,]    1 131       0           0     -13    1 657         694
657.8333          0 1.617281    209 cong
[5,]    1 131       0           0     -13    1 657         694
657.8333          0 1.617281    210 cong
[6,]    1 131       0           0     -13    1 657         694
657.8333          0 1.617281    211 cong


Is this a bug?  Or have I done something wrong?  I've posted a small
set of the data if that helps.  You can get it here:
http://personal.maccs.mq.edu.au/~mfinkbei/Rdata/one.RData

thanks

Matthew


More information about the datatable-help mailing list