[datatable-help] problem returning single row for each subset
Matthew Finkbeiner
matthew.finkbeiner at gmail.com
Wed Jun 6 08:11:06 CEST 2012
I have a data.table that I am trying to pull, from each subset, the
row with a max column value.
here is an example:
> library(data.table)
data.table 1.8.0 For help type: help("data.table")
> tables()
NAME NROW MB COLS
KEY
[1,] one 165,798 15
Subj,wordPos,correctResp,LiftOff,resp,RT,BeepTargSOA,presT,key,pathoffset,xvel,S
Subj,cnd,key
Total: 15MB
> one
Subj wordPos correctResp LiftOff resp RT BeepTargSOA presT
key pathoffset xvel Sample cnd
[1,] 1 0 0 -13 1 657 694 -196.3333
131 -0.002970074 0.2758665 1 cong
[2,] 1 0 0 -13 1 657 694 -192.1667
131 -0.002987193 0.2615092 2 cong
[3,] 1 0 0 -13 1 657 694 -188.0000
131 -0.003003959 0.2463627 3 cong
[4,] 1 0 0 -13 1 657 694 -183.8333
131 -0.003020312 0.2301467 4 cong
[5,] 1 0 0 -13 1 657 694 -179.6667
131 -0.003036162 0.2127339 5 cong
[6,] 1 0 0 -13 1 657 694 -175.5000
131 -0.003051398 0.1941322 6 cong
[7,] 1 0 0 -13 1 657 694 -171.3333
131 -0.003065893 0.1744955 7 cong
[8,] 1 0 0 -13 1 657 694 -167.1667
131 -0.003079516 0.1541903 8 cong
[9,] 1 0 0 -13 1 657 694 -163.0000
131 -0.003092148 0.1338630 9 cong
[10,] 1 0 0 -13 1 657 694 -158.8333
131 -0.003103696 0.1144705 10 cong
First 10 rows of 165798 printed.
each subset is defined by unique values in the "Subj" and "key" column
and I use this to retrieve the row with the max "pathoffset" value
from each subset:
> mx<- one[one[,pathoffset==max(pathoffset),by='Subj,key'][[3]]]
> head(mx)
Subj wordPos correctResp LiftOff resp RT BeepTargSOA presT
key pathoffset xvel Sample cnd
[1,] 1 0 0 -13 1 657 694 657.8333
131 0 1.617281 206 cong
[2,] 1 0 0 -13 1 657 694 657.8333
131 0 1.617281 207 cong
[3,] 1 0 0 -13 1 657 694 657.8333
131 0 1.617281 208 cong
[4,] 1 0 0 -13 1 657 694 657.8333
131 0 1.617281 209 cong
[5,] 1 0 0 -13 1 657 694 657.8333
131 0 1.617281 210 cong
[6,] 1 0 0 -13 1 657 694 657.8333
131 0 1.617281 211 cong
>
The columns are likely to have shifted, so it may not be clear, but
this did not return unique rows. nor does this:
mx<- one[, .SD[which(pathoffset == max(pathoffset)),], by = list(Subj, key)]
>
> head(mx)
Subj key wordPos correctResp LiftOff resp RT BeepTargSOA
presT pathoffset xvel Sample cnd
[1,] 1 131 0 0 -13 1 657 694
657.8333 0 1.617281 206 cong
[2,] 1 131 0 0 -13 1 657 694
657.8333 0 1.617281 207 cong
[3,] 1 131 0 0 -13 1 657 694
657.8333 0 1.617281 208 cong
[4,] 1 131 0 0 -13 1 657 694
657.8333 0 1.617281 209 cong
[5,] 1 131 0 0 -13 1 657 694
657.8333 0 1.617281 210 cong
[6,] 1 131 0 0 -13 1 657 694
657.8333 0 1.617281 211 cong
Is this a bug? Or have I done something wrong? I've posted a small
set of the data if that helps. You can get it here:
http://personal.maccs.mq.edu.au/~mfinkbei/Rdata/one.RData
thanks
Matthew
More information about the datatable-help
mailing list