[datatable-help] problem returning single row for each subset

Matthew Finkbeiner matthew.finkbeiner at gmail.com
Wed Jun 6 11:53:16 CEST 2012


You are right.  It's my data, not data.table.  Thanks for that.

Matthew



On Wed, Jun 6, 2012 at 5:48 PM, Christoph Jäckel
<christoph.jaeckel at wi.tum.de> wrote:
> I had a look at your example and for me it works. Note that
> pathoffset==max(pathoffset) returns every row for which pathoffset is equal
> to max(pathoffset) for this particular key/Subj combination. And it just
> happens that your first combination key==1 & Sub==131 has max(pathoffset)==0
> and many rows of with 0 pathoffset.
>
> If you have for example a look at mx[key==141], you only get back one row:
>
>      Subj wordPos correctResp LiftOff resp  RT BeepTargSOA    presT key
> pathoffset
> [1,]    1       0           0      64    1 835         694 448.1333 141
>  0.10344
>          xvel Sample  cnd
> [1,] 47.68907    156 cong
>
> So I guess everything is working out here. I hope I didn't miss anything.
>
> Christoph
>
> On Wed, Jun 6, 2012 at 8:11 AM, Matthew Finkbeiner
> <matthew.finkbeiner at gmail.com> wrote:
>>
>> I have a data.table that I am trying to pull, from each subset, the
>> row with a max column value.
>>
>> here is an example:
>>
>> > library(data.table)
>> data.table 1.8.0  For help type: help("data.table")
>> > tables()
>>     NAME    NROW MB COLS
>>                               KEY
>> [1,] one  165,798 15
>>
>> Subj,wordPos,correctResp,LiftOff,resp,RT,BeepTargSOA,presT,key,pathoffset,xvel,S
>> Subj,cnd,key
>> Total: 15MB
>> > one
>>      Subj wordPos correctResp LiftOff resp  RT BeepTargSOA     presT
>> key   pathoffset      xvel Sample  cnd
>>  [1,]    1       0           0     -13    1 657         694 -196.3333
>> 131 -0.002970074 0.2758665      1 cong
>>  [2,]    1       0           0     -13    1 657         694 -192.1667
>> 131 -0.002987193 0.2615092      2 cong
>>  [3,]    1       0           0     -13    1 657         694 -188.0000
>> 131 -0.003003959 0.2463627      3 cong
>>  [4,]    1       0           0     -13    1 657         694 -183.8333
>> 131 -0.003020312 0.2301467      4 cong
>>  [5,]    1       0           0     -13    1 657         694 -179.6667
>> 131 -0.003036162 0.2127339      5 cong
>>  [6,]    1       0           0     -13    1 657         694 -175.5000
>> 131 -0.003051398 0.1941322      6 cong
>>  [7,]    1       0           0     -13    1 657         694 -171.3333
>> 131 -0.003065893 0.1744955      7 cong
>>  [8,]    1       0           0     -13    1 657         694 -167.1667
>> 131 -0.003079516 0.1541903      8 cong
>>  [9,]    1       0           0     -13    1 657         694 -163.0000
>> 131 -0.003092148 0.1338630      9 cong
>> [10,]    1       0           0     -13    1 657         694 -158.8333
>> 131 -0.003103696 0.1144705     10 cong
>> First 10 rows of 165798 printed.
>>
>>
>> each subset is defined by unique values in the "Subj" and "key" column
>> and I use this to retrieve the row with the max "pathoffset" value
>> from each subset:
>>
>> > mx<- one[one[,pathoffset==max(pathoffset),by='Subj,key'][[3]]]
>> > head(mx)
>>     Subj wordPos correctResp LiftOff resp  RT BeepTargSOA    presT
>> key pathoffset     xvel Sample  cnd
>> [1,]    1       0           0     -13    1 657         694 657.8333
>> 131          0 1.617281    206 cong
>> [2,]    1       0           0     -13    1 657         694 657.8333
>> 131          0 1.617281    207 cong
>> [3,]    1       0           0     -13    1 657         694 657.8333
>> 131          0 1.617281    208 cong
>> [4,]    1       0           0     -13    1 657         694 657.8333
>> 131          0 1.617281    209 cong
>> [5,]    1       0           0     -13    1 657         694 657.8333
>> 131          0 1.617281    210 cong
>> [6,]    1       0           0     -13    1 657         694 657.8333
>> 131          0 1.617281    211 cong
>> >
>>
>> The columns are likely to have shifted, so it may not be clear, but
>> this did not return unique rows.  nor does this:
>>
>>  mx<- one[, .SD[which(pathoffset == max(pathoffset)),], by = list(Subj,
>> key)]
>> >
>> > head(mx)
>>     Subj key wordPos correctResp LiftOff resp  RT BeepTargSOA
>> presT pathoffset     xvel Sample  cnd
>> [1,]    1 131       0           0     -13    1 657         694
>> 657.8333          0 1.617281    206 cong
>> [2,]    1 131       0           0     -13    1 657         694
>> 657.8333          0 1.617281    207 cong
>> [3,]    1 131       0           0     -13    1 657         694
>> 657.8333          0 1.617281    208 cong
>> [4,]    1 131       0           0     -13    1 657         694
>> 657.8333          0 1.617281    209 cong
>> [5,]    1 131       0           0     -13    1 657         694
>> 657.8333          0 1.617281    210 cong
>> [6,]    1 131       0           0     -13    1 657         694
>> 657.8333          0 1.617281    211 cong
>>
>>
>> Is this a bug?  Or have I done something wrong?  I've posted a small
>> set of the data if that helps.  You can get it here:
>> http://personal.maccs.mq.edu.au/~mfinkbei/Rdata/one.RData
>>
>> thanks
>>
>> Matthew
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
>
>
>
> --


More information about the datatable-help mailing list