[datatable-help] problem returning single row for each subset

Christoph Jäckel christoph.jaeckel at wi.tum.de
Wed Jun 6 09:48:34 CEST 2012


I had a look at your example and for me it works. Note that
pathoffset==max(pathoffset) returns every row for which pathoffset is equal
to max(pathoffset) for this particular key/Subj combination. And it just
happens that your first combination key==1 & Sub==131 has
max(pathoffset)==0 and many rows of with 0 pathoffset.

If you have for example a look at mx[key==141], you only get back one row:

     Subj wordPos correctResp LiftOff resp  RT BeepTargSOA    presT key
pathoffset
[1,]    1       0           0      64    1 835         694 448.1333 141
 0.10344
         xvel Sample  cnd
[1,] 47.68907    156 cong

So I guess everything is working out here. I hope I didn't miss anything.

Christoph

On Wed, Jun 6, 2012 at 8:11 AM, Matthew Finkbeiner <
matthew.finkbeiner at gmail.com> wrote:

> I have a data.table that I am trying to pull, from each subset, the
> row with a max column value.
>
> here is an example:
>
> > library(data.table)
> data.table 1.8.0  For help type: help("data.table")
> > tables()
>     NAME    NROW MB COLS
>                               KEY
> [1,] one  165,798 15
>
> Subj,wordPos,correctResp,LiftOff,resp,RT,BeepTargSOA,presT,key,pathoffset,xvel,S
> Subj,cnd,key
> Total: 15MB
> > one
>      Subj wordPos correctResp LiftOff resp  RT BeepTargSOA     presT
> key   pathoffset      xvel Sample  cnd
>  [1,]    1       0           0     -13    1 657         694 -196.3333
> 131 -0.002970074 0.2758665      1 cong
>  [2,]    1       0           0     -13    1 657         694 -192.1667
> 131 -0.002987193 0.2615092      2 cong
>  [3,]    1       0           0     -13    1 657         694 -188.0000
> 131 -0.003003959 0.2463627      3 cong
>  [4,]    1       0           0     -13    1 657         694 -183.8333
> 131 -0.003020312 0.2301467      4 cong
>  [5,]    1       0           0     -13    1 657         694 -179.6667
> 131 -0.003036162 0.2127339      5 cong
>  [6,]    1       0           0     -13    1 657         694 -175.5000
> 131 -0.003051398 0.1941322      6 cong
>  [7,]    1       0           0     -13    1 657         694 -171.3333
> 131 -0.003065893 0.1744955      7 cong
>  [8,]    1       0           0     -13    1 657         694 -167.1667
> 131 -0.003079516 0.1541903      8 cong
>  [9,]    1       0           0     -13    1 657         694 -163.0000
> 131 -0.003092148 0.1338630      9 cong
> [10,]    1       0           0     -13    1 657         694 -158.8333
> 131 -0.003103696 0.1144705     10 cong
> First 10 rows of 165798 printed.
>
>
> each subset is defined by unique values in the "Subj" and "key" column
> and I use this to retrieve the row with the max "pathoffset" value
> from each subset:
>
> > mx<- one[one[,pathoffset==max(pathoffset),by='Subj,key'][[3]]]
> > head(mx)
>     Subj wordPos correctResp LiftOff resp  RT BeepTargSOA    presT
> key pathoffset     xvel Sample  cnd
> [1,]    1       0           0     -13    1 657         694 657.8333
> 131          0 1.617281    206 cong
> [2,]    1       0           0     -13    1 657         694 657.8333
> 131          0 1.617281    207 cong
> [3,]    1       0           0     -13    1 657         694 657.8333
> 131          0 1.617281    208 cong
> [4,]    1       0           0     -13    1 657         694 657.8333
> 131          0 1.617281    209 cong
> [5,]    1       0           0     -13    1 657         694 657.8333
> 131          0 1.617281    210 cong
> [6,]    1       0           0     -13    1 657         694 657.8333
> 131          0 1.617281    211 cong
> >
>
> The columns are likely to have shifted, so it may not be clear, but
> this did not return unique rows.  nor does this:
>
>  mx<- one[, .SD[which(pathoffset == max(pathoffset)),], by = list(Subj,
> key)]
> >
> > head(mx)
>     Subj key wordPos correctResp LiftOff resp  RT BeepTargSOA
> presT pathoffset     xvel Sample  cnd
> [1,]    1 131       0           0     -13    1 657         694
> 657.8333          0 1.617281    206 cong
> [2,]    1 131       0           0     -13    1 657         694
> 657.8333          0 1.617281    207 cong
> [3,]    1 131       0           0     -13    1 657         694
> 657.8333          0 1.617281    208 cong
> [4,]    1 131       0           0     -13    1 657         694
> 657.8333          0 1.617281    209 cong
> [5,]    1 131       0           0     -13    1 657         694
> 657.8333          0 1.617281    210 cong
> [6,]    1 131       0           0     -13    1 657         694
> 657.8333          0 1.617281    211 cong
>
>
> Is this a bug?  Or have I done something wrong?  I've posted a small
> set of the data if that helps.  You can get it here:
> http://personal.maccs.mq.edu.au/~mfinkbei/Rdata/one.RData
>
> thanks
>
> Matthew
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>



--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20120606/d44bfd18/attachment.html>


More information about the datatable-help mailing list