[datatable-help] Select from second key but not first

Matthew Dowle mdowle at mdowle.plus.com
Wed Jul 13 02:53:34 CEST 2011


Hi Chris,

Welcome.

That's a 'secondary key'. FR#1007 is to build in secondary keys :
https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1007&group_id=240&atid=978

In the meantime you can do a 'manual' secondary key :

  idx = dt[,list(z,y,i=1:nrow(dt))]
  setkey(idx,z,y)
  dt[idx[J(3),i]$i]

Btw, that [,i]$i is ugly and I'm coming around to the idea of making
that more consistent, as requested by a previous poster (can't find it
now).

I'm thinking it should work like this :

    DT[c("a","b"),j]
    # always returns vector, for consistency, even though 2 groups are
joined to via the mult="all" default and the correspondence might be
lost. Then mult="first" and mult="last" would return vector too,
consistent with mult="all"

    DT[c("a","b"),list(j)]
    # list() needed to retain the group columns and return a data.table
rather than a vector. Same type (i.e. data.table) returned for all
values of mult.

Would that be better?  Throwing that out to all.
   
Btw, posting from googlegroups does work then, that's good. This thread
should be mirrored in all places; it shouldn't matter where you post
from or to, but any probs please let me know as it's the first time.

Matthew 


On Tue, 2011-07-12 at 16:21 -0700, Chris Neff wrote:
> Hi all,
> 
> 
> I'm really new to data.table and something really simple has me
> stumped.
> 
> 
> Lets say I have the following (but much bigger so timing matters)
> 
> 
> dt = data.table(x=1:100,y=1:2,z=1:4,key="y,z")
> 
> 
> From the documentation, I understand that
> 
> 
> dt[J(1,3)]
> 
> 
> is significantly faster 
> 
> 
> dt[y==1 & z==3]
> 
> 
> 
> 
> and I could do
> 
> 
> dt[J(1)]
> 
> 
> instead of 
> 
> 
> dt[y==1]
> 
> 
> but is there any way to do 
> 
> 
> dt[z==3]
> 
> 
> faster?  I want to do something like
> 
> 
> df[J( ,3)]
> 
> 
> but I know that doesn't make sense. Is it because z is not the primary
> key that I can't seem to figure out how to use J to do this? Since it
> isn't sorted on z anyway I doubt I can get a speed up right?
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help




More information about the datatable-help mailing list