[datatable-help] subsetting by second key

G See gsee000 at gmail.com
Sun Jun 15 18:03:13 CEST 2014


Thank you Arun.  Should that answer be updated to use CJ(.), then?  Is
there an advantage to using J(.) over CJ(.) if you know that you're
only looking for one value in the second column?

On Sun, Jun 15, 2014 at 10:56 AM, Arunkumar Srinivasan
<aragorn168b at gmail.com> wrote:
> unique(Species) is of length 3, where as the 2nd entry c(1.5, 2) is of
> length 2.
>
> J in J(.) is replaced with list(.) internally (using lazy evaluation),
> following which it’s converted to a data.table using as.data.table(list(.)).
>
> And here your list is:
>
> list(c("setosa", "versicolor", "virginica") , c(1.5, 2.0)) which results in
> the warning because it has to recycle to convert it to a data.table.
>
> In the example you’ve linked, J(.) and CJ(.) will return the same result
> (because there’s just one value in 2nd column). So, the results don’t
> change. But the general expression is to use CJ(.) along with nomatch=0L, as
> you’ve done.
>
> Those two expressions are equivalent, yes.
>
>
> Arun
>
> From: G See gsee000 at gmail.com
> Reply: G See gsee000 at gmail.com
> Date: June 15, 2014 at 5:45:11 PM
> To: datatable-help at lists.r-forge.r-project.org
> datatable-help at lists.r-forge.r-project.org
> Subject:  [datatable-help] subsetting by second key
>
> Hi,
>
> I want to subset a data.table using only its second key, which is
> demonstrated here
> http://stackoverflow.com/questions/15597685/subsetting-data-table-by-2nd-column-only-of-a-2-column-key-using-binary-search/15597713#15597713
>
> However, I need to subset with more than one value in the secondary key
>
> Is this warning expected? What exactly is it telling me?
>
> library(data.table)
> DT <- data.table(iris, key="Species,Petal.Width")
> DT[J(unique(Species), c(1.5, 2.0)), nomatch=0L]
> # Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> #1: 6.0 2.2 5.0 1.5 virginica
> #2: 6.3 2.8 5.1 1.5 virginica
> #Warning message:
> #In as.data.table.list(i) :
> # Item 2 is of size 2 but maximum size is 3 (recycled leaving a
> remainder of 1 items)
>
>
> It looks like I can get what I want with either of these; can you
> confirm that both of these will always return the same result?
>
> DT[Petal.Width %in% c(1.5, 2.0)] # vector scan
> DT[CJ(unique(Species), c(1.5, 2.0)), nomatch=0L]
>
>
> Thanks,
> Garrett
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help


More information about the datatable-help mailing list