[datatable-help] indexing with nomatch=0

Gabor Grothendieck ggrothendieck at gmail.com
Sat May 4 04:18:33 CEST 2013


On Fri, May 3, 2013 at 8:20 PM, Arunkumar Srinivasan
<aragorn168b at gmail.com> wrote:
> "Indexing is merging with row numbers, so indeed there's a merging going on"
> - I hadn't seen it this way until now. But I like this. I see why you expect
> `nomatch=0` to work on indexing as well. And it makes sense to me.
>
> But I am not so much inclined towards the implementation of `merge`-like
> operations in X[Y] syntax. I'd love to be convinced. I just can't get my
> mind around the usage X[Y, all.X = TRUE] and even more X[Y, list(2 columns
> of X, 1 column of Y), all.X=TRUE]. I could just do Y[X, …] which makes more
> sense here. I am unable to wrap my head around the need for this feature...
>

I think many people find data.table confusing until they put
substantial time into it and if one can leverage their existing
knowledge of R then it should be easier to understand.  all.y= would
have the exact same meaning in merge and in [.data.table so one would
immediately know what to expect if one knew merge.  I don't thinks the
same can be said for nomatch since match() is not really the same
thing as merge.

The downsides seem to be:
- It does seem that in order to be consistent with how subscripting
works that all.y = TRUE would need to be the default for data.table
whereas all.y = FALSE is the default for merge.
- all.y seems important to have but all.x is less important although
it might be included for completeness and symmetry even if less
useful.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com


More information about the datatable-help mailing list