[datatable-help] indexing with nomatch=0

Eduard Antonyan eduard.antonyan at gmail.com
Mon May 6 17:56:36 CEST 2013


+1; I especially like #2 and the slight conceptual shift it implies


On Sat, May 4, 2013 at 6:26 AM, Gabor Grothendieck
<ggrothendieck at gmail.com>wrote:

> The proposal at this point would be:
>
> 1. nomatch= would be replaced by all.i= such that
>      X[Y,,nomatch=NA] is the same as X[Y,,all.i=TRUE]
>      X[Y,,nomatch=0] is the same as X[Y,,all.i=FALSE]
> nomatch= would be deprecated and ultimately removed.
>
> Note that #1 is simple to implement as it only involves changing names
> and values of arguments and does not really change any behavior;
> however, its easier to think about because X[Y,,all.i=Z] now has the
> same behavior as merge(X, Y, all.y=Z) and so can be quickly understood
> by anyone who knows merge in R.  In contrast nomatch= did not even
> have the same meaning as in match() since match matches the first
> occurrence whereas with mult="all", the default, matching in
> data.table matches all occurrences.  Note that the default of merge's
> all.y= is all.y=FALSE but the default of all.i= is all.i=TRUE in order
> that the default behave as indices do.  Also note that this solves the
> problem that nomatch= can only be 0 or NA since a logical can only
> have two non-NA values anyways.
>
> 2. If Y were a numeric index vector then all.i= will have the same
> effect as if Y were a data.table with Y as its column and is merged
> with the row numbers of X.  e.g.  X[1:4,,all.i=FALSE] would be the
> same as X[1:3] if X only had 3 rows since 4 does not match a row
> number of X and is dropped because all.i=FALSE.  If Y were a numeric
> vector with negative values it would be converted to one with positive
> values in such a way as to have the established meaning and then the
> same strategy is applied. If Y were logical then its recycled giving
> YY and the same strategy is applied to which(YY). This description is
> intended to be conceptual and the actual internal mechanism could be
> different.
>
> Thus #2 allows one to think of **all** i indexing as merging rather
> than as multiple separate concepts (which I believe is consistent with
> the original intention of data.table).
>
>
>
>
>
>
> On Fri, May 3, 2013 at 8:02 PM, Eduard Antonyan
> <eduard.antonyan at gmail.com> wrote:
> > I think I like this proposal - maybe you should write up a few examples
> of
> > what current behavior is, vs the proposed behavior.
> >
> >
> > On Fri, May 3, 2013 at 6:54 PM, Gabor Grothendieck <
> ggrothendieck at gmail.com>
> > wrote:
> >>
> >> data.table is supposed to generalize indexing and although not
> >> explicitly stated the generalization seems to be that indexing is
> >> merging with the row numbers so there is indeed merging going on and
> >> that merging should respect nomatch= for consistency.
> >>
> >> On Fri, May 3, 2013 at 6:54 PM, Eduard Antonyan
> >> <eduard.antonyan at gmail.com> wrote:
> >> > There is no join'ing happening here, thus nomatch=0 has no effect.
> >> >
> >> >
> >> > On Fri, May 3, 2013 at 5:52 PM, Gabor Grothendieck
> >> > <ggrothendieck at gmail.com>
> >> > wrote:
> >> >>
> >> >> The definition of DT was left out by mistake.  It should be:
> >> >>
> >> >> DT <- data.table(a=letters[1:3])
> >> >>
> >> >>
> >> >> On Fri, May 3, 2013 at 6:50 PM, Gabor Grothendieck
> >> >> <ggrothendieck at gmail.com> wrote:
> >> >> > Consider this example:
> >> >> >
> >> >> >> DT[1:4,,nomatch=0]
> >> >> >     a
> >> >> > 1:  a
> >> >> > 2:  b
> >> >> > 3:  c
> >> >> > 4: NA
> >> >> >
> >> >> > Should it not return only the first 3 rows?  It seems to be
> ignoring
> >> >> > the nomatch=0.
> >> >> >
> >> >> > --
> >> >> > Statistics & Software Consulting
> >> >> > GKX Group, GKX Associates Inc.
> >> >> > tel: 1-877-GKX-GROUP
> >> >> > email: ggrothendieck at gmail.com
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Statistics & Software Consulting
> >> >> GKX Group, GKX Associates Inc.
> >> >> tel: 1-877-GKX-GROUP
> >> >> email: ggrothendieck at gmail.com
> >> >> _______________________________________________
> >> >> datatable-help mailing list
> >> >> datatable-help at lists.r-forge.r-project.org
> >> >>
> >> >>
> >> >>
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Statistics & Software Consulting
> >> GKX Group, GKX Associates Inc.
> >> tel: 1-877-GKX-GROUP
> >> email: ggrothendieck at gmail.com
> >
> >
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130506/84d2d55c/attachment.html>


More information about the datatable-help mailing list