[datatable-help] changing data.table by-without-by syntax to require a "by"

eddi eduard.antonyan at gmail.com
Fri Apr 19 21:54:38 CEST 2013


Matthew Dowle suggested I put this up for a discussion here.
This is continuation of the discussion that started on  SO
<http://stackoverflow.com/questions/16093289/data-table-join-and-j-expression-unexpected-behavior/>  
and resulted in  FR2696
<https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2696&group_id=240&atid=978>  
(I recommend reading the latter first, as it's much more clear).
My case for the change boils down to the following: I believe *d[i, j, by =
b]* should be always understood to mean
*"take d, apply i, return j by b"*
instead of the much more complicated current behavior, which is:
*"take d, apply i, if i was not a merge, return j by b, if i was a merge, if
no by, then return j by key, else if b and b == key, complain and return j
by b, else return j by b"*
I believe, while disruptive to some current users, this will make data.table
much more user-friendly for any future users (one piece of evidence I would
suggest for this, besides my plea, is that FAQs 1.13-1.14 (and part of 1.12)
would become completely unnecessary).
This is regarding syntax only, and I do NOT propose any changes to
underlying behavior, in particular the speed-up when you do a "by" by the
key of the join should stay (and should be done iff by=key is present).



--
View this message in context: http://r.789695.n4.nabble.com/changing-data-table-by-without-by-syntax-to-require-a-by-tp4664770.html
Sent from the datatable-help mailing list archive at Nabble.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130419/636a459e/attachment.html>


More information about the datatable-help mailing list