<div>

                    In retrospect, `.join` is also confusing/untrue (as the data.table join is still being done). I find `cross.apply` clearer.

                </div><div><br></div><div><div>Arun</div><div><br></div></div>

                <p style="color: #A0A0A8;">On Thursday, May 2, 2013 at 12:33 AM, Arunkumar Srinivasan wrote:</p>

                <blockquote type="cite" style="border-left-style:solid;border-width:1px;margin-left:0px;padding-left:10px;">

                    <span><div><div>

                <div>

                    Eduard,

                </div><div><br></div><div>Yes, that clears it up. If `.join` if FALSE, then there's no `by-without-by`, basically. `drop` really serves another purpose.</div><div><br></div><div>Once again, I find `each.i = TRUE/FALSE` to be confusing (as it was one of the intended purposes of this post to begin with) to mean to apply to *any* `i` operation. Unless this is true, I'd like to stick to `.join` as it's what we are setting to FALSE/TRUE here.</div>

                <div><div><br></div><div>Thanks for the patient clarifications.</div><div><br></div><div>Arun</div><div><br></div></div>

                <p style="color: #A0A0A8;">On Thursday, May 2, 2013 at 12:28 AM, Eduard Antonyan wrote:</p><blockquote type="cite"><div>

                    <span><div><div><div dir="ltr">Arun, from my previous email:<div><br></div><div>"<span style="font-family:arial,sans-serif;font-size:13px">Take 'dt' and apply 'i' and return 'j' (for any 'i' and 'j') by 'b':</span></div>

<div style="font-size:13px;font-family:arial,sans-serif">  dt[i, j, by = b] <-> dt[i][, j, by = b] in general, but also dt[i, j, by = b] if 'i' is not a join, and can also be dt[i, j, by = b] if 'i' is a join in some cases but not others</div>

<div style="font-size:13px;font-family:arial,sans-serif"><br></div><div style="font-size:13px;font-family:arial,sans-serif">Take 'dt' and apply 'i' and return j, applying cross-apply/by-without-by (will do cross-apply only when 'i' is a join):</div>

<div style="font-size:13px;font-family:arial,sans-serif">  dt[i, j, each.i = TRUE] <-> dt[i, j]"</div><div style="font-size:13px;font-family:arial,sans-serif"><br></div><div style="font-size:13px;font-family:arial,sans-serif">

Together with the default being each.i=FALSE, you can see that the answer to your question will be:</div><div style="font-size:13px;font-family:arial,sans-serif"><br></div><div style="font-size:13px;font-family:arial,sans-serif">

DT1[DT2, sum(y), each.i = FALSE, allow.cartesian = TRUE] <-> DT1[DT2, allow.cartesian=TRUE][, sum(y)], i.e.</div><div style="font-size:13px;font-family:arial,sans-serif">[1] 21</div><div style="font-size:13px;font-family:arial,sans-serif">

<br></div><div style="font-size:13px;font-family:arial,sans-serif">and</div><div style="font-size:13px;font-family:arial,sans-serif">DT1[DT2, sum(y), each.i = TRUE, allow.cartesian = TRUE] <-> DT1[DT2, sum(y), allow.cartesian=TRUE], i.e.</div>

<div><div><font face="arial, sans-serif">   x V1</font></div><div><font face="arial, sans-serif">1: 1  6</font></div><div><font face="arial, sans-serif">2: 2  9</font></div><div><font face="arial, sans-serif">3: 1  6</font></div>

<div style="font-family:arial,sans-serif;font-size:13px"><br></div></div></div><div><br><br><div>On Wed, May 1, 2013 at 5:23 PM, Arunkumar Srinivasan <span dir="ltr"><<a href="mailto:aragorn168b@gmail.com" target="_blank">aragorn168b@gmail.com</a>></span> wrote:<br><blockquote type="cite"><div>

                <div>eddi,

                </div><div><br></div><div>sorry again, I am confused a bit now. </div><div><br></div><div>DT1 <- data.table(x=c(1,1,1,2,2), y=1:5))</div><div><div>DT2 <- data.table(x=c(1,2,1))</div><div>

setkey(DT1, "x")</div><div><br></div></div><div>What's the intended result for `DT1[DT2, sum(y), allow.cartesian = TRUE, .join = FALSE]` ? c(6,9,6) or 21?</div><div><br></div>

                <div><div><br></div><div>Arun</div><div><br></div></div><div><div>

                <p style="color:#a0a0a8">On Thursday, May 2, 2013 at 12:20 AM, Arunkumar Srinivasan wrote:</p><blockquote type="cite"><div>

                    <span><div><div>

                <div>

                    Sorry the proposed result was a wrong paste in the last message:

                </div><div><br></div><div><div>    # proposed way and the result:</div><div>    DT1[DT2, sum(y), .join = FALSE]</div><div>    [1] 6 9 6</div></div><div><br></div>

                <div><div>And the last part that it *should* be a data.table is quite obvious then.</div><div><br></div><div>Arun</div><div><br></div></div>

                <p style="color:#a0a0a8">On Thursday, May 2, 2013 at 12:16 AM, Arunkumar Srinivasan wrote:</p><blockquote type="cite"><div>

                    <span><div><div>

                <div>

                    Eduard,

                </div><div><br></div><div>Great. That explains me the difference between `drop` and `.join` here. </div><div>Even though I don't *need* this feature (I can't recall the last time when I use a `data.table` for `i` and had to reduce the function, say, sum). But, I think it can only better the usage.</div>

<div><br></div><div>However, there's one point *I think* would still disagree with @eddi here, not sure. </div><div><br></div><div>    DT1 <- data.table(x=c(1,1,1,2,2), y=1:5)</div><div>    DT2 <- data.table(x=c(1,2,1))</div>

<div>    setkey(DT1, "x")</div><div><br></div><div>    # proposed way and the result:</div><div>    DT1[DT2, sum(y), .join = FALSE]</div><div>    [1] 21</div><div><br></div><div><br></div><div>So far nice. However, the operation `DT1[DT2, sum(y), .join = TRUE]` *should* result in a `data.table` output as follows (it's even more clearer now that .join is set to TRUE, meaning it's a data.table join):</div>

<div><br></div><div><div>   x V1</div><div>1: 1  6</div><div>2: 2  9</div><div>3: 1  6</div></div><div><br></div><div>Basically, `.join = TRUE` is the current functionality unchanged and nice to be default (as Matthew hinted).</div>

<div>    </div><div><div>Arun</div><div><br></div></div>

                <p style="color:#a0a0a8">On Tuesday, April 30, 2013 at 5:03 PM, Eduard Antonyan wrote:</p><blockquote type="cite"><div>

                    <span><div><div><div dir="ltr"><div><div><div><div>Arun,<br><br></div>Yes, DT1[DT2, y, .JOIN = FALSE] would do the same as DT1[DT2][, y] does currently.<br></div>No, DT1[DT2, y, .JOIN=FALSE], will NOT do a by-without-by, which is literally a 'by' by each of the rows of DT2 that are in the join (thus each.i! - the operation 'y' will be performed for each of the rows of 'i' and then combined and returned). There is no efficiency issue here that I can see, but Matthew can correct me on this. As far as I understand the efficiency comes into play when e.g. the rows of 'i' are unique, and after the join you'd like to do a 'by' by those, then DT1[DT2][, j, by = key(DT1)] would be less efficient since the 'by' could've already been done while joining.<br>

<br></div>DT1[DT2, .JOIN=FALSE] would be equivalent to both current and future DT1[DT2] - in this expression there is no by-without-by happening in either case.<br><br></div><div>The purpose of this is NOT for j just being a column or an expression that gets evaluated into a signal column. It applies to any j. The extra 'by-without-by' column is currently output independently of how many columns you output in your j-expression, the behavior is very similar as to when you specify a by=., except that the 'by' happens by a very special expression, that only exists when joining two data-tables and that generally doesn't exist before or after the join.<br>

</div><div><br></div>Hope this answers your questions.<br><div><br><br><div>On Tue, Apr 30, 2013 at 8:48 AM, Arunkumar Srinivasan <span dir="ltr"><<a href="mailto:aragorn168b@gmail.com" target="_blank">aragorn168b@gmail.com</a>></span> wrote:<br><blockquote type="cite"><div>

                <div>

                    Eduard, thanks for your reply. But somethings are unclear to me still. I'll try to explain them below.

                </div><div><br></div><div>First I prefer .JOIN (or cross.apply) just because `each.i` seems general (that it is applicable to *every* i operation, which as of now seems untrue). .JOIN is specific to data.table type for `i`.</div>

<div><br></div><div>From what I understand from your reply, if (.JOIN = FALSE), then,</div><div><br></div><div>    DT1[DT2, y, .JOIN = FALSE] <=> DT1[DT2][, y]</div><div><br></div><div>Is this right? It's a bit confusing because I think you're okay with "by-without-by" and I got the impression from Sadao that he finds the syntax of "by-without-by" unaccessible/advanced for basic users. So, just to clarify, here the DT1[DT2, y, .JOIN=FALSE] will still do the "by-without-by" and then result in a "vector", right?  </div>

<div><br></div><div>Matthew explains in the current documentation that DT1[DT2][, y] would "join" all columns of DT1 and DT2 and then subset. I assume the implementation underneath is *not* DT1[DT2][, y] rather the result is an efficient equivalence. Then, that of course seems alright to me.</div>

<div><br></div><div>If what I've told so far is right, then the syntax `DT1[DT2, .JOIN=FALSE]` doesn't make sense/has no purpose to me. At least I can't think of any at the moment. </div><div><br></div><div>To conclude, IMHO, if the purpose of `.JOIN` is to provide the same as DT1[i, j] for DT1[DT2, j] (j being a column or an expression that results in getting evaluated as a scalar for every group in the current by-without-by syntax), then, I find this is covered in `drop = TRUE/FALSE`. Correct me if I am wrong. But, one could do: `DT1[DT2, j, drop=TRUE]` instead of `DT1[DT2, j, .JOIN=FALSE]` and DT1[i, j, drop=FALSE] instead of DT1[i, list(x,y)].</div>

<div><br></div>

                <div><div>If you/anyone believes it's wrong, I'd be all ears to clarify as to what's the purpose of `drop` then (and also how it *doesn't* suit here as compared to .JOIN).</div><div><br></div>

<div>Arun</div><div><br></div></div><div><div>

                <p style="color:#a0a0a8">On Tuesday, April 30, 2013 at 2:54 PM, Eduard Antonyan wrote:</p><blockquote type="cite"><div>

                    <span><div><div><div>Arun,</div><div><br></div><div>If the new boolean is false, the result would be the same as without it and would be equal to current behavior of d[i][, j]. If it's true, it will only have an effect if i is a join (I think each.i= fits slightly better for this description than .join=) - this will replicate current underlying behavior. If you think the cross-apply is something that could work not just for i being a data-table but other things as well, then it would make perfect sense to implement that action too when the bool is true.</div>

<div><br>On Apr 30, 2013, at 2:58 AM, Arunkumar Srinivasan <<a href="mailto:aragorn168b@gmail.com" target="_blank">aragorn168b@gmail.com</a>> wrote:<br><br></div><blockquote type="cite"><div>

                <div>(The earlier message was too long and was rejected.)

                </div><div><div>So, from the discussion so far, I see that Matthew is nice enough to implement `.JOIN` or `cross.apply`. I've a couple of questions. Suppose,</div><div><br></div><div>    DT1 <- data.table(x=c(1,1,2,3,3), y=1:5, z=6:10)</div>

<div>    setkey(DT1, "x")</div><div>    DT2 <- data.table(x=1)</div><div>    DT1[DT2, y, .JOIN=TRUE] # I guess the syntax is something like this. I expect here the same output as current DT1[DT2, y]</div><div>

<br></div><div>The above syntax seems "okay". But my first question is what is `.JOIN=FALSE` supposed to do under these two circumstances? Suppose, </div><div><br></div><div>    DT1 <- data.table(x=c(1,1,2,3,3), y=1:5, z=6:10)</div>

<div>    setkey(DT1, "x")</div><div>    DT2 <- data.table(x=c(1,2,1), w=c(11:13))</div><div>    # what's the output supposed to be for?</div><div>    DT1[DT2, y, .JOIN=FALSE]</div><div>    DT1[DT2, .JOIN = FALSE]</div>

<div><br></div><div>Depending on this I'd have to think about `drop = TRUE/FALSE`. Also, how does it work with `subset`? </div><div><br></div><div>    DT1[x %in% c(1,2,1), y, .JOIN=TRUE] # .JOIN is ignored?</div><div>

<span style="white-space:pre-wrap">     </span></div><div>Is this supposed to also do a "cross-apply" on the logical subset? I guess not. So, .JOIN is an "extra" parameter that comes into play *only* when `i` is a `data.table`? </div>

<div><br></div><div>I'd love to have some replies to these questions for me to take a stance on `.JOIN`. Thank you.</div><div><br></div><div>Best,</div><div>Arun.</div></div><div><br></div>

                <p style="color:#a0a0a8"><br></p>

            </div></blockquote></div></div></span>

                </div></blockquote><div>

                    <br>

                </div>

            </div></div></div></blockquote></div><br></div></div>

</div></div></span>

                </div></blockquote><div>

                    <br>

                </div>

            </div></div></span>

                </div></blockquote><div>

                    <br>

                </div>

            </div></div></span>

                </div></blockquote><div>

                    <br>

                </div>

            </div></div></div></blockquote></div><br></div>

</div></div></span>

                </div></blockquote><div>

                    <br>

                </div>

            </div></div></span>

                </blockquote>

                <div>

                    <br>

                </div>