<html><head><style>body{font-family:Helvetica,Arial;font-size:13px}</style></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"><blockquote type="cite" class="clean_bq"><div dir="ltr">Arun, sometimes it helps to have nomatch = 0 when using :=<div><br></div><div>dta = data.table(idx = c(1,1), key = "idx")<br></div><div>dtb = data.table(idx = c(1,2), val = c("a", "b"), key = "idx")<br></div><div><br></div><div>This fails because of cartesian join not allowed by default</div></div></blockquote><div><div dir="ltr"><div><br></div><div>That's because of a bug. `allow.cartesian` error shouldn't occur with `:=` at all, as the number of rows will *never* exceed `x`. IIRC, the allow.cartesian bugs are scheduled to be fixed for the next-next release (after 1.9.4).</div></div></div></div> <div id="bloop_sign_1410991381005714944" class="bloop_sign"><div style="font-family:helvetica,arial;font-size:13px"><br></div><div style="font-family:helvetica,arial;font-size:13px">Arun</div></div> <div style="color:black"><br>From: <span style="color:black">Juan Manuel Truppia</span> <a href="mailto:jmtruppia@gmail.com"><jmtruppia@gmail.com></a><br>Reply: <span style="color:black">Juan Manuel Truppia</span> <a href="mailto:jmtruppia@gmail.com"><jmtruppia@gmail.com>></a><br>Date: <span style="color:black">September 15, 2014 at 7:52:26 PM</span><br>To: <span style="color:black">Arunkumar Srinivasan</span> <a href="mailto:aragorn168b@gmail.com"><aragorn168b@gmail.com>></a><br>Cc: <span style="color:black">datatable-help@lists.r-forge.r-project.org</span> <a href="mailto:datatable-help@lists.r-forge.r-project.org"><datatable-help@lists.r-forge.r-project.org>></a><br>Subject: <span style="color:black"> Re: [datatable-help] Update table from other table <br></span></div><br> <blockquote type="cite" class="clean_bq"><span><div><div></div><div>
<title></title>
<div dir="ltr">Arun, sometimes it helps to have nomatch = 0 when
using :=
<div><br></div>
<div>dta = data.table(idx = c(1,1), key = "idx")<br></div>
<div>dtb = data.table(idx = c(1,2), val = c("a", "b"), key =
"idx")<br></div>
<div><br></div>
<div>This fails because of cartesian join not allowed by
default</div>
<div><br></div>
<div>dta[dtb, val := i.val]<br></div>
<div><br></div>
<div>but this doesn't</div>
<div><br></div>
<div>dta[dtb, val := i.val, nomatch = 0]<br></div>
<div><br></div>
<div>This is the same as doing</div>
<div><br></div>
<div>dta[dtb, val := i.val, allow.cartesian = TRUE]<br></div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Fri, Sep 12, 2014 at 8:08 PM, Arunkumar
Srinivasan <span dir="ltr"><<a href="mailto:aragorn168b@gmail.com" target="_blank">aragorn168b@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word">
<div style="font-family:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto">
Glad it helped.</div>
<div style="font-family:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto">
Always welcome "pull requests" :).</div>
<div style="font-family:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto">
<br></div>
<div>
<div style="font-family:helvetica,arial;font-size:13px">Arun</div>
</div>
<div style="color:black"><span class=""><br>
From: <span style="color:black">Juan Manuel Truppia</span>
<a href="mailto:jmtruppia@gmail.com" target="_blank"><jmtruppia@gmail.com></a><br>
Reply: <span style="color:black">Juan Manuel Truppia</span>
<a href="mailto:jmtruppia@gmail.com" target="_blank"><jmtruppia@gmail.com>></a><br></span>Date: <span style="color:black">September
12, 2014 at 5:46:59 PM</span><br>
To: <span style="color:black">Arunkumar Srinivasan</span>
<a href="mailto:aragorn168b@gmail.com" target="_blank"><aragorn168b@gmail.com>></a><br>
Cc: <span style="color:black"><a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a></span>
<a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank"><datatable-help@lists.r-forge.r-project.org>></a><br>
Subject: <span style="color:black">Re: [datatable-help]
Update table from other table<br></span></div>
<div>
<div class="h5"><br>
<blockquote type="cite">
<div>
<div>
<div dir="ltr"><span>Great! (sorry, .EACHI = TRUE was an old
definition).</span>
<div><span>It's good to know also that nomatch = 0 is irrelevant
when using :=, I always used is to avoid the rows in dtb creeping
in dtb as NAs.</span></div>
<div><span>Also, it's really useful to know that by = EACHI should
be used when the calculations you are perfoming depend on the group
or not. This came in really in handy yesterday, and should be
emphasized in .EACHI description. Should I perform a pull
request?</span></div>
</div>
<div class="gmail_extra"><span><br></span>
<div class="gmail_quote"><span>On Fri, Sep 12, 2014 at 12:14 PM,
Arunkumar Srinivasan <span dir="ltr"><<a href="mailto:aragorn168b@gmail.com" target="_blank">aragorn168b@gmail.com</a>></span> wrote:<br></span>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word">
<div>
<p>I think you mean:</p>
<pre><code>dta[dtb, b:=b, by=.EACHI]
</code>
</pre>
<p>and not <code>.EACHI = TRUE</code>. Not sure what’s the use of
<code>nomatch=0L</code> along with <code>:=</code>.</p>
<p><code>by=.EACHI</code> does exactly what it means, really. It
evaluates <code>j</code> for each <code>i</code> match. Let’s first
see the matches:</p>
<pre><code>dta[dtb, which=TRUE]
# [1] 1 1 3
</code>
</pre>
<p>So, first row of <code>dtb</code> matches with first of
<code>dta</code>. The second of <code>dtb</code> matches with 1st
of <code>dta</code> and so on.</p>
<p>When you add <code>by=.EACHI</code>, as shown on the top,
<code>j-expression</code> is evaluated on each of these matches.
So, it’ll be evaluated 3-times here. On the other hand, without it,
<code>j</code> is evaluated once. In this case, it doesn’t make a
difference either way. So you should avoid <code>by=.EACHI</code>,
as it’ll be slower with it.</p>
<p>It’s particularly useful when you’d like to perform operations
in <code>j</code>, that depends on the values in <code>j</code> on
<em>that</em> group. For example, consider these data.tables
<code>dt1</code> and <code>dt2</code>:</p>
<pre><code>dt1 = data.table(x=rep(1:4, each=2), y=1:8, key="x")
dt2 = data.table(x=3:5, z=10, key="x")
</code>
</pre>
<p>And, you’d like to get <code>sum(y)*z</code> while joining.. If
not for the <code>by=.EACHI</code> feature.. you’d approach the
problem like this:</p>
<pre><code>dt1[dt2][, list(agg = sum(y)*z[1]), by=x]
</code>
</pre>
<p>With <code>by=.EACHI</code>, this is simply:</p>
<pre><code>dt1[dt2, list(agg=sum(y)*z), by=.EACHI]
</code>
</pre>
<p>Here, your expression is evaluated on each <code>i</code>.</p>
<p>Another interesting use case is, say, you’d like to create a
lagged vector of <code>y</code>:</p>
<pre><code>dt1[dt2, list(y=y, lagy = c(NA, head(y,-1)), z=z), by=.EACHI]
</code>
</pre>
<p>It’s that simple.. really. Basically, as long as the operation
you’re performing in <code>j</code> affects it depending on whether
j is executed for that group or as a whole, then you’re most likely
looking for <code>by=.EACHI</code>. If not, <code>by=.EACHI</code>
has no effect, and therefore you’re wanting to use a <code>normal
join</code> there..</p>
<p>This is not a text book definition, rather my understanding of
this awesome feature!</p>
<p>Hope this helps.</p>
</div>
<div>
<div style="font-family:Helvetica,Arial;font-size:13px;color:rgba(0,0,0,1.0);margin:0px;line-height:auto">
<br></div>
<div>
<div style="font-family:helvetica,arial;font-size:13px">Arun</div>
</div>
<div style="color:black"><br>
From: <span style="color:black">Juan Manuel Truppia</span>
<a href="mailto:jmtruppia@gmail.com" target="_blank"><jmtruppia@gmail.com></a><br>
Reply: <span style="color:black">Juan Manuel Truppia</span>
<a href="mailto:jmtruppia@gmail.com" target="_blank"><jmtruppia@gmail.com>></a><br>
Date: <span style="color:black">September 11, 2014 at 10:16:41
PM</span><br>
To: <span style="color:black"><a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a></span>
<a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank"><datatable-help@lists.r-forge.r-project.org>></a><br>
Subject: <span style="color:black">[datatable-help] Update
table from other table<br></span></div>
<br>
<blockquote type="cite">
<div>
<div>
<div>
<div>
<div dir="ltr"><span>What is the best data.table way of doing
something similar to UPDATE FROM in SQL?</span>
<div><span><br></span>
<div><span>I used to do something like</span></div>
<div><span><br></span></div>
<div><span>dta = data.table(idx = c(1, 2, 3), a = runif(3), key =
"idx")</span></div>
<div><span>dtb = data.table(idx = c(1, 3), b = runif(3), key =
"idx")</span></div>
<div><span>dta[dtb, b := b]</span></div>
<div><span><br></span></div>
<div><span>However, after the 1.9.3 and the explicit .EACHI, it
fails sometimes, but I can't determine when.</span></div>
<div><span><br></span></div>
<div><span>So, just to be sure, I do </span></div>
<div><span><br></span></div>
<div><span>dta[dtb, b := b, .EACHI = TRUE, nomatch =
0]<br></span></div>
</div>
<div><span><br></span></div>
<div><span>Is the .EACHI and the nomatch necessary?</span></div>
<div><span><br></span></div>
<div><span>In this case, I want the row with idx 1 and 3 (the
matching ones) to end with a b value from the matching b column in
dtb, and the row with idx 2 (the one that isn't in dtb) to end up
with NA in column b.</span></div>
<div><span><br></span></div>
<div><span><br></span></div>
</div>
</div>
</div>
<span>_______________________________________________<br>
datatable-help mailing list<br>
<a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a><br>
<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a></span></div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
<br></div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
</div>
<br></div>
</div></div></span></blockquote></body></html>