<html><head><style>body{font-family:Helvetica,Arial;font-size:13px}</style></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">In base R `NA` matches `NA` alone, and `NaN` matches `NaN` alone:</div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;">match(NA, c(1:5, NA))</div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"># [1] 6</div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"><br></div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"><span style="font-family: sans-serif; ">data.table</span><span style="font-family: sans-serif; "> matches, through binary search, by design, in the same way. </span>And in `?match`, there's this line: "<span style="font-family: sans-serif; ">Exactly what matches what is to some extent a matter of definition." In some operations it may not make sense. But, by design, we do consider Inf = Inf, -Inf = -Inf, NaN = NaN and NA = NA always. Do you think it'd help tp state this explicitly in `?data.table`?</span></div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"><span style="font-family: sans-serif; "><br></span></div><div id="bloop_customfont" style="font-family:Helvetica,Arial;font-size:13px; color: rgba(0,0,0,1.0); margin: 0px; line-height: auto;"><span style="font-family: sans-serif; "><br></span></div> <div id="bloop_sign_1411066455802876928" class="bloop_sign"><div style="font-family:helvetica,arial;font-size:13px">Arun</div></div> <div style="color:black"><br>From: <span style="color:black">Juan Manuel Truppia</span> <a href="mailto:jmtruppia@gmail.com"><jmtruppia@gmail.com></a><br>Reply: <span style="color:black">Juan Manuel Truppia</span> <a href="mailto:jmtruppia@gmail.com"><jmtruppia@gmail.com>></a><br>Date: <span style="color:black">September 18, 2014 at 6:14:56 PM</span><br>To: <span style="color:black">datatable-help@lists.r-forge.r-project.org</span> <a href="mailto:datatable-help@lists.r-forge.r-project.org"><datatable-help@lists.r-forge.r-project.org>></a><br>Subject: <span style="color:black"> [datatable-help] NA in joins <br></span></div><br> <blockquote type="cite" class="clean_bq"><span><div><div></div><div>
<title></title>
<div dir="ltr">Hi, this must have been discussed before, but I
couldn't find anything.
<div><br></div>
<div>In my opinion, NA shouldn't join with anything, including
other NA (as to mirror what we expect from SQL, where NULL doesn't
join with NULL).</div>
<div><br></div>
<div>However, with data.table, NA matches other NA.</div>
<div><br></div>
<div>I.e, this should return an empty data.table</div>
<div><br></div>
<div>data.table(idx = NA_real_, key = "idx")[data.table(idx =
NA_real_, val = "a", key = "idx"), nomatch = 0]<br></div>
<div><br></div>
<div>Let's assume that we can't change this behavior, would it be
possible to add a parameter to avoid NA matching NA in [.data.table
and merge?</div>
</div>
_______________________________________________
<br>datatable-help mailing list
<br>datatable-help@lists.r-forge.r-project.org
<br>https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</div></div></span></blockquote></body></html>