[datatable-help] NA in joins

Juan Manuel Truppia jmtruppia at gmail.com
Thu Sep 18 22:00:46 CEST 2014


818 and 819 created

On Thu, Sep 18, 2014 at 4:34 PM, Arunkumar Srinivasan <aragorn168b at gmail.com
> wrote:

> Thanks. It'd also be great if you could add an issue for adding the
> documentation.
> On NA non-matching, yes you could add an FR, there isn't one to my
> recollection. However much of this year has been spent on internal order
> and binary search in tweaking quite a lot of things. So I'd not be
> surprised if it is not attended to anytime soon.
>
> Arun
>
> From: Juan Manuel Truppia <jmtruppia at gmail.com> <jmtruppia at gmail.com>
> Reply: Juan Manuel Truppia <jmtruppia at gmail.com>> <jmtruppia at gmail.com>
> Date: September 18, 2014 at 9:14:42 PM
> To: Arunkumar Srinivasan <aragorn168b at gmail.com>> <aragorn168b at gmail.com>
> Cc: datatable-help at lists.r-forge.r-project.org
> <datatable-help at lists.r-forge.r-project.org>>
> <datatable-help at lists.r-forge.r-project.org>
> Subject:  Re: [datatable-help] NA in joins
>
>  It might help, specially where data.table is compared to SQL. However, I
> think that having merge (and maybe [.data.table) have an argument to avoid
> NA matching. Is there a FR already created for this? I can create it
> otherwise
>
> On Thu, Sep 18, 2014 at 4:00 PM, Arunkumar Srinivasan <
> aragorn168b at gmail.com> wrote:
>
>>  In base R `NA` matches `NA` alone, and `NaN` matches `NaN` alone:
>>  match(NA, c(1:5, NA))
>>  # [1] 6
>>
>>  data.table matches, through binary search, by design, in the same way. And
>> in `?match`, there's this line: "Exactly what matches what is to some
>> extent a matter of definition." In some operations it may not make sense.
>> But, by design, we do consider Inf = Inf, -Inf = -Inf, NaN = NaN and NA =
>> NA always. Do you think it'd help tp state this explicitly in `?data.table`?
>>
>>
>>  Arun
>>
>> From: Juan Manuel Truppia <jmtruppia at gmail.com> <jmtruppia at gmail.com>
>> Reply: Juan Manuel Truppia <jmtruppia at gmail.com>> <jmtruppia at gmail.com>
>> Date: September 18, 2014 at 6:14:56 PM
>> To: datatable-help at lists.r-forge.r-project.org
>> <datatable-help at lists.r-forge.r-project.org>>
>> <datatable-help at lists.r-forge.r-project.org>
>> Subject:  [datatable-help] NA in joins
>>
>>    Hi, this must have been discussed before, but I couldn't find
>> anything.
>>
>> In my opinion, NA shouldn't join with anything, including other NA (as to
>> mirror what we expect from SQL, where NULL doesn't join with NULL).
>>
>> However, with data.table, NA matches other NA.
>>
>> I.e, this should return an empty data.table
>>
>> data.table(idx = NA_real_, key = "idx")[data.table(idx = NA_real_, val =
>> "a", key = "idx"), nomatch = 0]
>>
>> Let's assume that we can't change this behavior, would it be possible to
>> add a parameter to avoid NA matching NA in [.data.table and merge?
>>   _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140918/4b78c234/attachment.html>


More information about the datatable-help mailing list