[datatable-help] unique.data.frame should create a copy, right?

Ricardo Saporta saporta at scarletmail.rutgers.edu
Mon Aug 12 20:12:48 CEST 2013


Steve,

I like your suggestion a lot.  I can see putting column specification to
good use.

As for the argument name, perhaps
   'use.columns'

And where a value of NULL or FALSE will yield same results as
`unique.data.frame`

    use.columns=key(x)   # default behavior
    use.columns=c("col1name", "col7name")   #etc
    use.columns=NULL


Thanks as always,
Rick



On Mon, Aug 12, 2013 at 1:51 PM, Steve Lianoglou <
mailinglist.honeypot at gmail.com> wrote:

> Hi folks,
>
> I actually want to revisit the fix I made here.
>
> Instead of having `use.key` in the signature to unique.data.table (and
> duplicated.data.table) to be:
>
> function(x,
>              incomparables=FALSE,
>              tolerance=.Machine$double.eps ^ 0.5,
>              use.key=TRUE, ...)
>
> How about we switch out use.key for a parameter that specifies the
> column names to use in the uniqueness check, which defaults to key(x)
> to keep backwards compatibility.
>
> For argument's sake (like that?), lets call this parameter `columns`
> (by.columns? with.columns? whatever) so:
>
> function(x,
>              incomparables=FALSE,
>              tolerance=.Machine$double.eps ^ 0.5,
>              columns=key(x), ...)
>
> Then:
>
> (1) leaving it alone is the backward compatibile behavior;
> (2) Perhaps setting it to NULL will use all columns, and make it
> equivalent to unique.data.frame (also the same when x has no key); and
> (3) setting it to any other combo of columns uses those columns as the
> uniqueness key and filters the rows (only) out of x accordingly.
>
> What do you folks think? Personally I think this is better on all
> accounts then just specifying to use the key or not and the only
> question in my mind is the name of the argument -- happy to hear other
> world views, however, so don't be shy.
>
> Thanks,
> -steve
>
> --
> Steve Lianoglou
> Computational Biologist
> Bioinformatics and Computational Biology
> Genentech
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130812/6f32eabe/attachment.html>


More information about the datatable-help mailing list