[datatable-help] Unexpected behavior in setnames()

Arunkumar Srinivasan aragorn168b at gmail.com
Sat Nov 2 00:10:41 CET 2013


Hm, I've not encountered that use myself, can't comment there. Probably then it should be allowed everywhere except where deciding which column could be an issue? Ex: subsetting/aggregating/grouping/by-without-by etc.. should result in error (if one has the time, one could do this by checking if the duplicate column is in use actually or not and then issue an error/warning).  

At the moment, I'm not convinced that it's worth that much trouble to help data presentation.  

Arun


On Saturday, November 2, 2013 at 12:05 AM, Eduard Antonyan wrote:

> Because it's very useful for e.g. data presentation purposes.
>  
>  
> On Fri, Nov 1, 2013 at 6:02 PM, Arunkumar Srinivasan <aragorn168b at gmail.com (mailto:aragorn168b at gmail.com)> wrote:
> > Yes, it chooses the first. But we won't be able to perform any operation as intended. So why allow duplicate names (ex: in `setnames` as Alexandre asks)?  
> >  
> > Arun
> >  
> >  
> > On Friday, November 1, 2013 at 11:57 PM, Eduard Antonyan wrote:
> >  
> > > I think currently it chooses the first "x", but it's definitely a good idea to add a warning there.
> > >  
> > >  
> > > On Fri, Nov 1, 2013 at 5:51 PM, Arunkumar Srinivasan <aragorn168b at gmail.com (mailto:aragorn168b at gmail.com)> wrote:
> > > > Ricardo added a bug report here on this topic: https://r-forge.r-project.org/tracker/index.php?func=detail&aid=5008&group_id=240&atid=975  
> > > > But I don't think having duplicate names is an easy-to-implement concept. For ex:
> > > >  
> > > > dt <- data.table(x=1:3, x=4:6, y=c(1,1,2))
> > > > dt[, print(.SD), by=y]
> > > >    x
> > > > 1: 1
> > > > 2: 2
> > > >    x
> > > > 1: 3
> > > >  
> > > >  
> > > > .SD loses the second "x". Also, some other questions become difficult to handle. Ex:  
> > > >  
> > > > dt <- data.table(x=c(1,1,2,2), y=c(1,2,3,4), x=c(2,2,1,1))  
> > > > dt[, list(x=x/x[1], y=y), by=x]
> > > >  
> > > >  
> > > > Which "x" should be choose for which operation?
> > > >  
> > > > Arun
> > > >  
> > > >  
> > > > On Friday, November 1, 2013 at 10:59 PM, Eduard Antonyan wrote:
> > > >  
> > > > > Having duplicate names is allowed and not that unusual in data.table framework, so there is no need to signal anything here.
> > > > >  
> > > > > A different question is whether there should be a warning here:  
> > > > >  
> > > > >   dt = data.table(a = 1, a = 2)
> > > > >   dt[, a]
> > > > >  
> > > > > and I think that'd be a pretty good FR to have.
> > > > >  
> > > > >  
> > > > > On Fri, Nov 1, 2013 at 4:49 PM, Alexandre Sieira <alexandre.sieira at gmail.com (mailto:alexandre.sieira at gmail.com)> wrote:
> > > > > > I found this behavior during a debugging session:  
> > > > > >  
> > > > > > > d = data.table(a=1, b=2, c=3)
> > > > > > > setnames(d, "a", "b")
> > > > > > > d
> > > > > >    b b c
> > > > > > 1: 1 2 3
> > > > > >  
> > > > > > Shouldn’t setnames() check if the new column names already exist before renaming, and signal an error or at least a warning if they do?
> > > > > > --  
> > > > > > Alexandre Sieira
> > > > > > CISA, CISSP, ISO 27001 Lead Auditor
> > > > > >  
> > > > > > "The truth is rarely pure and never simple."
> > > > > > Oscar Wilde, The Importance of Being Earnest, 1895, Act I
> > > > > > _______________________________________________
> > > > > > datatable-help mailing list
> > > > > > datatable-help at lists.r-forge.r-project.org (mailto:datatable-help at lists.r-forge.r-project.org)
> > > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > > >  
> > > > > _______________________________________________
> > > > > datatable-help mailing list
> > > > > datatable-help at lists.r-forge.r-project.org (mailto:datatable-help at lists.r-forge.r-project.org)
> > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> > > > >  
> > > > >  
> > > > >  
> > > >  
> > > >  
> > >  
> >  
>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20131102/d785e3f6/attachment-0001.html>


More information about the datatable-help mailing list