[datatable-help] Unexpected behavior in setnames()

Arunkumar Srinivasan aragorn168b at gmail.com
Sun Nov 3 01:36:35 CET 2013


Eddi, 
While loading the data in, maybe, if it is essential to keep names intact, we can probably add an argument, "asis=TRUE" or something like that. But I don't see a reason for doing anything else in `data.table` using duplicate names and trying to catch errors when nothing meaningful can be done with them. Besides data presentation, can you tell any other use with them?

Arun


On Sunday, November 3, 2013 at 1:31 AM, Eduard Antonyan wrote:

> The main usage case I've personally encountered is data presentation (for either self or others), where I would sometimes organize data like so:
> 
> category1 name,colname1,colname2,category2 name,colname1,colname2
> ....numbersandstuff....
> 
> Also, in general there are many cases I brought up above that generate duplicate names, and I definitely don't want either lost columns or renamed columns as a result - both are data loss that I don't appreciate.
> 
> 
> On Sat, Nov 2, 2013 at 7:10 PM, Steve Lianoglou <lianoglou.steve at gene.com (mailto:lianoglou.steve at gene.com)> wrote:
> > Hi,
> > 
> > On Sat, Nov 2, 2013 at 8:41 AM, Arunkumar Srinivasan
> > <aragorn168b at gmail.com (mailto:aragorn168b at gmail.com)> wrote:
> > [snip]
> > > Overall, I agree keeping duplicate names may help some users. But then, the
> > > potential side-effects should be marked with warnings/errors distinctly, in
> > > all cases (and preferably documented).
> > [/snip]
> > 
> > I guess I must have missed it, but has anyone anywhere (in this
> > thread, a FR or something) actually present a (concrete) compelling
> > situation where allowing duplicate column names was actually useful?
> > 
> > I'm hard pressed to come up with any situation where (purposefully)
> > keeping duplicate column names in a data.table has more benefit than
> > downside. Seems to me that if this ever happens, it most certainly
> > would be by mistake.
> > 
> > Can someone help me out here?
> > 
> > In the case of cbinding two data.tables together that end up having
> > two duplicate names, I'd imagine unique-ing the names of the
> > data.tables and firing a warning that this was done would be most
> > useful (uniqueness priority would be from left to right as the
> > data.tables are passed into the cbind call)
> > 
> > -steve
> > 
> > --
> > Steve Lianoglou
> > Computational Biologist
> > Bioinformatics and Computational Biology
> > Genentech
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20131103/912486c8/attachment.html>


More information about the datatable-help mailing list