[datatable-help] Unexpected behavior in setnames()

Arunkumar Srinivasan aragorn168b at gmail.com
Sun Nov 3 01:31:28 CET 2013


> I guess I must have missed it, but has anyone anywhere (in this
> thread, a FR or something) actually present a (concrete) compelling
> situation where allowing duplicate column names was actually useful?

True, Not quite compelling situations so far. The only example I've seen (in this thread) is reg. data presentation purpose (from eddi). I don't quite know exactly in what way, still. I can understand although, that the data by itself sometimes maybe available in such format. But one can always make unique names while loading.


> I'm hard pressed to come up with any situation where (purposefully)
> keeping duplicate column names in a data.table has more benefit than
> downside. Seems to me that if this ever happens, it most certainly
> would be by mistake.


I agree.


> In the case of cbinding two data.tables together that end up having
> two duplicate names, I'd imagine unique-ing the names of the
> data.tables and firing a warning that this was done would be most
> useful (uniqueness priority would be from left to right as the
> data.tables are passed into the cbind call)


Unless there's a nice argument why this (unique-ing the names) would be bad or in which case keeping duplicate names would be good, I agree with you on this point as well.



Arun


On Sunday, November 3, 2013 at 1:10 AM, Steve Lianoglou wrote:

> Hi,
> 
> On Sat, Nov 2, 2013 at 8:41 AM, Arunkumar Srinivasan
> <aragorn168b at gmail.com (mailto:aragorn168b at gmail.com)> wrote:
> [snip]
> > Overall, I agree keeping duplicate names may help some users. But then, the
> > potential side-effects should be marked with warnings/errors distinctly, in
> > all cases (and preferably documented).
> > 
> 
> [/snip]
> 
> I guess I must have missed it, but has anyone anywhere (in this
> thread, a FR or something) actually present a (concrete) compelling
> situation where allowing duplicate column names was actually useful?
> 
> I'm hard pressed to come up with any situation where (purposefully)
> keeping duplicate column names in a data.table has more benefit than
> downside. Seems to me that if this ever happens, it most certainly
> would be by mistake.
> 
> Can someone help me out here?
> 
> In the case of cbinding two data.tables together that end up having
> two duplicate names, I'd imagine unique-ing the names of the
> data.tables and firing a warning that this was done would be most
> useful (uniqueness priority would be from left to right as the
> data.tables are passed into the cbind call)
> 
> -steve
> 
> -- 
> Steve Lianoglou
> Computational Biologist
> Bioinformatics and Computational Biology
> Genentech
> 
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20131103/e3ccb03a/attachment.html>


More information about the datatable-help mailing list