[datatable-help] Something seems funky. I think with character-to-factor conversion for keys (?)
Steve Lianoglou
mailinglist.honeypot at gmail.com
Fri Mar 4 23:46:17 CET 2011
I'll have to apologize in advance because I can't create a
reproducible example for this behavior, but I'll keep trying .. please
bear with me.
Somehow I've ended up with a data.table `m2` that looks like this:
R> m2
entrez.id total.tags.liver cds.liver intron.liver utr.liver
[1,] 9 27 0 0 0
[2,] 10 347 0 0 0
[3,] 12 5076 0 17 0
[4,] 13 2445 0 0 0
[5,] 18 2076 0 0 0
[6,] 20 15 0 0 0
[7,] 25 62 0 0 0
[8,] 32 320 0 0 0
[9,] 34 1377 0 0 0
[10,] 35 757 0 0 0
First 10 rows of 5236 printed.
R> key(m2)
[1] "entrez.id"
R> any(duplicated(m2$entrez.id))
[1] FALSE
So far so good -- I stumbled on the following problem when `merge`-ing
two large data tables which was giving me a stranger error. In the
process of trying to smoke out the problem, I notice this unexpected
behavior:
## This is expected
R> subset(m2, entrez.id == '9')
entrez.id total.tags.liver cds.liver intron.liver utr.liver
[1,] 9 27 0 0 0
## This isn't
R> m2['9']
entrez.id total.tags.liver cds.liver intron.liver utr.liver
[1,] 9 NA NA NA NA
Woops! Isn't that supposed to return the same as above?
I can fix `m2` by manipulating the key column:
R> key(m2) <- NULL ## probably not necessary
R> m2$entrez.id <- as.character(m2$entrez.id)
R> key(m2) <- 'entrez.id'
R> m2['9']
entrez.id total.tags.liver cds.liver intron.liver utr.liver
[1,] 9 27 0 0 0
(side note: the bug I mentioned when I try to `merge` this w/ another
data.table is gone after I did the above fix).
So -- I guess my point is that I'm not exactly sure how I got `m2` to
have a funky key, but the fact that it got messed up like this somehow
I think is undesired behavior, no?
Does this point to something (maybe obvious) that happened on the way
to building up `m2`?
Thanks,
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the datatable-help
mailing list