[datatable-help] Error inside [.data.table
Andreas Borg
andreas.borg at unimedizin-mainz.de
Mon Apr 11 15:05:28 CEST 2011
Hello,
I get an error using the [-operator for a data table object that is hard
to understand. Basically, what I want to do is the following: I have a
data.table with a number of binary patterns, possibly including IDs or
other additional columns. I want to process them grouped by the outcome
of some variables, e.g. count the number of examples that have 1s in
some specified columns. Unfortunately, I cannot reproduce the error with
simple test data, but with data from our package RecordLinkage:
install.packages("RecordLinkage") # just for convenienve as this is not
a common package
library(RecordLinkage)
library(data.table)
data(RLdata500)
rpairs <- compare.dedup(RLdata500, blockfld=list(5,6,7))
p <- rpairs$pairs
p[is.na(p)] <- 0
p <- data.table(p)
# Everything up to here sets up test data. Now define a key and
aggregate by the key columns
keyCol <- c("fname_c1", "lname_c2")
key(p) <- keyCol
p[, length(fname_c2), by=keyCol]
The last expression results in:
Fehler in `[[<-.data.frame`(`*tmp*`, jj, value = 0:1) :
replacement has 2 rows, data has 15627
However, when I construct a similar table manually, everything works fine:
expr <- quote(as.numeric(sample(0:1, 1000, replace=TRUE)))
dt <- data.table(fname_c1=eval(expr), fname_c2=eval(expr))
key(dt) <- c("fname_c1", "fname_c2")
dt[, length(fname_c2), by=c("fname_c1", "fname_c2")]
Also, with other combinations of attributes I get the first example
working as well:
keyCol <- c("fname_c1", "lname_c2")
key(p) <- keyCol
p[, length(fname_c2), by=keyCol]
I am not even sure if the error is strictly reproducible.
Any idea what could be wrong?
Best regards,
Andreas
--
Andreas Borg
Medizinische Informatik
UNIVERSITÄTSMEDIZIN
der Johannes Gutenberg-Universität
Institut für Medizinische Biometrie, Epidemiologie und Informatik
Obere Zahlbacher Straße 69, 55131 Mainz
www.imbei.uni-mainz.de
Telefon +49 (0) 6131 175062
E-Mail: borg at imbei.uni-mainz.de
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der
richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den
Absender und löschen Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe
dieser Mail und der darin enthaltenen Informationen ist nicht gestattet.
More information about the datatable-help
mailing list