[datatable-help] Error inside [.data.table

Steve Lianoglou mailinglist.honeypot at gmail.com
Mon Apr 11 16:24:35 CEST 2011


Hi Andreas,

On Mon, Apr 11, 2011 at 9:05 AM, Andreas Borg
<andreas.borg at unimedizin-mainz.de> wrote:
> Hello,
>
> I get an error using the [-operator for a data table object that is hard to
> understand. Basically, what I want to do is the following: I have a
> data.table with a number of binary patterns, possibly including IDs or other
> additional columns. I want to process them grouped by the outcome of some
> variables, e.g. count the number of examples that have 1s in some specified
> columns. Unfortunately, I cannot reproduce the error with simple test data,
> but with data from our package RecordLinkage:
>
> install.packages("RecordLinkage") # just for convenienve as this is not a
> common package
> library(RecordLinkage)
> library(data.table)
> data(RLdata500)
> rpairs <- compare.dedup(RLdata500, blockfld=list(5,6,7))
> p <- rpairs$pairs
> p[is.na(p)] <- 0
> p <- data.table(p)
> # Everything up to here sets up test data. Now define a key and aggregate by
> the key columns
> keyCol <- c("fname_c1", "lname_c2")
> key(p) <- keyCol
> p[, length(fname_c2), by=keyCol]
>
> The last expression results in:
>
> Fehler in `[[<-.data.frame`(`*tmp*`, jj, value = 0:1) :
>  replacement has 2 rows, data has 15627

I believe this was a bug that was fixed since the last release of data.table.

Assuming you are using data.table 1.5.3 (you didn't provide
sessionInfo :-), explicitly naming hte key columns will work, eg:

R> p[, length(fname_c2), by=list(fname_c1, lname_c2)]
     fname_c1 lname_c2    V1
[1,]        0        0 15454
[2,]        1        0   173

If you install data.table from R-forge, your current incantation will
work as well, eg:

R> install.packages('data.table', repos='http://r-forge.r-project.org')
R> # ... reload all your data and what not
R> p[, length(fname_c2), by=keyCol]
     fname_c1 lname_c2    V1
[1,]        0        0 15454
[2,]        1        0   173

R> sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
...
other attached packages:
 [1] data.table_1.5.4    RecordLinkage_0.3-1 RSQLite_0.9-4
...

Hope that helps,

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact


More information about the datatable-help mailing list