[datatable-help] (no subject)

DUPREZ Cédric Cedric.DUPREZ at ign.fr
Tue Feb 7 14:34:51 CET 2012


Dear all,

I am looking for the best way to complete missing values in a datatable, according to particular rules.

Having the following datatable:
DT <- data.table('id1'=c(1,1,1,1,2,2,2,2,3,3,3,3)
      , 'id2'= c(1,2,3,4)
      , 'val'=c(1,2,NA,5,1,NA,NA,4,NA,2,4,6)
      , key=c("id1", "id2"))

I get: 
      id1 id2 val
 [1,]   1   1   1
 [2,]   1   2   2
 [3,]   1   3  NA
 [4,]   1   4   5
 [5,]   2   1   1
 [6,]   2   2  NA
 [7,]   2   3  NA
 [8,]   2   4   4
 [9,]   3   1  NA
[10,]   3   2   2
[11,]   3   3   4
[12,]   3   4   6

The rule to complete missing values is the following: put the immediatly preceding value (val) from the same id1 line that is not missing.
In my example, lines with missing values are :

DT[is.na(val)]

     id1 id2 val
[1,]   1   3  NA
[2,]   2   2  NA
[3,]   2   3  NA
[4,]   3   1  NA

The final result for my datatable should be:

DT
      id1 id2 val
 [1,]   1   1   1
 [2,]   1   2   2
 [3,]   1   3   2
 [4,]   1   4   5
 [5,]   2   1   1
 [6,]   2   2   1
 [7,]   2   3   1
 [8,]   2   4   4
 [9,]   3   1  NA
[10,]   3   2   2
[11,]   3   3   4
[12,]   3   4   6

What is the best and easiest way to complete missing values with such rules. I tried with joins and := operator by often get error messages like "combining bywithoutby with := in j is not yet implemented."

Thanks in advance for your help,

Cedric



More information about the datatable-help mailing list