[datatable-help] (no subject)
DUPREZ Cédric
Cedric.DUPREZ at ign.fr
Tue Feb 7 14:34:51 CET 2012
Dear all,
I am looking for the best way to complete missing values in a datatable, according to particular rules.
Having the following datatable:
DT <- data.table('id1'=c(1,1,1,1,2,2,2,2,3,3,3,3)
, 'id2'= c(1,2,3,4)
, 'val'=c(1,2,NA,5,1,NA,NA,4,NA,2,4,6)
, key=c("id1", "id2"))
I get:
id1 id2 val
[1,] 1 1 1
[2,] 1 2 2
[3,] 1 3 NA
[4,] 1 4 5
[5,] 2 1 1
[6,] 2 2 NA
[7,] 2 3 NA
[8,] 2 4 4
[9,] 3 1 NA
[10,] 3 2 2
[11,] 3 3 4
[12,] 3 4 6
The rule to complete missing values is the following: put the immediatly preceding value (val) from the same id1 line that is not missing.
In my example, lines with missing values are :
DT[is.na(val)]
id1 id2 val
[1,] 1 3 NA
[2,] 2 2 NA
[3,] 2 3 NA
[4,] 3 1 NA
The final result for my datatable should be:
DT
id1 id2 val
[1,] 1 1 1
[2,] 1 2 2
[3,] 1 3 2
[4,] 1 4 5
[5,] 2 1 1
[6,] 2 2 1
[7,] 2 3 1
[8,] 2 4 4
[9,] 3 1 NA
[10,] 3 2 2
[11,] 3 3 4
[12,] 3 4 6
What is the best and easiest way to complete missing values with such rules. I tried with joins and := operator by often get error messages like "combining bywithoutby with := in j is not yet implemented."
Thanks in advance for your help,
Cedric
More information about the datatable-help
mailing list