[datatable-help] Assignement by reference on a datatable subset

DUPREZ Cédric Cedric.DUPREZ at ign.fr
Wed Feb 8 14:18:40 CET 2012


Dear all,

I have a new question about data completion within a datatable.

Having the following datatable:
DT <- data.table("id1" = c("n1", "n1", "n1", "n1", "n1", "n1", "n1", "n1", "n1", "n1", "n1", "n1", "n2", "n2", "n2", "n2")
	, 'id2'=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4)
	, val=c(NA, NA, 2, 2, NA, NA, 7, 7, NA, NA, NA, 11, NA, NA, NA, NA)
	, key = c("id1", "id2"))

I get:
      id1 id2 val
 [1,]  n1   1  NA
 [2,]  n1   2  NA
 [3,]  n1   3   2
 [4,]  n1   4   2
 [5,]  n1   5  NA
 [6,]  n1   6  NA
 [7,]  n1   7   7
 [8,]  n1   8   7
 [9,]  n1   9  NA
[10,]  n1  10  NA
[11,]  n1  11  NA
[12,]  n1  12  11
[13,]  n2   1  NA
[14,]  n2   2  NA
[15,]  n2   3  NA
[16,]  n2   4  NA

The val column contains values of id2 per id1.
For each id2 referenced by a val value, I would like to complete its val value if it is not the case, copying its id2.
In my example, the final datatable should look like this:
      id1 id2 val
 [1,]  n1   1  NA
 [2,]  n1   2   2
 [3,]  n1   3   2
 [4,]  n1   4   2
 [5,]  n1   5  NA
 [6,]  n1   6  NA
 [7,]  n1   7   7
 [8,]  n1   8   7
 [9,]  n1   9  NA
[10,]  n1  10  NA
[11,]  n1  11  11
[12,]  n1  12  11
[13,]  n2   1  NA
[14,]  n2   2  NA
[15,]  n2   3  NA
[16,]  n2   4  NA
As you can see, val on lines 2 and 11 have been completed with the id2 value.

I tried like this:
DT2 <- unique(DT[!is.na(val), c("id1", "val"), with = F])
DT2$id2 <- DT2$val
setkeyv(DT2, c("id1", "id2"))
DT[DT2, val:=val.1]

But I get the following message: "combining bywithoutby with := in j is not yet implemented."

Here is the solution I finally found:
DT <- data.table("id1" = c("n1", "n1", "n1", "n1", "n1", "n1", "n1", "n1", "n1", "n1", "n1", "n1", "n2", "n2", "n2", "n2"), 'id2'=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4), val=c(NA, NA, 2, 2, NA, NA, 7, 7, NA, NA, NA, 11, NA, NA, NA, NA), key = c("id1", "id2"))
noms <- names(DT)
cle <- key(DT)
DT2 <- unique(DT[!is.na(val), c("id1", "val"), with = F])
DT2$id2 <- DT2$val
setkeyv(DT2, c("id1", "id2"))
X <- DT2[DT]
X[is.na(val.1), val.1:=val]
DT <- X[,list(id1, id2, val.1)]
setnames(DT, 3, "val")
setkeyv(DT, cle)

Is there a faster way to complete my data?

Thanks in advance for you help.

Regards,
Cedric


More information about the datatable-help mailing list