[datatable-help] Wonder whether there is an easier way to changepart of data.table values

Short, Tom TShort at epri.com
Thu Jul 29 21:38:32 CEST 2010


Branson,

[<-.data.table and $<-.data.table both need a bit of work. 

> DT <- data.table(A = c("A", "Z"), Z = 1:10, key = "A")
> 

You've found the DT$column[index] = new value approach, but if
you use that on keys, DT may no longer be sorted right:

> DT$A[10] <- "A"
> DT
      A   Z
 [1,] A   1
 [2,] A   3
 [3,] A   5
 [4,] A   7
 [5,] A   9
 [6,] Z   2
 [7,] Z   4
 [8,] Z   6
 [9,] Z   8
[10,] A 100

Since we now inherit from data.frames, we can just use
[<-.data.frame. It still has the problem that it won't remove the
key if the key'd column changes.

> 
> `[<-.data.table` <- `[<-.data.frame`
> DT[9,"Z"] <- 22
> DT
      A   Z
 [1,] A   1
 [2,] A   3
 [3,] A   5
 [4,] A   7
 [5,] A   9
 [6,] Z   2
 [7,] Z   4
 [8,] Z   6
 [9,] Z  22
[10,] A 100

I'm not sure we want to be able to do DT[select,] <- something.

Something like the following will work for a simple select:

> DT <- data.table(A = c("A", "Z"), Z = 1:10, key = "A")
> `[<-.data.table` <- `[<-.data.frame`
> DT[DT[J("A"), which=TRUE, mult="all"], "Z"] <- 44
> DT
      A  Z
 [1,] A 44
 [2,] A 44
 [3,] A 44
 [4,] A 44
 [5,] A 44
 [6,] Z  2
 [7,] Z  4
 [8,] Z  6
 [9,] Z  8
[10,] Z 10

The following is equivalent:

> DT$Z[DT[J("A"), which=TRUE, mult="all"]] <- 55
> DT
      A  Z
 [1,] A 55
 [2,] A 55
 [3,] A 55
 [4,] A 55
 [5,] A 55
 [6,] Z  2
 [7,] Z  4
 [8,] Z  6
 [9,] Z  8
[10,] Z 10

I'd prefer to use as much of [<-.data.frame and $<-.data.frame as
possible.  $<-.data.frame is pretty easy:

> "$<-.data.table" = function (x, name, value) {
+     res <- `$<-.data.frame`(x, name, value)
+     if (any(name %in% key(x)))
+         key(res) <- NULL
+     res
+ }
> DT <- data.table(A = c("A", "Z"), Z = 1:10, key = "A")
> DT$Z[3] <- 33
> key(DT)
[1] "A"
> DT$A[10] <- "A"
> key(DT)
NULL
> DT
      A  Z
 [1,] A  1
 [2,] A  3
 [3,] A 33
 [4,] A  7
 [5,] A  9
 [6,] Z  2
 [7,] Z  4
 [8,] Z  6
 [9,] Z  8
[10,] A 10

This doesn't allow x to be a data.table-style select. If we want 
that, I could experiment some.

[<-.data.table is more challenging, but I could take a shot on a
plane ride next week.

- Tom 

 

> -----Original Message-----
> From: datatable-help-bounces at lists.r-forge.r-project.org 
> [mailto:datatable-help-bounces at lists.r-forge.r-project.org] 
> On Behalf Of Branson Owen
> Sent: Thursday, July 29, 2010 14:39
> To: datatable-help at lists.r-forge.r-project.org
> Subject: [datatable-help] Wonder whether there is an easier 
> way to changepart of data.table values
> 
> I thought I have no more question, but ... Please take your 
> time to respond, I don't want to overwhelm your time.
> 
> ** I want to only change values for certain rows of selected 
> columns. **
> 
> In data.frame, I can do something like:
> 
> > DF[row index, "column"] = new value.
> 
> In data.table, this has been disabled even using "with = FALSE"
> 
> >DT[3,"Z", with = FALSE]
> 
>        Z
> [1,] 20
> 
> > DT[3,"Z", with = FALSE] <- 1
> Error in `[<-.data.table`(`*tmp*`, 3, "Z", with = FALSE, value = 1) :
>   unused argument(s) (with = FALSE)
> 
> 
> Actually, I found that there is no way I can edit value using 
> [,] in DT. The only way I found to change value is using 
> DT$column[index] = new value
> 
> This would make the following task difficult:
> 
> # DOES NOT WORK #
> > DT[join/select, {
> columnA <- calculation based on columnB, C, D, ...
> }]
> # DOES NOT WORK #
> 
> It didn't complain, but it doesn't change value at all. I 
> guess this is due to the syntax of with in data.frame because 
> it doesn't work there, either.
> 
> At this moment, my solution is:
> > DT[join/select, {
> DT$columnA[index] <<- calculation based on columnB, C, D, ...
> }]
> 
> with the help of DT$columnA[index] and super assign <<-. We 
> also need to either get index by ourselves like 
> DT[select/join, which = T] or store it first. Not sure 
> whether this is the best solution.
> 
> In DF, it would be
> 
> > index = using scan
> > DF[index, columnA] = with(DF, calculation based on columnB, 
> C, D, ...)
> 
> Note that this doesn't work for DT. At this moment, the only 
> way I found to edit DT is
> > DT$column[index] = new value
> 
> I don't think my example is uncommon, but I can't find common 
> solution using data.table. Maybe, I missed something.
> 
> Any comments will be highly appreciated. Thank you very much 
> again for your help.
> 
> Best regards,
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/d
atatable-help
> 


More information about the datatable-help mailing list