[datatable-help] mapply cannot modify in place when iterating over list of DTs

Ricardo Saporta saporta at scarletmail.rutgers.edu
Fri Sep 20 15:48:11 CEST 2013


I've encountered the following issue iterating over a list of data.tables.
The issue is only with mapply, not with lapply .


Given a list of data.table's, mapply'ing over the list directly
cannot modify in place.

Also if attempting to add a new column, we get an "Invalid
.internal.selfref" warning.
Modifying an existing column does not issue a warning, but still fails to
modify-in-place

WORKAROUND:
----------
The workaround is to iterate over an index to the list, then to
  modify each data.table via list.of.DTs[[i]][ .. ]

**Interestingly, this issue occurs with `mapply`, but not `lapply`.**


EXAMPLE:
--------
  # Given a list of DT's and two lists of vectors,
  #   we want to add the corresponding vectors as columns to the DT.

## ---------------- ##
##   SAMPLE DATA:   ##
## ---------------- ##
  # list of data.tables
  list.DT <- list(
    DT1=data.table(Col1=111:115, Col2=121:125),
    DT2=data.table(Col1=211:215, Col2=221:225)
    )

  # lists of columns to add
  list.Col3 <- list(131:135, 231:235)
  list.Col4 <- list(141:145, 241:245)


## ------------------------------------ ##
##   Iterating over the list elements   ##
##     adding a new column              ##
## ------------------------------------ ##
##   Will issue warning and             ##
##     will fail to modify in place     ##
## ------------------------------------ ##
  mapply (
      function(DT, C3, C4)
         DT[, c("Col3", "Col4") := list(C3, C4)],

      list.DT,  # iterating over the list
      list.Col3, list.Col4,
      SIMPLIFY=FALSE
    )

  ## Note the lack of change
  list.DT


## ------------------------------------ ##
##   Iterating over an index            ##
## ------------------------------------ ##
  mapply (
      function(i, C3, C4)
         list.DT[[i]] [, c("Col3", "Col4") := list(C3, C4)],

      seq(list.DT),   # iterating over an index to the list
      list.Col3, list.Col4,
      SIMPLIFY=FALSE
    )

  ## Note each DT _has_ been modified
  list.DT

## ------------------------------------ ##
##   Iterating over the list elements   ##
##     modifying existing column        ##
## ------------------------------------ ##
##   No warning issued, but             ##
##     Will fail to modify in place     ##
## ------------------------------------ ##
  mapply (
      function(DT, C3, C4)
         DT[, c("Col3", "Col4") := list(Col3*1e3, Col4*1e4)],

      list.DT,  # iterating over the list
      list.Col3, list.Col4,
      SIMPLIFY=FALSE
    )

  ## Note the lack of change (compare with output from `mapply`)
  list.DT

## ------------------------------------ ##
##                                      ##
##   `lapply` works as expected.        ##
##                                      ##
## ------------------------------------ ##

  ## NOW WITH lapply
  lapply(list.DT,
    function(DT)
      DT[, newCol := LETTERS[1:5]]
  )

  ## Note the new column:
  list.DT



# ========================== #

##   NON-WORKAROUNDS   ##
##
## I also tried all of the following alternatives
##   in hopes of being able to iterate over the list
##   directly, using `mapply`.
## None of these worked.

# (1) Creating the DTs First, then creating the list from them
    DT1 <- data.table(Col1=111:115, Col2=121:125)
    DT2 <- data.table(Col1=211:215, Col2=221:225)

    list.DT <- list(DT1=DT1,DT2=DT2 )


# (2) Same as 1, and using `copy()` in the call to `list()`
    list.DT <- list(DT1=copy(DT1),
                    DT2=copy(DT2) )

# (3) lapply'ing `copy` and then iterating over that list
    list.DT <- lapply(list.DT, copy)

# (4) Not naming the list elements
    list.DT <- list(DT1, DT2)
    # and tried
    list.DT <- list(copy(DT1), copy(DT2))

## All of the above still failed to modify in place
##   (and also issued the same warning if trying to add a column)
##    when iterating using mapply

  mapply(function(DT, C3, C4)
    DT[, c("Col3", "Col4") := list(C3, C4)],
    list.DT, list.Col3, list.Col4,
    SIMPLIFY=FALSE)


# ========================== #


Ricardo Saporta
Rutgers University, New Jersey
e: saporta at rutgers.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130920/755cad81/attachment.html>


More information about the datatable-help mailing list