<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Sep 23, 2013 at 9:42 PM, Matthew Dowle <span dir="ltr"><<a href="mailto:mdowle@mdowle.plus.com" target="_blank">mdowle@mdowle.plus.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF" text="#000000">
    <div><br>
      Hi,<br>
      Basically adding columns by reference to a data.table when it's a
      member of a list of data.table, is really difficult to handle
      internally.  I had to special case internally to get around list()
      copying, so that the binding can change inside the list on the
      shallow copy when [[ is used.  A for loop is the way to add
      columns by reference inside a list of data.table, and that should
      work ok using [[.  But doing that via lapply and mapply is really
      stretching it.  </div></div></blockquote><div><br></div><div>That makes sense.  I took a whack at it, but couldn't even come close.  </div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><div>Even catching user expectations in this area is
      difficult.  Ideally we'd catch mapply, yes,  but really data.table
      likes to be rbindlist()-ed and then ops to work on a single large
      data.table.  </div></div></blockquote><div><br></div><div>Agreed.  In the application where this came up, I am dealing with a list of tables with different dims (hence not rbinding)</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><div>We can advice to the warning message not to use
      mapply or lapply to add columns by reference to a list of
      data.table (use a for loop instead) ?</div></div></blockquote><div><br></div><div>Perhaps a warning that modifications to the DT's in the list are likely to not have stuck and to use rbindlist when possible?</div>
<div>  </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><div><span class="HOEnZb"><font color="#888888"><br>
      Matthew</font></span><div><div class="h5"><br>
      <br>
      <br>
      On 22/09/13 03:02, Ricardo Saporta wrote:<br>
    </div></div></div><div><div class="h5">
    <blockquote type="cite">
      <div dir="ltr">Matthew, 
        <div><br>
        </div>
        <div>I did notice the warning, but something doesnt add up: </div>
        <div><br>
        </div>
        <div>If the issue is simply that it is being copied when
          created, then wouldnt we expect the same warning to arise when
          we try to modify the table in using `mapply` or `lapply`? (the
          latter does not produce a warning. </div>
        <div><br>
        </div>
        <div>If on the otherhand, the issue pertains specifically to
          mapply (which I assume it does), then why is it only a problem
          when we iterate over the list directly, whereas iterating
          indirectly by using an index does not produce any warnings. </div>
        <div> </div>
        <div class="gmail_extra">
          <div>
            <div style="color:rgb(34,34,34);font-size:13px;font-family:arial,sans-serif">
              <div style="font-size:13px">While overall, this is minor
                if one is aware of the issue, I think it might allow for
                unnoticed bugs to creep into someones code.  
                Specifically if using mapply to modify a list of DTs and
                the user not realizing that the modifications are not
                being held. </div>
              <div style="font-size:13px"><br>
              </div>
              <div style="font-size:13px">That being said, I'm not sure
                how this could even be addressed if the root is in
                mapply, but is it worth trying to address? </div>
              <div style="font-size:13px">
                <br>
              </div>
              <div style="font-size:13px">Rick</div>
              <div style="font-size:13px"><br>
              </div>
            </div>
          </div>
          <br>
          <div class="gmail_quote">On Fri, Sep 20, 2013 at 2:18 PM,
            Matthew Dowle <span dir="ltr"><<a href="mailto:mdowle@mdowle.plus.com" target="_blank">mdowle@mdowle.plus.com</a>></span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000">
                <div>Does this sentence from the warning help?
                  <div><br>
                    <br>
                    " Also, in R<v3.1.0, list(DT1,DT2) copied the
                    entire DT1 and DT2 (R's list() used to copy named
                    objects); please upgrade to R>=v3.1.0 if that is
                    biting. "<br>
                    <br>
                  </div>
                  <span><font color="#888888"> Matthew</font></span>
                  <div>
                    <div><br>
                      <br>
                      On 20/09/13 19:01, Ricardo Saporta wrote:<br>
                    </div>
                  </div>
                </div>
                <div>
                  <div>
                    <blockquote type="cite">
                      <div dir="ltr">One warning per DT in the list
                        <div>  (I added the line breaks) 
                          <div>-Rick</div>
                          <div>=============================================</div>
                          <div>
                            <div>Warning messages:</div>
                            <div><br>
                            </div>
                            <div>1: In `[.data.table`(DT, ,
                              `:=`(c("Col3", "Col4"), list(C3, C4))) :</div>
                            <div><br>
                            </div>
                            <div>  Invalid .internal.selfref detected
                              and fixed by taking a copy of the whole
                              table so that := can add this new column
                              by reference. At an earlier point, this
                              data.table has been copied by R (or been
                              created manually using structure() or
                              similar). Avoid key<-, names<- and
                              attr<- which in R currently (and oddly)
                              may copy the whole data.table. Use set*
                              syntax instead to avoid copying: ?set,
                              ?setnames and ?setattr. Also, in
                              R<v3.1.0, list(DT1,DT2) copied the
                              entire DT1 and DT2 (R's list() used to
                              copy named objects); please upgrade to
                              R>=v3.1.0 if that is biting. If this
                              message doesn't help, please report to
                              datatable-help so the root cause can be
                              fixed.</div>
                            <div><br>
                            </div>
                            <div>2: In `[.data.table`(DT, ,
                              `:=`(c("Col3", "Col4"), list(C3, C4))) :</div>
                            <div><br>
                            </div>
                            <div>  Invalid .internal.selfref detected
                              and fixed by taking a copy of the whole
                              table so that := can add this new column
                              by reference. At an earlier point, this
                              data.table has been copied by R (or been
                              created manually using structure() or
                              similar). Avoid key<-, names<- and
                              attr<- which in R currently (and oddly)
                              may copy the whole data.table. Use set*
                              syntax instead to avoid copying: ?set,
                              ?setnames and ?setattr. Also, in
                              R<v3.1.0, list(DT1,DT2) copied the
                              entire DT1 and DT2 (R's list() used to
                              copy named objects); please upgrade to
                              R>=v3.1.0 if that is biting. If this
                              message doesn't help, please report to
                              datatable-help so the root cause can be
                              fixed.</div>
                          </div>
                          <div>=============================================<br>
                          </div>
                          <div><br>
                          </div>
                        </div>
                        <div class="gmail_extra"><br>
                          <br>
                          <br>
                          <div class="gmail_quote">On Fri, Sep 20, 2013
                            at 12:49 PM, Matthew Dowle <span dir="ltr"><<a href="mailto:mdowle@mdowle.plus.com" target="_blank">mdowle@mdowle.plus.com</a>></span>
                            wrote:<br>
                            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                              <div bgcolor="#FFFFFF" text="#000000">
                                <div><br>
                                  Hi,<br>
                                  <br>
                                  What's the warning?<br>
                                  <br>
                                  Matthew
                                  <div>
                                    <div><br>
                                      <br>
                                      <br>
                                      On 20/09/13 14:48, Ricardo Saporta
                                      wrote:<br>
                                    </div>
                                  </div>
                                </div>
                                <blockquote type="cite">
                                  <div>
                                    <div>
                                      <div dir="ltr">
                                        <div>
                                          <div>I've encountered the
                                            following issue iterating
                                            over a list of data.tables. </div>
                                          <div>The issue is only with
                                            mapply, not with lapply .</div>
                                          <div><br>
                                          </div>
                                          <div> </div>
                                          <div>Given a list of
                                            data.table's, mapply'ing
                                            over the list directly </div>
                                          <div>cannot modify in place. </div>
                                          <div><br>
                                          </div>
                                          <div>Also if attempting to add
                                            a new column, we get an
                                            "Invalid .internal.selfref"
                                            warning. </div>
                                          <div>Modifying an existing
                                            column does not issue a
                                            warning, but still fails to
                                            modify-in-place</div>
                                          <div><br>
                                          </div>
                                          <div>WORKAROUND: </div>
                                          <div>----------</div>
                                          <div>The workaround is to
                                            iterate over an index to the
                                            list, then to </div>
                                          <div>  modify each data.table
                                            via list.of.DTs[[i]][ .. ]</div>
                                          <div><br>
                                          </div>
                                          <div>**Interestingly, this
                                            issue occurs with `mapply`,
                                            but not `lapply`.**</div>
                                          <div><br>
                                          </div>
                                          <div> </div>
                                          <div>EXAMPLE:</div>
                                          <div>-------- </div>
                                          <div>  # Given a list of DT's
                                            and two lists of vectors, </div>
                                          <div>  #   we want to add the
                                            corresponding vectors as
                                            columns to the DT. </div>
                                          <div><br>
                                          </div>
                                          <div>## ---------------- ##</div>
                                          <div>##   SAMPLE DATA:   ##</div>
                                          <div>## ---------------- ##</div>
                                          <div>  # list of data.tables</div>
                                          <div>  list.DT <- list(</div>
                                          <div>   
                                            DT1=data.table(Col1=111:115,
                                            Col2=121:125),</div>
                                          <div>   
                                            DT2=data.table(Col1=211:215,
                                            Col2=221:225)</div>
                                          <div>    )</div>
                                          <div><br>
                                          </div>
                                          <div>  # lists of columns to
                                            add</div>
                                          <div>  list.Col3 <-
                                            list(131:135, 231:235)</div>
                                          <div>  list.Col4 <-
                                            list(141:145, 241:245)</div>
                                          <div><br>
                                          </div>
                                          <div><br>
                                          </div>
                                          <div>##
                                            ------------------------------------
                                            ##</div>
                                          <div>##   Iterating over the
                                            list elements   ##</div>
                                          <div>##     adding a new
                                            column              ##</div>
                                          <div>##
                                            ------------------------------------
                                            ##</div>
                                          <div>##   Will issue warning
                                            and             ##</div>
                                          <div>##     will fail to
                                            modify in place     ##</div>
                                          <div>##
                                            ------------------------------------
                                            ##</div>
                                          <div>  mapply (</div>
                                          <div>      function(DT, C3,
                                            C4)</div>
                                          <div>          DT[, c("Col3",
                                            "Col4") := list(C3, C4)],</div>
                                          <div>      </div>
                                          <div>      list.DT,  #
                                            iterating over the list</div>
                                          <div>      list.Col3,
                                            list.Col4,</div>
                                          <div>      SIMPLIFY=FALSE</div>
                                          <div>    ) </div>
                                          <div><br>
                                          </div>
                                          <div>  ## Note the lack of
                                            change</div>
                                          <div>  list.DT</div>
                                          <div><br>
                                          </div>
                                          <div><br>
                                          </div>
                                          <div>##
                                            ------------------------------------
                                            ##</div>
                                          <div>##   Iterating over an
                                            index            ##</div>
                                          <div>##
                                            ------------------------------------
                                            ##</div>
                                          <div>  mapply (</div>
                                          <div>      function(i, C3, C4)</div>
                                          <div>         list.DT[[i]] [,
                                            c("Col3", "Col4") :=
                                            list(C3, C4)],</div>
                                          <div>     </div>
                                          <div>      seq(list.DT),   #
                                            iterating over an index to
                                            the list</div>
                                          <div>      list.Col3,
                                            list.Col4,</div>
                                          <div>      SIMPLIFY=FALSE</div>
                                          <div>    )</div>
                                          <div><br>
                                          </div>
                                          <div>  ## Note each DT _has_
                                            been modified</div>
                                          <div>  list.DT</div>
                                          <div><br>
                                          </div>
                                          <div>##
                                            ------------------------------------
                                            ##</div>
                                          <div>##   Iterating over the
                                            list elements   ##</div>
                                          <div>##     modifying existing
                                            column        ##</div>
                                          <div>##
                                            ------------------------------------
                                            ##</div>
                                          <div>##   No warning issued,
                                            but             ##</div>
                                          <div>##     Will fail to
                                            modify in place     ##</div>
                                          <div>##
                                            ------------------------------------
                                            ##</div>
                                          <div>  mapply (</div>
                                          <div>      function(DT, C3,
                                            C4)</div>
                                          <div>         DT[, c("Col3",
                                            "Col4") := list(Col3*1e3,
                                            Col4*1e4)],</div>
                                          <div><br>
                                          </div>
                                          <div>       list.DT,  #
                                            iterating over the list</div>
                                          <div>      list.Col3,
                                            list.Col4,</div>
                                          <div>      SIMPLIFY=FALSE</div>
                                          <div>    ) </div>
                                          <div><br>
                                          </div>
                                          <div>  ## Note the lack of
                                            change (compare with output
                                            from `mapply`)</div>
                                          <div>  list.DT</div>
                                          <div><br>
                                          </div>
                                          <div>##
                                            ------------------------------------
                                            ##</div>
                                          <div>##                      
                                                           ##</div>
                                          <div>##   `lapply` works as
                                            expected.        ##</div>
                                          <div>##                      
                                                           ##</div>
                                          <div>##
                                            ------------------------------------
                                            ##</div>
                                          <div>  </div>
                                          <div>  ## NOW WITH lapply</div>
                                          <div>  lapply(list.DT, </div>
                                          <div>    function(DT)</div>
                                          <div>      DT[, newCol :=
                                            LETTERS[1:5]]</div>
                                          <div>  )</div>
                                          <div><br>
                                          </div>
                                          <div>  ## Note the new
                                            column: </div>
                                          <div>  list.DT</div>
                                          <div><br>
                                          </div>
                                          <div><br>
                                          </div>
                                          <div><br>
                                          </div>
                                          <div>#
                                            ==========================
                                            # </div>
                                          <div><br>
                                          </div>
                                          <div>##   NON-WORKAROUNDS  
                                            ## </div>
                                          <div>##</div>
                                          <div>## I also tried all of
                                            the following alternatives</div>
                                          <div>##   in hopes of being
                                            able to iterate over the
                                            list </div>
                                          <div>##   directly, using
                                            `mapply`.  </div>
                                          <div>## None of these worked. </div>
                                          <div><br>
                                          </div>
                                          <div># (1) Creating the DTs
                                            First, then creating the
                                            list from them</div>
                                          <div>    DT1 <-
                                            data.table(Col1=111:115,
                                            Col2=121:125)</div>
                                          <div>    DT2 <-
                                            data.table(Col1=211:215,
                                            Col2=221:225)</div>
                                          <div><br>
                                          </div>
                                          <div>    list.DT <-
                                            list(DT1=DT1,DT2=DT2 )</div>
                                          <div><br>
                                          </div>
                                          <div><br>
                                          </div>
                                          <div># (2) Same as 1, and
                                            using `copy()` in the call
                                            to `list()`</div>
                                          <div>    list.DT <-
                                            list(DT1=copy(DT1), </div>
                                          <div>                   
                                            DT2=copy(DT2) )</div>
                                          <div><br>
                                          </div>
                                          <div># (3) lapply'ing `copy`
                                            and then iterating over that
                                            list</div>
                                          <div>    list.DT <-
                                            lapply(list.DT, copy)</div>
                                          <div><br>
                                          </div>
                                          <div># (4) Not naming the list
                                            elements</div>
                                          <div>    list.DT <-
                                            list(DT1, DT2)</div>
                                          <div>    # and tried</div>
                                          <div>    list.DT <-
                                            list(copy(DT1), copy(DT2))</div>
                                          <div><br>
                                          </div>
                                          <div>## All of the above still
                                            failed to modify in place</div>
                                          <div>##   (and also issued the
                                            same warning if trying to
                                            add a column)</div>
                                          <div>##    when iterating
                                            using mapply</div>
                                          <div><br>
                                          </div>
                                          <div>  mapply(function(DT, C3,
                                            C4)</div>
                                          <div>    DT[, c("Col3",
                                            "Col4") := list(C3, C4)],</div>
                                          <div>    list.DT, list.Col3,
                                            list.Col4,</div>
                                          <div>    SIMPLIFY=FALSE)</div>
                                          <div><br>
                                          </div>
                                          <div><br>
                                          </div>
                                          <div>#
                                            ==========================
                                            # </div>
                                        </div>
                                        <div><br>
                                        </div>
                                        <br clear="all">
                                        <div>
                                          <div style="color:rgb(34,34,34);font-size:13px;font-family:arial,sans-serif">
                                            <div style="font-size:13px">Ricardo
                                              Saporta</div>
                                            <div style="font-size:13px">
                                              Rutgers University, New
                                              Jersey<br>
                                            </div>
                                            <div style="font-size:13px"><span style="font-size:13px">e: </span><a href="mailto:saporta@rutgers.edu" style="color:rgb(17,85,204);font-size:13px" target="_blank">saporta@rutgers.edu</a></div>

                                            <div><br>
                                            </div>
                                          </div>
                                        </div>
                                      </div>
                                      <br>
                                      <fieldset></fieldset>
                                      <br>
                                    </div>
                                  </div>
                                  <pre>_______________________________________________
datatable-help mailing list
<a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a>
<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a></pre>
                                </blockquote>
                                <br>
                              </div>
                            </blockquote>
                          </div>
                          <br>
                        </div>
                      </div>
                    </blockquote>
                    <br>
                  </div>
                </div>
              </div>
            </blockquote>
          </div>
          <br>
        </div>
      </div>
      <br>
      <fieldset></fieldset>
      <br>
      <pre>_______________________________________________
datatable-help mailing list
<a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a>
<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a></pre>
    </blockquote>
    <br>
  </div></div></div>

</blockquote></div><br></div></div>