<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix"><br>
Hi,<br>
Basically adding columns by reference to a data.table when it's a
member of a list of data.table, is really difficult to handle
internally. I had to special case internally to get around list()
copying, so that the binding can change inside the list on the
shallow copy when [[ is used. A for loop is the way to add
columns by reference inside a list of data.table, and that should
work ok using [[. But doing that via lapply and mapply is really
stretching it. Even catching user expectations in this area is
difficult. Ideally we'd catch mapply, yes, but really data.table
likes to be rbindlist()-ed and then ops to work on a single large
data.table. We can advice to the warning message not to use
mapply or lapply to add columns by reference to a list of
data.table (use a for loop instead) ?<br>
Matthew<br>
<br>
<br>
On 22/09/13 03:02, Ricardo Saporta wrote:<br>
</div>
<blockquote
cite="mid:CAE7Aa4TStQqej+j=UMQo6NQJs5Z8214bLoCa0oGJnw+RBSOk+Q@mail.gmail.com"
type="cite">
<div dir="ltr">Matthew,
<div><br>
</div>
<div>I did notice the warning, but something doesnt add up: </div>
<div><br>
</div>
<div>If the issue is simply that it is being copied when
created, then wouldnt we expect the same warning to arise when
we try to modify the table in using `mapply` or `lapply`? (the
latter does not produce a warning. </div>
<div><br>
</div>
<div>If on the otherhand, the issue pertains specifically to
mapply (which I assume it does), then why is it only a problem
when we iterate over the list directly, whereas iterating
indirectly by using an index does not produce any warnings. </div>
<div> </div>
<div class="gmail_extra">
<div>
<div
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
<div style="font-size:13px">While overall, this is minor
if one is aware of the issue, I think it might allow for
unnoticed bugs to creep into someones code.
Specifically if using mapply to modify a list of DTs and
the user not realizing that the modifications are not
being held. </div>
<div style="font-size:13px"><br>
</div>
<div style="font-size:13px">That being said, I'm not sure
how this could even be addressed if the root is in
mapply, but is it worth trying to address? </div>
<div style="font-size:13px">
<br>
</div>
<div style="font-size:13px">Rick</div>
<div style="font-size:13px"><br>
</div>
</div>
</div>
<br>
<div class="gmail_quote">On Fri, Sep 20, 2013 at 2:18 PM,
Matthew Dowle <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:mdowle@mdowle.plus.com" target="_blank">mdowle@mdowle.plus.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>Does this sentence from the warning help?
<div class="im"><br>
<br>
" Also, in R<v3.1.0, list(DT1,DT2) copied the
entire DT1 and DT2 (R's list() used to copy named
objects); please upgrade to R>=v3.1.0 if that is
biting. "<br>
<br>
</div>
<span class="HOEnZb"><font color="#888888"> Matthew</font></span>
<div>
<div class="h5"><br>
<br>
On 20/09/13 19:01, Ricardo Saporta wrote:<br>
</div>
</div>
</div>
<div>
<div class="h5">
<blockquote type="cite">
<div dir="ltr">One warning per DT in the list
<div> (I added the line breaks)
<div>-Rick</div>
<div>=============================================</div>
<div>
<div>Warning messages:</div>
<div><br>
</div>
<div>1: In `[.data.table`(DT, ,
`:=`(c("Col3", "Col4"), list(C3, C4))) :</div>
<div><br>
</div>
<div> Invalid .internal.selfref detected
and fixed by taking a copy of the whole
table so that := can add this new column
by reference. At an earlier point, this
data.table has been copied by R (or been
created manually using structure() or
similar). Avoid key<-, names<- and
attr<- which in R currently (and oddly)
may copy the whole data.table. Use set*
syntax instead to avoid copying: ?set,
?setnames and ?setattr. Also, in
R<v3.1.0, list(DT1,DT2) copied the
entire DT1 and DT2 (R's list() used to
copy named objects); please upgrade to
R>=v3.1.0 if that is biting. If this
message doesn't help, please report to
datatable-help so the root cause can be
fixed.</div>
<div><br>
</div>
<div>2: In `[.data.table`(DT, ,
`:=`(c("Col3", "Col4"), list(C3, C4))) :</div>
<div><br>
</div>
<div> Invalid .internal.selfref detected
and fixed by taking a copy of the whole
table so that := can add this new column
by reference. At an earlier point, this
data.table has been copied by R (or been
created manually using structure() or
similar). Avoid key<-, names<- and
attr<- which in R currently (and oddly)
may copy the whole data.table. Use set*
syntax instead to avoid copying: ?set,
?setnames and ?setattr. Also, in
R<v3.1.0, list(DT1,DT2) copied the
entire DT1 and DT2 (R's list() used to
copy named objects); please upgrade to
R>=v3.1.0 if that is biting. If this
message doesn't help, please report to
datatable-help so the root cause can be
fixed.</div>
</div>
<div>=============================================<br>
</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<br>
<div class="gmail_quote">On Fri, Sep 20, 2013
at 12:49 PM, Matthew Dowle <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:mdowle@mdowle.plus.com"
target="_blank">mdowle@mdowle.plus.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px
#ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div><br>
Hi,<br>
<br>
What's the warning?<br>
<br>
Matthew
<div>
<div><br>
<br>
<br>
On 20/09/13 14:48, Ricardo Saporta
wrote:<br>
</div>
</div>
</div>
<blockquote type="cite">
<div>
<div>
<div dir="ltr">
<div>
<div>I've encountered the
following issue iterating
over a list of data.tables. </div>
<div>The issue is only with
mapply, not with lapply .</div>
<div><br>
</div>
<div> </div>
<div>Given a list of
data.table's, mapply'ing
over the list directly </div>
<div>cannot modify in place. </div>
<div><br>
</div>
<div>Also if attempting to add
a new column, we get an
"Invalid .internal.selfref"
warning. </div>
<div>Modifying an existing
column does not issue a
warning, but still fails to
modify-in-place</div>
<div><br>
</div>
<div>WORKAROUND: </div>
<div>----------</div>
<div>The workaround is to
iterate over an index to the
list, then to </div>
<div> modify each data.table
via list.of.DTs[[i]][ .. ]</div>
<div><br>
</div>
<div>**Interestingly, this
issue occurs with `mapply`,
but not `lapply`.**</div>
<div><br>
</div>
<div> </div>
<div>EXAMPLE:</div>
<div>-------- </div>
<div> # Given a list of DT's
and two lists of vectors, </div>
<div> # we want to add the
corresponding vectors as
columns to the DT. </div>
<div><br>
</div>
<div>## ---------------- ##</div>
<div>## SAMPLE DATA: ##</div>
<div>## ---------------- ##</div>
<div> # list of data.tables</div>
<div> list.DT <- list(</div>
<div>
DT1=data.table(Col1=111:115,
Col2=121:125),</div>
<div>
DT2=data.table(Col1=211:215,
Col2=221:225)</div>
<div> )</div>
<div><br>
</div>
<div> # lists of columns to
add</div>
<div> list.Col3 <-
list(131:135, 231:235)</div>
<div> list.Col4 <-
list(141:145, 241:245)</div>
<div><br>
</div>
<div><br>
</div>
<div>##
------------------------------------
##</div>
<div>## Iterating over the
list elements ##</div>
<div>## adding a new
column ##</div>
<div>##
------------------------------------
##</div>
<div>## Will issue warning
and ##</div>
<div>## will fail to
modify in place ##</div>
<div>##
------------------------------------
##</div>
<div> mapply (</div>
<div> function(DT, C3,
C4)</div>
<div> DT[, c("Col3",
"Col4") := list(C3, C4)],</div>
<div> </div>
<div> list.DT, #
iterating over the list</div>
<div> list.Col3,
list.Col4,</div>
<div> SIMPLIFY=FALSE</div>
<div> ) </div>
<div><br>
</div>
<div> ## Note the lack of
change</div>
<div> list.DT</div>
<div><br>
</div>
<div><br>
</div>
<div>##
------------------------------------
##</div>
<div>## Iterating over an
index ##</div>
<div>##
------------------------------------
##</div>
<div> mapply (</div>
<div> function(i, C3, C4)</div>
<div> list.DT[[i]] [,
c("Col3", "Col4") :=
list(C3, C4)],</div>
<div> </div>
<div> seq(list.DT), #
iterating over an index to
the list</div>
<div> list.Col3,
list.Col4,</div>
<div> SIMPLIFY=FALSE</div>
<div> )</div>
<div><br>
</div>
<div> ## Note each DT _has_
been modified</div>
<div> list.DT</div>
<div><br>
</div>
<div>##
------------------------------------
##</div>
<div>## Iterating over the
list elements ##</div>
<div>## modifying existing
column ##</div>
<div>##
------------------------------------
##</div>
<div>## No warning issued,
but ##</div>
<div>## Will fail to
modify in place ##</div>
<div>##
------------------------------------
##</div>
<div> mapply (</div>
<div> function(DT, C3,
C4)</div>
<div> DT[, c("Col3",
"Col4") := list(Col3*1e3,
Col4*1e4)],</div>
<div><br>
</div>
<div> list.DT, #
iterating over the list</div>
<div> list.Col3,
list.Col4,</div>
<div> SIMPLIFY=FALSE</div>
<div> ) </div>
<div><br>
</div>
<div> ## Note the lack of
change (compare with output
from `mapply`)</div>
<div> list.DT</div>
<div><br>
</div>
<div>##
------------------------------------
##</div>
<div>##
##</div>
<div>## `lapply` works as
expected. ##</div>
<div>##
##</div>
<div>##
------------------------------------
##</div>
<div> </div>
<div> ## NOW WITH lapply</div>
<div> lapply(list.DT, </div>
<div> function(DT)</div>
<div> DT[, newCol :=
LETTERS[1:5]]</div>
<div> )</div>
<div><br>
</div>
<div> ## Note the new
column: </div>
<div> list.DT</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>#
==========================
# </div>
<div><br>
</div>
<div>## NON-WORKAROUNDS
## </div>
<div>##</div>
<div>## I also tried all of
the following alternatives</div>
<div>## in hopes of being
able to iterate over the
list </div>
<div>## directly, using
`mapply`. </div>
<div>## None of these worked. </div>
<div><br>
</div>
<div># (1) Creating the DTs
First, then creating the
list from them</div>
<div> DT1 <-
data.table(Col1=111:115,
Col2=121:125)</div>
<div> DT2 <-
data.table(Col1=211:215,
Col2=221:225)</div>
<div><br>
</div>
<div> list.DT <-
list(DT1=DT1,DT2=DT2 )</div>
<div><br>
</div>
<div><br>
</div>
<div># (2) Same as 1, and
using `copy()` in the call
to `list()`</div>
<div> list.DT <-
list(DT1=copy(DT1), </div>
<div>
DT2=copy(DT2) )</div>
<div><br>
</div>
<div># (3) lapply'ing `copy`
and then iterating over that
list</div>
<div> list.DT <-
lapply(list.DT, copy)</div>
<div><br>
</div>
<div># (4) Not naming the list
elements</div>
<div> list.DT <-
list(DT1, DT2)</div>
<div> # and tried</div>
<div> list.DT <-
list(copy(DT1), copy(DT2))</div>
<div><br>
</div>
<div>## All of the above still
failed to modify in place</div>
<div>## (and also issued the
same warning if trying to
add a column)</div>
<div>## when iterating
using mapply</div>
<div><br>
</div>
<div> mapply(function(DT, C3,
C4)</div>
<div> DT[, c("Col3",
"Col4") := list(C3, C4)],</div>
<div> list.DT, list.Col3,
list.Col4,</div>
<div> SIMPLIFY=FALSE)</div>
<div><br>
</div>
<div><br>
</div>
<div>#
==========================
# </div>
</div>
<div><br>
</div>
<br clear="all">
<div>
<div
style="color:rgb(34,34,34);font-size:13px;font-family:arial,sans-serif">
<div style="font-size:13px">Ricardo
Saporta</div>
<div style="font-size:13px">
Rutgers University, New
Jersey<br>
</div>
<div style="font-size:13px"><span
style="font-size:13px">e: </span><a
moz-do-not-send="true"
href="mailto:saporta@rutgers.edu"
style="color:rgb(17,85,204);font-size:13px" target="_blank">saporta@rutgers.edu</a></div>
<div><br>
</div>
</div>
</div>
</div>
<br>
<fieldset></fieldset>
<br>
</div>
</div>
<pre>_______________________________________________
datatable-help mailing list
<a moz-do-not-send="true" href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a>
<a moz-do-not-send="true" href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a></pre>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
datatable-help mailing list
<a class="moz-txt-link-abbreviated" href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a>
<a class="moz-txt-link-freetext" href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a></pre>
</blockquote>
<br>
</body>
</html>