<div dir="ltr">The over allocation of pointers gets (silently) lost when we assign using `<-` in R. That's why we insist on using `:=`. And this is why we use an external pointer to detect if the over allocation has been lost. <div><br></div><div>We can't directly use `:=`, as the over-allocation is gone. So we've to shallow copy, over-allocate once again. The message should say that the data was "shallow" copied. So it shouldn't affect performance (unless you do this quite repeatedly). </div><div><br></div><div>But best to avoid the message altogether by using it the idiomatic way.</div><div><br></div><div>I'm not sure if the R version matters here or not.</div><div><br></div><div>HTH</div><div>Arun.</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Jan 25, 2015 at 12:33 PM, statquant3 <span dir="ltr"><<a href="mailto:statquant@outlook.com" target="_blank">statquant@outlook.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I am experiencing a problem with data.table and can't find out what the<br>
problem is.<br>
I am getting the warning "this data table had to be copied over"<br>
First i cannot reproduce the bug on small example so I realize this post is<br>
mostly a bottle thrown at the see...<br>
<br>
Here is what I do:<br>
<br>
1.<br>
I have several tables on a distant kdb server that I retrieve through a<br>
proprietary package.<br>
Those tables are copied and some data.frames are created. (I think it as<br>
nothing to do with it but still you never know)<br>
Because I have several tables on that server I created a function wrapper<br>
the "fetching" of a table, a data.frame is created and<br>
I setDT them the make them a data.table.<br>
2.<br>
I call this function within a sapply command and end up with a named-list of<br>
data.table<br>
3.<br>
I modify some of those tables in a R function, once again using lapply so I<br>
end up with a modified list of data.tables<br>
4.<br>
I use attach() to be able to work on each data.table by name<br>
<br>
**5**<br>
Latter I when I try to add a column by ":=" I am getting a warning saying<br>
that the whole table had to be copied.<br>
Am I doing something obviously wrong here ?<br>
<br>
<br>
pseudo code:<br>
<br>
fetchingData = function(tableName, connectionToServer){<br>
DF = connectAndFetchTable(connectionToServer, tableName)<br>
DT = setDT(DF)<br>
return(DT)<br>
}<br>
<br>
modifyDataTable = function(DT){<br>
if('thisColName' %in% colnames(DT)){<br>
DT[,thisColName:=someTransformation(thisColName)]<br>
}<br>
...<br>
}<br>
<br>
I the main code :<br>
<br>
myDataTableList = sapply(c('tableA','tableB','tableC','tableD'),<br>
FUN=fetchingData, connectionToServer=myCon)<br>
myDataTableList = lapply(myDataTableList, FUN=modifyDataTable)<br>
attach(myDataTableList)<br>
<br>
tableB[,newColumn:=1L]<br>
*** getting the warning here***<br>
<br>
Not I am using R 3.0.2<br>
<br>
Finally let me say that I am using data.table for a long time right now, so<br>
it I know about := and set function (that usually are forgotten and trigger<br>
this warning like when doing DT$new = stuff)<br>
<br>
If any of you gets it I'd be gratefull (sorry I can't reproduce it<br>
correctly)<br>
<br>
<br>
<br>
<br>
--<br>
View this message in context: <a href="http://r.789695.n4.nabble.com/whole-data-table-copied-warning-tp4702267.html" target="_blank">http://r.789695.n4.nabble.com/whole-data-table-copied-warning-tp4702267.html</a><br>
Sent from the datatable-help mailing list archive at Nabble.com.<br>
_______________________________________________<br>
datatable-help mailing list<br>
<a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br>
<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a><br>
</blockquote></div><br></div>