<div dir="ltr">For what it's worth, I use the with=FALSE version frequently without knowing how many columns I have selected, so I like the implicit wrapping of the columns in a list() (or implicit drop=FALSE). An example (almost) from something I did yesterday:<div>
<br></div><div><font face="courier new, monospace">mycols <- grep("^Vbar",names(DT),value=TRUE)</font></div><div><font face="courier new, monospace">DT1 <- DT[,mycols,with=FALSE]</font></div><div><font face="arial, helvetica, sans-serif"><br>
</font></div><div><font face="arial, helvetica, sans-serif">-- Frank</font></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Nov 14, 2013 at 11:59 AM, Eduard Antonyan <span dir="ltr"><<a href="mailto:eduard.antonyan@gmail.com" target="_blank">eduard.antonyan@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Perhaps a simple sentence along the lines of "drop argument is absent and should be considered as FALSE when comparing with data.frame in with=FALSE mode" would suffice. The fact that i-expression is a full-on data.table i-expression in with=FALSE mode will probably also cause inconsistencies.</div>
</div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Nov 14, 2013 at 10:47 AM, Arunkumar Srinivasan <span dir="ltr"><<a href="mailto:aragorn168b@gmail.com" target="_blank">aragorn168b@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
I'll try to make a list of places where data.table != data.frame operation.
</div>
<div><div><br></div><div>Arun</div><div><br></div></div><div><div>
<p style="color:#a0a0a8">On Thursday, November 14, 2013 at 5:46 PM, Arunkumar Srinivasan wrote:</p>
<blockquote type="cite" style="border-left-style:solid;border-width:1px;margin-left:0px;padding-left:10px">
<span><div><div>
<div>Glad that we agree on better-ing the documentation. However, I don't find it a sound argument that we deviate from data.frame because the design is bad, *when we inherit from data.frame*. The choice is already made! Too many such trivial inconsistencies piles up pretty quickly and could potentially result in a steep learning curve - as there are different set of rules to be memorised.
</div><div><br></div><div>Tackling the point of "inheriting from data.frame", *but* this, this, this.. and many other things are different, if can't be avoided, should be *very clearly* documented (in the beginning, maybe as a cheat sheet) so that people aren't confused.</div>
<div><br></div>
<div><div><br></div><div>Arun</div><div><br></div></div>
<p style="color:#a0a0a8">On Thursday, November 14, 2013 at 5:39 PM, Eduard Antonyan wrote:</p><blockquote type="cite"><div>
<span><div><div><div dir="ltr">I agree that it's inconsistent with data.frame, and imo that's a good thing. We don't replicate the drop argument, so it wouldn't be possible to return a data.table when with=FALSE and either way drop=TRUE by default is a bad design choice in data.frame and matrix (that is unlikely to change given R-core's attitude towards that type of a thing).<div>
<div><br></div><div>I'm always pro more and better documentation :)</div></div></div><div><br><br><div>On Thu, Nov 14, 2013 at 10:33 AM, Arunkumar Srinivasan <span dir="ltr"><<a href="mailto:aragorn168b@gmail.com" target="_blank">aragorn168b@gmail.com</a>></span> wrote:<br>
<blockquote type="cite"><div>
<div>
Eddi, At the least, I think the documentation needs to be clearer on the use of "with=FALSE". It does feel inconsistent with the fact that "j" with a single column should return a vector. In data.frames, the type in "j" being column names, if it's just one column name, would return a vector, unless drop = FALSE. That is, DF[, "y"] will return a vector while DF[, c("x", "y")] will return a data.frame. So, it is inconsistent with data.frame here, I think.
</div><div><br></div>
<div><div><br></div><div>Arun</div><div><br></div></div><div><div>
<p style="color:#a0a0a8">On Thursday, November 14, 2013 at 5:25 PM, Eduard Antonyan wrote:</p><blockquote type="cite"><div>
<span><div><div><div dir="ltr">DT[, y] returning a vector is I think the only correct behavior, given the understanding of j-expression as something evaluated in the DT environment. If they want a data.table they should simply use DT[, list(y)] or DT[, data.table(y)].<div>
<br></div><div>I haven't thought about DT[, "y", with = FALSE] before as I pretty much never use that form, but I see an argument for it staying as is, because "y" and c("y") are the same and since we all presumably agree that DT[, c("y", "z"), with = FALSE] should return a data.table. If DT[, c("y"), with = FALSE] returned a different type that would mean inconsistent return types which makes life much harder for users (as evidenced by the periodic drop=FALSE questions that come up on SO).</div>
<div><br></div><div>Going back to DT[, y], note that y and list(y) actually produce *different* results (in e.g. base_env), so there is no type consistency issue there between DT[, y] and DT[, list(y, z)].</div></div><div>
<br><br><div>On Thu, Nov 14, 2013 at 6:09 AM, Arunkumar Srinivasan <span dir="ltr"><<a href="mailto:aragorn168b@gmail.com" target="_blank">aragorn168b@gmail.com</a>></span> wrote:<br><blockquote type="cite"><div>
<div>
Hi everybody,
</div><div><br></div><div>It'd be nice if you could weigh-in on the bug report filed by Bill here: </div><div><a href="https://r-forge.r-project.org/tracker/index.php?func=detail&aid=5100&group_id=240&atid=975" target="_blank">https://r-forge.r-project.org/tracker/index.php?func=detail&aid=5100&group_id=240&atid=975</a></div>
<div><br></div><div>The gist of it is:</div><div><br></div><div>require(data.table)</div><div>DT <- data.table(x=1:5, y=6:10, z=11:15)</div><div>DT[, y] # returns a vector</div><div>DT[, "y", with=FALSE] # returns a data.table</div>
<div><br></div><div>The question from the bug report basically is: "why is that in the first case, 'j' has only one column and we get a vector, but in the second case, we get a data.table?"</div><div><br>
</div><div>My question is: Is this behaviour okay or do you prefer that the first one returns a data.table as well or the second one (with "with=FALSE") returns a vector?</div>
<div><div><br></div><div>Thank you,</div><div>Arun</div><div><br></div></div>
<br>_______________________________________________<br>
datatable-help mailing list<br>
<a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a><br>
<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a><br></div></blockquote></div><br></div>
</div></div></span>
</div></blockquote><div>
<br>
</div>
</div></div></div></blockquote></div><br></div>
</div></div></span>
</div></blockquote><div>
<br>
</div>
</div></div></span>
</blockquote>
<div>
<br>
</div>
</div></div></blockquote></div><br></div>
</div></div><br>_______________________________________________<br>
datatable-help mailing list<br>
<a href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a><br>
<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a><br></blockquote></div><br></div>