[datatable-help] Is assignment such as DT[, a:=7] supposed to print DT when surrounded by braces?
Steve Lianoglou
lianoglou.steve at gene.com
Wed Mar 12 18:47:39 CET 2014
Hi,
On Wed, Mar 12, 2014 at 3:59 AM, Todd A. Johnson <tjohnson at src.riken.jp> wrote:
> I am using data.table Version 1.9.2 with R 3.0.2 on Mac OS 10.6.8.
>
> I've looked through 6 months worth of the mailing list as well as the Bug
> reports and of course the FAQ vignette. However, while my question seems
> related to FAQ 2.21, that answer seems to say that returning DT when
> assigning DT[i,col:=value] was made invisible in v1.8.3.
>
> My question comes from observing different behavior for assignment by
> reference to a column when a data.table DT is surrounded by braces compared
> to without braces (such as within an if..else statement).
>
> Here's a simple test program:
>
> library(data.table)
> DT <- data.table(a=c(1,2,3), b=c(4,5,6))
> DT[,d:=7]
>
> DT <- data.table(a=c(1,2,3), b=c(4,5,6))
> if( nrow(DT)>0 ){DT[,d:=7]}
I can reproduce what you're seeing, but I don't think it has anything
to do with DT being surrounded by {}, a simple:
if (nrow(DT) > 0) DT[, d := 7]
will trigger a dump to the console as well
> So, should the second assignment within the 'if' statement print out DT?
I don't think it should. Note that if the := isn't the last clause in
the expression block, nothing is printed, eg. this will be silent:
if (nrow(DT) > 0) {
DT[, d := 7]
x <- 1
}
> To
> get rid of this effect in my scripts (which potentially could result in
> printing out tens-of-thousands of rows of data into a log file...),
That wouldn't happen, data.table "dumps" are always trimmed if they
are too long (this is configured by the 'datatable.print.nrows' and
'datatable.print.topn' otions).
By default, if the data.table is > 100 rows, you will only print the
top 5 and bottom 5 rows.
In fact, as a workaround for you, if you set:
options(datatable.print.nrows=0)
Your "problem" will now go away, meaning:
if (nrow(DT) > 0) DT[, d := 7]
will be silent
But so will all of your data.table "console dumps". Which is to say,
just typing `DT` would not print anything to the console. You'd now
have to explicitly set the 'nrows' option in a call to `print` to see
your data.table, eg: `print(DT, nrows=100)` so you could explore the
data.table on the console.
There are people who say you should never dump a data.table or
data.frame to the console, but rather look at str(dt) ... not sure
that I agree with that, but that is another thing to consider if you
hammer datatable.print.nrows to 0.
HTH,
-steve
--
Steve Lianoglou
Computational Biologist
Genentech
More information about the datatable-help
mailing list