[datatable-help] Is assignment such as DT[, a:=7] supposed to print DT when surrounded by braces?

Steve Lianoglou lianoglou.steve at gene.com
Wed Mar 12 18:47:39 CET 2014


Hi,

On Wed, Mar 12, 2014 at 3:59 AM, Todd A. Johnson <tjohnson at src.riken.jp> wrote:
> I am using data.table Version 1.9.2 with R 3.0.2 on Mac OS 10.6.8.
>
> I've looked through 6 months worth of the mailing list as well as the Bug
> reports and of course the FAQ vignette.  However, while my question seems
> related to FAQ 2.21, that answer seems to say that returning DT when
> assigning DT[i,col:=value] was made invisible in v1.8.3.
>
> My question comes from observing different behavior for assignment by
> reference to a column when a data.table DT is surrounded by braces compared
> to without braces (such as within an if..else statement).
>
> Here's a simple test program:
>
> library(data.table)
> DT <- data.table(a=c(1,2,3), b=c(4,5,6))
> DT[,d:=7]
>
> DT <- data.table(a=c(1,2,3), b=c(4,5,6))
> if( nrow(DT)>0 ){DT[,d:=7]}

I can reproduce what you're seeing, but I don't think it has anything
to do with DT being surrounded by {}, a simple:

    if (nrow(DT) > 0) DT[, d := 7]

will trigger a dump to the console as well

> So, should the second assignment within the 'if' statement print out DT?

I don't think it should. Note that if the := isn't the last clause in
the expression block, nothing is printed, eg. this will be silent:

    if (nrow(DT) > 0) {
      DT[, d := 7]
      x <- 1
    }

>  To
> get rid of this effect in my scripts (which potentially could result in
> printing out tens-of-thousands of rows of data into a log file...),

That wouldn't happen, data.table "dumps" are always trimmed if they
are too long (this is configured by the 'datatable.print.nrows' and
'datatable.print.topn' otions).

By default, if the data.table is > 100 rows, you will only print the
top 5 and bottom 5 rows.

In fact, as a workaround for you, if you set:

    options(datatable.print.nrows=0)

Your "problem" will now go away, meaning:

    if (nrow(DT) > 0) DT[, d := 7]

will be silent

But so will all of your data.table "console dumps". Which is to say,
just typing `DT` would not print anything to the console. You'd now
have to explicitly set the 'nrows' option in a call to `print` to see
your data.table, eg: `print(DT, nrows=100)` so you could explore the
data.table on the console.

There are people who say you should never dump a data.table or
data.frame to the console, but rather look at str(dt) ... not sure
that I agree with that, but that is another thing to consider if you
hammer datatable.print.nrows to 0.

HTH,
-steve

-- 
Steve Lianoglou
Computational Biologist
Genentech


More information about the datatable-help mailing list