[datatable-help] Is assignment such as DT[, a:=7] supposed to print DT when surrounded by braces?

Todd A. Johnson tjohnson at src.riken.jp
Fri Mar 14 12:48:32 CET 2014


Hi Steve,

Thanks for your thorough answer.  I suppose that my problem was that for
some iterations of my script, the last update of DT was to a DT with just
under 100 rows, so the not-so-silent column update then printed those rows
into my log file, making the size of certain log files very different from
the others.   Setting options(datatable.print.nrows=0) at the top of my
script seems like a more elegant way than finding the last DT[,d:=7] update
in a script and surrounding it with 'invisible'. :-)


Todd


On 3/13/14 2:47 AM, "Steve Lianoglou" <lianoglou.steve at gene.com> wrote:

> Hi,
> 
> On Wed, Mar 12, 2014 at 3:59 AM, Todd A. Johnson <tjohnson at src.riken.jp>
> wrote:
>> I am using data.table Version 1.9.2 with R 3.0.2 on Mac OS 10.6.8.
>> 
>> I've looked through 6 months worth of the mailing list as well as the Bug
>> reports and of course the FAQ vignette.  However, while my question seems
>> related to FAQ 2.21, that answer seems to say that returning DT when
>> assigning DT[i,col:=value] was made invisible in v1.8.3.
>> 
>> My question comes from observing different behavior for assignment by
>> reference to a column when a data.table DT is surrounded by braces compared
>> to without braces (such as within an if..else statement).
>> 
>> Here's a simple test program:
>> 
>> library(data.table)
>> DT <- data.table(a=c(1,2,3), b=c(4,5,6))
>> DT[,d:=7]
>> 
>> DT <- data.table(a=c(1,2,3), b=c(4,5,6))
>> if( nrow(DT)>0 ){DT[,d:=7]}
> 
> I can reproduce what you're seeing, but I don't think it has anything
> to do with DT being surrounded by {}, a simple:
> 
>     if (nrow(DT) > 0) DT[, d := 7]
> 
> will trigger a dump to the console as well
> 
>> So, should the second assignment within the 'if' statement print out DT?
> 
> I don't think it should. Note that if the := isn't the last clause in
> the expression block, nothing is printed, eg. this will be silent:
> 
>     if (nrow(DT) > 0) {
>       DT[, d := 7]
>       x <- 1
>     }
> 
>>  To
>> get rid of this effect in my scripts (which potentially could result in
>> printing out tens-of-thousands of rows of data into a log file...),
> 
> That wouldn't happen, data.table "dumps" are always trimmed if they
> are too long (this is configured by the 'datatable.print.nrows' and
> 'datatable.print.topn' otions).
> 
> By default, if the data.table is > 100 rows, you will only print the
> top 5 and bottom 5 rows.
> 
> In fact, as a workaround for you, if you set:
> 
>     options(datatable.print.nrows=0)
> 
> Your "problem" will now go away, meaning:
> 
>     if (nrow(DT) > 0) DT[, d := 7]
> 
> will be silent
> 
> But so will all of your data.table "console dumps". Which is to say,
> just typing `DT` would not print anything to the console. You'd now
> have to explicitly set the 'nrows' option in a call to `print` to see
> your data.table, eg: `print(DT, nrows=100)` so you could explore the
> data.table on the console.
> 
> There are people who say you should never dump a data.table or
> data.frame to the console, but rather look at str(dt) ... not sure
> that I agree with that, but that is another thing to consider if you
> hammer datatable.print.nrows to 0.
> 
> HTH,
> -steve




More information about the datatable-help mailing list