[datatable-help] Is assignment such as DT[, a:=7] supposed to print DT when surrounded by braces?

Matt Dowle mdowle at mdowle.plus.com
Fri Mar 14 15:03:03 CET 2014


Interesting.  What's happening is due to the result of DT[,d:=7] being 
DT.  That's so that compound statements can work e.g.

     DT[is.na(d),d:=0][,sum(a),by=d]

If  DT[,d:=7] is the last line of a function or last line inside braces, 
then R is printing the result.  It's not DT[,:=] printing, per se.   You 
don't have to 'wrap' with invisible,  it's quite common for the last 
line of a function to be invisible() on its own with no arguments,  just 
as another option.

I'll take a look to see if we can trap DT[,:=] printing when it's the 
return value.  If you could file an item on the tracker please.  It's a 
new one so haven't considered it before.

Matt

On 14/03/14 11:48, Todd A. Johnson wrote:
> Hi Steve,
>
> Thanks for your thorough answer.  I suppose that my problem was that for
> some iterations of my script, the last update of DT was to a DT with just
> under 100 rows, so the not-so-silent column update then printed those rows
> into my log file, making the size of certain log files very different from
> the others.   Setting options(datatable.print.nrows=0) at the top of my
> script seems like a more elegant way than finding the last DT[,d:=7] update
> in a script and surrounding it with 'invisible'. :-)
>
>
> Todd
>
>
> On 3/13/14 2:47 AM, "Steve Lianoglou" <lianoglou.steve at gene.com> wrote:
>
>> Hi,
>>
>> On Wed, Mar 12, 2014 at 3:59 AM, Todd A. Johnson <tjohnson at src.riken.jp>
>> wrote:
>>> I am using data.table Version 1.9.2 with R 3.0.2 on Mac OS 10.6.8.
>>>
>>> I've looked through 6 months worth of the mailing list as well as the Bug
>>> reports and of course the FAQ vignette.  However, while my question seems
>>> related to FAQ 2.21, that answer seems to say that returning DT when
>>> assigning DT[i,col:=value] was made invisible in v1.8.3.
>>>
>>> My question comes from observing different behavior for assignment by
>>> reference to a column when a data.table DT is surrounded by braces compared
>>> to without braces (such as within an if..else statement).
>>>
>>> Here's a simple test program:
>>>
>>> library(data.table)
>>> DT <- data.table(a=c(1,2,3), b=c(4,5,6))
>>> DT[,d:=7]
>>>
>>> DT <- data.table(a=c(1,2,3), b=c(4,5,6))
>>> if( nrow(DT)>0 ){DT[,d:=7]}
>> I can reproduce what you're seeing, but I don't think it has anything
>> to do with DT being surrounded by {}, a simple:
>>
>>      if (nrow(DT) > 0) DT[, d := 7]
>>
>> will trigger a dump to the console as well
>>
>>> So, should the second assignment within the 'if' statement print out DT?
>> I don't think it should. Note that if the := isn't the last clause in
>> the expression block, nothing is printed, eg. this will be silent:
>>
>>      if (nrow(DT) > 0) {
>>        DT[, d := 7]
>>        x <- 1
>>      }
>>
>>>   To
>>> get rid of this effect in my scripts (which potentially could result in
>>> printing out tens-of-thousands of rows of data into a log file...),
>> That wouldn't happen, data.table "dumps" are always trimmed if they
>> are too long (this is configured by the 'datatable.print.nrows' and
>> 'datatable.print.topn' otions).
>>
>> By default, if the data.table is > 100 rows, you will only print the
>> top 5 and bottom 5 rows.
>>
>> In fact, as a workaround for you, if you set:
>>
>>      options(datatable.print.nrows=0)
>>
>> Your "problem" will now go away, meaning:
>>
>>      if (nrow(DT) > 0) DT[, d := 7]
>>
>> will be silent
>>
>> But so will all of your data.table "console dumps". Which is to say,
>> just typing `DT` would not print anything to the console. You'd now
>> have to explicitly set the 'nrows' option in a call to `print` to see
>> your data.table, eg: `print(DT, nrows=100)` so you could explore the
>> data.table on the console.
>>
>> There are people who say you should never dump a data.table or
>> data.frame to the console, but rather look at str(dt) ... not sure
>> that I agree with that, but that is another thing to consider if you
>> hammer datatable.print.nrows to 0.
>>
>> HTH,
>> -steve
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>



More information about the datatable-help mailing list