[datatable-help] internal FALSE/TRUE value has been modified

Matt Dowle mdowle at mdowle.plus.com
Fri Jun 6 02:40:52 CEST 2014


Now fixed in v1.9.3 :

o  The warning "internal TRUE value has been modified" with recently 
released R 3.1
      when grouping a table containing a logical column and where all 
groups are just 1 row
      is now fixed and tests added. Thanks to James Sams for the 
reproducible example.
      The warning is issued by R and we have asked if it can be upgraded 
to error.

Matt


On 01/05/14 16:29, Matt Dowle wrote:
>
> Reproduced, thanks for nice example. Not sure yet but what R 3.1 now 
> does is store length 1 logical vectors once only, globally, for 
> efficiency to avoid many new allocations for the common case of single 
> TRUE or FALSE values passed around at C or R level (a nice and welcome 
> change).  Since data.table modifies vectors by reference,  if that 
> vector is length 1 a new data.table bug as from R 3.1 could be 
> modifying R's internal value of TRUE or FALSE whenever length 1 
> logical vectors occur. Clearly a serious bug. The test suite 
> immediately broke the day after the R-devel change was made (good) and 
> was one reason data.table was in error state in CRAN checks for quite 
> a while before R 3.1 shipped.  It was typically tests of 1-row 
> data.table's including a logical column and modifying that logical 
> column that broke. We fixed that and put in checks to detect and warn 
> if R's internal value has been been modified, just in case.  Those 
> changes were in v1.9.2 on CRAN.  I think I wasn't 100% confident in 
> the detection test (false positives) so made it a warning instead of 
> an error.  Now that R 3.1 is out and we haven't had any false 
> positives, it should be an error.
>
> The feature of this upc_table is that all the groups are size 1 :
>
> > upc_table[, .N, by=list(upc, upc_ver_uc)][,max(N)]
> [1] 1
>
> If we change the example so that one group has more than 1 row, it 
> works ok :
>
> > upc_table = data.table(upc=c(1:99998,1,1), upc_ver_uc=rep(c(1,2), 
> times=50000), is_PL=rep(c(T, F, F, T), each=25000), 
> product_module_code=rep(1:4, times=25000), ignore.column=2:100001)
> > upc_table[, .N, by=list(upc, upc_ver_uc)][,max(N)]
> [1] 2
> > upc = upc_table[, list(is_PL, product_module_code), keyby=list(upc, 
> upc_ver_uc)]
>
> So it seems the problem is in the single allocation of working memory 
> for the largest group when that's just 1 and contains a logical 
> column.  Odd, I would have sworn we caught that! Will fix.
>
> R-devel are planning to do more of this small-object-sharing for 
> common single integer values e.g. 0-10,  so we'll need to add more 
> tests accordingly.
>
> Thanks,
> Matt
>
>
>
> On 01/05/14 05:40, James Sams wrote:
>> I don't really know what this error message means. A quick example to 
>> show what I'm seeing:
>>
>> > library(data.table)
>> data.table 1.9.3  For help type: help("data.table")
>> > upc_table = data.table(upc=1:100000, upc_ver_uc=rep(c(1,2), 
>> times=50000), is_PL=rep(c(T, F, F, T), each=25000), 
>> product_module_code=rep(1:4, times=25000), ignore.column=2:100001)
>> > upc = upc_table[, list(is_PL, product_module_code), keyby=list(upc, 
>> upc_ver_uc)]
>> Warning message:
>> In `[.data.table`(upc_table, , list(is_PL, product_module_code), :
>>   internal TRUE value has been modified
>>
>> When I continue using R, I eventually start getting more errors, such 
>> as:
>>
>> Error in gettext(domain, unlist(args)) : invalid 'string' value
>> Error during wrapup: invalid 'string' value
>>
>> and then terminal input/output becomes corrupted. I only start 
>> getting these error messages once I start using data.table; but the 
>> messages don't necessarily occur only with data.table functions.
>>
>> I don't know if the last statement above is executing correctly or 
>> not. I'm rather confused as to what is going on. I was using a 
>> somewhat stale (maybe a couple of weeks old) svn version of 
>> data.table; but I see the same behavior with the latest data.table 
>> (r1263). I'm using CRAN's R 3.1 package for Ubuntu on 13.10 and 14.04.
>>
>>
>>
>> > sessionInfo()
>> R version 3.1.0 (2014-04-10)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> locale:
>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C 
>> LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8 
>> LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C LC_ADDRESS=C               
>> LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods base
>>
>> other attached packages:
>> [1] data.table_1.9.3
>>
>> loaded via a namespace (and not attached):
>> [1] plyr_1.8.1    Rcpp_0.11.1   reshape2_1.4  stringr_0.6.2
>>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help 
>
>



More information about the datatable-help mailing list