[Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3
Kevin Ushey
kevinushey at gmail.com
Sat Jan 30 22:58:19 CET 2016
And I now have a MRE:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector ouch(IntegerVector x, IntegerVector y) {
IntegerVector result;
result = x[y - 1];
return result;
}
/*** R
x <- 1:1E6
y <- 1:1E6
gctorture(TRUE)
x <- ouch(x, y)
gctorture(FALSE)
*/
This fails pretty reliably for me. Perhaps we can add this to our unit tests.
On Sat, Jan 30, 2016 at 1:44 PM, Kevin Ushey <kevinushey at gmail.com> wrote:
> I think I know what's going on now. Effectively, we're 'evaluating'
> the sugar proxy vector, getting a pointer to a (temporary,
> unprotected) R data structure, and then storing that pointer. Later,
> when we attempt to allocate the output vector (with `no_init`) the GC
> runs and cleans up our pointer.
>
> See:
>
> https://github.com/RcppCore/Rcpp/blob/master/inst/include/Rcpp/vector/Subsetter.h#L147
> https://github.com/RcppCore/Rcpp/blob/master/inst/include/Rcpp/vector/Subsetter.h#L197
>
> The 'best' fix here is to only re-use the indices for an actual
> IntegerVector, and generate the indices from sugar proxies, but I'm
> not exactly sure how to best handle that. It's also possible that if
> we just allocated the output vector earlier (ie, before getting a
> pointer to our subset proxy), that we'd avoid the R gc cleaning up our
> vector.
>
> FWIW, we should be able to reproduce this reliably when running under
> `gctorture()`...
>
> Kevin
>
> On Sat, Jan 30, 2016 at 7:07 AM, Qiang Kou <qkou at umail.iu.edu> wrote:
>> Another thing confused me is the segfault only happens when the
>> IntegerVector is significantly long.
>>
>> Best,
>>
>> KK
>>
>> On Sat, Jan 30, 2016 at 2:57 AM, Kevin Ushey <kevinushey at gmail.com> wrote:
>>>
>>> Hmm, I have one thought: we try to re-use the indices from an
>>> IntegerVector, but the type here is actually a sugar type:
>>>
>>> Rcpp::SubsetProxy<13, Rcpp::PreserveStorage, 13, true,
>>> Rcpp::sugar::Minus_Vector_Primitive<13, true, Rcpp::Vector<13,
>>> Rcpp::PreserveStorage> > >::get_vec (this=<optimized out>)
>>>
>>> Ie, we access the indices with `INTEGER(x)`, but perhaps those indices
>>> have not actually been properly materialized when we attempt to
>>> perform the subset?
>>>
>>> But then, a simple test case with `x[y - 1]` with `x` and `y` both
>>> being integer vectors seems to work just fine. So I am a bit confused.
>>>
>>> On Fri, Jan 29, 2016 at 3:16 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
>>> >
>>> > On 29 January 2016 at 17:55, Qiang Kou wrote:
>>> > | Hi, Paul, can you try my fork of Rcpp? You can install it by the line
>>> > below:
>>> > |
>>> > | devtools::install_github("thirdwing/Rcpp", ref = "subsetter")
>>> > |
>>> > | This fixed the segfault on my Ubuntu machine.
>>> >
>>> > Yay. Nice work!
>>> >
>>> > | The difference can be found from [1].
>>> >
>>> > Nice and concise.
>>> >
>>> > | In subsetter, if an IntegerVector passed in, we will try to reuse it.
>>> > This led
>>> > | to a segfault in this case, which I don't know why.
>>> > |
>>> > | Dirk and Kevin, do you have any thoughts on it?
>>> >
>>> > Not really, but happy to give this the full reverse-dependency check
>>> > treatment so that we can merge it.
>>> >
>>> > Dirk
>>> >
>>> >
>>> > | Best wishes,
>>> > |
>>> > | KK
>>> > |
>>> > | [1] https://github.com/thirdwing/Rcpp/commit/
>>> > | 216c5220bcb84778a656b3496d0f1803b973ef61
>>> > |
>>> > |
>>> > | On Fri, Jan 29, 2016 at 3:00 PM, Qiang Kou <qkou at umail.iu.edu> wrote:
>>> > |
>>> > |
>>> > | Hi, Kevin, I was also trying to track this down yesterday.
>>> > |
>>> > | From the debugging info below, indices_n is not equal to length of
>>> > indices,
>>> > | which I don't quite understand.
>>> > |
>>> > | Program received signal SIGSEGV, Segmentation fault.
>>> > |
>>> > | 0x00007ffff2ed5c4e in Rcpp::SubsetProxy<13, Rcpp::PreserveStorage,
>>> > 13,
>>> > | true, Rcpp::sugar::Minus_Vector_Primitive<13, true,
>>> > Rcpp::Vector<13,
>>> > | Rcpp::PreserveStorage> > >::get_vec
>>> > (this=this at entry=0x7fffffff79a0)
>>> > |
>>> > | at /usr/local/lib/R/site-library/Rcpp/include/Rcpp/vector/
>>> > | Subsetter.h:200
>>> > |
>>> > | 199 output[i] = lhs[ indices[i] ];
>>> > |
>>> > | (gdb) p i
>>> > |
>>> > | $1 = 33622
>>> > |
>>> > | (gdb) p indices[i]
>>> > |
>>> > | Cannot access memory at address 0x34c6e000
>>> > |
>>> > | (gdb) p indices_n
>>> > |
>>> > | $2 = 9594546
>>> > |
>>> > |
>>> > | On Fri, Jan 29, 2016 at 2:29 PM, Dirk Eddelbuettel
>>> > <edd at debian.org> wrote:
>>> > |
>>> > |
>>> > | On 29 January 2016 at 11:27, Kevin Ushey wrote:
>>> > | | When I add some debug printing to the associated
>>> > subscripting line
>>> > | | (https://github.com/awalker89/openxlsx/blob/
>>> > |
>>> > b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608),
>>> > | | I see:
>>> > | |
>>> > | | colNumbers.size(): 98,03,150
>>> > | | charCols.size(): 95,94,546
>>> > | |
>>> > | | It looks to me like the package is erroneously attempting to
>>> > subset
>>> > | | vectors of different sizes, causing out-of-bounds reads.
>>> > |
>>> > | Nice work.
>>> > |
>>> > | | Unfortunately, Rcpp is not detecting or warning about
>>> > this...
>>> > | |
>>> > | | Either way, I believe this is a bug in the openxlsx package,
>>> > but Rcpp
>>> > | | should be checking / reporting this.
>>> > |
>>> > | With (Rcpp)Armadillo you do have an option of turning this
>>> > on/off. With
>>> > | Rcpp
>>> > | alone not quite.
>>> > |
>>> > | Dirk
>>> > |
>>> > | --
>>> > | http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
>>> > | _______________________________________________
>>> > | Rcpp-devel mailing list
>>> > | Rcpp-devel at lists.r-forge.r-project.org
>>> > |
>>> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>> > |
>>> > |
>>> > |
>>> > |
>>> > | --
>>> > | Qiang Kou
>>> > | qkou at umail.iu.edu
>>> > | School of Informatics and Computing, Indiana University
>>> > |
>>> > |
>>> > |
>>> > |
>>> > |
>>> > | --
>>> > | Qiang Kou
>>> > | qkou at umail.iu.edu
>>> > | School of Informatics and Computing, Indiana University
>>> > |
>>> >
>>> > --
>>> > http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
>>
>>
>>
>>
>> --
>> Qiang Kou
>> qkou at umail.iu.edu
>> School of Informatics and Computing, Indiana University
>>
More information about the Rcpp-devel
mailing list