[Rcpp-devel] Possible regression in R-3.2.3 or Rcpp 0.12.3

Dirk Eddelbuettel edd at debian.org
Sun Jan 31 00:02:02 CET 2016


On 30 January 2016 at 13:58, Kevin Ushey wrote:
| And I now have a MRE:
| 
| #include <Rcpp.h>
| using namespace Rcpp;
| 
| // [[Rcpp::export]]
| IntegerVector ouch(IntegerVector x, IntegerVector y) {
|   IntegerVector result;
|   result = x[y - 1];
|   return result;
| }
| 
| /*** R
| x <- 1:1E6
| y <- 1:1E6
| gctorture(TRUE)
| x <- ouch(x, y)
| gctorture(FALSE)
| */
| 
| This fails pretty reliably for me. Perhaps we can add this to our unit tests.

Sweet.

We generally do not run gctorture() in unit tests though.  We could -- maybe
with a new environment variable to turn it on (somewhat selectively) ?

Dirk
 
| On Sat, Jan 30, 2016 at 1:44 PM, Kevin Ushey <kevinushey at gmail.com> wrote:
| > I think I know what's going on now. Effectively, we're 'evaluating'
| > the sugar proxy vector, getting a pointer to a (temporary,
| > unprotected) R data structure, and then storing that pointer. Later,
| > when we attempt to allocate the output vector (with `no_init`) the GC
| > runs and cleans up our pointer.
| >
| > See:
| >
| >     https://github.com/RcppCore/Rcpp/blob/master/inst/include/Rcpp/vector/Subsetter.h#L147
| >     https://github.com/RcppCore/Rcpp/blob/master/inst/include/Rcpp/vector/Subsetter.h#L197
| >
| > The 'best' fix here is to only re-use the indices for an actual
| > IntegerVector, and generate the indices from sugar proxies, but I'm
| > not exactly sure how to best handle that. It's also possible that if
| > we just allocated the output vector earlier (ie, before getting a
| > pointer to our subset proxy), that we'd avoid the R gc cleaning up our
| > vector.
| >
| > FWIW, we should be able to reproduce this reliably when running under
| > `gctorture()`...
| >
| > Kevin
| >
| > On Sat, Jan 30, 2016 at 7:07 AM, Qiang Kou <qkou at umail.iu.edu> wrote:
| >> Another thing confused me is the segfault only happens when the
| >> IntegerVector is significantly long.
| >>
| >> Best,
| >>
| >> KK
| >>
| >> On Sat, Jan 30, 2016 at 2:57 AM, Kevin Ushey <kevinushey at gmail.com> wrote:
| >>>
| >>> Hmm, I have one thought: we try to re-use the indices from an
| >>> IntegerVector, but the type here is actually a sugar type:
| >>>
| >>>     Rcpp::SubsetProxy<13, Rcpp::PreserveStorage, 13, true,
| >>>     Rcpp::sugar::Minus_Vector_Primitive<13, true, Rcpp::Vector<13,
| >>>     Rcpp::PreserveStorage> > >::get_vec (this=<optimized out>)
| >>>
| >>> Ie, we access the indices with `INTEGER(x)`, but perhaps those indices
| >>> have not actually been properly materialized when we attempt to
| >>> perform the subset?
| >>>
| >>> But then, a simple test case with `x[y - 1]` with `x` and `y` both
| >>> being integer vectors seems to work just fine. So I am a bit confused.
| >>>
| >>> On Fri, Jan 29, 2016 at 3:16 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
| >>> >
| >>> > On 29 January 2016 at 17:55, Qiang Kou wrote:
| >>> > | Hi, Paul, can you try my fork of Rcpp? You can install it by the line
| >>> > below:
| >>> > |
| >>> > | devtools::install_github("thirdwing/Rcpp", ref = "subsetter")
| >>> > |
| >>> > | This fixed the segfault on my Ubuntu machine.
| >>> >
| >>> > Yay. Nice work!
| >>> >
| >>> > | The difference can be found from [1].
| >>> >
| >>> > Nice and concise.
| >>> >
| >>> > | In subsetter, if an IntegerVector passed in, we will try to reuse it.
| >>> > This led
| >>> > | to a segfault in this case, which I don't know why.
| >>> > |
| >>> > | Dirk and Kevin, do you have any thoughts on it?
| >>> >
| >>> > Not really, but happy to give this the full reverse-dependency check
| >>> > treatment so that we can merge it.
| >>> >
| >>> > Dirk
| >>> >
| >>> >
| >>> > | Best wishes,
| >>> > |
| >>> > | KK
| >>> > |
| >>> > | [1] https://github.com/thirdwing/Rcpp/commit/
| >>> > | 216c5220bcb84778a656b3496d0f1803b973ef61
| >>> > |
| >>> > |
| >>> > | On Fri, Jan 29, 2016 at 3:00 PM, Qiang Kou <qkou at umail.iu.edu> wrote:
| >>> > |
| >>> > |
| >>> > |     Hi, Kevin, I was also trying to track this down yesterday.
| >>> > |
| >>> > |     From the debugging info below, indices_n is not equal to length of
| >>> > indices,
| >>> > |     which I don't quite understand.
| >>> > |
| >>> > |     Program received signal SIGSEGV, Segmentation fault.
| >>> > |
| >>> > |     0x00007ffff2ed5c4e in Rcpp::SubsetProxy<13, Rcpp::PreserveStorage,
| >>> > 13,
| >>> > |     true, Rcpp::sugar::Minus_Vector_Primitive<13, true,
| >>> > Rcpp::Vector<13,
| >>> > |     Rcpp::PreserveStorage> > >::get_vec
| >>> > (this=this at entry=0x7fffffff79a0)
| >>> > |
| >>> > |         at /usr/local/lib/R/site-library/Rcpp/include/Rcpp/vector/
| >>> > |     Subsetter.h:200
| >>> > |
| >>> > |     199             output[i] = lhs[ indices[i] ];
| >>> > |
| >>> > |     (gdb) p i
| >>> > |
| >>> > |     $1 = 33622
| >>> > |
| >>> > |     (gdb) p indices[i]
| >>> > |
| >>> > |     Cannot access memory at address 0x34c6e000
| >>> > |
| >>> > |     (gdb) p indices_n
| >>> > |
| >>> > |     $2 = 9594546
| >>> > |
| >>> > |
| >>> > |     On Fri, Jan 29, 2016 at 2:29 PM, Dirk Eddelbuettel
| >>> > <edd at debian.org> wrote:
| >>> > |
| >>> > |
| >>> > |         On 29 January 2016 at 11:27, Kevin Ushey wrote:
| >>> > |         | When I add some debug printing to the associated
| >>> > subscripting line
| >>> > |         | (https://github.com/awalker89/openxlsx/blob/
| >>> > |
| >>> > b92bb3acdd6ea759be928c298c6faeef2f26fa3e/src/cppFunctions.cpp#L2608),
| >>> > |         | I see:
| >>> > |         |
| >>> > |         |    colNumbers.size(): 98,03,150
| >>> > |         |    charCols.size(): 95,94,546
| >>> > |         |
| >>> > |         | It looks to me like the package is erroneously attempting to
| >>> > subset
| >>> > |         | vectors of different sizes, causing out-of-bounds reads.
| >>> > |
| >>> > |         Nice work.
| >>> > |
| >>> > |         | Unfortunately, Rcpp is not detecting or warning about
| >>> > this...
| >>> > |         |
| >>> > |         | Either way, I believe this is a bug in the openxlsx package,
| >>> > but Rcpp
| >>> > |         | should be checking / reporting this.
| >>> > |
| >>> > |         With (Rcpp)Armadillo you do have an option of turning this
| >>> > on/off. With
| >>> > |         Rcpp
| >>> > |         alone not quite.
| >>> > |
| >>> > |         Dirk
| >>> > |
| >>> > |         --
| >>> > |         http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
| >>> > |         _______________________________________________
| >>> > |         Rcpp-devel mailing list
| >>> > |         Rcpp-devel at lists.r-forge.r-project.org
| >>> > |
| >>> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
| >>> > |
| >>> > |
| >>> > |
| >>> > |
| >>> > |     --
| >>> > |     Qiang Kou
| >>> > |     qkou at umail.iu.edu
| >>> > |     School of Informatics and Computing, Indiana University
| >>> > |
| >>> > |
| >>> > |
| >>> > |
| >>> > |
| >>> > | --
| >>> > | Qiang Kou
| >>> > | qkou at umail.iu.edu
| >>> > | School of Informatics and Computing, Indiana University
| >>> > |
| >>> >
| >>> > --
| >>> > http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
| >>
| >>
| >>
| >>
| >> --
| >> Qiang Kou
| >> qkou at umail.iu.edu
| >> School of Informatics and Computing, Indiana University
| >>

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org


More information about the Rcpp-devel mailing list