[Rcpp-devel] Subsetter uses int for indexing (among other issues)?

William Nolan will at landale.net
Wed Nov 7 07:45:53 CET 2018


Hi all,

Longtime user and lurker here.

I got "index error" thrown by Rcpp when trying to subset a matrix with
width * height == 644764 * 3776 greater than MAXINT:

Allocating 647764 x 3776 matrix...
Catchpoint 1 (exception thrown), 0x00007ffff4baa920 in __cxa_throw () from
/usr/lib64/libstdc++.so.6
#1  0x0000000000473a05 in Rcpp::stop<>(char const*) (fmt=0x7fffe7bc3545
"index error")
    at
/home/nolanw/R/x86_64-redhat-linux-gnu-library/3.5/Rcpp/include/Rcpp/exceptions/cpp11/exceptions.h:52
52          throw Rcpp::exception( tfm::format(fmt,
std::forward<Args>(args)... ).c_str() );
(gdb) down
#0  0x00007ffff4baa920 in __cxa_throw () from /usr/lib64/libstdc++.so.6
(gdb) up
#1  0x0000000000473a05 in Rcpp::stop<>(char const*) (fmt=0x7fffe7bc3545
"index error")
    at
/home/nolanw/R/x86_64-redhat-linux-gnu-library/3.5/Rcpp/include/Rcpp/exceptions/cpp11/exceptions.h:52
52          throw Rcpp::exception( tfm::format(fmt,
std::forward<Args>(args)... ).c_str() );
(gdb) up
#2  0x00007fffe7bb75a6 in Rcpp::SubsetProxy<14, Rcpp::PreserveStorage, 13,
true, Rcpp::Vector<13, Rcpp::PreserveStorage> >::check_indices
(this=0x7fffffffc450, x=0x13f9920,
    n=3751, size=-1849010432) at
/home/nolanw/R/x86_64-redhat-linux-gnu-library/3.5/Rcpp/include/Rcpp/vector/:138
138                     stop("index error");
(gdb) up
#3  0x00007fffe7bb6a3c in Rcpp::SubsetProxy<14, Rcpp::PreserveStorage, 13,
true, Rcpp::Vector<13, Rcpp::PreserveStorage> >::get_indices
(this=0x7fffffffc450, t=...)
    at
/home/nolanw/R/x86_64-redhat-linux-gnu-library/3.5/Rcpp/include/Rcpp/vector/Subsetter.h:149
149             check_indices(ptr, rhs_n, lhs_n);
(gdb) l
144         #endif
145
146         void get_indices( traits::identity< traits::int2type<INTSXP> >
t ) {
147             indices.reserve(rhs_n);
148             int* ptr = INTEGER(rhs);
149             check_indices(ptr, rhs_n, lhs_n);
150             for (int i=0; i < rhs_n; ++i) {
151                 indices.push_back( rhs[i] );
152             }
153             indices_n = rhs_n;

As we can see from the stack trace and below, lhs.size() is negative when
cast to int:

(gdb) p (int)(lhs.size())
$12 = -1849010432

This is all coming from the assignment (via operator []) of the subsetting
of one matrix to another matrix's subset:

(static_cast<NumericVector&>(mat))[lhsI] =
(static_cast<NumericVector&>(signals))[rhsI];

(lhsI and rhsI are IntegerVector's)

Now, setting aside whether I *should* be doing that -- what I *do* see in
Subsetter.h (including what I understand to be the most recent version,
1.0.0 from github) is the use of int for indices all over the place in this
file, including in the member variable:

std::vector<int> indices;

Is there any reason why the indices that Subsetter uses internally
shouldn't be size_t or an equivalently capable type like R_xlen_t?
For example, Subsetter's check_indices function takes int's as arguments,
while Vector's size method returns R_xlen_t.

I'll change my code to manually copy elements via operator() using the
row/column arguments for now.  Seems like Subsetter is maybe not quite
ready for prime time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20181107/f86a037f/attachment.html>


More information about the Rcpp-devel mailing list