[Rcpp-devel] Zero length vectors in R

Simon Urbanek simon.urbanek at R-project.org
Wed Jun 10 12:39:27 CEST 2020


Toby,

Rcpp simply calls allocVector() so regular R rules apply. R's SEXP can hold vectors up to length 1 inside without additional allocations*, therefore from memory management perspective writes to the first element of a 0-length vector are not invalid. The valgrind instrumentation of R doesn't guard against that case, i.e., it doesn't mark those 8 bytes as NOACCESS, it only marks additional allocated memory accordingly (not relevant in this case).

Cheers,
Simon

* - see also R-ints 1.1.4 for details on allocator classes


> On Jun 10, 2020, at 4:45 PM, Toby Hocking <tdhock5 at gmail.com> wrote:
> 
> Hi Leonardo thanks for the help.
> For context, we are trying to fuzz test Rcpp packages, so we are
> throwing random sized vectors at Rcpp functions which have been
> written by others. We want to be able to detect when these Rcpp
> functions have memory read/write issues (e.g. reading the first
> element of a zero-length vector).
> I have been doing some tests
> https://github.com/akhikolla/RcppDeepState/issues/4 and I have
> observed that valgrind notices an invalid read if I do
> pointer_to_array_of_size_zero_from_new_or_malloc[0] but it does not
> give me any invalid read for integer_vector_of_size_zero[0]. For
> concreteness here are the functions I used for testing,
> 
> // [[Rcpp::export]]
> int read_vector(int i){
>  Rcpp::IntegerVector x(0);
>  return x[i];
> }
> 
> // [[Rcpp::export]]
> int read_new(int i){
>  int* ptr = new int[0];
>  int x = ptr[i];
>  delete[] ptr;
>  return x;
> }
> 
> So in other words doing read_new(0) in R results in a valgrind Invalid
> read message, but doing read_vector(0) does not. Is that normal? Are
> there any other tools we could use to detect the
> read-past-the-end-of-array problem that happens when we do
> read_vector(0) ? Or is this an inherent issue in the way that R
> allocates vectors?
> 
> BTW there seems to be some relevant discussion in
> https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-valgrind
> "Next is a description of the memory that was accessed. It is inside a
> block allocated by malloc, called from GetNewPage, that is, in the
> internal R heap. Since this memory all belongs to R, valgrind would
> not (and did not) detect the problem in an uninstrumented build of R."
> I'm using R-4.0.0 compiled from source
> --with-valgrind-instrumentation=2. Are Rcpp IntegerVectors allocated
> using malloc/GetNewPage as in this example, or is the memory
> allocation different?
> 
> Thanks for any help / ideas / clarification you can provide
> Toby
> 
> 
> On Mon, Jun 8, 2020 at 11:53 PM Leonardo Silvestri <lsilvestri at ztsdb.org> wrote:
>> 
>> Yes, there is an underlying pointer. You can simply test
>> 'max_segments.size()' to see if you are allowed to dereference that
>> first element or not.
>> 
>> If you're interested in the exact R representation of vectors at
>> C-level, read R Internals
>> (https://cran.r-project.org/doc/manuals/r-release/R-ints.pdf).
>> 
>> If you are interested in the Rcpp wrappers around these vectors look at
>> the Rcpp header file 'vector.h' in the Rcpp package
>> (inst/include/Rcpp/vector/Vector.h).
>> 
>> 
>> On 6/9/20 1:04 AM, Akhila Chowdary Kolla wrote:
>>> Hello everyone,
>>> 
>>> I am trying to test a package binsegRcpp. I pass the size of
>>> max_segments vector as zero it doesn't throw any segfault when I am
>>> trying to access the zeroth index element. Instead gives a garbage value
>>> or error like negative length vectors not allowed.
>>> Rcpp::List rcpp_binseg_normal
>>>      (const Rcpp::NumericVector data_vec,
>>>      const Rcpp::IntegerVector max_segments) {
>>>      int kmax = max_segments[0];
>>> 
>>>  }
>>> Link to the package:
>>> https://github.com/tdhock/binsegRcpp/blob/master/src/rcpp_interface.cpp
>>> 
>>> How is a zero-length vector represented in R? Does it still store a
>>> pointer to some location?
>>> If so Is there a way to get this segfault detected. I tried using
>>> rhub::check it doesn't give any error. I tried using sanitizers as well
>>> still no luck.
>>> Can someone please suggest a way to detect this in R.
>>> 
>>> Thanks
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Rcpp-devel mailing list
>>> Rcpp-devel at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>> 
>> 
>> _______________________________________________
>> Rcpp-devel mailing list
>> Rcpp-devel at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel



More information about the Rcpp-devel mailing list