[Rcpp-devel] Logic error causes R memory to be corrupt
Anton Bossenbroek
anton.bossenbroek at me.com
Thu Oct 6 17:31:28 CEST 2016
Sorry, as a result of some iterations on my test and code an error slipped in. The R code should be,
require(Rcpp)
sourceCpp(file="~/tmp/example.cpp")
add_children <- function(number_of_children = 2) {
p <- 0
st <- initialize_storage()
for (i in 1 : number_of_children) {
st <- add_element(st, i)
}
return(st)
}
n <- 10000
a <- add_children(number_of_children = n)
res <- sapply(get_nodes(a), function(x) x[["key"]])
all(res == 0 : n)
Since I am aware that the garbage may be trying to free up memory I added PROTECT(key_); in the body of the constructor of Element and UNPROTECT(result.size()); just before the return in get_nodes(). However, when I do that I get the following warning:
Warning: stack imbalance in '.Call', 19 then -9981
Warning: stack imbalance in '<-', 2 then -9998
etc.
> On 6 Oct 2016, at 10:52, Anton Bossenbroek <anton.bossenbroek at me.com> wrote:
>
> Hi Everyone,
>
> I want to add a large number of objects in C++ that are managed by `shared_ptr` in a `vector`. However, when I push the limits of the amount that I want to allocate the data in R becomes inconsistent.
>
> I will first show the test script and then the c++ file that cause the error. The expected results are shown at the bottom.
>
> # test script
> The test script permits to add an arbitrary number of objects to a vector.
>
> require(Rcpp)
> sourceCpp(file="~/tmp/example.cpp")
>
> add_children <- function(number_of_children = 2) {
> p <- 0
> st <- initialize_storage(p, p)
> for (i in 1 : number_of_children) {
> st <- add_node(st, i, i)
> }
> return(st)
> }
>
> # example.cpp
> The vector is stored in a object that is managed by R but the elements in the vector are managed by `shared_ptr` and created with `make_shared`.
>
> // [[Rcpp::plugins(cpp11)]]
> #include <RcppCommon.h>
>
> #include <memory>
> #include <vector>
>
> using namespace std;
>
> struct Element : public std::enable_shared_from_this< Element > {
> SEXP key_;
>
> /* Simple constructor that assigns the key. */
> Element(SEXP key) : key_(key) {}
>
> /* Convert the object to a R object. */
> operator SEXP() const;
> };
>
> typedef shared_ptr<Element> element_sp;
> typedef vector<element_sp> element_sp_vec;
> typedef shared_ptr<element_sp_vec> element_sp_vec_sp;
>
> struct Storage {
> /* Internal storage of nodes. */
> element_sp_vec nodes_;
>
> /* Empty constructor. */
> Storage() {}
>
> /* Add a node to the storage with its key set to key. */
> void add_element(SEXP key) {
> /* Since Element objects are managed by shared_ptr we create a new class
> * with make_shared. */
> element_sp e = make_shared<Element>(key);
> /* Add the node to the internal storage. */
> nodes_.push_back(e);
> }
>
> element_sp_vec_sp get_nodes() {
> /* Create a shared pointer that will hold all the results. Although we
> * could do this simpler it mimics the logic I implemented in my real
> * program. There I need to swap elements in the list after the copy of the
> * vector. */
> element_sp_vec_sp res(new element_sp_vec());
> /* Copy the data in the nodes vector to the result vector. */
> *res = nodes_;
> return res;
> }
> };
>
> #include <Rcpp.h>
>
> using namespace Rcpp;
>
> /* Convert the Element object to a list with key set its internal member */
> Element::operator SEXP() const
> {
> List serial;
>
> serial["key"] = key_;
>
> return serial;
> }
>
> typedef XPtr<Storage> st_xptr;
>
> // [[Rcpp::export]]
> SEXP
> initialize_storage()
> {
> /* Create a new storage managed by R. */
> Storage* st = new Storage();
> st_xptr p(st, true);
>
> return p;
> }
>
> // [[Rcpp::export]]
> SEXP
> add_element(SEXP st_sexp, SEXP key)
> {
> st_xptr st(st_sexp);
> /* Add a new element to the internal storage. */
> st->add_element(key);
>
> return st;
> }
>
> // [[Rcpp::export]]
> List
> get_nodes(SEXP st_sexp)
> {
> st_xptr st(st_sexp);
> /* Retrieve the elements in the internal storage. */
> element_sp_vec_sp c_res = st->get_nodes();
>
> /* Allocate a List to store all our results. */
> List result(c_res->size());
> int i = 0;
> /* Iterate through the results and store the result in our list. */
> for (auto it : *c_res) {
> result[i] = wrap(*it);
> ++i;
> }
>
> return result;
> }
>
> Below follow a few test cases of the script with the behavior that I experience on Mac OS Sierra with clang.
>
> ## n = 10
>
> Everything works fine
>
> n <- 10
> a <- add_children(number_of_children = n)
> res <- sapply(get_nodes(a), function(x) x[["key"]])
> all(res == 0 : n)
> # [1] TRUE
>
> ## n = 100
>
> Everything works fine
>
> n <- 100
> a <- add_children(number_of_children = n)
> res <- sapply(get_nodes(a), function(x) x[["key"]])
> all(res == 0 : n)
> # [1] TRUE
>
> ## n = 10000
>
> Something goes wrong.
>
> n <- 10000
> a <- add_children(number_of_children = n)
>
> res <- sapply(get_nodes(a), function(x) x[["key"]])
> all(res == 0 : n)
> # [1] FALSE
> # There were 50 or more warnings (use warnings() to see the first 50)
>
> Some further research shows that the warnings are:
>
> warnings()
> # Warning messages:
> # 1: NAs introduced by coercion
> # 2: NAs introduced by coercion
> # 3: NAs introduced by coercion
> # 4: NAs introduced by coercion
> ### etc.
>
> a closer inspection into the content of `res` shows that it has a non numeric value,
>
> res[1000]
> # [[1]]
> # [1] "data"
>
> which is surprising to me since the script only added numeric `SEXP` values to the `vector`. My expected output for this value of `n` would be the same as the cases above.
>
> ## gctorture
> I reran the `n=10000` example with `gctorture(TRUE)` but did not receive any warning but the data is corrupt. Two random elements in the `res` list:
>
> # [[998]]
> # <CHARSXP: "\"key\"">
> #
> # [[999]]
> # [1] "srcref"
>
> = Replication
> I replicated these results on Mac OS X Sierra as well as Docker image based on rocker.
>
> sessionInfo()
> # R version 3.3.1 (2016-06-21)
> # Platform: x86_64-apple-darwin15.5.0 (64-bit)
> # Running under: OS X 10.12 (Sierra)
> #
> # locale:
> # [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> #
> # attached base packages:
> # [1] stats graphics grDevices utils datasets methods base
> #
> # other attached packages:
> # [1] Rcpp_0.12.7 setwidth_1.0-4 colorout_1.1-2
> #
> # loaded via a namespace (and not attached):
> # [1] tools_3.3.1
>
> ### uname
>
> uname -prsv
> Darwin 16.0.0 Darwin Kernel Version 16.0.0: Mon Aug 29 17:56:20 PDT 2016; root:xnu-3789.1.32~3/RELEASE_X86_64 i386
>
> ### clang
>
> clang -v
> Apple LLVM version 8.0.0 (clang-800.0.38)
> Target: x86_64-apple-darwin16.0.0
> Thread model: posix
> InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
>
> Any advise on what may be the problem here?
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
More information about the Rcpp-devel
mailing list