[Rcpp-devel] Largest size of a NumericMatrix, segfaults and error messages

Ramon Diaz-Uriarte rdiaz02 at gmail.com
Mon Apr 1 14:48:42 CEST 2013


Dear All,

I am confused about creating Rcpp Numeric Matrices larger than
.Machine$integer.max. The code below illustrates some of the points
(probably with too much detail ;-). These are some things that puzzle me:

1. For some values of number of rows and columns, creating the matrix is
not allowed, with the message "negative length vectors are not allowed",
but with other values the creation of the matrix proceeds without
(apparent) troubles, even when the total size is >> 2^31 - 1.

1.a. Is this intended? 

1.b. I understand the error message is coming from R (not Rcpp) and thus
this is not something that can be made easier to understand?


2. The part I found confusing is that the same problem (number of cells >
2^32 - 1) is sometimes caught at object creation, but sometimes manifests
itself much later (either in the C++ code or later in R).

I was expecting (maybe the problem are my expectations) an error early on,
when creating the matrix; if the creation proceeds without trouble, I was
not expecting a segfault (as I think all cells are initialized to cero).

Is the recommended procedure to check if the product of dimensions is <
2^31 - 1 before creation? (But then, this will change in R-3.0 in 64 bit
systems?). 


Best,

R.



// Beginning of file max-size.cpp

#include <Rcpp.h>

using namespace Rcpp;


// [[Rcpp::export]]

NumericMatrix f1(IntegerVector nr, IntegerVector nc,
		 IntegerVector sf = 0) {
  int nrow = as<int>(nr);
  int ncol = as<int>(nc);
  int segf = as<int>(sf);
  
  NumericMatrix outM(nrow, ncol);
  std::cout << " After creating outM" << std::endl;
  outM(nrow - 1, 0) = 1;
  std::cout << " After asigning to last row, first column" 
            << std::endl;

  std::cout << " Some other value: 1, 0:   " 
	    << outM(1, 0) << std::endl;

  if( (nrow > 1) && (ncol > 3) )
    std::cout << " Some other value: nrow - 1, ncol - 3:   " 
	      << outM(nrow - 1, ncol - 3) << std::endl;

  outM(nrow - 1, ncol - 1) = 1;
  std::cout << " After asigning something to last cell" 
            << std::endl;

  std::cout << " Try to return the last assignment: " 
	    << outM(nrow - 1, ncol - 1) << std::endl;

  if((nrow >= 500000) && segf) {
    std::cout << "\n Assign a few around/beyond 2^32 - 1. Should segfault\n";
    for(int i = 4290; i < 4300; ++i) {
      std::cout << "    i = " << i << std::endl;
      outM(nrow - 1, i) = 0;
    }
  }

  return wrap(outM);
}

// End of file max-size.cpp





################################################
library(Rcpp)
sourceCpp("max-size.cpp", verbose = TRUE)

(tmp <- f1(4, 5))


4294967 * 500 > .Machine$integer.max
tmp <- f1(4294967, 500)
object.size(tmp)/(4294967 * 500) ## ~ 8

4294967 * 501 > .Machine$integer.max
tmp <- f1(4294967, 501) ## negative length vectors 

500000 * 9000 > .Machine$integer.max
tmp <- f1(500000, 9000) ## sometimes segfaults
tmp[500000, 9000]
object.size(tmp) ## things are missing 
prod(dim(tmp)) > .Machine$integer.max

## using either of these usually leads to segfault

for(i in (4290:4300)) print(tmp[500000, i]) 

f1(500000, 9000, 1)

#####################################################


-- 
Ramon Diaz-Uriarte
Department of Biochemistry, Lab B-25
Facultad de Medicina 
Universidad Autónoma de Madrid 
Arzobispo Morcillo, 4
28029 Madrid
Spain

Phone: +34-91-497-2412

Email: rdiaz02 at gmail.com
       ramon.diaz at iib.uam.es

http://ligarto.org/rdiaz




More information about the Rcpp-devel mailing list