[Rcpp-devel] Use size_t rather than int for vector si zes and indexing?

Dirk Eddelbuettel edd at debian.org
Wed Apr 13 20:38:06 CEST 2011


Hi Chuck,

Thanks for posting here.

On 13 April 2011 at 12:22, chuck at chuckcode.com wrote:
| Thanks Jay for the lightning fast insightful response! Didn't realize that
| this limitation was intrinsic to R itself. I'll have to go look at
| bigmemory and some of the other projects.

Yes, Rcpp uses 'proxy model' classes around the underlying SEXP objects. It
really is just R under it, and we inherit all its advantages and oddities.
Such as int32 indexing for vectors.

The bigmemory package can be a way out, and as it interfaces R using external
pointers, you should also be able to hook into this via the Rcpp::XPtr class.
That would make for a great new demo or example.  

Feel like cooking one up while you work through this?  ;-)

Cheers, Dirk

| 
| Thanks,
| -Chuck
| 
| On Wed, 13 Apr 2011 14:11:25 -0400, Jay Emerson <jayemerson at gmail.com>
| wrote:
| > Chuck,
| > 
| > Internally, R is using 4-byte integers for indexing, and the length of
| > a vector is thus constrained to 2-billion-ish elements.  Two ways
| > around this include packages ff and
| > bigmemory, for example, or relying on database-like queries.  However,
| > the resulting objects can't be used with standard R functions (with
| > some special exceptions).
| > 
| > Jay
| > 
| > 
| > On Wed, Apr 13, 2011 at 2:05 PM,  <chuck at chuckcode.com> wrote:
| >>
| >> Hi All,
| >>
| >> Thanks to Dirk Eddelbuettel and the other contributors for such a
| >> wonderful package. Rcpp really transforms the way that I've been able
| to
| >> incorporate c++ code into R. Makes it possible to speed up the critical
| >> computations while keeping all the great flexibility and features of R.
| >>
| >> I've run into a problem lately with particularly large vectors in Rcpp.
| I
| >> seem to be overflowing when my vectors get larger than 2^31 elements on
| a
| >> 64 bit system. It looks from the code of both the classic (included
| >> below)
| >> and the newer versions as though this is due to using ints rather than
| >> something like size_t (http://en.wikipedia.org/wiki/Size_t) as the type
| >> for
| >> size and indexing into the vector. It looks like RcppResultsSet's add()
| >> functions may also be using ints.
| >>
| >> Curious to know if there is a particular reason for using ints rather
| >> than
| >> something like size_t and if the project managers would be open to
| >> changing
| >> to accomodate larger vectors.
| >>
| >> Thanks,
| >> -Chuck Sugnet
| >>
| >> template <typename T>
| >> class RcppVector {
| >> public:
| >>        typedef T* iterator ;
| >>        typedef const T* const_iterator ;
| >>
| >>    RcppVector(SEXP vec);
| >>    RcppVector(int len);
| >>    int size() const;
| >>    T& operator()(int i) const;
| >>    T *cVector() const;
| >>    std::vector<T> stlVector() const;
| >>
| >>    inline const_iterator begin() const { return v ; }
| >>    inline const_iterator end() const { return v + len ; }
| >>
| >>    inline iterator begin(){ return v ; }
| >>    inline iterator end(){ return v + len ; }
| >>
| >> private:
| >>    int len;
| >>    T *v;
| >> };
| >> _______________________________________________
| >> Rcpp-devel mailing list
| >> Rcpp-devel at lists.r-forge.r-project.org
| >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
| >>
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com


More information about the Rcpp-devel mailing list