[Rcpp-devel] ColDatum constructors memory safety

Alistair Gee alistair.gee at gmail.com
Wed Mar 17 00:59:26 CET 2010


Here are patches to RcppFrame.h:

--- RcppFrame.h~        2010-02-16 10:54:01.000000000 -0500
+++ RcppFrame.h 2010-03-16 19:48:29.847994574 -0400
@@ -30,7 +30,8 @@
 enum ColType {                 // Supported data frame column types.
     COLTYPE_DOUBLE, COLTYPE_INT, COLTYPE_STRING,
     COLTYPE_FACTOR, COLTYPE_LOGICAL,
-    COLTYPE_DATE, COLTYPE_DATETIME
+    COLTYPE_DATE, COLTYPE_DATETIME,
+    COLTYPE_UNKNOWN = -1
 };
 class ColDatum;                        // forward declaration, see below


and RcppFrame.cpp:


--- RcppFrame.cpp~      2010-03-12 15:11:45.000000000 -0500
+++ RcppFrame.cpp       2010-03-16 19:48:30.517076368 -0400
@@ -22,7 +22,7 @@

 #include <RcppFrame.h>

-ColDatum::ColDatum() : level(0) { }
+ColDatum::ColDatum() : type(COLTYPE_UNKNOWN), level(0) { }

 ColDatum::ColDatum(const ColDatum& datum) {
     // Need deep copy so contruction/destruction is synchronized.




On Sat, Mar 13, 2010 at 12:03 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
>
> Hi Alistair,
>
> Thanks for sharing the analysis.
>
> On 12 March 2010 at 15:15, Alistair Gee wrote:
> | The ColDatum default constructor does not initialize any fields except
> | for "level". But the ColDatum *copy* constructor expects the "type"
> | field to be properly initialized, b/c if type is COLTYPE_FACTOR, the
> | copy constructor proceeds to allocate memory using the numLevels field
> | (which is unreliable b/c numLevels was never initialized). See
> | comments in source below:
> |
> | ColDatum::ColDatum() : level(0) { };
> |
> | ColDatum::ColDatum(const ColDatum& datum) {
> |   // Need deep copy so contruction/destruction is synchronized.
> |   s = datum.s;
> |   x = datum.x;
> |   i = datum.i;
> |   type = datum.type;                // XXX datum.type was never initialized.
> |   level = datum.level;
> |   numLevels = datum.numLevels; // XXX datum.numLevels was never initialzed.
> |   d = datum.d;
> |   if (type == COLTYPE_FACTOR) {
> |       levelNames = new std::string[numLevels];   // XXX numLevels was copied
> |
> | // from datum.numLevels which
> |
> | // was never initialized
> |       for (int j = 0; j < numLevels; j++)
> |           levelNames[j] = datum.levelNames[j];     // XXX datum.levelNames is
> |
> | // an uninitialized pointer.
> |   }
> | }
> |
> | This is a problem if one tries to create a vector of ColDatum, e.g.
> |
> | vector<ColDatum> colDatumVector(10);
> |
> | Here, the default constructor is used to first construct a ColDatum
> | object (which will not have "type" and "levelNames" properly
> | initialized). Then the copy constructor called by the vector
> | constructor to create 10 copies.
>
> ColDatum is part of what we now call the 'classic' API. It is not getting
> much attention or love as we focus on the 'new' API. That said, the new API
> still lacks coverage for data.frame (and Date or Datetime classes).
>
> If you can think of a fix, we'd give it a good luck.
>
> Else, if you had a patch to clarify the header or code file and maybe just
> warn of the issue, that would help too.
>
> Cheers, Dirk
>
> --
>  Registration is open for the 2nd International conference R / Finance 2010
>  See http://www.RinFinance.com for details, and see you in Chicago in April!
>


More information about the Rcpp-devel mailing list