[Rcpp-devel] ColDatum constructors memory safety

Dirk Eddelbuettel edd at debian.org
Fri Mar 19 14:24:30 CET 2010


On 18 March 2010 at 21:04, Alistair Gee wrote:
| I'm not sure that you can verify the fix with an empirical test, short
| of using something like valgrind b/c the bug would occur only due to
| chance:
| 
| Here is a simpler scenario:
| 
|   ColDatum c1;       // via the default constructor
|   ColDatum c2(c1);  // via the copy constructor
| 
| When c1 is constructed via the default constructor (and without my
| patch), only the level field is initialized. The other fields, such as
| type, numLevels, and levelNames are not initialized, so they can be
| garbage values. So, if by chance, c1.type == COLTYPE_FACTOR, then when
| c2 is constructed by the copy constructor, the code to copy the
| levelNames is executed, which proceeds to copy the data
| c1.levelNames[0] .. c1.levelNames[c1.numLevels - 1]. But c1.levelNames
| wasn't initialized, so it could refer to random memory. Moreover,
| c1.numLevels wasn't initialized either, so it could be anything as
| well. Thus, the code could be accessing invalid memory.
| 
| Here is the code to the copy constructor:
| 
| ColDatum::ColDatum(const ColDatum& datum) {
|     // Need deep copy so contruction/destruction is synchronized.
|     s = datum.s;
|     x = datum.x;
|     i = datum.i;
|     type = datum.type;
|     level = datum.level;
|     numLevels = datum.numLevels;
|     d = datum.d;
|     if (type == COLTYPE_FACTOR) {
| 	levelNames = new std::string[numLevels];
| 	for (int j = 0; j < numLevels; j++)
| 	    levelNames[j] = datum.levelNames[j];
|     }
| }
| 
| Note also the destructor. When c1's destructor is invoked, and c1.type
| == COLTYPE_FACTOR, the code will attempt to delete [] levelNames but
| that field was never initialized, so the code would attempt to
| incorrectly deallocate memory:
| 
| ColDatum::~ColDatum() {
|     if (type == COLTYPE_FACTOR) {
| 	// For this to work we need a deep copy when type == COLTYPE_FACTOR.
| 	// See the copy constructor below. It is wasteful to have
| 	// evey factor cell own a separate copy of levelNames, but we leave
| 	// the task of factoring it out (using reference counts) for
| 	// later.
| 	delete [] levelNames;
|     }
| }
| 
| The patch fixes both problems by initializing c1.type to
| COLTYPE_UNKNOWN (which I added to the enumeration).

All good points. I'll commit the patch, and expand the unit test a little
more to add a type or two more among the ones that ColDatum provides.

Dirk

-- 
  Registration is open for the 2nd International conference R / Finance 2010
  See http://www.RinFinance.com for details, and see you in Chicago in April!


More information about the Rcpp-devel mailing list