[Rcpp-devel] Report of CRAN compilation problem and solution with architecture x86_32

Juan Domingo Esteve Juan.Domingo at uv.es
Sat Nov 26 21:10:30 CET 2022

Keywords: runtime error, package check, 32 bit architectures, large files

This is the second of two reports with CRAN check problems that I found in my package and
that affect only some particular architectures (in this case, x86_32)

Problem description:

  When compiling a package with C++ source code using Rcpp in a Linux system,
  kernel 5.19.16-100, distribution Fedora 35, the generated package passed
  R CMD check --as-cran test, giving no compilation warnings and no execution errors.

  Nevertheless, the runtime tests in the CRAN server provoked an error exclusively
  for the x86_32 architecture (found mostly in old PCs).

  Let's suppose you have stored a variable of unsigned long long type at the end
  of a binary file. You think you can read it with:

  unsigned long long endofbindata;
  std::string fname="yourfilename";

  std::ifstream f(fname.c_str());
  f.seekg(-sizeof(unsigned long long),std::ios::end);
  f.read((char *)&endofbindata,sizeof(unsigned long long));

  and indeed you can, but ONLY in 64-bit architectures. The function seekg does not
  work as expected in 32-bit architectures since the first parameter (offset)
  is of type streamoff which does not seem to be defined equally by g++ for 32 and
  64 bit architectures. In 32 bit provokes over/underflow and absurd results
  on execution EVEN IF THE FILE is smaller than 2^32 bytes (in compilation, even in
  a 32-bit computer, no error or warning is raised so you don't notice the problem).

My solution has been:

  Make a more portable function to get the size of a file using the stat system call, like:

  unsigned long long GetFileSize(std::string fname)
         struct stat stat_buf;
         int rc = stat(fname.c_str(), &stat_buf);
         if (rc != 0)
          std::string err="Cannot obtain information (with stat system call) of file "+fname+"\n";
          err += "This is probably because you are running this in a 32-bit architecture and the file is bigger than 4 GB.\n";
          err += "Unfortunately, we have not found yet a solution for that and, if you need to manage so big files,\n";
          err += "probably you should consider using a 64-bit architecture.\n";
          // NOTE: may be definition of __USE_FILE_OFFSET64 could solve this but it might provoke other problems...
          return ((unsigned long long)stat_buf.st_size);

  According to the stat manual, stat returns this error:

         pathname or fd refers to a file whose size, inode number, or number of blocks cannot be represented in, respectively, the  types
         off_t,  ino_t,  or  blkcnt_t.   This  error  can  occur  when, for example, an application compiled on a 32-bit platform without
         -D_FILE_OFFSET_BITS=64 calls stat() on a file whose size exceeds (1<<31)-1 bytes.

  Done that, use the returned number (if it has succeeded) to go there (or there, less an offset) with a f.seekg call.

  As you see, I have not found a real solution, but at least this warns the user about the problem of using large files
  in 32-bit architectures.

  This should be now infrequent in practice, since every day less 32-bit computers remain in use,
  but since CRAN still checks with them I have preferred to document it, just in case anyone else may
  benefit of the information.


Juan Domingo Esteve
Dept. of Informatics, School of Engineering
University of Valencia
Avda. de la Universidad, s/n.
         46100-Burjasot (Valencia)

Telephone:      +34-963543572
Fax:            +34-963543550
email:  Juan.Domingo at uv.es

More information about the Rcpp-devel mailing list