[Rcpp-devel] Passing large data frame

Romain Francois romain at r-enthusiasts.com
Mon Jun 14 12:36:18 CEST 2010


Hi,

Le 14/06/10 05:38, R_help Help a écrit :
>
> Hi,
>
> I have a doubt regarding passing large data frame into Rcpp. If we
> consider the following function
>
> foo(SEXP myframe) {
>
>      RcppFrame&fr_ref = (RcppFrame&) myframe;
> }
>
> Somehow seems to work without a need to call a constructor and thus
> causes copy of large data frame to RcppFrame object.

This is very wrong code, you are just getting lucky about the internal 
representation of RcppFrame.

Consider:

require( Rcpp )
require( inline )

inc <- '
class Foo{
public:
	Foo( SEXP x) : y(5), xx(x) {
		Rprintf( "hello" ) ;
	}
	Foo( ) : y(6), xx(R_NilValue) {
		Rprintf( "hello from default" );
	}

	inline SEXP gety(){
		return IntegerVector::create( y ) ;
	}

private:
	int y  ;
	SEXP xx ;

} ;
'
code <- '
	Foo& foo = (Foo&) x ;
	return foo.gety() ;
'

df <- data.frame( x = 1:5, y = 1:5 )
fx <- cxxfunction( signature( x = "data.frame" ), code, include = inc, 
plugin = "Rcpp" )

I get :

 > fx( df )
[1] 35966160
 > fx( df )
[1] 35966160


Using C++ cast "static_cast", the compiler would tell you the error.

file10d63af1.cpp: In function ‘SEXPREC* file10d63af1(SEXPREC*)’:
file10d63af1.cpp:49: error: invalid static_cast from type ‘SEXPREC*’ to 
type ‘Foo&’
make: *** [file10d63af1.o] Error 1

ERROR(s) during compilation: source code errors or compiler 
configuration errors!


> However, you can
> see that the code is not safe.

It is more than "not safe", it is just plain wrong.

> there's no guarantee that myframe is a
> data frame. This is my first question, is there any way to check type
> of the input SEXP? Or is there any better way to do this?

RcppFrame is a class of what we call the "classic" api, which indeed is 
largely inefficient because it copies data all the time.

The new api, and in particular the class Rcpp::DataFrame is much more 
efficient. For example the constructor

Rcpp::DataFrame( SEXP )

will not make a copy of the SEXP you pass in.

You can find example code of Rcpp::DataFrame in the unit test:

 > system.file( "unitTests", "runit.DataFrame.R", package = "Rcpp" )


> Secondly, I'm wondering why the POSIXct column in my data frame
> appears as double when I pass a data frame as an argument into a
> function or when I read it out from global environment map? Is there
> anyway to ensure it appears as RcppDatetime? Thank you.
>
> Robert

Someone else will pick this up.


-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/98Uf7u : Rcpp 0.8.1
|- http://bit.ly/c6YnCi : graph gallery collage
`- http://bit.ly/bZ7ltC : inline 0.3.5



More information about the Rcpp-devel mailing list