[Rcpp-devel] Passing large data frame
Romain Francois
romain at r-enthusiasts.com
Mon Jun 14 12:36:18 CEST 2010
Hi,
Le 14/06/10 05:38, R_help Help a écrit :
>
> Hi,
>
> I have a doubt regarding passing large data frame into Rcpp. If we
> consider the following function
>
> foo(SEXP myframe) {
>
> RcppFrame&fr_ref = (RcppFrame&) myframe;
> }
>
> Somehow seems to work without a need to call a constructor and thus
> causes copy of large data frame to RcppFrame object.
This is very wrong code, you are just getting lucky about the internal
representation of RcppFrame.
Consider:
require( Rcpp )
require( inline )
inc <- '
class Foo{
public:
Foo( SEXP x) : y(5), xx(x) {
Rprintf( "hello" ) ;
}
Foo( ) : y(6), xx(R_NilValue) {
Rprintf( "hello from default" );
}
inline SEXP gety(){
return IntegerVector::create( y ) ;
}
private:
int y ;
SEXP xx ;
} ;
'
code <- '
Foo& foo = (Foo&) x ;
return foo.gety() ;
'
df <- data.frame( x = 1:5, y = 1:5 )
fx <- cxxfunction( signature( x = "data.frame" ), code, include = inc,
plugin = "Rcpp" )
I get :
> fx( df )
[1] 35966160
> fx( df )
[1] 35966160
Using C++ cast "static_cast", the compiler would tell you the error.
file10d63af1.cpp: In function ‘SEXPREC* file10d63af1(SEXPREC*)’:
file10d63af1.cpp:49: error: invalid static_cast from type ‘SEXPREC*’ to
type ‘Foo&’
make: *** [file10d63af1.o] Error 1
ERROR(s) during compilation: source code errors or compiler
configuration errors!
> However, you can
> see that the code is not safe.
It is more than "not safe", it is just plain wrong.
> there's no guarantee that myframe is a
> data frame. This is my first question, is there any way to check type
> of the input SEXP? Or is there any better way to do this?
RcppFrame is a class of what we call the "classic" api, which indeed is
largely inefficient because it copies data all the time.
The new api, and in particular the class Rcpp::DataFrame is much more
efficient. For example the constructor
Rcpp::DataFrame( SEXP )
will not make a copy of the SEXP you pass in.
You can find example code of Rcpp::DataFrame in the unit test:
> system.file( "unitTests", "runit.DataFrame.R", package = "Rcpp" )
> Secondly, I'm wondering why the POSIXct column in my data frame
> appears as double when I pass a data frame as an argument into a
> function or when I read it out from global environment map? Is there
> anyway to ensure it appears as RcppDatetime? Thank you.
>
> Robert
Someone else will pick this up.
--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/98Uf7u : Rcpp 0.8.1
|- http://bit.ly/c6YnCi : graph gallery collage
`- http://bit.ly/bZ7ltC : inline 0.3.5
More information about the Rcpp-devel
mailing list