[Rcpp-devel] overloaded methods in modules

Thu Nov 18 22:25:18 CET 2010

Hello,

Up to now, a class exposed by a module could only have one method of a 
given name. I've now lifted that restriction by applying the same sort 
of trickery as in this thread: 
http://article.gmane.org/gmane.comp.lang.r.rcpp/929

Consider this simple C++ class :

class Randomizer {
public:

     Randomizer(){}

	NumericVector get( int n ){
		RNGScope scope ;
		return runif( n, 0.0, 1.0 );
	}

	NumericVector get( int n, double min ){
		RNGScope scope ;
		return runif( n, min, 1.0 );
	}

	NumericVector get( int n, double min, double max ){
		RNGScope scope ;
		return runif( n, min, max );
	}

} ;

We'd like to be able to use it from R like we would in C++:

r <- new( Randomizer )
r$get(5)
r$get(5, .5)
r$get(5, 0, 10)

It is a bit more work than with the constructor because we need to help 
the compiler to disambiguate between the three "get".

This is one way to do it:

RCPP_MODULE(mod){

     // helping the compiler disambiguate things
     NumericVector (Randomizer::*get_1)(int) = &Randomizer::get ;
     NumericVector (Randomizer::*get_2)(int,double) = &Randomizer::get ;
     NumericVector (Randomizer::*get_3)(int,double,double) = 
&Randomizer::get ;

	class_<Randomizer>( "Randomizer" )

	    .default_constructor()

		.method( "get" , get_1 )
		.method( "get" , get_2 )
		.method( "get" , get_3 )
		;

}

Another way is :

RCPP_MODULE(mod){

	class_<Randomizer>( "Randomizer" )

	    .default_constructor()

		.method( "get" , ( NumericVector (Randomizer::*)(int) )( 
&Randomizer::get)  )
		.method( "get" , ( NumericVector (Randomizer::*)(int,double) )( 
&Randomizer::get) )
		.method( "get" , ( NumericVector (Randomizer::*)(int,double,double) )( 
&Randomizer::get) )
		;

}

We can probably get smarter about this. If someone has an idea, please 
come forward.

The examples above are for the most trivial form of overloading, where 
the decision is only based on the number of arguments. But we can also 
dispatch based on the arguments themselves. This is the same idea as for 
the constructor and is achieved by passing an extra argument to .method 
that constrols whether a given ethod is valid for the supplied arguments.

I'm running out of good examples, but here is one that shows dispatch 
based on the arguments:

The C++ class:

class Randomizer {
public:

     Randomizer(){}

	NumericVector get( int n ){
		RNGScope scope ;
		return runif( n, 0.0, 1.0 );
	}

	List get( IntegerVector n ){
		RNGScope scope ;
		int size = n.size() ;
		List res( size) ;
		for( int i=0; i<size; i++){
		    res[i] = runif(n[i] , 0.0, 1.0 ) ;
		}
		return res ;
	}

} ;

 From R, we will call this as such:

r$get(5L)
r$get(c(5,10))

and we want to use the first get when the vector is of length one, the 
other one otherwise.

For this we need this little function (it needs to have this exact 
signature) :

bool get_int_valid(SEXP* args, int nargs){
     if( nargs != 1 ) return false ;
     if( TYPEOF(args[0]) != INTSXP ) return false ;
     return ( LENGTH(args[0]) == 1 ) ;
}

whose job is to decide if the first version is ok. If not, the second is 
tried, and so on. methods are stored in a std::vector so it is 
guaranteed that they are scanned in the order they are declared (the 
.method calls).

So, we would then expose like this:

RCPP_MODULE(mod){

	class_<Randomizer>( "Randomizer" )

	    .default_constructor()

		.method( "get" , ( NumericVector (Randomizer::*)(int) )( 
&Randomizer::get) , &get_int_valid )
		.method( "get" , ( List (Randomizer::*)(IntegerVector) )( 
&Randomizer::get) )
		;

}

As I said, it is more work, but most often only the first form of 
dispatch is needed.

This is a first shot at it, so it might not be as good as it can be, it 
deserves testing, ...

Romain

-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9VOd3l : ZAT! 2010
|- http://bit.ly/c6DzuX : Impressionnism with R
`- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube