[Rcpp-devel] Rcpp version of %in%
Romain Francois
romain at r-enthusiasts.com
Thu Nov 15 17:52:10 CET 2012
Hello,
I've commited an Rcpp version of %in%.
For example:
require(Rcpp)
require(microbenchmark)
sourceCpp( code = '
#include <Rcpp.h>
using namespace Rcpp ;
// [[Rcpp::export]]
LogicalVector in_( CharacterVector x, CharacterVector table){
return in( x, table ) ;
}
' )
`%in++%` <- in_
> c("a", "ad") %in++% letters
[1] TRUE FALSE
In terms of performance:
> xx <- sample( sample(letters, 15 ), 1000000, replace = TRUE )
> microbenchmark(
+ xx %in% letters,
+ xx %in++% letters,
+ in_( xx, letters )
+ )
Unit: milliseconds
expr min lq median uq max
1 in_(xx, letters) 12.79488 12.85228 12.88214 15.33067 44.65161
2 xx %in% letters 31.96431 34.43951 34.90381 35.37460 65.68226
3 xx %in++% letters 12.81114 12.86457 12.91557 15.06667 16.20493
The tool here is unordered_set as we don't care where the data is on the
table, we just want to know if it is there.
Might be interesting at some point to check alternatives to the standard
hasing functions... e.g. play with sparsehash:
http://code.google.com/p/sparsehash/
Romain
--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
R Graph Gallery: http://gallery.r-enthusiasts.com
`- http://bit.ly/SweN1Z : SuperStorm Sandy
blog: http://romainfrancois.blog.free.fr
|- http://bit.ly/RE6sYH : OOP with Rcpp modules
`- http://bit.ly/Thw7IK : Rcpp modules more flexible
More information about the Rcpp-devel
mailing list