[Rcpp-devel] Rcpp "version" of R's match function
Romain Francois
romain at r-enthusiasts.com
Fri Nov 16 22:51:27 CET 2012
Le 16/11/12 18:03, Hadley Wickham a écrit :
>> Ouch. Simon wins:
>>
>> expr min lq median uq max
>> 1 fmatch(xx, letters) 59.4727 60.29989 74.18049 77.94288 112.2938
>> 2 match__(xx, letters) 137.3878 137.77486 138.33766 152.14018 193.3748
>> 3 match_(xx, letters) 147.7115 148.36442 149.20221 162.82343 171.3181
>> 4 match(xx, letters) 288.4345 293.10380 294.58833 296.26125 333.1210
>
> But only because he builds up the hash table once - the others have to
> do it every time.
>
> Hadley
I don't think that's it. here, we have to create a table for 26 entries.
this takes no time at all:
sourceCpp( code = '
#include <Rcpp.h>
using namespace Rcpp ;
// [[Rcpp::export]]
void match_create_hash( const CharacterVector& x, const CharacterVector&
table){
typedef std::tr1::unordered_map<SEXP,int> MAP ;
typedef MAP::value_type VALUE ;
// populate the hash
MAP hash ;
int n = table.size() ;
SEXP* ptr = get_string_ptr(table) ;
for( int i=0 ; i<n; i++){
hash.insert( std::make_pair<SEXP,int>(ptr[i], i + 1) ) ;
}
}
' )
> xx <- sample( letters, 10000000, replace = TRUE )
> system.time( match_create_hash( xx, letters ) )
utilisateur système écoulé
0.000 0.000 0.001
Also, in this example, there is no difference between the first and
second time fmatch is used:
> system.time( fmatch( xx, letters ) )
utilisateur système écoulé
0.060 0.000 0.061
> system.time( fmatch( xx, letters ) )
utilisateur système écoulé
0.059 0.000 0.060
> system.time( fmatch( xx, letters ) )
utilisateur système écoulé
0.060 0.000 0.059
...
This is about looking up in the hash table. We need to do it 10 million
times in this example, so I guess his hashing function performs better.
...
Not giving up
--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
R Graph Gallery: http://gallery.r-enthusiasts.com
`- http://bit.ly/SweN1Z : SuperStorm Sandy
blog: http://romainfrancois.blog.free.fr
|- http://bit.ly/RE6sYH : OOP with Rcpp modules
`- http://bit.ly/Thw7IK : Rcpp modules more flexible
More information about the Rcpp-devel
mailing list