If I remember correctly, environments use hash tables to lookup names. <br><br>In pure R to check if an element L[["key"]] exists I would usually use <br>> "key" %in% names(L)<br>which seems to be reasonably fast. `%in%` works in vectors, too<br>


> keys %in% L<br>returning exactly what Dirks suggestion was supposed to do  (you missed an "if", or alternatively an "or")<br><br>`%in%`

 uses .Internal(match(...)), which computes a temporary hash table as 

far as I can see in the source (src/main/unique.c). I am not sure about 

the implementation of the Hash. In particular I don't know if some kind 

of caching is done, because the lookup is much faster than Dirks 

function (see below) and computing a hash table for one lookup seems 

inefficient. I think it would be good to use the internal `match` 

function of R to do what you want. In Rinternals.h match is exposed as <br>  SEXP Rf_match(SEXP itable, SEXP ix, int nmatch)<br>where nmatch is the value to be returned if there is no match. <br><br>

Below are the results of my little benchmark (see below). First parameter

 of `lookup` is the number elements in L, second is the length of keys, 

third one specifies the probability that a key is in present in the list.<br>

<br>

Regards,<br>

Jonas<br>

<br>##Benchmark results<br>> lookup(10000, 1000, 0.1)<br>                test elapsed relative<br>2         f(L, keys)   0.641   160.25<br>3         g(L, keys)   0.005     1.25<br>1 keys %in% names(L)   0.004     1.00<br>


> lookup(10000, 1000, 1.0)<br>                test elapsed relative<br>2         f(L, keys)   0.643   160.75<br>3         g(L, keys)   0.004     1.00<br>1 keys %in% names(L)   0.004     1.00<br>> lookup(100000, 100, 0.1)<br>


                test elapsed relative<br>2         f(L, keys)   0.656      164<br>3         g(L, keys)   0.004        1<br>1 keys %in% names(L)   0.004        1<br>> lookup(100000, 100, 1.0)<br>                test elapsed relative<br>


2         f(L, keys)   0.654    163.5<br>3         g(L, keys)   0.004      1.0<br>1 keys %in% names(L)   0.004      1.0<br>> lookup(100000, 1, 0.1)<br>                test elapsed relative<br>2         f(L, keys)   0.647   161.75<br>


3         g(L, keys)   0.004     1.00<br>1 keys %in% names(L)   0.005     1.25<br>> lookup(100000, 1, 1.0)<br>                test elapsed relative<br>2         f(L, keys)   0.647   161.75<br>3         g(L, keys)   0.004     1.00<br>


1 keys %in% names(L)   0.005     1.25<br><br>

<br>## Code for benchmarks<br>library(inline)<br>library(rbenchmark)<br>f <- cxxfunction(signature(ls="list", ts="character"), plugin="Rcpp", body='<br>    Rcpp::List lst(ls);<br>    Rcpp::CharacterVector nam = lst.names();<br>


    std::vector<std::string> nm = Rcpp::as< std::vector< std::string > >(nam);<br>    int m = nam.size();<br><br>    Rcpp::CharacterVector tok(ts);<br>    std::vector<std::string> tk = Rcpp::as< std::vector< std::string > >(tok);<br>


    int n = tok.size();<br><br>    Rcpp::LogicalVector   log(n);<br><br>    for (int i=0; i<n; i++) {  // look at all tokens<br>       log[i] = false;                   // assume false, but compare to all names<br>       for (int j=0; j<m; j++) {<br>


                    if(tk[i] == nm[j])<br>              log[i] = 1;     // and note equality if we find it<br>       }<br>    }<br>    return log;<br> ')<br><br>g <- cxxfunction(signature(ls="list", ts="character"), plugin="Rcpp", body='<br>


    Rcpp::List lst(ls);<br>    Rcpp::CharacterVector nam = lst.names();<br><br>        Rcpp::IntegerVector ind(Rf_match(nam,ts,0));<br>    Rcpp::LogicalVector   log(ind);<br>        return log;<br> ')<br><br><br>lookup<-function(nList, nKeys, P) {<br>


    nList<-10000<br>    nKeys<-1000<br>    P<-0.1<br>    keys<-as.character(floor(runif(nKeys, 1, nList/P+1)))<br>    L<-as.list(1:nList)<br>    names(L)<-as.character(1:nList)<br>    benchmark(keys %in% names(L), f(L, keys), g(L, keys), replications=10, columns=c("test", "elapsed", "relative"))<br>


}<br><br>lookup(10000, 1000, 0.1)<br>lookup(10000, 1000, 1.0)<br>lookup(100000, 100, 0.1)<br>lookup(100000, 100, 1.0)<br>lookup(100000, 1, 0.1)<br>lookup(100000, 1, 1.0)<br><br><br><div class="gmail_quote">On Wed, Mar 28, 2012 at 3:34 PM, Dirk Eddelbuettel <span dir="ltr"><<a href="mailto:edd@debian.org">edd@debian.org</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"><br>

On 28 March 2012 at 15:17, Ulrich Bodenhofer wrote:<br>

| Thanks for your swift reply, Dirk! Wow, to be frank, that is not what I<br>

| was expecting. In the meantime, I read Section 5.9.6 of "Writing R<br>

| extensions" and I was stunned to see a solution there that is similar to<br>

| the one you propose. I do not know R internals very well, but I cannot<br>

| believe that accesses by names are implemented by tediously searching<br>

| through all names sequentially. There must be some hash table behind,<br>

| right? Otherwise, list accesses would be terribly slow. If all my<br>

<br>

</div>It's a trade-off: Inserting into a hash table has cost, you would enforce<br>

that on every (!!) vector use, whether names are looked up or not.<br>

<br>

Also, vectors are a natural representation as R's SEXP are just linear<br>

collections.<br>

<div class="im"><br>

| assumptions are right, I wonder why the R API does not make this<br>

| mechanism available.<br>

|<br>

| May I add one more question: where can I find the definition of the List<br>

| class and the implementation of its methods in the Rcpp package source<br>

| code? I was looking for this, but got lost. Are you using the method<br>

<br>

</div>The only answer I can give is to ... keep looking.  It's messy, because there<br>

is so much code---but that is because Rcpp does so much stuff.<br>

<br>

But some header files are more important than others -- and the Vector.h file<br>

as well as the Vector/ subdirectory are important.<br>

<br>

As mentioned, Rcpp::List() is a vector too.<br>

<div class="im"><br>

| described in Section 5.9.6 of "Writing R extensions" (which I doubt) or<br>

| something else. I would actually be quite curious to learn more about<br>

| how lists are implemented in Rcpp.<br>

<br>

</div>They are just vectors, hence the linear traversal.<br>

<span class="HOEnZb"><font color="#888888"><br>

Dirk<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

| Thanks and best regards,<br>

| Ulrich<br>

|<br>

|<br>

| If so, I wonder why this mechanism has not been<br>

|<br>

| On 03/28/2012 02:56 PM, Dirk Eddelbuettel wrote:<br>

| > On 28 March 2012 at 13:56, Ulrich Bodenhofer wrote:<br>

| > | My question is the following: is there any way of checking in whether a<br>

| > | component of an Rcpp list (or vector) with a given name exists in this list. If<br>

| > | I simply try accessing a non-existing component, I get an "index out of bounds"<br>

| > | error. Trying to catch a possible exception did not work either. I also browsed<br>

| > | the Rcpp package source code, but unfortunately I got lost. Sorry if this has<br>

| > | been addressed on this list before. At least I googled in many different ways,<br>

| > | but could not find anything. Any ideas? Any help is gratefully appreciated!<br>

| ><br>

| > Good question, and I have the suspicion that we answered that years ago on<br>

| > the list before.<br>

| ><br>

| > Here is a super-pedestrian version. It takes a list, extracts its names() --<br>

| > and should probably test whether names is empty ? -- and then compares these<br>

| > against a vector of tokens, returning a bool for each token:<br>

| ><br>

| > R><br>

| > R>  suppressMessages(library(inline))<br>

| > R><br>

| > R>  f<- cxxfunction(signature(ls="list", ts="character"), plugin="Rcpp", body='<br>

| > +    Rcpp::List lst(ls);<br>

| > +    Rcpp::CharacterVector nam = lst.names();<br>

| > +    std::vector<std::string>  nm = Rcpp::as<  std::vector<  std::string>  >(nam);<br>

| > +    int m = nam.size();<br>

| > +<br>

| > +    Rcpp::CharacterVector tok(ts);<br>

| > +    std::vector<std::string>  tk = Rcpp::as<  std::vector<  std::string>  >(tok);<br>

| > +    int n = tok.size();<br>

| > +<br>

| > +    Rcpp::LogicalVector   log(n);<br>

| > +<br>

| > +    for (int i=0; i<n; i++) {  // look at all tokens<br>

| > +       log[i] = false;                   // assume false, but compare to all names<br>

| > +       for (int j=0; j<m; j++) {<br>

| > +          log[i] = (tk[i] == nm[j]);     // and note equality if we find it<br>

| > +       }<br>

| > +    }<br>

| > +    return log;<br>

| > + ')<br>

| > R>  ll<- list(aa=1,b="b")       ## list with names 'aa' and 'b'<br>

| > R>  f(ll, c("aa", "b", "c"))         ## does it contain 'a' or 'b' or 'c' ?<br>

| > [1] FALSE  TRUE FALSE<br>

| > R>  f(ll, "c")                       ## works with scalars too<br>

| > [1] FALSE<br>

| > R><br>

| ><br>

| > I am sure there is a much cleverer solution one could write...<br>

| ><br>

| > Dirk<br>

| ><br>

|<br>

<br>

</div></div><div class="im HOEnZb">--<br>

R/Finance 2012 Conference on May 11 and 12, 2012 at UIC in Chicago, IL<br>

See agenda, registration details and more at <a href="http://www.RinFinance.com" target="_blank">http://www.RinFinance.com</a><br>

</div><div class="HOEnZb"><div class="h5">_______________________________________________<br>

Rcpp-devel mailing list<br>

<a href="mailto:Rcpp-devel@lists.r-forge.r-project.org">Rcpp-devel@lists.r-forge.r-project.org</a><br>

<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel</a><br>

</div></div></blockquote></div><br>