[Rcpp-devel] Using data-frames as sets of rows (e.g. using R data-frames as lookup tables in C++)

Romain Francois romain at r-enthusiasts.com
Fri Sep 6 16:36:58 CEST 2013


Hello, 

I have not looked at this in detail. You might want to have a look at RcppExtras on github. 

Something hadley and I are working on to manipulate data frames. 

We have e.g an implementation of unique that is much faster than r builtin silly version that pastes columns together as one string and unique these strings. 

RcppExtras goal is to expose c++ algorithms that facilitate working with data frames. 

The readme.md file on github has some examples of uses. 

Romain

Le 6 sept. 2013 à 16:20, Mark Clements <mark.clements at ki.se> a écrit :

> By my understanding, Rcpp is better suited to working with data-frames as columns rather than working with data-frames as a set of rows. However, occasionally it may be useful to work with the set of rows. How have others considered this use case?
>  
> [As a motivating example based on simulations in C++, we want to pass data-frames from R for use as look-up tables in C++ (cf. passing transformed data back to R). The STL container std::map is well suited to this task, particularly as it provides ordered keys.
>  
> In the following code, we define a template class Table1D which reads in a data-frame and defines the map key with the first column and the map value with the second column. The only sophistication here is (i) using std::greater as a comparison function and (ii) using Rcpp traits. Note that calling the function from R is only shown for demonstration.
>  
> require(inline)
> lookup <- rcpp(signature(df="data.frame",x="numeric"),
>             body="
>   Table1D<double,double> table = Table1D<double,double>(df);
>   return wrap(table(as<double>(x)));
> ",
>             includes="
> #include <map>
> #include <functional>
> template <class Index, class Outcome>
> class Table1D {
> public:
>   std::map<Index,Outcome,std::greater<Index> > data;
>   Table1D(DataFrame df, int iIndex = 0, int iOutcome = 1) {
>     Vector<Rcpp::traits::r_sexptype_traits<Index>::rtype> df0 = df[iIndex];
>     Vector<Rcpp::traits::r_sexptype_traits<Outcome>::rtype> df1 = df[iOutcome];
>     for (size_t i=0; i<df0.size(); i++) {
>       data[df0[i]] = df1[i];
>     }
>   }
>   virtual Outcome lookup(Index index) {
>     return data.lower_bound(index)->second;
>   }
>   virtual Outcome operator()(Index index) {
>     return lookup(index);
>   }
> };
> ")
> lookup(data.frame(as.numeric(1:1000000),10.0*as.numeric(1:1000000)), 12345.5)
>  
> See also https://github.com/mclements/microsimulation/blob/master/src/rcpp_table.h for an extension to higher dimensions.]
>  
> Sincerely, Mark.
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130906/7be874e9/attachment.html>


More information about the Rcpp-devel mailing list