[Rcpp-devel] Accessing data frame information within Rcpp (question from Stack Overflow)

Dirk Eddelbuettel edd at debian.org
Mon Jun 11 03:36:34 CEST 2012


On 9 June 2012 at 17:28, T P wrote:
| 
| I'm reposting the following question relating to using data frames in Rcpp - I
| originally put it up on StackOverflow but Dirk directed me to post it here
| instead. I'm interested in whether there's a resolution to this issue, and if
| not, whether there are future plans to resolve it.
| 
| This is my first post on here, so go easy - I'm hoping my query will get a
| better response than on SO!
| 
| 
| In the R / Rcpp code shown in italics below (a toy example), I beam across the
| data frame mydf to the Rcpp code (and pick it up as DF), and then count the
| number of age values that exceed 21, and the number of name values that equal
| "Bob" or "Eve". The two answers (4 and 2) are returned as a list, as shown at
| the end of the code. All hopefully self-explanatory.
| 
| Here's my question: Rcpp clearly understands DF["name"] and DF["age"] as being
| the columns name and age in DF - that's great. Given that this notation is
| meaningful, what notation can we use to refer directly to the individual
| elements in DF, so that we don't need to generate intermediate vectors (i.e.
| the std::vectors name and age in the code below)? The reason I ask is that in
| practice the input data frame(s) may well have a much, much greater number of
| columns, and it feels unwieldy to have to map each one individually to a vector
| given that the information is clearly already contained within the DF object.
| If we had to do this to use the columns of a data frame in R, there'd be a
| riot!
| 
| I imagine an answer to this question will be valuable to all those who use Rcpp
| for complex tasks where data frames need passing (which is presumably
| everything beyond a certain level of complexity), so I thought I'd map things
| out in detail. Hope the question is clear, and many thanks in advance for your
| help. :)

1) Please post with full names, preferably also with some affiliation. "T P"
does not really qualify.

2) You asked several related questions on StackOverflow; I fear you may
simply not understand C++ well enough to appreciate why what you ask for is
both difficult as well as not necessarily useful to a C++ programmer.  Rcpp
is /not/ a simple 'R to C++' translation tool. It rather is device to enable
interoperability.  But when your in a C++ context ... you are bound by C++
rules.  Hence a data.frame as a collection of vectors etc pp

Sorry, no silver bullet.

Dirk

| 
| 
| library(inline)
| 
| mydf = data.frame(name=c("Amy","Bob","Cal","Dan","Eve","Fay","Gus"),
|                   age=c(24,17,31,28,19,20,25), stringsAsFactors=FALSE)
| 
| 
| testfunc1 = cxxfunction(
|     signature(DFin = "data.frame"),
|     plugin = "Rcpp",
|     body = '
|         Rcpp::DataFrame DF(DFin);
|         std::vector<std::string> name =
|                          Rcpp::as< std::vector<std::string> >(DF["name"]);
|         std::vector<int> age =
|                          Rcpp::as< std::vector<int> >(DF["age"]);
|         int n = name.size();
|         int counter1 = 0;
|         int counter2 = 0;
|         for (int i = 0; i < n; i++) {
|             if (age[i] > 21) {
|                 counter1++;
|             }
|             if ((name[i] == "Bob") | (name[i] == "Eve")) {
|                 counter2++;
|             }
|         }
|         return(Rcpp::List::create( _["counter1"] = counter1,
|                                    _["counter2"] = counter2 ));
|         ')
| 
| out = testfunc1(mydf)
| print(out)
| 
| The output in out is of course:
| 
| $counter1
| [1] 4
| 
| $counter2
| [1] 2
| 
| 
| ----------------------------------------------------------------------
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com  


More information about the Rcpp-devel mailing list