[Rcppdevel] Joining each row of CharacterMatrix to return a CharacterVector?
Dirk Eddelbuettel
edd at debian.org
Tue Dec 11 00:09:05 CET 2012
Hi Pete,
On 11 December 2012 at 09:43, hickey at wehi.EDU.AU wrote:
 I preface this by stating that I'm very much a Rcpp beginner who is comfortable
 in R but I've never before used C++. I'm working through the Rcpp documentation
 but haven't been able to answer my question.

 I've written an Rcpp (v0.10.1) function f that takes as input a CharacterMatrix
 X. X has 20 million rows and 100 columns. For each row of X the function alters
 certain entries of that row according to rules governed by some other input
 variables. f returns the updated version of X. This function works as I'd like
 it to:
 # a toy example with nrow = 2, ncol = 2
 > X < matrix('A', ncol = 2, nrow = 2)
 > X
 [,1] [,2]
 [1,] "A" "A"
 [2,] "A" "A"
 > X < f(X, other_input_variables)
 > X
 [,1] [,2]
 [1,] "Z" "A"
 [2,] "z" "A"

 However, instead of f returning a CharacterMatrix as it currently does, I'd
 like to return a CharacterVector Y, where each element of Y is a "collapsed"
 row of the updated X.

 I can achieve the desired result in R by using:
 Y < apply(X=X, MARGIN = 1, FUN = function(x){paste0(x, collapse = '')})
 > Y
 [1] "ZA" "zA"

 but I wondered whether this "joining" is likely to be more efficiently
 performed within my function f? If so, how do I join the 100 individual
 character entries of a row of the CharacterMatrix X into a single string that
 will then comprise an element of the returned CharacterVector Y?
Ah, the joy of working with character strings/vectors/pointers :)
You certainly can. And there will be a lot of old, bad, ... tutorials out
there. I can't right now think of a good tutorial to point you to  other
than the perennial "C++ Annotations" by Brokken which is at the same time
good, current, uptodate and free (!!)  so maybe you shoud continue with
the little 2 x 2 and 3 x 3 examples:
i) loop over a row, first init the target string to be ""
ii) assign each element of the matrix to a string
iii) append, which can be as easy as using the + for two strings
iv) accumulate the result strings in a vector of strings
That should work, does not require pointers, free, malloc, ... You can
optimize later.
Hope this helps, Dirk

Dirk Eddelbuettel  edd at debian.org  http://dirk.eddelbuettel.com
More information about the Rcppdevel
mailing list