[Rcpp-devel] Joining each row of CharacterMatrix to return a CharacterVector?

hickey at wehi.EDU.AU hickey at wehi.EDU.AU
Tue Dec 11 07:21:22 CET 2012


Thanks very much, Dirk and Steve. 

Always slightly fear-inducing when someone starts their reply with "Ah, the joy of working with X" :) I'll have a go at implementing your suggestion on my two examples, Dirk. 

I think learning more about Rcpp will become my Christmas-holiday project. It's already saved me buckets of computational time in this past week and that's without even really knowing what I'm doing :)
Pete
On 11/12/2012, at 10:09 AM, Dirk Eddelbuettel wrote:

> 
> Hi Pete,
> 
> On 11 December 2012 at 09:43, hickey at wehi.EDU.AU wrote:
> | I preface this by stating that I'm very much a Rcpp beginner who is comfortable
> | in R but I've never before used C++. I'm working through the Rcpp documentation
> | but haven't been able to answer my question.
> | 
> | I've written an Rcpp (v0.10.1) function f that takes as input a CharacterMatrix
> | X. X has 20 million rows and 100 columns. For each row of X the function alters
> | certain entries of that row according to rules governed by some other input
> | variables. f returns the updated version of X. This function works as I'd like
> | it to: 
> | # a toy example with nrow = 2, ncol = 2
> | > X <- matrix('A', ncol = 2, nrow = 2)
> | > X
> |      [,1] [,2]
> | [1,] "A"  "A" 
> | [2,] "A"  "A" 
> | > X <- f(X, other_input_variables)
> | > X
> |      [,1] [,2]
> | [1,] "Z"  "A" 
> | [2,] "z"  "A" 
> | 
> | However, instead of f returning a CharacterMatrix as it currently does, I'd
> | like to return a CharacterVector Y, where each element of Y is a "collapsed"
> | row of the updated X.
> | 
> | I can achieve the desired result in R by using: 
> | Y <- apply(X=X, MARGIN = 1, FUN = function(x){paste0(x, collapse = '')}) 
> | > Y
> | [1] "ZA" "zA"
> | 
> | but I wondered whether this "joining" is likely to be more efficiently
> | performed within my function f? If so, how do I join the 100 individual
> | character entries of a row of the CharacterMatrix X into a single string that
> | will then comprise an element of the returned CharacterVector Y?
> 
> Ah, the joy of working with character strings/vectors/pointers :)  
> 
> You certainly can. And there will be a lot of old, bad, ... tutorials out
> there.  I can't right now think of a good tutorial to point you to -- other
> than the perennial "C++ Annotations" by Brokken which is at the same time
> good, current, up-to-date and free (!!) -- so maybe you shoud continue with
> the little 2 x 2 and 3 x 3 examples:
> 
> i)   loop over a row, first init the target string to be ""
> ii)  assign each element of the matrix to a string
> iii) append, which can be as easy as using the   +   for two strings
> iv)  accumulate the result strings in a vector of strings
> 
> That should work, does not require pointers, free, malloc, ...  You can
> optimize later.
> 
> Hope this helps,  Dirk
> 
> -- 
> Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com  

--------------------------------
Peter Hickey,
PhD Student/Research Assistant,
Bioinformatics Division,
Walter and Eliza Hall Institute of Medical Research,
1G Royal Parade, Parkville, Vic 3052, Australia.
Ph: +613 9345 2324

hickey at wehi.edu.au
http://www.wehi.edu.au


______________________________________________________________________
The information in this email is confidential and intended solely for the addressee.
You must not disclose, forward, print or use it without the permission of the sender.
______________________________________________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20121211/cb98c9fa/attachment-0001.html>


More information about the Rcpp-devel mailing list