[Rcpp-devel] Joining each row of CharacterMatrix to return a CharacterVector?

Romain Francois romain at r-enthusiasts.com
Tue Dec 11 08:45:31 CET 2012


Hello,

We don't know your function f, so this is hard to say. Anyway, this 
below implements something similar to apply(.,1,paste0) in rcpp (current 
devl version):

#include <Rcpp.h>
using namespace Rcpp ;

// [[Rcpp::export]]
CharacterVector pasteColumns(CharacterMatrix m){
     String buffer ;
     int nc = m.ncol(), nr = m.nrow() ;
     CharacterVector out(nr) ;
     for( int i=0; i<nr; i++){
         CharacterMatrix::Row row = m(i,_) ;
         buffer = "" ;
         for( int j=0; j<nc; j++){
             buffer += row[j] ;
         }
         out[i] = buffer ;
     }
     return out ;
}

With this, I get these timings:

     nc <- 100; nr <- 2e4
     M <- matrix( sample(letters, nc*nr, replace = TRUE) , ncol = nc )

     require(microbenchmark)
     microbenchmark(
         pasteColumns(M),
         apply(M, 1, paste0)
         )
     Unit: milliseconds
                  expr       min        lq    median        uq      max
     1 apply(M, 1, paste0) 451.39975 484.41435 495.92757 501.58728 714.1418
     2     pasteColumns(M)  67.91322  68.29269  70.34704  77.09383 145.9161



Le 10/12/12 23:43, hickey at wehi.EDU.AU a écrit :
> I preface this by stating that I'm very much a Rcpp beginner who is
> comfortable in R but I've never before used C++. I'm working through the
> Rcpp documentation but haven't been able to answer my question.
>
> I've written an Rcpp (v0.10.1) function f that takes as input a
> CharacterMatrix X. X has 20 million rows and 100 columns. For each row
> of X the function alters certain entries of that row according to rules
> governed by some other input variables. f returns the updated version of
> X. This function works as I'd like it to:
> # a toy example with nrow = 2, ncol = 2
>  > X <- matrix('A', ncol = 2, nrow = 2)
>  > X
>       [,1] [,2]
> [1,] "A"  "A"
> [2,] "A"  "A"
>  > X <- f(X, other_input_variables)
>  > X
>       [,1] [,2]
> [1,] "Z"  "A"
> [2,] "z"  "A"
>
> However, instead of f returning a CharacterMatrix as it currently does,
> I'd like to return a CharacterVector Y, where each element of Y is a
> "collapsed" row of the updated X.
>
> I can achieve the desired result in R by using:
> Y <- apply(X=X, MARGIN = 1, FUN = function(x){paste0(x, collapse = '')})
>  > Y
> [1] "ZA" "zA"
>
> but I wondered whether this "joining" is likely to be more efficiently
> performed within my function f? If so, how do I join the 100 individual
> character entries of a row of the CharacterMatrix X into a single string
> that will then comprise an element of the returned CharacterVector Y?
>
> Many thanks,
> Pete
> --------------------------------
> Peter Hickey,
> PhD Student/Research Assistant,
> Bioinformatics Division,
> Walter and Eliza Hall Institute of Medical Research,
> 1G Royal Parade, Parkville, Vic 3052, Australia.
> Ph: +613 9345 2324
>
> hickey at wehi.edu.au <mailto:hickey at wehi.edu.au>
> http://www.wehi.edu.au


-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

R Graph Gallery: http://gallery.r-enthusiasts.com

blog:            http://romainfrancois.blog.free.fr
|- http://bit.ly/RE6sYH : OOP with Rcpp modules
`- http://bit.ly/Thw7IK : Rcpp modules more flexible



More information about the Rcpp-devel mailing list