[Rcpp-devel] Joining each row of CharacterMatrix to return a CharacterVector?

Romain Francois romain at r-enthusiasts.com
Tue Dec 11 09:37:37 CET 2012


Or (from svn rev 4144), you can use the collapse funtion:

// [[Rcpp::export]]
CharacterVector pasteColumns2(CharacterMatrix m){
     int nr = m.nrow() ;
     CharacterVector out(nr) ;
     for( int i=0; i<nr; i++)
         out[i] = collapse( m(i,_) ) ;
     return out ;
}

Romain

Le 11/12/12 08:45, Romain Francois a écrit :
>
> Hello,
>
> We don't know your function f, so this is hard to say. Anyway, this
> below implements something similar to apply(.,1,paste0) in rcpp (current
> devl version):
>
> #include <Rcpp.h>
> using namespace Rcpp ;
>
> // [[Rcpp::export]]
> CharacterVector pasteColumns(CharacterMatrix m){
>      String buffer ;
>      int nc = m.ncol(), nr = m.nrow() ;
>      CharacterVector out(nr) ;
>      for( int i=0; i<nr; i++){
>          CharacterMatrix::Row row = m(i,_) ;
>          buffer = "" ;
>          for( int j=0; j<nc; j++){
>              buffer += row[j] ;
>          }
>          out[i] = buffer ;
>      }
>      return out ;
> }
>
> With this, I get these timings:
>
>      nc <- 100; nr <- 2e4
>      M <- matrix( sample(letters, nc*nr, replace = TRUE) , ncol = nc )
>
>      require(microbenchmark)
>      microbenchmark(
>          pasteColumns(M),
>          apply(M, 1, paste0)
>          )
>      Unit: milliseconds
>                   expr       min        lq    median        uq      max
>      1 apply(M, 1, paste0) 451.39975 484.41435 495.92757 501.58728 714.1418
>      2     pasteColumns(M)  67.91322  68.29269  70.34704  77.09383 145.9161
>
>
>
> Le 10/12/12 23:43, hickey at wehi.EDU.AU a écrit :
>> I preface this by stating that I'm very much a Rcpp beginner who is
>> comfortable in R but I've never before used C++. I'm working through the
>> Rcpp documentation but haven't been able to answer my question.
>>
>> I've written an Rcpp (v0.10.1) function f that takes as input a
>> CharacterMatrix X. X has 20 million rows and 100 columns. For each row
>> of X the function alters certain entries of that row according to rules
>> governed by some other input variables. f returns the updated version of
>> X. This function works as I'd like it to:
>> # a toy example with nrow = 2, ncol = 2
>>  > X <- matrix('A', ncol = 2, nrow = 2)
>>  > X
>>       [,1] [,2]
>> [1,] "A"  "A"
>> [2,] "A"  "A"
>>  > X <- f(X, other_input_variables)
>>  > X
>>       [,1] [,2]
>> [1,] "Z"  "A"
>> [2,] "z"  "A"
>>
>> However, instead of f returning a CharacterMatrix as it currently does,
>> I'd like to return a CharacterVector Y, where each element of Y is a
>> "collapsed" row of the updated X.
>>
>> I can achieve the desired result in R by using:
>> Y <- apply(X=X, MARGIN = 1, FUN = function(x){paste0(x, collapse = '')})
>>  > Y
>> [1] "ZA" "zA"
>>
>> but I wondered whether this "joining" is likely to be more efficiently
>> performed within my function f? If so, how do I join the 100 individual
>> character entries of a row of the CharacterMatrix X into a single string
>> that will then comprise an element of the returned CharacterVector Y?
>>
>> Many thanks,
>> Pete
>> --------------------------------
>> Peter Hickey,
>> PhD Student/Research Assistant,
>> Bioinformatics Division,
>> Walter and Eliza Hall Institute of Medical Research,
>> 1G Royal Parade, Parkville, Vic 3052, Australia.
>> Ph: +613 9345 2324
>>
>> hickey at wehi.edu.au <mailto:hickey at wehi.edu.au>
>> http://www.wehi.edu.au
>
>


-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

R Graph Gallery: http://gallery.r-enthusiasts.com

blog:            http://romainfrancois.blog.free.fr
|- http://bit.ly/RE6sYH : OOP with Rcpp modules
`- http://bit.ly/Thw7IK : Rcpp modules more flexible



More information about the Rcpp-devel mailing list