[Rcpp-devel] wordcloud

Dirk Eddelbuettel edd at debian.org
Sat Jul 23 18:37:48 CEST 2011


On 23 July 2011 at 09:02, ian.fellows at stat.ucla.edu wrote:
| Hi all,
| 
| I've just released an R package to CRAN that creates pretty looking word
| clouds. I think it makes a good minimal example of how to prototype an
| algorithm in R, and then bring the performance bottleneck down to c++ to
| improve speed.

Sweet!  I am still watching the whole onslaught of new or updated packages
unfold so I haven't had a chance to even check if there were new Rcpp-using
packages.  So welcome to the club :)
 
| An example:
| 
| >install.packages("wordcloud",repos="http://cran.r-project.org",type="source")
| >library(tm)
| >data(crude)
| >crude <- tm_map(crude, removePunctuation)
| >crude <- tm_map(crude, function(x)removeWords(x,stopwords()))
| >tdm <- TermDocumentMatrix(crude)
| >m <- as.matrix(tdm)
| >v <- sort(rowSums(m),decreasing=TRUE)
| >d <- data.frame(word = names(v),freq=v
| + )
| >library(wordcloud)
| Loading required package: Rcpp
| >#using c++ to help layout the words
| >system.time(wordcloud(d$word,d$freq,scale=c(8,.1),min.freq=0))
|   user  system elapsed
|  9.979   0.049   9.878
| >#using R code to do the same layout
| >system.time(wordcloud(d$word,d$freq,scale=c(8,.1),min.freq=0,use.r.layout=T))
|   user  system elapsed
| 151.919   0.716 146.737

Ok, I'll be lazy now as I could just look at the code, but what type of
layout operation did you move to C++? Is it a type of sorting / arranging /
classifying / ... ?  Does it rely on other libraries or did you solve it with
homegrown C++?  How many lines?

And lastly ... given that also know Java so well: what works well / better
with Rcpp for you?

Cheers, Dirk

-- 
Gauss once played himself in a zero-sum game and won $50.
                      -- #11 at http://www.gaussfacts.com


More information about the Rcpp-devel mailing list