[Rcpp-devel] Problem with memory usage when using Rcpp(Parallel) in a package

Thanh Le Hoang thanh_le_hoang at web.de
Tue Oct 10 22:48:05 CEST 2017


Hello, you can find a text copy of the previous emails below.
I have already found a solution for my problem, but thanks for your reply.

Thanh

> Gesendet: Dienstag, 10. Oktober 2017 um 13:21 Uhr
> Von: "Dirk Eddelbuettel" <edd at debian.org>
> An: "Thanh Le Hoang" <thanh_le_hoang at web.de>
> Cc: Rcpp-devel at lists.r-forge.r-project.org
> Betreff: Re: [Rcpp-devel] Problem with memory usage when using Rcpp(Parallel) in a package
>
> 
> On 10 October 2017 at 12:40, Thanh Le Hoang wrote:
> | [DELETED ATTACHMENT <no suggested filename>, HTML]
> 
> Can you please try again in text mode?
> 
> Dirk
> 
> -- 
> http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
> 


> Replying to my own email since I just found the solution. I somehow screwed up the 
> Makevars/Makevars.win files, so I deleted them and created new files where I 
> exactly copied the Makevars lines on the RcppParallel webpage. I also had to add
> #' @importFrom RcppParallel RcppParallelLibs
> to my package so that there were no errors with the NAMESPACE file when running 
> roxygen2.

> > Gesendet: Montag, 09. Oktober 2017 um 00:22 Uhr
> > Von: "Thanh Le Hoang" <thanh_le_ho... at web.de>
> > An: rcpp-devel at lists.r-forge.r-project.org
> > Betreff: [Rcpp-devel] Problem with memory usage when using Rcpp(Parallel) in a package
> > Hello,
> >  
> > I'm writing my first package for a machine learning algorithm called self-organizing 
> > map where I use compiled code (with Rcpp) and parallelization (RcppParallel).
> > My computer uses Windows 10 (64 bit, 8 GB RAM) and I currently have a problem 
> > with the memory usage (shown in the Windows task manager) which keeps going up the 
> > longer the algorithm runs. The usage doesn't increase immediately, but after a couple
> > of seconds and I only noticed it when I tried larger data sets. The memory is 
> > only freed by terminating/restarting the R session.
> >  
> > What is somewhat strange is that the memory usage is not attributed to Rstudio or 
> > the R session (i.e. the memory usage in the task manager does not go up for 
> > the respective processes). According to RAMMap (which gives more information 
> > about memory usage on Windows) the used memory belongs to the "nonpaged pool". 
> > The RStudio profiler and lineprof did not seem to detect the memory leak (if 
> > I read the output correctly). So far I have rewritten parts of the C++ code to 
> > use references and pre-allocated memory, but it did not help.
> >  
> > The main function in the package calls several smaller functions written in 
> > C++ and it seems that all of those functions play a role here, but I have found 
> > a function where this problem occurs consistently. It calculates the (squared) 
> > euclidean norm for each row of a given matrix (in parallel) with a boolean 
> > vector (oldColumns) specifying which columns should be used/ignored during this 
> > calculation:
> > 
> > https://pastebin.com/qgyzx0M7
> > 
> > When I pasted this code into a new project, I have noticed that the problem 
> > only happens when I build (with devtools::build()) and install a package 
> > containing this function, regardless of whether I build a source package or 
> > a binary package. When I just sourceCpp a file with this function, no 
> > memory problems occur. So could this have anything to do with how I build packages? 
> > Until now I have followed the "R packages" book written by Hadley Wickham for this.
> > 
> > Here is some R code which generates some test data and calls the function.
> > 
> > https://pastebin.com/c0RaeW9K
> > 
> > Everytime I run this code (which takes a couple of minutes), the memory usage 
> > goes up by 4% - 6% which makes my package unusable for larger sets of data.
> > I have been stuck on this problem for a week now and any help would be 
> > appreciated.
> > 
> > Thank you,
> > Thanh


More information about the Rcpp-devel mailing list