[inlinedocs] slow package.skeleton.dx(), probably because of extract.file.parse()

Julien Moeys julien.moeys at mark.slu.se
Fri Nov 12 18:14:35 CET 2010


Hi Toby & inlinedocs team

The parser extra.code.code is unfortunately needed in my package (the doc is not extracted without it!)

And I unfortunately I can't put the package code on r-forge (not that it is strategic, but it is not public)

I tried to see a bit more what could be wrong
- Rprof() shows that textConnection takes most of the time (I guess it is used by parse or something like that)
- memoryRprof() does not show any overuse of memory
- I tested old versions of inlindocs, and it is slow too! (down to 1.2). So I was wrong to think that some recent changes made it slower (recompiled on R 2.12.0).
- tried to speedup the code (lapply). Almost no changes.
- when removing some functions, it does not get especially faster (I removed functions added lately in my packages, + S3 and S4 methods)

So I finally re-installed R 2.11.1, and recompiled the latest inlinedocs (on a lighter version of my package)
===> and it is faster! (~70  seconds instead of ~270 seconds)

I wonder if it could be that textConnexion is slower on R 2.12.0???
I found this issue, related to textConnexion in R 2.11 and 2.12 devel
https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14286


Finally, I tested package.skeleton.dx on the latest rforge commit of the code and 

- on R 2.12.0 it takes 3.38 seconds, and Rprof report on textConnexion is 
$by.self
                 self.time self.pct total.time total.pct
textConnection        2.52    74.56       2.52     74.56
grep                  0.18     5.33       0.18      5.33
gsub                  0.10     2.96       0.10      2.96
cat                   0.08     2.37       0.10      2.96

- On R 2.11.1 it takes 1.54 seconds (2 times less), , and Rprof report on textConnexion is
$by.self
                          self.time self.pct total.time total.pct
textConnection                 0.66     42.9       0.66      42.9
grep                           0.18     11.7       0.18      11.7
file                           0.10      6.5       0.10       6.5
gsub                           0.10      6.5       0.10       6.5

so it is a bit the same problem (except that in my case 270 seconds becomes enough to get a coffee every time I compile the package :o)

Well, that is all for the moment.
If that inspire you some idea on possible cause of the slow down...

All the best

Julien
PS: Again, this is not critical. Inlinedocs is still saving me much more time than this strange slow down.
And now I know I can run inlinedocs from R 2.11.1.



-----Original Message-----
From: Toby Dylan Hocking [mailto:Toby.Hocking at inria.fr] 
Sent: 12 November 2010 10:22
To: Julien Moeys
Cc: inlinedocs-support at r-forge.wu-wien.ac.at
Subject: Re: [inlinedocs] slow package.skeleton.dx(), probably because of extract.file.parse()

an idea: do you need the extra.code.code Parser Function for that
package? If not, you can disable it by putting something like this in
pkgdir/R/.inlinedocs.R

parsers <- default.parsers[names(default.parsers)!="extra.code.docs"]

if you do need that Parser Function, then the only solution is to
speed up the code. maybe so we can test different implementations and
possible speedups, could you upload your package code to
inlinedocs/pkg/inlinedocs/etc/ 

Also another idea is to use SVN to go back to a version that is
sufficiently fast, then try to figure out what has happened since then
that makes it slow.

In any case if you end up getting an implementation that speeds up the
package (and it still passes all the tests when you run R CMD check),
please commit to SVN.


More information about the Inlinedocs-support mailing list