[Rcpp-devel] Question on lme4 book

Kasper Daniel Hansen kasperdanielhansen at gmail.com
Tue Dec 10 16:00:53 CET 2013


This is indeed a great loss for the R community.  I am (luckily) releasing
my work through Bioconductor which seems to be much less strict than CRAN
these days.  Perhaps this statement from Doug can influence the CRAN
policies a bit.

It seems to me that the CRAN people don't always understand the problems
with including external code in your package, such as Eigen.  This can be
extremely painful as I know from experience as well (I have 2 packages on
Bioc using external code), since you depend very much on the code quality
from outside people and you have little specific knowledge of this
codebase.  In general, I caution including external code in a package for
this reason, but there are cases where the external code is so valuable
that it is worthwhile to provide access to it through an R package.
 Unfortunately, those cases also tends to be the cases where the external
code is very complicated.  I am happy to release through Bioc where I can
have a talk with the repository people about these issues.

I maintain Rgraphviz and I am also getting some grief from CRAN about
solaris (since several package on CRAN depend on Rgraphviz).  Luckily,
Rgraphviz is being pulled into CRAN from Bioconductor so the interchange is
more gentle.  I am perhaps willing to spend a tiny amount of time on
solaris but I am hampered by not having access to a system.  It is
basically impossible to address these issues without a system.  It would be
great if someone made a virtual machine with R and everything that I (and
others) could download and test with.

On one hand it is a good thing that a public repository has some standards
(as both Bioc and CRAN has), and I believe that some standards very much
improve the quality of the code, and as an author it is good to spend some
time addressing them.  But it is also very easy to institute too draconian
policies which drives productive people away, and that is extremely
unfortunate.  It seems that CRAN is getting close to this point.
 Hopefully, Bioconductor is still on the right side of quality vs.
annoyance.  It is very hard to have fixed rules on this issue, since the
specific case (codebase, developer, "usefulness") can be so varied.

Best,
Kasper




On Mon, Dec 9, 2013 at 12:56 PM, Douglas Bates <bates at stat.wisc.edu> wrote:

> Yesterday Taylor Russ asked
>
> What it the proper citation for the lme4 package and the Bates' book?
>
>  Also, can lme4 datasets (e.g., Pastes, ScotsSec, InstEval etc.) be
>  used for illustration in publications?  Can the authors grant
>  permission or is the permission from the source needed?
>
>  Many thanks for the package and the book.  When can I hold a
> non-digital copy in my hands?
>
> I inadvertently deleted the message and so must respond without
> maintaining the thread.
>
> The data sets can be used in other publications.  At least my
> understanding is that the data themselves cannot be copyright (despite the
> "Microsoft Patents 1's, 0's" headline in The Onion many years ago - for
> those of you who don't know that The Onion is a satirical newspaper, that
> didn't really occur).  It is only the representation of the data, such as a
> table in a copyright publication, that can be copyright.  I suppose I
> should provide the usual caveat, "I am (thankfully) not a lawyer".
>
> The other lme4 authors may be able to respond to the question of citing
> the lme4 package.  I regret to say that I don't know of a good way of
> citing the book and that there won't be non-digital copies.
>
> Partly this can be attributed to my personality - I'm good at starting
> projects but not so good at finishing them.  However, finishing the book
> would involve spending time maintaining and developing the lme4 package for
> CRAN and I have completely lost my enthusiasm for doing so.
>
> As many of you know, I am doing most of my work in the Julia language (
> www.julialang.org) now.  R is wonderful and I enjoyed most of my time
> working on R and R packages but there are inherent limitations to R,
> particularly when trying to achieve good performance on fitting complex
> models to large data sets, that make this difficult.  It would be
> attractive to have a "pure R" implementation of mixed-models but I don't
> see a way of making it run quickly and without using a lot of memory.  In
> Julia I can build a package that achieves good performance without the need
> to interface to code written in C, C++ or Fortran - in the sense that my
> package doesn't need to require compilation of code outside of that
> provided by the language itself.
>
> It is not surprising that the design of R is starting to show its age.
>  Although R has only been around for 15-18 years, its syntax and much of
> the semantics are based on the design of "S3" which is 25-30 years old.
>
> R packages can include code to be compiled along with the interface code
> and there are many wonderful tools to facilitate this - such as the Rcpp
> package, the devtools package and RStudio support for these packages.  I
> used these in the compiled code underlying lme4_1.0.
>
> But even though Dirk would describe the use of Rcpp as "seamless", in my
> experience it is not, especially if you wish to have your package available
> on CRAN.
>
> Maintaining an Rcpp-based package on CRAN these days is a case of "no good
> deed shall go unpunished" and "the flogging will continue until morale
> improves".  I am the maintainer of the RcppEigen package which apparently
> also makes me the maintainer of an Eigen port to Solaris.  When compilers
> on Solaris report errors in code from Eigen I am supposed to fix them.
>  This is difficult in that I don't have access to any computers running
> Solaris, which is a proprietary operating system as far as I can tell, and
> Eigen is a complex code base using what is called "template
> meta-programming" in C++.  Making modifications to such code can be
> difficult.  I can't claim to fully understand all the details in Eigen and
> in Rcpp.  I am a user of these code bases, not a developer. The Eigen
> authors themselves don't test their code under Solaris because they don't
> have access to Solaris systems either and they don't regard Solaris as an
> important platform for numerical computing.  The CRAN maintainers feel
> differently, which puts me in a box.
>
> There are days when I am tempted to say, "okay, if RcppEigen is not
> suitable for CRAN then remove it" which would result in removal of all the
> packages that depend on it, including lme4.  That may seem childish of me
> but I really don't know what else to do.
>
> So I have reached the point of saying "goodbye" to R, Rcpp and lme4 and
> switching all of my development effort to Julia.  I'm sorry but others are
> going to need to determine how to maintain lme4 to the satisfaction of the
> CRAN maintainers or whether there should be an alternative distribution
> mechanism for R packages.
>
>
>
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20131210/16e6bf4c/attachment.html>


More information about the Rcpp-devel mailing list