[Rcpp-devel] RcppArmadillo inv() depends on Lapack, zgetri_ not available on CRAN / R-forge?

Douglas Bates bates at stat.wisc.edu
Tue Jun 7 22:03:04 CEST 2011


On Tue, Jun 7, 2011 at 2:34 PM, baptiste auguie
<baptiste.auguie at googlemail.com> wrote:
> Hi Doug,
>
> On 8 June 2011 03:43, Douglas Bates <bates at stat.wisc.edu> wrote:
>>
>> On Jun 6, 2011 4:17 AM, "baptiste auguie" <baptiste.auguie at googlemail.com>
>> wrote:
>>>
>>> Thank you for the explanations below.
>>>
>>> On 5 June 2011 10:40, Dirk Eddelbuettel <edd at debian.org> wrote:
>>> >
>>> > On 5 June 2011 at 10:12, baptiste auguie wrote:
>>> > | Hi Dirk and all,
>>> > |
>>> > | On 4 June 2011 12:04, Dirk Eddelbuettel <edd at debian.org> wrote:
>>> > | >
>>> > | > Baptiste,
>>> > | >
>>> > | > On 4 June 2011 at 11:45, baptiste auguie wrote:
>>> > | > | Dear list,
>>> > | > |
>>> > | > | My package cda, which I was hoping to release on CRAN, fails to
>>> > | > | compile on R-forge with error,
>>> > | > |
>>> > | > | ** testing if installed package can be loaded
>>> > | > | Error in dyn.load(file, DLLpath = DLLpath, ...) :
>>> > | > |   unable to load shared object
>>> > '/tmp/RtmpbztUMm/Rinst1829c04c/cda/libs/cda.so':
>>> > | > |   /tmp/RtmpbztUMm/Rinst1829c04c/cda/libs/cda.so: undefined symbol:
>>> > zgetri_
>>> > | > |
>>> > | > | It builds fine on my local machines (Mac OS 10.5, 10.6).
>>> > | > |
>>> > | > | >From an older discussion on this list <
>>> > | > |
>>> > http://www.mail-archive.com/rcpp-devel@lists.r-forge.r-project.org/msg00678.html>
>>> > | > | the issue seems to be that Armadillo's inv() relies on a function
>>> > that
>>> > | > | is not provided by R, only by LAPACK. I have just replaced inv()
>>> > by
>>> > | > | pinv() and solve() in my code; merely to see what happens, but
>>> > chances
>>> > | > | are they also require a full LAPACK.
>>> > |
>>> > | Indeed, the error on R-forge is now with zgels_, required to solve
>>> > | linear systems. It seems one cannot solve Armadillo linear systems
>>> > | without LAPACK in the current situation.
>>> >
>>> > Yes. Doug, Romain and myself should address that, or at least make it
>>> > clear
>>> > what feature of the full Armadillo are lacking in RcppArmadillo.
>>> >
>>> > | > Sometime relatively early in the RcppArmadillo development process,
>>> > Doug
>>> > | > convinced Romain and myself to go for a pure template solution with
>>> > Armadillo
>>> > | > as all / most things found during the configure (or in this case,
>>> > cmake)
>>> > | > stage can be assumed 'found' given that we have around us by design.
>>> >  So no
>>> > | > testing, no local library and full reliance and what R gives us.
>>> > | >
>>> > | > That was a brilliant idea, and has freed us from having to rely on
>>> > building
>>> > | > and shipping a library, having to tell users how to set PKG_LIBS etc
>>> > pp and I
>>> > | > firmly believe that this helped tremendously in getting
>>> > RcppArmadillo more
>>> > | > widely used. So I would not want to revert this.
>>> > |
>>> > | It sounds like a good choice, I agree. Yet solving a linear system is
>>> > | quite a crucial task in linear algebra; it would be nice if we could
>>> > | come up with a tutorial recipe for dummies like me.
>>> > |
>>> > | >
>>> > | > In any event, it seems that you need more LAPACK than R has for you.
>>> >  That is
>>> > | > likely to be a dicey situation as you per se do not know whether R
>>> > was built
>>> > | > and linked with its own (subset) copy of LAPACK, or whether it uses
>>> > system
>>> > | > LAPACK libraries (as e.g. the Debian / Ubuntu systems do).  So you
>>> > may be in
>>> > | > a spot bother and I not sure what I can recommend --- other than
>>> > trying your
>>> > | > luck at some short configure snippets that will run at package build
>>> > time to
>>> > | > determine whether the system you want to build cda on it 'rich'
>>> > enough to
>>> > | > support it.  I can help you off list with some configure snippets as
>>> > some of
>>> > | > my packages have configure code; adding a test for zgetri should be
>>> > feasible.
>>> > | >
>>> > | > | Does anybody have any experience
>>> > | > | dealing with such issues w.r.t releasing a package on R-forge /
>>> > CRAN?
>>> > | > | Is there any chance they would consider installing LAPACK on those
>>> > | > | servers, or is there no point in asking such things?
>>> > | >
>>> > | > I don't think it is a matter of fixing the R-Forge server. I think
>>> > it is a
>>> > | > matter of making your package installable on the largest number of
>>> > user
>>> > | > systems.  Also try win-builder.r-project.org to see how it fares on
>>> > that
>>> > | > platform.
>>>
>>> Unsurprisingly, it fails, with the same complaint as R-forge.
>>>
>>> > |
>>> > | I was under the impression that R-forge or CRAN, if it had LAPACK
>>> > | installed, could produce binaries for the relevant platforms, and
>>> > | users would not have to build the package themselves and would not be
>>> > | required of having LAPACK on their machine. That's probably a
>>> > | misconception, isn't it?
>>> >
>>> > If and only statically linked binaries or libraries where produced,
>>> > which is
>>> > generally not the case. Many OSs (Linux incl) ship source only and
>>> > otherwise
>>> > link dynamically, others (Windoze) use dynamic linking and OS X is for
>>> > all I
>>> > know somewhere in the middle (as you can get prebuild packages with
>>> > dynamic
>>> > linking or build from source).
>>>
>>> I see; so basically the user will always need to have a full LAPACK
>>> installed. Here's one question then, if R-core didn't consider
>>> necessary to include those particular functions from LAPACK,
>>> presumably that means that R defines its own routines to solve linear
>>> systems and invert matrices. Is there any possibility to use those
>>> routines with Armadillo?
>>
>> One important point has been missed in this discussion, which is that the
>> particular Lapack subroutine that is not found in the subset of Lapack
>> shipped with R is for general, complex-valued matrices (just google the name
>> zgetri).  The way that Armadillo is structured it is either all-in or
>> all-out with respect to complex-valued matrices  If you allow for complex
>> matrices then you must have all the supporting subroutines for whatever you
>> are calling.  If you call inv() you need to allow for all the [sdcz]getri
>> routines to be available.
>
> The matrix I need to invert is definitely always complex; in fact,
> convenient complex algebra is the main attraction of Armadillo for me.

Argh.  Of course, I was being foolish.  Because Armadillo is a
header-only library it does not access any Lapack subroutine until the
call is instantiated.  I still haven't quite gotten used to thinking
only having the headers.

>>
>> So one possibility is to check the Armadillo sources to see if you can
>> suppress the use of complex-valued matrices when compiling your package.  A
>> quick glance indicates that this might now be easy.
>>
>> Otherwise, remember the first rule of numerical linear algebra which is, "If
>> your algorithm involves computing the inverse of a matrix you should rethink
>> the algorithm because there is a better way of doing it".  So why do you
>> need inv()?
>> If the answer is to solve a linear system of equations then you
>> use solve(), you do not use inv().  If, for some truly unusual reason you
>> actually need the inverse (and, remember in 99.99% of the cases where people
>> think they need the inverse, they don't) then you use a combination of
>> solve() and eye().
>
> I tried this, and in fact I do use solve() elsewhere in my code, which
> calls for another LAPACK routine (zgels.f) not present in R. The
> reason I still need inv() is that I have to solve about 300 times a
> linear system AX=B with the same matrix A but varying B. A quick
> timing last week revealed that using solve() 300 times was typically
> slower by a factor of two  in my use case than using inv() once (or
> pinv() for that matter, it makes not appreciable difference). I'm
> happy to be shown otherwise though.

I forgot that Armadillo doesn't provide a convenient way of using the
LU decomposition (that is one of the things that I find frustrating
about Armadillo).  Did you try a single solve call in which the
right-hand side is an identity matrix?  On looking at the Armadillo
sources it seems that it should call zgesv which is included in R's
subset of the Lapack routines.


> One option that I'd like to consider is whether the appropriate LAPACK
> routines could be wrapped and shipped in a separate package
> (discontinued rblas could provide a good starting point). Sadly, I
> know nothing about static/dynamic libraries and all this business..
>
> Thanks,
>
> baptiste
>
>>
>>> > | Sorry for being dense, I don't know anything about linking R to
>>
>>> > | external dependencies.
>>> >
>>> > It can be done; there are many examples -- for example every package
>>> > using
>>> > the GSL.
>>>
>>> I just checked how RcppGSL does it, and well, this configure magic is
>>> way above my head.
>>>
>>> Best,
>>>
>>> baptiste
>>> _______________________________________________
>>> Rcpp-devel mailing list
>>> Rcpp-devel at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>
>


More information about the Rcpp-devel mailing list