[Rcpp-devel] RcppArmadillo inv() depends on Lapack, zgetri_ not available on CRAN / R-forge?

baptiste auguie baptiste.auguie at googlemail.com
Tue Jun 7 22:47:12 CEST 2011


On 8 June 2011 08:03, Douglas Bates <bates at stat.wisc.edu> wrote:
> On Tue, Jun 7, 2011 at 2:34 PM, baptiste auguie
> <baptiste.auguie at googlemail.com> wrote:
>> Hi Doug,
>>
>> On 8 June 2011 03:43, Douglas Bates <bates at stat.wisc.edu> wrote:
>>>
>>> On Jun 6, 2011 4:17 AM, "baptiste auguie" <baptiste.auguie at googlemail.com>
>>> wrote:
>>>>
>>>> Thank you for the explanations below.
>>>>
>>>> On 5 June 2011 10:40, Dirk Eddelbuettel <edd at debian.org> wrote:
>>>> >
>>>> > On 5 June 2011 at 10:12, baptiste auguie wrote:
>>>> > | Hi Dirk and all,
>>>> > |
>>>> > | On 4 June 2011 12:04, Dirk Eddelbuettel <edd at debian.org> wrote:
>>>> > | >
>>>> > | > Baptiste,
>>>> > | >
>>>> > | > On 4 June 2011 at 11:45, baptiste auguie wrote:
>>>> > | > | Dear list,
>>>> > | > |
>>>> > | > | My package cda, which I was hoping to release on CRAN, fails to
>>>> > | > | compile on R-forge with error,
>>>> > | > |
>>>> > | > | ** testing if installed package can be loaded
>>>> > | > | Error in dyn.load(file, DLLpath = DLLpath, ...) :
>>>> > | > |   unable to load shared object
>>>> > '/tmp/RtmpbztUMm/Rinst1829c04c/cda/libs/cda.so':
>>>> > | > |   /tmp/RtmpbztUMm/Rinst1829c04c/cda/libs/cda.so: undefined symbol:
>>>> > zgetri_
>>>> > | > |
>>>> > | > | It builds fine on my local machines (Mac OS 10.5, 10.6).
>>>> > | > |
>>>> > | > | >From an older discussion on this list <
>>>> > | > |
>>>> > http://www.mail-archive.com/rcpp-devel@lists.r-forge.r-project.org/msg00678.html>
>>>> > | > | the issue seems to be that Armadillo's inv() relies on a function
>>>> > that
>>>> > | > | is not provided by R, only by LAPACK. I have just replaced inv()
>>>> > by
>>>> > | > | pinv() and solve() in my code; merely to see what happens, but
>>>> > chances
>>>> > | > | are they also require a full LAPACK.
>>>> > |
>>>> > | Indeed, the error on R-forge is now with zgels_, required to solve
>>>> > | linear systems. It seems one cannot solve Armadillo linear systems
>>>> > | without LAPACK in the current situation.
>>>> >
>>>> > Yes. Doug, Romain and myself should address that, or at least make it
>>>> > clear
>>>> > what feature of the full Armadillo are lacking in RcppArmadillo.
>>>> >
>>>> > | > Sometime relatively early in the RcppArmadillo development process,
>>>> > Doug
>>>> > | > convinced Romain and myself to go for a pure template solution with
>>>> > Armadillo
>>>> > | > as all / most things found during the configure (or in this case,
>>>> > cmake)
>>>> > | > stage can be assumed 'found' given that we have around us by design.
>>>> >  So no
>>>> > | > testing, no local library and full reliance and what R gives us.
>>>> > | >
>>>> > | > That was a brilliant idea, and has freed us from having to rely on
>>>> > building
>>>> > | > and shipping a library, having to tell users how to set PKG_LIBS etc
>>>> > pp and I
>>>> > | > firmly believe that this helped tremendously in getting
>>>> > RcppArmadillo more
>>>> > | > widely used. So I would not want to revert this.
>>>> > |
>>>> > | It sounds like a good choice, I agree. Yet solving a linear system is
>>>> > | quite a crucial task in linear algebra; it would be nice if we could
>>>> > | come up with a tutorial recipe for dummies like me.
>>>> > |
>>>> > | >
>>>> > | > In any event, it seems that you need more LAPACK than R has for you.
>>>> >  That is
>>>> > | > likely to be a dicey situation as you per se do not know whether R
>>>> > was built
>>>> > | > and linked with its own (subset) copy of LAPACK, or whether it uses
>>>> > system
>>>> > | > LAPACK libraries (as e.g. the Debian / Ubuntu systems do).  So you
>>>> > may be in
>>>> > | > a spot bother and I not sure what I can recommend --- other than
>>>> > trying your
>>>> > | > luck at some short configure snippets that will run at package build
>>>> > time to
>>>> > | > determine whether the system you want to build cda on it 'rich'
>>>> > enough to
>>>> > | > support it.  I can help you off list with some configure snippets as
>>>> > some of
>>>> > | > my packages have configure code; adding a test for zgetri should be
>>>> > feasible.
>>>> > | >
>>>> > | > | Does anybody have any experience
>>>> > | > | dealing with such issues w.r.t releasing a package on R-forge /
>>>> > CRAN?
>>>> > | > | Is there any chance they would consider installing LAPACK on those
>>>> > | > | servers, or is there no point in asking such things?
>>>> > | >
>>>> > | > I don't think it is a matter of fixing the R-Forge server. I think
>>>> > it is a
>>>> > | > matter of making your package installable on the largest number of
>>>> > user
>>>> > | > systems.  Also try win-builder.r-project.org to see how it fares on
>>>> > that
>>>> > | > platform.
>>>>
>>>> Unsurprisingly, it fails, with the same complaint as R-forge.
>>>>
>>>> > |
>>>> > | I was under the impression that R-forge or CRAN, if it had LAPACK
>>>> > | installed, could produce binaries for the relevant platforms, and
>>>> > | users would not have to build the package themselves and would not be
>>>> > | required of having LAPACK on their machine. That's probably a
>>>> > | misconception, isn't it?
>>>> >
>>>> > If and only statically linked binaries or libraries where produced,
>>>> > which is
>>>> > generally not the case. Many OSs (Linux incl) ship source only and
>>>> > otherwise
>>>> > link dynamically, others (Windoze) use dynamic linking and OS X is for
>>>> > all I
>>>> > know somewhere in the middle (as you can get prebuild packages with
>>>> > dynamic
>>>> > linking or build from source).
>>>>
>>>> I see; so basically the user will always need to have a full LAPACK
>>>> installed. Here's one question then, if R-core didn't consider
>>>> necessary to include those particular functions from LAPACK,
>>>> presumably that means that R defines its own routines to solve linear
>>>> systems and invert matrices. Is there any possibility to use those
>>>> routines with Armadillo?
>>>
>>> One important point has been missed in this discussion, which is that the
>>> particular Lapack subroutine that is not found in the subset of Lapack
>>> shipped with R is for general, complex-valued matrices (just google the name
>>> zgetri).  The way that Armadillo is structured it is either all-in or
>>> all-out with respect to complex-valued matrices  If you allow for complex
>>> matrices then you must have all the supporting subroutines for whatever you
>>> are calling.  If you call inv() you need to allow for all the [sdcz]getri
>>> routines to be available.
>>
>> The matrix I need to invert is definitely always complex; in fact,
>> convenient complex algebra is the main attraction of Armadillo for me.
>
> Argh.  Of course, I was being foolish.  Because Armadillo is a
> header-only library it does not access any Lapack subroutine until the
> call is instantiated.  I still haven't quite gotten used to thinking
> only having the headers.
>
>>>
>>> So one possibility is to check the Armadillo sources to see if you can
>>> suppress the use of complex-valued matrices when compiling your package.  A
>>> quick glance indicates that this might now be easy.
>>>
>>> Otherwise, remember the first rule of numerical linear algebra which is, "If
>>> your algorithm involves computing the inverse of a matrix you should rethink
>>> the algorithm because there is a better way of doing it".  So why do you
>>> need inv()?
>>> If the answer is to solve a linear system of equations then you
>>> use solve(), you do not use inv().  If, for some truly unusual reason you
>>> actually need the inverse (and, remember in 99.99% of the cases where people
>>> think they need the inverse, they don't) then you use a combination of
>>> solve() and eye().
>>
>> I tried this, and in fact I do use solve() elsewhere in my code, which
>> calls for another LAPACK routine (zgels.f) not present in R. The
>> reason I still need inv() is that I have to solve about 300 times a
>> linear system AX=B with the same matrix A but varying B. A quick
>> timing last week revealed that using solve() 300 times was typically
>> slower by a factor of two  in my use case than using inv() once (or
>> pinv() for that matter, it makes not appreciable difference). I'm
>> happy to be shown otherwise though.
>
> I forgot that Armadillo doesn't provide a convenient way of using the
> LU decomposition (that is one of the things that I find frustrating
> about Armadillo).  Did you try a single solve call in which the
> right-hand side is an identity matrix?  On looking at the Armadillo
> sources it seems that it should call zgesv which is included in R's
> subset of the Lapack routines.

It seems curious, looking at the bestiary of LAPACK functions, that so
many of them seem to perform a similar task. I wonder what are the
practical downsides of using a code that solves AX=I rather than one
inverting A. Also, when I replaced inv() by solve() and pinv(),
R-forge still failed to build complaining that zgesv_ was not present.
I assumed it was used by solve(), but perhaps it was pinv(). I'll give
it a try anyway, but I'd rather hope we can figure out a less ad hoc
solution.

Thanks,

baptiste

>
>
>> One option that I'd like to consider is whether the appropriate LAPACK
>> routines could be wrapped and shipped in a separate package
>> (discontinued rblas could provide a good starting point). Sadly, I
>> know nothing about static/dynamic libraries and all this business..
>>
>> Thanks,
>>
>> baptiste
>>
>>>
>>>> > | Sorry for being dense, I don't know anything about linking R to
>>>
>>>> > | external dependencies.
>>>> >
>>>> > It can be done; there are many examples -- for example every package
>>>> > using
>>>> > the GSL.
>>>>
>>>> I just checked how RcppGSL does it, and well, this configure magic is
>>>> way above my head.
>>>>
>>>> Best,
>>>>
>>>> baptiste
>>>> _______________________________________________
>>>> Rcpp-devel mailing list
>>>> Rcpp-devel at lists.r-forge.r-project.org
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>>
>>
>


More information about the Rcpp-devel mailing list