[Rcpp-devel] RcppArmadillo inv() depends on Lapack, zgetri_ not available on CRAN / R-forge?

Douglas Bates bates at stat.wisc.edu
Tue Jun 7 23:04:30 CEST 2011


On Tue, Jun 7, 2011 at 3:47 PM, baptiste auguie
<baptiste.auguie at googlemail.com> wrote:
> On 8 June 2011 08:03, Douglas Bates <bates at stat.wisc.edu> wrote:
>> On Tue, Jun 7, 2011 at 2:34 PM, baptiste auguie
>> <baptiste.auguie at googlemail.com> wrote:
>>> Hi Doug,
>>>
>>> On 8 June 2011 03:43, Douglas Bates <bates at stat.wisc.edu> wrote:
>>>>
>>>> On Jun 6, 2011 4:17 AM, "baptiste auguie" <baptiste.auguie at googlemail.com>
>>>> wrote:
>>>>>
>>>>> Thank you for the explanations below.
>>>>>
>>>>> On 5 June 2011 10:40, Dirk Eddelbuettel <edd at debian.org> wrote:
>>>>> >
>>>>> > On 5 June 2011 at 10:12, baptiste auguie wrote:
>>>>> > | Hi Dirk and all,
>>>>> > |
>>>>> > | On 4 June 2011 12:04, Dirk Eddelbuettel <edd at debian.org> wrote:
>>>>> > | >
>>>>> > | > Baptiste,
>>>>> > | >
>>>>> > | > On 4 June 2011 at 11:45, baptiste auguie wrote:
>>>>> > | > | Dear list,
>>>>> > | > |
>>>>> > | > | My package cda, which I was hoping to release on CRAN, fails to
>>>>> > | > | compile on R-forge with error,
>>>>> > | > |
>>>>> > | > | ** testing if installed package can be loaded
>>>>> > | > | Error in dyn.load(file, DLLpath = DLLpath, ...) :
>>>>> > | > |   unable to load shared object
>>>>> > '/tmp/RtmpbztUMm/Rinst1829c04c/cda/libs/cda.so':
>>>>> > | > |   /tmp/RtmpbztUMm/Rinst1829c04c/cda/libs/cda.so: undefined symbol:
>>>>> > zgetri_
>>>>> > | > |
>>>>> > | > | It builds fine on my local machines (Mac OS 10.5, 10.6).
>>>>> > | > |
>>>>> > | > | >From an older discussion on this list <
>>>>> > | > |
>>>>> > http://www.mail-archive.com/rcpp-devel@lists.r-forge.r-project.org/msg00678.html>
>>>>> > | > | the issue seems to be that Armadillo's inv() relies on a function
>>>>> > that
>>>>> > | > | is not provided by R, only by LAPACK. I have just replaced inv()
>>>>> > by
>>>>> > | > | pinv() and solve() in my code; merely to see what happens, but
>>>>> > chances
>>>>> > | > | are they also require a full LAPACK.
>>>>> > |
>>>>> > | Indeed, the error on R-forge is now with zgels_, required to solve
>>>>> > | linear systems. It seems one cannot solve Armadillo linear systems
>>>>> > | without LAPACK in the current situation.
>>>>> >
>>>>> > Yes. Doug, Romain and myself should address that, or at least make it
>>>>> > clear
>>>>> > what feature of the full Armadillo are lacking in RcppArmadillo.
>>>>> >
>>>>> > | > Sometime relatively early in the RcppArmadillo development process,
>>>>> > Doug
>>>>> > | > convinced Romain and myself to go for a pure template solution with
>>>>> > Armadillo
>>>>> > | > as all / most things found during the configure (or in this case,
>>>>> > cmake)
>>>>> > | > stage can be assumed 'found' given that we have around us by design.
>>>>> >  So no
>>>>> > | > testing, no local library and full reliance and what R gives us.
>>>>> > | >
>>>>> > | > That was a brilliant idea, and has freed us from having to rely on
>>>>> > building
>>>>> > | > and shipping a library, having to tell users how to set PKG_LIBS etc
>>>>> > pp and I
>>>>> > | > firmly believe that this helped tremendously in getting
>>>>> > RcppArmadillo more
>>>>> > | > widely used. So I would not want to revert this.
>>>>> > |
>>>>> > | It sounds like a good choice, I agree. Yet solving a linear system is
>>>>> > | quite a crucial task in linear algebra; it would be nice if we could
>>>>> > | come up with a tutorial recipe for dummies like me.
>>>>> > |
>>>>> > | >
>>>>> > | > In any event, it seems that you need more LAPACK than R has for you.
>>>>> >  That is
>>>>> > | > likely to be a dicey situation as you per se do not know whether R
>>>>> > was built
>>>>> > | > and linked with its own (subset) copy of LAPACK, or whether it uses
>>>>> > system
>>>>> > | > LAPACK libraries (as e.g. the Debian / Ubuntu systems do).  So you
>>>>> > may be in
>>>>> > | > a spot bother and I not sure what I can recommend --- other than
>>>>> > trying your
>>>>> > | > luck at some short configure snippets that will run at package build
>>>>> > time to
>>>>> > | > determine whether the system you want to build cda on it 'rich'
>>>>> > enough to
>>>>> > | > support it.  I can help you off list with some configure snippets as
>>>>> > some of
>>>>> > | > my packages have configure code; adding a test for zgetri should be
>>>>> > feasible.
>>>>> > | >
>>>>> > | > | Does anybody have any experience
>>>>> > | > | dealing with such issues w.r.t releasing a package on R-forge /
>>>>> > CRAN?
>>>>> > | > | Is there any chance they would consider installing LAPACK on those
>>>>> > | > | servers, or is there no point in asking such things?
>>>>> > | >
>>>>> > | > I don't think it is a matter of fixing the R-Forge server. I think
>>>>> > it is a
>>>>> > | > matter of making your package installable on the largest number of
>>>>> > user
>>>>> > | > systems.  Also try win-builder.r-project.org to see how it fares on
>>>>> > that
>>>>> > | > platform.
>>>>>
>>>>> Unsurprisingly, it fails, with the same complaint as R-forge.
>>>>>
>>>>> > |
>>>>> > | I was under the impression that R-forge or CRAN, if it had LAPACK
>>>>> > | installed, could produce binaries for the relevant platforms, and
>>>>> > | users would not have to build the package themselves and would not be
>>>>> > | required of having LAPACK on their machine. That's probably a
>>>>> > | misconception, isn't it?
>>>>> >
>>>>> > If and only statically linked binaries or libraries where produced,
>>>>> > which is
>>>>> > generally not the case. Many OSs (Linux incl) ship source only and
>>>>> > otherwise
>>>>> > link dynamically, others (Windoze) use dynamic linking and OS X is for
>>>>> > all I
>>>>> > know somewhere in the middle (as you can get prebuild packages with
>>>>> > dynamic
>>>>> > linking or build from source).
>>>>>
>>>>> I see; so basically the user will always need to have a full LAPACK
>>>>> installed. Here's one question then, if R-core didn't consider
>>>>> necessary to include those particular functions from LAPACK,
>>>>> presumably that means that R defines its own routines to solve linear
>>>>> systems and invert matrices. Is there any possibility to use those
>>>>> routines with Armadillo?
>>>>
>>>> One important point has been missed in this discussion, which is that the
>>>> particular Lapack subroutine that is not found in the subset of Lapack
>>>> shipped with R is for general, complex-valued matrices (just google the name
>>>> zgetri).  The way that Armadillo is structured it is either all-in or
>>>> all-out with respect to complex-valued matrices  If you allow for complex
>>>> matrices then you must have all the supporting subroutines for whatever you
>>>> are calling.  If you call inv() you need to allow for all the [sdcz]getri
>>>> routines to be available.
>>>
>>> The matrix I need to invert is definitely always complex; in fact,
>>> convenient complex algebra is the main attraction of Armadillo for me.
>>
>> Argh.  Of course, I was being foolish.  Because Armadillo is a
>> header-only library it does not access any Lapack subroutine until the
>> call is instantiated.  I still haven't quite gotten used to thinking
>> only having the headers.
>>
>>>>
>>>> So one possibility is to check the Armadillo sources to see if you can
>>>> suppress the use of complex-valued matrices when compiling your package.  A
>>>> quick glance indicates that this might now be easy.
>>>>
>>>> Otherwise, remember the first rule of numerical linear algebra which is, "If
>>>> your algorithm involves computing the inverse of a matrix you should rethink
>>>> the algorithm because there is a better way of doing it".  So why do you
>>>> need inv()?
>>>> If the answer is to solve a linear system of equations then you
>>>> use solve(), you do not use inv().  If, for some truly unusual reason you
>>>> actually need the inverse (and, remember in 99.99% of the cases where people
>>>> think they need the inverse, they don't) then you use a combination of
>>>> solve() and eye().
>>>
>>> I tried this, and in fact I do use solve() elsewhere in my code, which
>>> calls for another LAPACK routine (zgels.f) not present in R. The
>>> reason I still need inv() is that I have to solve about 300 times a
>>> linear system AX=B with the same matrix A but varying B. A quick
>>> timing last week revealed that using solve() 300 times was typically
>>> slower by a factor of two  in my use case than using inv() once (or
>>> pinv() for that matter, it makes not appreciable difference). I'm
>>> happy to be shown otherwise though.
>>
>> I forgot that Armadillo doesn't provide a convenient way of using the
>> LU decomposition (that is one of the things that I find frustrating
>> about Armadillo).  Did you try a single solve call in which the
>> right-hand side is an identity matrix?  On looking at the Armadillo
>> sources it seems that it should call zgesv which is included in R's
>> subset of the Lapack routines.
>
> It seems curious, looking at the bestiary of LAPACK functions, that so
> many of them seem to perform a similar task. I wonder what are the
> practical downsides of using a code that solves AX=I rather than one
> inverting A. Also, when I replaced inv() by solve() and pinv(),
> R-forge still failed to build complaining that zgesv_ was not present.
> I assumed it was used by solve(), but perhaps it was pinv(). I'll give
> it a try anyway, but I'd rather hope we can figure out a less ad hoc
> solution.

Once upon a time we actually counted the number of floating point
operations that were needed to perform a particular calculation, which
is the reason for all the special-case code in Lapack.  The difference
between using the LU decomposition of a matrix to solve a system in
which the right hand side is an identity, versus a special-purpose
piece of code like zgetri is that zgetri can take advantage of the
fact that the identity matrix is diagonal, thereby saving a relatively
small number of operations.

I'm not sure why there should be a complaint about zgesv_ not being
available.  It's there in the R sources (src/modules/lapack/cmplx.f
starting at line 3944).  That routine is trivial because it just
checks its arguments then calls zgetrf and zgetrs.
>
> Thanks,
>
> baptiste
>
>>
>>
>>> One option that I'd like to consider is whether the appropriate LAPACK
>>> routines could be wrapped and shipped in a separate package
>>> (discontinued rblas could provide a good starting point). Sadly, I
>>> know nothing about static/dynamic libraries and all this business..
>>>
>>> Thanks,
>>>
>>> baptiste
>>>
>>>>
>>>>> > | Sorry for being dense, I don't know anything about linking R to
>>>>
>>>>> > | external dependencies.
>>>>> >
>>>>> > It can be done; there are many examples -- for example every package
>>>>> > using
>>>>> > the GSL.
>>>>>
>>>>> I just checked how RcppGSL does it, and well, this configure magic is
>>>>> way above my head.
>>>>>
>>>>> Best,
>>>>>
>>>>> baptiste
>>>>> _______________________________________________
>>>>> Rcpp-devel mailing list
>>>>> Rcpp-devel at lists.r-forge.r-project.org
>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>>>
>>>
>>
>


More information about the Rcpp-devel mailing list