[Rcpp-devel] Detecting Singular Matrices in RcppArmadillo

Mon Jan 31 02:13:15 CET 2022

Yes, quoting the paper on which ‘solve’ is based on (
http://arma.sourceforge.net/armadillo_solver_2020.pdf):

“The SVD-based solver uses the xGELSD set of functions, which find a
minimum-norm solution to a linear least squares problem.”

On Mon, Jan 31, 2022 at 12:12 AM Alex Ilich <alexilich93 at gmail.com> wrote:

> Thanks for the advice Douglas. Just using B = solve(X,Y) works and it
> outputs a warning that an approximate solution is attempted when the matrix
> is nearly singular. To suppress the warning I've just included #define
> ARMA_WARN_LEVEL 1 at the top of my cpp code. Also, no longer getting some
> odd outliers that I was previously! For documentation purposes, would it
> still be accurate to say "parameters are fit using ordinary least squares"?
>
> Thanks,
> Alex
>
> On Sat, Jan 29, 2022 at 12:05 PM Douglas Bates <dmbates at gmail.com> wrote:
>
>> If you are concerned about singularity it would be better to use a QR or
>> singular value decomposition of X because the condition number of X'X is
>> the square of the condition number of X. Just the act of forming X'X from X
>> makes the conditioning much worse.
>>
>> I believe that arma::solve, as used by
>> https://github.com/RcppCore/RcppArmadillo/blob/master/src/fastLm.cpp,
>> uses a QR decomposition.
>>
>> In general it is much more stable to work with a decomposition of X than
>> to form X'X and work with it.
>>
>>  Also, even if the formula for a calculation involves the inverse of a
>> matrix, it is rarely a good idea to evaluate the full inverse.  Most of the
>> time, evaluating the inverse is much more work than is required to solve a
>> single linear system of equations.  It is like saying that evaluating a
>> scalar quotient like a/b requires that you first evaluate the inverse of b,
>> 1/b, then form the product a * (1/b).  In fact, it is worse because
>> evaluating the inverse of an n by n matrix A takes roughly the same amount
>> of work as solving n systems of equations of the form Ax = b.
>>
>> On Fri, Jan 28, 2022 at 6:56 PM <alexilich93 at gmail.com> wrote:
>>
>>> Thanks! I'll use if(rcf <= std::numeric_limits<double>::epsilon())
>>> instead of rcf==0.
>>>
>>> On Fri, Jan 28, 2022 at 7:32 PM Dirk Eddelbuettel <edd at debian.org>
>>> wrote:
>>>
>>>>
>>>> On 28 January 2022 at 13:25, Alex Ilich wrote:
>>>> | Thank you Neal and Dirk. Based on these comments I think using
>>>> arma::rcond
>>>>
>>>> Ah, yes, of course. Should have remembered arma::rcond.
>>>>
>>>> | to get the reciprocal of the condition factor, and checking if that
>>>> is zero
>>>> | rather than the determinant may be the route to go. At least for this
>>>> | example that works.
>>>> |
>>>> | library(Rcpp)
>>>> |
>>>> | cppFunction(depends="RcppArmadillo", "NumericVector C_OLS(arma::mat X,
>>>> | arma::mat Y){
>>>> |   arma::mat Xt = trans(X);
>>>> |   arma::mat XtX = Xt * X;
>>>> |   double rcf = rcond(XtX);
>>>> |   Rcout << rcf; //print reciprocal of condition factor
>>>> |   if(rcf==0){
>>>>
>>>> Re-read Neal's first email.  You probably do not want zero (apart from
>>>> all
>>>> the R FAQ 7.31 reasons to not compare a double with equality :wink: )
>>>>
>>>> Dirk
>>>>
>>>> |     NumericVector B2(X.n_cols, NA_REAL);
>>>> |     return B2;
>>>> |   } else{
>>>> |     arma::mat XtX_inv= inv(XtX);
>>>> |     NumericVector B = Rcpp::as<Rcpp::NumericVector>(wrap(XtX_inv *
>>>> (Xt *
>>>> | Y)));
>>>> |     return B;
>>>> |   }}")
>>>> |
>>>> | X<- matrix(data=c(-250, -250, -250, -250, -250, -250, 250, 200, 150,
>>>> |                   100, 50, 0, 1, 1, 1, 1, 1, 1), ncol=3, nrow=6)
>>>> |
>>>> |
>>>> | X2<- X
>>>> |
>>>> | X2[1,2]<- X[1,2] + 5.684342e-14 #Add a slight numeric precision issue
>>>> |
>>>> | Y<- matrix(1:6, ncol=1)
>>>> |
>>>> | C_OLS(X,Y)
>>>> | # 0[1] NA NA NA
>>>> |
>>>> | C_OLS(X2,Y)
>>>> | # 0[1] NA NA NA
>>>> |
>>>> |
>>>> |
>>>> | On Fri, Jan 28, 2022 at 11:53 AM Alex Ilich <alexilich93 at gmail.com>
>>>> wrote:
>>>> |
>>>> | > Hi, I have some code I've written to calculate regression
>>>> parameters using
>>>> | > ordinarily least squares using RcppArmadillo. This is meant to be
>>>> performed
>>>> | > iteratively over many small subsets (sliding window over spatial
>>>> raster
>>>> | > data) so I'd like to automatically detect when the regression can't
>>>> be
>>>> | > performed and just return a vector of NA's. Currently I check to
>>>> see if the
>>>> | > determinant is 0 in order to see if the XtX matrix can be inverted
>>>> and it
>>>> | > seems to work in the vast majority of cases but I did run into a
>>>> case where
>>>> | > a small amount of precision error was introduced and for some
>>>> reason this
>>>> | > causes the determinant to be non-zero in C++ however, the matrix
>>>> still is
>>>> | > considered singular (uninvertible) leading to an error. I was
>>>> wondering if
>>>> | > anyone had any suggestions? Maybe there's a minimum value non-zero
>>>> value
>>>> | > where it's considered essentially singular that I could use instead
>>>> of 0,
>>>> | > or maybe there's a way to return a vector of NAs if an error is
>>>> detected
>>>> | > rather than stopping execution and printing an error message. I've
>>>> also
>>>> | > seen that pinv can be used, but from some tests I ran it's
>>>> substantially
>>>> | > slower, so I'd like to avoid using that. I've included some example
>>>> code
>>>> | > below. Any suggestions would be greatly appreciated.
>>>> | >
>>>> | > Thanks,
>>>> | > Alex
>>>> | >
>>>> | > library(Rcpp)
>>>> | > cppFunction(depends="RcppArmadillo", "NumericVector C_OLS(arma::mat
>>>> X,
>>>> | > arma::mat Y){
>>>> | >   arma::mat Xt = trans(X);
>>>> | >   arma::mat XtX = Xt * X;
>>>> | >   double d = det(XtX);
>>>> | >   Rcout << d; //print determinant
>>>> | >   if(d==0){
>>>> | >     NumericVector B2(X.n_cols, NA_REAL);
>>>> | >     return B2;
>>>> | >   } else{
>>>> | >     arma::mat XtX_inv= inv(XtX);
>>>> | >     NumericVector B = Rcpp::as<Rcpp::NumericVector>(wrap(XtX_inv *
>>>> (Xt *
>>>> | > Y)));
>>>> | >     return B;
>>>> | >   }}")
>>>> | >
>>>> | > X<- matrix(data=c(-250, -250, -250, -250, -250, -250, 250, 200, 150,
>>>> | >                   100, 50, 0, 1, 1, 1, 1, 1, 1), ncol=3, nrow=6)
>>>> | >
>>>> | >
>>>> | > X2<- X
>>>> | >
>>>> | > X2[1,2]<- X[1,2] + 5.684342e-14 #Add a slight numeric precision
>>>> issue
>>>> | >
>>>> | > Y<- matrix(1:6, ncol=1)
>>>> | >
>>>> | > #Determinant of XtX in R is 0 for both
>>>> | > det(t(X) %*% X) ==0 #TRUE
>>>> | > det(t(X2) %*% X2) ==0 #TRUE
>>>> | >
>>>> | > C_OLS(X,Y) #Determinant is 0 in C++ and correctly returns a vector
>>>> of NA's
>>>> | > #0[1] NA NA NA
>>>> | >
>>>> | > C_OLS(X2,Y) #Determinant is not 0 in C++ so the if statement is not
>>>> | > triggered but the matrix is still uninvertible
>>>> | > # 4.57764e-05
>>>> | > # error: inv(): matrix is singular
>>>> | > # Error in C_OLS(X2, Y) : inv(): matrix is singular
>>>> | >
>>>> | _______________________________________________
>>>> | Rcpp-devel mailing list
>>>> | Rcpp-devel at lists.r-forge.r-project.org
>>>> |
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>>> --
>>>> https://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
>>>>
>>> _______________________________________________
>>> Rcpp-devel mailing list
>>> Rcpp-devel at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>
>> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

-- 
Zé Vinícius
https://mirca.github.io
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20220131/f2635253/attachment-0001.html>