[GenABEL-dev] ProbABEL, chi^2, Wald and log-likelihood

L.C. Karssen lennart at karssen.org
Sun Jul 14 22:00:38 CEST 2013


Thanks for the explanation Yurii.

On 12-07-13 01:41, Yurii Aulchenko wrote:
> In principle score, Wald, and LRT have to give similar answers in
> non-extreme cases. LRT is theoretically the most superior method (if
> underlying model assumptions, e.g. normality, hold).  Score / Wald are
> the approximations to LRT derived at the point of null/alternative,
> respectively. They actually ARE derived from quadratic approximations of
> the likleihood function derived at these points :) 

Interesting! I didn't know that.

> 
> As for practical advantages/disadvantages of these, may be someone else
> could comment. I remember there are good/bad sides in both...
> 
> Re: Wald on 2df - you can not add Walds from individual beta/se, you
> need to take the covariance into account.

I see, I guess adding them is only allowed when the two are independent
(hence no covariance). Right?

> For full treatment of the
> problem, see
> 
> http://www.math.chalmers.se/~wermuth/pdfs/86-95/CoxWer90_An_approximation_to_ML.pdf
> 

Thanks. Not an easy piece to read...

> For a simple variant, I think our ProbABEL paper does give some details
> on score/Wald. 
> 
> Would that be good idea to put this discussion topic to our "Journal
> club"? - these are kind of topics of general interest irrespective of
> GenABEL.
> 

Good idea. I'll see if I can find the time to start the discussion there.


Best,

Lennart.


> best,
> Yurii
> 
> On Thu, Jul 11, 2013 at 11:56 PM, L.C. Karssen <lennart at karssen.org
> <mailto:lennart at karssen.org>> wrote:
> 
>     Dear all,
> 
>     For the upcoming release of ProbABEL I've run into the following. In the
>     past (~ v 0.1-3) the output of ProbABEL had chi^2 values when doing Cox
>     regression. These were based on the likelihood ratio test:
>      2 * (loglik -loglik_null) ~ chi_1^2
>     However, at some point, when having hamissing data was allowed in
>     ProbABEL, we ran into the problem that the null model had to be
>     recalculated for cases with missing genotype data. To do that 'simply'
>     for each SNP would be time consuming, so the chi^2 values were removed
>     from the output and replaced by the loglik values for the full model.
>     (At least, that's how I guess it went).
> 
>     Now, I would like to get them back. This can be done in two ways:
>     1) calculate chi^2 as described above, with some smart way of only
>     recalculating the null model when a missing value occurs (this shouldn't
>     be often with today's imputed data).
>     2) simply calculate the chi^2 value through the Wald test. We have betas
>     and se_betas, so that is easy.
> 
>     Many of you have more knowledge about statistics than I do, so,
>     statistically, are these methods equivalent? Or is one better (more
>     precise/unbiased) than the other?
> 
> 
>     Another question:
>     While testing the Wald-type implementation I ran into the following:
>     I would assume that for the 2df models (where we get beta_SNP_A1A2 and
>     beta_SNP_A1A1) the final chi^2 value would be the sum of the individual
>     Wald statistics, which would be distributed as chi_2^2 (so 2 df). Is
>     that correct? I ask this because if I compare them with the chi^2 values
>     from the LRT I get different values. In the example data set I get:
>     name      chi^2_Wald        chi^2_LRT
>     rs7247199 0.880949           0.452465
>     rs8102643 0.0116651          0.512709   <- here we have a missing value!
>     rs8102615 1.51434            0.754701
>     rs8105536 2.56337            1.33223
>     rs2312724 0.492364           0.256649
> 
>     When running the additive model I do get (almost) the same results:
>     name       chi^2_Wald        chi^2_LRT
>     rs7247199  0.0101558          0.01012
>     rs8102643  0.353168           0.492147  <- here we have a missing value!
>     rs8102615  0.0181841          0.0180033
>     rs8105536  0.00222781         0.00222216
>     rs2312724  0.0412005          0.0401556
> 
>     Shouldn't the chi_2 values be equal in both cases? FYI: the LRT chi^2
>     values are the same as those obtained with ProbABEL v0.1-3.
> 
> 
>     Any suggestions?
>     Thanks,
> 
>     Lennart.
> 
>     --
>     -----------------------------------------------------------------
>     L.C. Karssen
>     Utrecht
>     The Netherlands
> 
>     lennart at karssen.org <mailto:lennart at karssen.org>
>     http://blog.karssen.org
> 
>     Stuur mij aub geen Word of Powerpoint bestanden!
>     Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
>     ------------------------------------------------------------------
> 
> 
>     _______________________________________________
>     genabel-devel mailing list
>     genabel-devel at lists.r-forge.r-project.org
>     <mailto:genabel-devel at lists.r-forge.r-project.org>
>     https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> 
> 
> 
> 
> -- 
> -----------------------------------------------------
> Yurii S. Aulchenko
> 
> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter
> <http://twitter.com/YuriiAulchenko> ] [ Blog
> <http://yurii-aulchenko.blogspot.nl/> ]

-- 
-----------------------------------------------------------------
L.C. Karssen
Utrecht
The Netherlands

lennart at karssen.org
http://blog.karssen.org

Stuur mij aub geen Word of Powerpoint bestanden!
Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
------------------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130714/ad2e6eb5/attachment-0001.sig>


More information about the genabel-devel mailing list