[GenABEL-dev] faster polygenic

Thu Mar 3 10:11:52 CET 2011

>>> That's interesting, I guess the extra variance inflation is probably due
>>> to
>>> uncertainty in K which we don't model. I've wondered about extending the
>>> method to work with an uncertain K, I'm not sure how feasible
>>> statistically
>>> or computationally that would be.
>>
>> This is very interesting suggestion -- never quite thought of what
>> woul the effects of uncertainty of K be onto test statistics. My
>> feeling is that uncertainty in K is likely to translate into
>> uncertainty of Lambda (so, SD of L would be bigger), but not into bias
>> -- and that is what we see; but I can well be wrong about that.
>>
>
> You might be right, I've not thought a lot about it, but my initial thinking
> was there might be some bias, my line of thought was: if there is maximal
> uncertainty in K then K adds no useful information, so you'd expect to
> estimate heritability=0, but then your statistics are just uncorrected
> chisq, so you have a standard GC situation in which case lambda>1. However
> if K is accurate you'd hope that the statistic is correct therefore
> lambda=1.

I see your point. This is an interesting 'extreme' scenario to
consider. Would be really interesting to try that: say, use less and
less markers to estimate K and see what happens. Rather trivial to set
up, and may give us important clues on how many markers should be used
for K.

I was thinking from under an assumption that even in population-based
studies you can correct for inflation (Lambda>1) by use of K. This is
actually another interesting question to test using real data :) But
you are right, probably you need very good K to do that.

> But maybe this is not quite what you mean by bias?
>

By 'bias' I indeed mean deviation of Lambda from 1.

>> I think we can get a better view of the problem and answer to
>> William's hypothesis when we get polygenic-FMM working as an entity,
>
> OK, I'll see what I can do. Is the best approach to have the user create a
> Polygenic-FMM R object and then let the user apply an R function to it when
> they want to do the GWAS?
>
> GWAS-FMM(Polygenic-FMM-Object, GWAS-SNP-Data)
>
> In this case we will need to call two separate C functions from R, when to
> setup and fit polygenic-FMM object and then another to run the GWAS.
>

Absolutely, this was what I though about. Polygenic-FMM-Object should
comply to 'polygenic' format -- meaning it should have a few slots
named in particular way, but not necessarily all, and can contain
extra slots with info not in standard polygenic (e.g. I remember you
derive something big during null-herit estimation, which is not in
standard polygenic). Then we could thinking of coming up with
'consolidated' polygenic-class at the end, or decide this is not worth
effort :)

all the best,
Yurii