[adegenet-forum] Fwd: Significance of allelic contribution to discriminant functions
neagef at gmail.com
Mon Oct 6 19:09:32 CEST 2014
I took a closer look into the object created by the snpzip tool, and I
found the contributions for all the different axes.
I didn't noticed them before as I was looking only at the plot obtained.
2014-10-06 12:30 GMT-03:00 Andrea Garavito <neagef at gmail.com>:
> Hello Caitlin,
> I was taking a look to the adegenet forum and I found this previous answer
> about a statistical threshold for marker contributions.
> Originally I was planing to retain for each one of my discriminant
> functions, around the 0.3% of markers with the highest contributions by
> establishing a threshold of 3-sigma. I'm not sure if these data are
> distributed normally, but as I have almost 5000 markers I was assuming so.
> Then I saw your post about the snpzip analysis and decided to give it a try.
> I tested the function with all the methods available, and I think I'll use
> the "median" method as with the others I'm getting to many markers retained
> (and only one with the "single" method).
> I see that the snpzip test make the analysis for the first discriminant
> function, but is there a way to make it also for the other discriminant
> functions found with DAPC?
> Thanks for your answer
> 2014-08-26 12:58 GMT-03:00 Caitlin Collins <caitiecollins at gmail.com>:
>> Yeah, it's new!
>> I might as well note, in case you decide only to try a subset of the
>> methods available:
>> - Ward's method is most likely to select a very large number of variables
>> to get the most complete picture
>> - Single linkage hierarchical clustering will probably select the fewest
>> - Centroid clustering will probably select a useful middle-ground.
>> You can always check to see what proportion of the variance is contained
>> in the subset of variables retained, or you could even try running a DAPC/
>> PCA with just those variables to compare the discriminatory power of the
>> entire set with that of the subset selected.
>> Good luck.
>> On Tue, Aug 26, 2014 at 4:31 PM, Charlie Waters <cwaters8 at uw.edu> wrote:
>>> Thanks Caitlin! I've never come across the snpzip function so I'll give
>>> those clustering methods a try.
>>> On Tue, Aug 26, 2014 at 3:49 AM, Caitlin Collins <
>>> caitiecollins at gmail.com> wrote:
>>>> Hi Charlie,
>>>> Good question. Technically, there is no one "correct" statistical
>>>> solution to your problem. But, there *are *a number of ways of
>>>> approaching the problem with more statistical rigour than simply using an
>>>> arbitrary threshold as you have done.
>>>> Have you taken a look at the snpzip function in the adegenet packge? If
>>>> not, just type "?snpzip" into R with the adegenet package loaded. With this
>>>> function, you can apply one of seven different hierarchical clustering
>>>> formulas to the allelic contributions generated by dapc. Essentially, each
>>>> hierarchical clustering method uses a unique approach to determine where
>>>> the threshold should be drawn. I should note, however, that this
>>>> descriptive approach will not have an associated p-value. You may want to
>>>> try out a few different methods before deciding which variables you want
>>>> to consider "most significant".
>>>> I hope that helps!
>>> Charlie Waters
>>> Box 355020
>>> School of Aquatic and Fishery Sciences
>>> University of Washington
>>> Seattle, WA 98105
>> adegenet-forum mailing list
>> adegenet-forum at lists.r-forge.r-project.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the adegenet-forum