<div dir="ltr">Hello again Andrea, <br><br>Glad you found what you were looking for! <br><br>Incidentally, and in case anyone else on the forum is looking to visualise the variable contributions to discriminant axes > 1, here is some code to do so for a toy example. (The last chunk will be the relevant bit for creating loading plots):<br><br><font color="#38761d"># make a simulated dataset with 5 "groups"</font><br><div><div><font color="#0000ff">simpop <- glSim(200, 1000, 40, k=5, sort.pop=TRUE)</font></div><div><font color="#0000ff">snps <- as.matrix(simpop)</font></div><div><font color="#0000ff">phen <- simpop@other$ancestral.pops</font></div><div><font color="#0000ff"><br></font></div><div><font color="#38761d"># for fun/ as a check, quickly visualise the clusters</font></div><div><font color="#0000ff">dapc1 <- dapc(snps, phen, n.pca=50, n.da=4)</font></div><div><font color="#0000ff">scatter(dapc1)</font></div><div><font color="#0000ff"><br></font></div><div><span style="color:rgb(56,118,29)"># create an object called foo that contains the results of running snpzip on your dataset</span><font color="#0000ff"><br></font></div><div><font color="#0000ff">foo <- snpzip(snps, phen, xval.plot=TRUE, loading.plot=TRUE, method="centroid")</font></div><div><font color="#0000ff"><br></font></div><div><span style="color:rgb(56,118,29)"># isolate the DAPC component of the snpzip results, calling it "dapc1"</span><font color="#0000ff"><br></font></div><div><font color="#0000ff">dapc1 <- foo$DAPC</font></div><div><span style="color:rgb(56,118,29)"># specify that you want to run the following lines for all DA (ie. from DA=1 to DA=(k-1), where K is the number of groups in your dataset)</span><br></div><div><font color="#0000ff">DA <- c(1:dapc1$n.da)</font></div><div><font color="#0000ff">par(ask=TRUE)</font></div><div><span style="color:rgb(56,118,29)"># generate separate loading plots for each DA</span><font color="#0000ff"><br></font></div><div><font color="#0000ff">for(i in DA){</font></div><div><font color="#0000ff"> title <- paste("Loading Plot for DA", i, sep=" ")</font></div><div><font color="#0000ff"> maximus <- foo$FS[[i]][[2]] </font></div><div><font color="#0000ff"> cutoff <- abs(dapc1$var.contr[maximus,i][(which.min(dapc1$var.contr[maximus,i]))])-0.000001</font></div><div><font color="#0000ff"> loadingplot(dapc1$var.contr[, i], threshold=cutoff, main=title)</font></div><div><font color="#0000ff">}</font></div></div><div><br><br></div><div>Hope that helps!<br>And thanks for your input: I'll try and implement the above code within snpzip to generate loadinplots for all DA automatically in the next release of adegenet. <br><br>Cheers, <br>Caitlin. </div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Oct 6, 2014 at 6:09 PM, Andrea Garavito <span dir="ltr"><<a href="mailto:neagef@gmail.com" target="_blank">neagef@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div>Hello again!<br></div><br>I took a closer look into the object created by the snpzip tool, and I found the contributions for all the different axes. <br>I didn't noticed them before as I was looking only at the plot obtained.<br><br></div>Thanks anyway!<br></div>Andrea<br><div><div><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">2014-10-06 12:30 GMT-03:00 Andrea Garavito <span dir="ltr"><<a href="mailto:neagef@gmail.com" target="_blank">neagef@gmail.com</a>></span>:<div><div class="h5"><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div><div><div>Hello Caitlin,<br></div>I was taking a look to the adegenet forum and I found this previous answer about a statistical threshold for marker contributions.<br><br></div>Originally I was planing to retain for each one of my discriminant functions, around the 0.3% of markers with the highest contributions by establishing a threshold of 3-sigma. I'm not sure if these data are distributed normally, but as I have almost 5000 markers I was assuming so. Then I saw your post about the snpzip analysis and decided to give it a try.<br></div><div>I tested the function with all the methods available, and I think I'll use the "median" method as with the others I'm getting to many markers retained (and only one with the "single" method).<br></div>I see that the snpzip test make the analysis for the first discriminant function, but is there a way to make it also for the other discriminant functions found with DAPC?<br><br></div>Thanks for your answer<br></div>Andrea<br><div><div><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">2014-08-26 12:58 GMT-03:00 Caitlin Collins <span dir="ltr"><<a href="mailto:caitiecollins@gmail.com" target="_blank">caitiecollins@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><div dir="ltr">Yeah, it's new! <div><br>I might as well note, in case you decide only to try a subset of the methods available:<div>- Ward's method is most likely to select a very large number of variables to get the most complete picture</div>
<div>- Single linkage hierarchical clustering will probably select the fewest</div><div>- Centroid clustering will probably select a useful middle-ground.</div><div><br></div><div>You can always check to see what proportion of the variance is contained in the subset of variables retained, or you could even try running a DAPC/ PCA with just those variables to compare the discriminatory power of the entire set with that of the subset selected. <br>
<br>Good luck. <br><br>Cheers, <br>Caitlin. </div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Aug 26, 2014 at 4:31 PM, Charlie Waters <span dir="ltr"><<a href="mailto:cwaters8@uw.edu" target="_blank">cwaters8@uw.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Thanks Caitlin! I've never come across the snpzip function so I'll give those clustering methods a try. <div>
<br></div><div>Thanks,</div><div>Charlie</div></div><div class="gmail_extra"><div><div><div><div><br><br><div class="gmail_quote">
On Tue, Aug 26, 2014 at 3:49 AM, Caitlin Collins <span dir="ltr"><<a href="mailto:caitiecollins@gmail.com" target="_blank">caitiecollins@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Hi Charlie, <br><br>Good question. Technically, there is no one "correct" statistical solution to your problem. But, there <i>are </i>a number of ways of approaching the problem with more statistical rigour than simply using an arbitrary threshold as you have done. <br>
<br>Have you taken a look at the snpzip function in the adegenet packge? If not, just type "?snpzip" into R with the adegenet package loaded. With this function, you can apply one of seven different hierarchical clustering formulas to the allelic contributions generated by dapc. Essentially, each hierarchical clustering method uses a unique approach to determine where the threshold should be drawn. I should note, however, that this descriptive approach will not have an associated p-value. You may want to try out a few different methods before deciding which variables you want to consider "most significant". <div>
<br></div><div>I hope that helps! <br><br>Best, <span><font color="#888888"><br>Caitlin</font></span></div></div>
</blockquote></div><br><br clear="all"><div><br></div></div></div></div></div><span><font color="#888888"><span><font color="#888888">-- <br><div dir="ltr">Charlie Waters<div>Box 355020<br><div>School of Aquatic and Fishery Sciences</div><div>University of Washington</div>
<div>Seattle, WA 98105</div>
<div><br></div></div></div>
</font></span></font></span></div>
</blockquote></div><br></div>
<br></div></div>_______________________________________________<br>
adegenet-forum mailing list<br>
<a href="mailto:adegenet-forum@lists.r-forge.r-project.org" target="_blank">adegenet-forum@lists.r-forge.r-project.org</a><br>
<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum</a><br></blockquote></div><br></div>
</blockquote></div></div></div><br></div>
</blockquote></div><br></div>