<div dir="ltr"><div class="gmail_default">



















<p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif">Hello,</p><p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif">When trying to find the number of
clusters, as is known, I get different results when I retain different numbers
of PCs. <span></span></p>

<p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif">As background, I have samples
from 180 individuals over 11 different sites, and am trying to find the best
structure.<span></span></p>

<p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif">In the tutorial, it says that
when you run find.clusters there is no reason for keeping small numbers of
principle components here. When I run with n.pca.max = 60 (so, n/3), using xval
I get pretty consistently that the good number of PCs to retain is 50. <span></span></p><p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif"></p><div><img src="cid:ii_jhw59j0m0_163bc0a44a93dc26" width="238" height="212"><br></div><br><p></p>

<p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif"><span style="font-size:11pt">When I run find.cluster using 50
PCs I get anywhere between 7 and 9 clusters, mostly telling the same story for
the data. However, when I run find.cluster with over 100 PCs I consistently get
k = 4 or 5, and the plot is much cleaner. In addition, however, when I look at
my variance explained plots, they don’t really asymptote, either for
find.cluster or for dapc.</span></p><p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif"><span></span></p>

<p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif">Both of the variance explained
plots look like<span></span></p>

<p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif"><span><img style="margin-right: 0px;"></span></p><div><img src="cid:ii_jhw5a9ul1_163bc0acc7ec88ae" width="239" height="212"><br></div><span></span><p></p>

<p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif">Using the scaled dataset</p><p class="MsoNormal" style="line-height:normal;margin:0cm 0cm 8pt"><font color="#0000ff" face="Calibri, sans-serif"><span style="font-size:14.6667px">mat <- scaleGen(Stickle8c10NoOdds, NA.method="mean")</span></font><br></p><p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif">I use 120 PCs, and get<span></span></p>

<div style="color:rgb(0,0,255);font-size:large"><img src="cid:ii_jhw5avww2_163bc0b3d4df2540" width="239" height="212"><br></div><br>

<p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif">If I run with 90 PCs the number of clusters bumps up to 5, </p><p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif"></p><div><img src="cid:ii_jhw5e9je4_163bc0da5cbb7d20" width="411" height="366"><br></div><br><p></p><p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif">However, if I run find.cluster
and choose 50 PCs, I get<span></span></p>

<div style="color:rgb(0,0,255);font-size:large"><img src="cid:ii_jhw5b8k03_163bc0b7c9cb3023" width="239" height="212"><br></div><br>

<p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;margin:0cm 0cm 0.0001pt;line-height:11.25pt;background:white;word-break:break-all;font-family:Calibri,sans-serif"><span style="font-size:10pt;font-family:"Lucida Console";color:blue">> head(NumClust$Kstat,
11)<span></span></span></p>

<p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;margin:0cm 0cm 0.0001pt;line-height:11.25pt;background:white;word-break:break-all;font-family:Calibri,sans-serif"><span style="font-size:10pt;font-family:"Lucida Console";color:black;border:1pt none windowtext;padding:0cm"><span>     </span>K=1<span>     
</span>K=2<span>      </span>K=3<span>      </span>K=4<span>     
</span>K=5<span>      </span>K=6<span>      </span>K=7<span>     
</span>K=8<span>      </span>K=9<span>     </span>K=10<span>    
</span>K=11 <span></span></span></p>

<p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;margin:0cm 0cm 0.0001pt;line-height:11.25pt;background:white;word-break:break-all;font-family:Calibri,sans-serif"><span style="font-size:10pt;font-family:"Lucida Console";color:black;border:1pt none windowtext;padding:0cm">1492.620
1472.790 1455.980 1448.216 1443.735 1442.909 1440.166 1440.344 1440.867
1441.979 1443.101 </span><span style="font-size:10pt;font-family:"Lucida Console";color:black"><span></span></span></p>

<p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;margin:0cm 0cm 0.0001pt;line-height:11.25pt;background:white;word-break:break-all;font-family:Calibri,sans-serif"><span style="font-size:10pt;font-family:"Lucida Console";color:black"><span> </span></span></p>

<p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif">Are the xval procedure results
(i.e., 50 PCs in my case) meant to be used only at the dapc1 <- dapc(mat,
NumClust$grp) stage? And, do my variance explained plots concern you at all
given that they don’t asymptote? </p><p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif"><br></p><p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif">I've attached a word document with all the same text and plots as are in this message in case they don't show up on your screen.</p><p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif">Thank you for your time,</p><p class="MsoNormal" style="color:rgb(0,0,255);font-size:11pt;line-height:normal;margin:0cm 0cm 8pt;font-family:Calibri,sans-serif">Ella</p></div>-- <br><div class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div>Ella Bowles, PhD<br>Postdoctoral Researcher<br>Department of Biology<br>Concordia University<br><br>Website: <a href="https://ellabowlesphd.wordpress.com/" target="_blank">https://ellabowlesphd.wordpress.com/</a><br>Email: <a href="mailto:bowlese@gmail.com" target="_blank">bowlese@gmail.com</a></div></div></div></div></div></div></div>
</div>