From karl.fetter at gmail.com Sat Oct 12 23:16:21 2013 From: karl.fetter at gmail.com (Karl Fetter) Date: Sat, 12 Oct 2013 17:16:21 -0400 Subject: [adegenet-forum] Selecting K in DAPC Message-ID: Hello, I'm a new user to DAPC and adegenet in general. I just went through the DAPC vignette using my own data instead of the data provided. Unless I missed something, it appears to me that DAPC doesn't actually select the most likely value of K. It looks like the selection of this value is left up to the user, and despite optimizing the number of pca's to use with alpha-score optimization, the entire process depends on the value of K you select when you are using find.cluster. Am I missing something? On a related note, I'm using several different methods to select K: structure, structurama, Fst clusters, & my hypothesis regarding the number of K. For DAPC, I chose K=9 because that's where the "elbow" in the BIC vs K plot. All my other clustering methods suggest k=4, or k=5. When I use DAPC with K = 9, and I make a scatter plot, it appears there are 3 clusters that are widely and obviously separated from each other. Inferring K from this plot makes more sense to me than continuing with the analyses outlined in the DAPC vignette. Would it be appropriate to infer k from this plot, and then make a dapc w/ K=3 that is subsequently visualized with compoplot? I don't think I fully understand the rational of DAPC. Is it a method for selecting K, when you do not have, or do not prefer to use any a priori information about your groups? Or only if you are willing to use a priori information? Thanks for your ideas and help, Karl Fetter -------------- next part -------------- An HTML attachment was scrubbed... URL: