[adegenet-forum] DaPC vs. BAPS results question

Felipe Hernández fhernandeu at uc.cl
Tue Dec 6 17:59:05 CET 2016


Ok, thanks! So just putting attention in the lower k-mean value doesn't
relate to the more likely number of clusters at the end? Ultimately, may
K=5 be considered as the most probable number of genetic clusters explained
by my dataset, or should I consider other factors too? I tried your
suggestions and see what I can get. Thanks!

Best,

2016-12-06 11:13 GMT-05:00 Thibaut Jombart <thibautjombart at gmail.com>:

> Hello,
>
> the results will be a bit more stable if you increase the number of
> starting points for the k-means (see arg. n.start).
>
> It should not really impact the outcome though: here, any K from 2 to 12
> is an equally good solution, at least as judged by the BIC.
>
> Cheers
> Thibaut
>
>
> --
> Dr Thibaut Jombart
> Lecturer, Department of Infectious Disease Epidemiology, Imperial College
> London
> Head of RECON: repidemicsconsortium.org
> sites.google.com/site/thibautjombart/
> github.com/thibautjombart
> Twitter: @TeebzR <http://twitter.com/TeebzR>
>
> On 6 December 2016 at 15:17, Felipe Hernández <fhernandeu at uc.cl> wrote:
>
>> Thanks Thibaut,
>>
>> Here you have the image and values for each estimated K. Any advice is
>> more than welcome, thanks!
>>
>> Best,
>> Felipe
>>
>> > grp
>> $Kstat
>>      K=1      K=2      K=3      K=4      K=5      K=6      K=7      K=8
>> 1494.756 1481.467 1473.864 1472.002 1470.633 1472.970 1470.754 1472.011
>>      K=9     K=10     K=11     K=12     K=13     K=14     K=15     K=16
>> 1471.813 1473.632 1473.924 1476.759 1476.699 1475.433 1479.546 1481.119
>>     K=17     K=18     K=19     K=20     K=21     K=22     K=23     K=24
>> 1481.292 1485.865 1488.130 1488.356 1493.552 1494.979 1501.182 1499.258
>>     K=25     K=26     K=27     K=28     K=29     K=30     K=31     K=32
>> 1500.146 1504.113 1511.598 1511.550 1513.889 1516.275 1522.144 1524.733
>>     K=33     K=34     K=35     K=36     K=37     K=38     K=39     K=40
>> 1528.089 1530.409 1535.778 1538.049 1541.269 1546.197 1547.656 1552.127
>>
>> $stat
>>      K=5
>> 1470.633
>>
>>
>>
>> 2016-12-05 10:10 GMT-05:00 Thibaut Jombart <thibautjombart at gmail.com>:
>>
>>> Dear Felipe,
>>>
>>> this is always a hard question, as different methods essentially do..
>>> different things. The K-means in find.clusters optimizes the variance
>>> between groups, while BAPS maximizes a likelihood function under a
>>> given population genetics model. So it may be the case that you have
>>> ~17 demes roughly at HWE, but that only 4-5 groups are optimum in
>>> terms of clearly delineated groups. And this is assuming both methods
>>> are 'right'. They may be prone to all sorts of biases. Namely, largely
>>> different group variances for the K-means, and deviations from the
>>> original model in BAPS.
>>>
>>> Feel free to post the image (or a link to it) of the BIC for
>>> find.clusters if you want a 2-cents advice on the number of K to look
>>> at.
>>>
>>> Best
>>> Thibaut
>>>
>>> --
>>> Dr Thibaut Jombart
>>> Lecturer, Department of Infectious Disease Epidemiology, Imperial
>>> College London
>>> Head of RECON: repidemicsconsortium.org
>>> sites.google.com/site/thibautjombart/
>>> github.com/thibautjombart
>>> Twitter: @TeebzR
>>>
>>>
>>> On 5 December 2016 at 14:29, Felipe Hernández <fhernandeu at uc.cl> wrote:
>>> > Good morning,
>>> >
>>> > I wonder if you may guide me with this question (that may be pretty
>>> basic
>>> > surely). After a run DaPC analysis using adegenet, I'm usually getting
>>> K
>>> > between 4 and 5 for my dataset (480 hogs, 59 microsats, 39 sampling
>>> sites).
>>> > Maximum number of clusters tried are 40. Afterwards, I tried to
>>> estimate
>>> > number of clusters (spatial clustering by individuals) using another
>>> > software (BAPS 6.0), but I got an even higher number of estimated
>>> cluster
>>> > (K=17), after testing different maximum number of K's (i.e., K=5
>>> through
>>> > K=20). Any clue about what's the reason of this? Maybe related to the
>>> > maximum number of cluster tested? Or, linkage disequilibrium between
>>> some
>>> > loci? Sorry if the question is really basic, but I would appreciate any
>>> > advice.
>>> >
>>> > Regards,
>>> > Felipe
>>> >
>>> > --
>>> > Felipe Hernández
>>> > Médico Veterinario (DVM), MSc.
>>> > PhD. Candidate
>>> > Interdisciplinary Ecology Program
>>> > School of Natural Resources and Environment
>>> > Wildlife Ecology and Conservation Department
>>> > University of Florida
>>> >
>>> > _______________________________________________
>>> > adegenet-forum mailing list
>>> > adegenet-forum at lists.r-forge.r-project.org
>>> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo
>>> /adegenet-forum
>>>
>>
>>
>>
>> --
>> Felipe Hernández
>> Médico Veterinario (DVM), MSc.
>> PhD. Candidate
>> Interdisciplinary Ecology Program
>> School of Natural Resources and Environment
>> Wildlife Ecology and Conservation Department
>> University of Florida
>>
>
>


-- 
Felipe Hernández
Médico Veterinario (DVM), MSc.
PhD. Candidate
Interdisciplinary Ecology Program
School of Natural Resources and Environment
Wildlife Ecology and Conservation Department
University of Florida
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161206/5a007430/attachment-0001.html>


More information about the adegenet-forum mailing list