[adegenet-forum] Very different number of clusters in different datasets.
Peri Bolton
peri.bolton at students.mq.edu.au
Fri Oct 30 12:40:55 CET 2015
Dear adegenet developers and users,
I have a dataset with 50 individuals across 5 sampling locations in a
microsatellite dataset, and roughly equivalent numbers of individuals in
a SNP dataset with 3839 loci.
I have just been interested in finding whether there is any population
structure in my species. However, when I run the different datasets I
get different answers, and some of them look strange.
microsatellite dataset.
Fst, mantel test for IBD and STRUCTURE both find zero evidence of
structure...
find.clusters says k=4 or 5
then I run optima.a.score and xvalDapc to find the best number of PCs to
retain for a dapc, and I have nice groups in the final answer, with
apparently good assignment power back to the original groups.
However, my alpha scores for that dapc run is as follows
1 2 3 4
0.4905714 0.5570149 0.7075510 0.5962500
Further, when I visualise this as a compoplot there is no evidence that
these structures actually represent any kind of geographic structure in
the data, as the groups are just randomly dispersed through my individuals.
I have read on topics in the forums that if there is enough space in the
data it will find an optimal clustering solution, no matter whether it
is biologically realistic. I have also read that find.clusters shouldn't
find an optimal solution for k=1 because it is meant to be a non-sense
solution for a cluster. Indeed this makes sense because when you use
sampling locality as a prior in dapc it all comes out as one big cluster.
HOWEVER, when I run my SNP dataset things get really strange.
I ran essentially all the same procedures and I've come up against a
number of hurdles:
1. I can't get the xvalDapc to work on a genlight object. I keep getting
an error:
Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class "structure("SNPbin", package = "adegenet")" to a
data.frame
In addition: Warning message:
In min(dim(x)) : no non-missing arguments to min; returning Inf
Obviously this is because genlight doesn't store the genetic data in the
same way as the genind objects do. Is there a work around for using this
function?
So far I have got xvalDapc to work on my genind objects, but I do get a
bunch of "warning messages "49: In if (result == "overall") { ... :
the condition has length > 1 and only the first element will be
used", but it seems to spit out an output at least....
2. when I run find.clusters my cumulative variance plot is nearly
linear... as is my BICvsK plot, with the optimal solution being the
supposedly non-sensical k=1 (see the attached pdf of the output)? Is
there something weird with my data? Or, is that the genuine signal
coming through? When I use other clustering methods such as
fastSTRUCTURE and mds I don't get any indication of structure either.
HOWEVER, I don't know how to reconcile the two clustering solutions from
the two nuclear data sources.
3. When I run an a.score analysis it is basically a flat line, and
although it finds an "optimal" pca retention it doesn't seem very
reliable to me (see also attached)
So I am aware that there are a few problems there, but hopefully the
itemisation and the context of my questions help any good hearted
helping people out there.
Sincerely,
Peri
--
*Peri Bolton*
PhD Candidate, Griffith Lab <http://bio.mq.edu.au/avianbehaviouralecology/>
Department of Biological Sciences
Macquarie University, NSW 2109, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20151030/0243b459/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: findclusters.pdf
Type: application/pdf
Size: 7623 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20151030/0243b459/attachment-0001.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ascoreoptimisation.jpeg
Type: image/jpeg
Size: 65733 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20151030/0243b459/attachment-0001.jpeg>
More information about the adegenet-forum
mailing list