[adegenet-forum] Detecting Genetically Unique Individuals in a Well Mixed Population
Valeria Montano
mirainoshojo at gmail.com
Tue May 7 10:18:40 CEST 2013
Hi Nate,
I don't know how much I can help you, but if you want you can send me the
plot off-list.
All the best
Valeria
On 2 May 2013 22:40, Nathan Truelove <nathan.truelove at manchester.ac.uk>wrote:
> Dear Valeria,
>
> Thank you for your advice and encouragement. I definitely need it
> working with flat populations! The overall aim of my research is to see if
> ocean currents are shaping spatial patterns of genetic variation in spiny
> lobster species. For example, we would like to investigate if individuals
> are more genetically different to their neighbors in advective
> oceanographic regions and more genetically similar to their neighbors in
> retentive regions. I was also interested in trying to figure out if any
> individuals that are genetically different from their neighbors happen to
> be immigrants from a distant population. After reading Thibaut's response
> to my first email it looks like I was heading down the wrong path by trying
> to force my data into 2 dimensions and basing my identification of
> 'outliers' based upon distance from the centroid. Some new techniques are
> definitely needed to search for the presence of any genetically unique
> individuals within my data.
>
> I really like your ideas of creating a tree using nj in ape and also
> using dist.gene in ape to calculate the mean number of pairwise distances
> of every individual from all others. I took your advice and ran your
> analyses. The tree in ape had 3 nodes. A long branch in the second node
> contained 22 individuals and stood out from all other branches. I then
> calculated the mean of the distances of every individual from all others
> and it came out to be 41. I sorted out all individuals that where
> 'arbitrarily' 2X above this threshold and almost all of these individuals
> also belonged to the branch that stood out in the second node of the tree.
>
> It would be great to get your opinion on these results. Perhaps it would
> be best if I sent you an image of the tree since it's a little tricky to
> describe it properly. Just let me know what you prefer. Your advice has
> been really helpful.
>
> Best Wishes,
>
> Nate
>
>
> On Apr 30, 2013, at 6:16 PM, Valeria Montano wrote:
>
> Hi Nate,
>
> I think Thibaut's answer is already more than appropriate and actually
> points out the main question among your questions. As I understand, your
> population is not really easy to deal with since you have this high genetic
> homogeneity which does not leave much room to imagination (a bit
> frustrating I believe). Focusing on outliers can be an option, but it
> really depends on your scientific aim. If I were you, I would try with a
> statistics estimating individual genetic distances (for instance the mean
> number of pairwise distances using dist.gene in the ape package), calculate
> the mean of the distances of every ind from all the others, and than put
> a threshold to define 'outliers', does it make sense? A wee bit arbitrary
> maybe...moreover, in this case you would have 'outliers' compared to the
> general population, and I am not sure it would help...
>
> On the other hand, to understand whether outliers are immigrants from
> distant pops, you could build a network or use any phylogenetic
> reconstruction and see if outliers appear to be long but derived branches
> within their geographic neighbours or if they are more basal. This is the
> only tool that comes to my mind.
>
> Anyway good luck with it, flat populations are upsetting.
>
> with the occasion, happy Labor day everybody! (or happy transition from
> Spring to Summer - just in case you follow the Celtic tradition)
>
> Valeria
>
> On 30 April 2013 12:14, Jombart, Thibaut <t.jombart at imperial.ac.uk> wrote:
>
>> Dear Nate,
>>
>> the problem here is that it is not clear what is meant by 'outliers'. If
>> we're talking about a few migrants from another population, then they
>> should fall in a small cluster of there own (e.g. using find.clusters). If
>> the definition is spatial, then 'outliers' may be individuals that are
>> genetically distinct from their neighbours (without having to be migrants
>> from another population). Or, 'outliers' can be individuals with
>> rare/original alleles (without having to be any of the above). Or
>> 'outliers' can be whatever does not fall within the inertia ellipse, and in
>> this case you will always have 'outliers' with the default parameters of
>> s.class.
>>
>> All of these definitions of 'outliers' would require different techniques
>> to pin them down. I would really avoid anything based on the distance from
>> the centroid. This implies that the cloud of point of the population is
>> well represented in only 2D and more importantly is spherical, which is
>> very unlikely. Detection based on inertia ellipses (not intertia - inertia
>> is the squared length of a vector, which in PCA is the variance of the
>> corresponding scores) is bound to fail to. There the assumption is that the
>> cloud of point of the population is bivariate normal, which again is
>> unlikely. But if it is the case, the default inertia ellipse in s.class
>> contains 2/3 of the points. It would be far-fetched to call the remaining
>> third 'outliers'. One can change this parameter, but again, that means
>> arbitrarily deciding of a fixed number of outliers.
>>
>> But again, the problem here as I understand it is not technical (for now)
>> - what is meant by 'outliers' needs to be clarified first.
>>
>> All the best
>>
>> Thibaut
>>
>> ________________________________________
>> From: adegenet-forum-bounces at lists.r-forge.r-project.org [
>> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Nathan
>> Truelove [nathan.truelove at manchester.ac.uk]
>> Sent: 23 April 2013 13:46
>> To: adegenet-forum at lists.r-forge.r-project.org
>> Subject: [adegenet-forum] Detecting Genetically Unique Individuals in a
>> Well Mixed Population
>>
>> Dear Thibaut and Adegenet Users,
>>
>> I would like to begin by thanking Thibaut and everyone else who created
>> Adegenet, it has to be the most useful data analysis tool that I have used
>> for my PhD research.
>>
>> I am PhD student working on the population genetics of Caribbean spiny
>> lobster using 16 microsatellite markers. The species has a huge potential
>> for migration since it can spend up to a year floating/swimming in ocean
>> currents before settling in shallow coastal habitat. Adults can also
>> migrate 10s to 100s of km. It's no big surprise that I am finding very
>> little differentiation in PCA, PCoA, and DAPC analyses. The trend that
>> comes out in all these analyses is that ~80% of individuals from all
>> sampling sites fall within the interia ellipse (s.class) or the contour
>> polygon (s.chull). Several of the individuals outside the interia ellipse
>> (or polygons) are located quite far away from the "core" of individuals
>> within the ellipse. These outlier individuals are not associated with any
>> particular site, however on the spatial level, there appear to be more
>> outliers in southern sites than in northern sites. I've been trying a
>> variety of techniques to try and figure out the ecological
>> importance of these outlier individuals. For example, a recent paper by
>> Elphie et al. entitled "Detecting immigrants in a highly genetically
>> homogeneous spiny lobster population (Palinurus elephas) in the northwest
>> Mediterranean Sea" explores a similar issue in a different species of
>> lobster. In this paper the authors use non-metric multidimensional scaling
>> to separate out the genetic distances of their individuals in multivariate
>> space. They then classified all individuals within a 50% radius of the
>> barycentre as the "reference population" and all individuals outside the
>> 50% radius as an "assignment population". They then used Geneclass2 to run
>> assignment tests and any individuals that had a p-value < 0.05 are
>> considered "genetically different". The authors argue that the most likely
>> explanation for the genetic differences is that the genetically unique
>> individuals detected in Geneclass are migrants from populations that have
>> genetically diverged. I imagine there are severa
>> l other ecological or selective processes that could also lead to
>> genetically unique individuals, so calling them migrants is up for debate.
>>
>> For my data I ran a similar analysis in Adegenet using the functions
>> s.class and s.chull along with dudi.pca to select the reference and
>> assignment populations for Genclass2. I compared these results to a similar
>> analysis using non-metric multidimensional scaling in the Vegan package.
>> The Adegenet PCA analyses contained about twice as many individuals in the
>> reference population than the nMDS technique, yet the overall trend of
>> Geneclass finding more unique individuals in the south than the north was
>> consistent among all techniques. Also, most of the distant outliers in PCA
>> analysis in Adegenet were also significantly different in the Geneclass
>> analysis.
>>
>> It would be excellent to get your opinions on this technique and discuss
>> potential options for improving it:
>>
>> 1) Would it be possible to get additional information using Adegenet on
>> how different the outliers in PCA are from the "core" of individuals inside
>> the inertia ellipse? It would be nice to run the entire analysis in
>> Adegenet and not have to use Geneclass2 at all.
>>
>> 2) Is there a simple way to identify each individual within an inertia
>> ellipse. I have been using the function identify to select the individuals
>> that are located within the ellipse, yet it is rather clunky since you have
>> to click on every point.
>>
>> 3) Any additional advice concerning how to detect genetic outliers in
>> homogeneous populations using Adegenet would be greatly appreciated.
>>
>> Thank you very much for your time.
>>
>> Best Wishes,
>>
>> Nate
>>
>>
>>
>> _______________________________________________
>> adegenet-forum mailing list
>> adegenet-forum at lists.r-forge.r-project.org
>>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20130507/3f62347a/attachment.html>
More information about the adegenet-forum
mailing list