[adegenet-forum] Request an example of genetic distance among two individuals
Jombart, Thibaut
t.jombart at imperial.ac.uk
Sun Nov 17 16:45:51 CET 2013
Hi there,
I'm not sure which tree you are referring to.
Cheers
Thibaut
________________________________________
From: Fernando Cruz [fernando.cruz at ebd.csic.es]
Sent: 17 November 2013 15:41
To: Jombart, Thibaut; adegenet-forum at lists.r-forge.r-project.org
Subject: Re: [adegenet-forum] Request an example of genetic distance among two individuals
Thanks Tibaut,
This clarifies. In both the euclidean and the Hamming distances, the
distance between a pair of individuals depends on the number of
"unshared alleles".
By the way, then the standardized distance is plot in the NJ Tree
instead of using the Saitou & Nei (1987) used by APE library, right?
Cheers,
Fernando
On 11/17/13 4:23 PM, Jombart, Thibaut wrote:
> Just realized a typo:
>
> sqrt(\sum_i (x_i - y_i)^2
>
> should read
>
> sqrt{ \sum_i (x_i - y_i)^2 }
>
> Cheers
> Thibaut
> ________________________________________
> From:adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Jombart, Thibaut [t.jombart at imperial.ac.uk]
> Sent: 17 November 2013 15:07
> To: Fernando Cruz;adegenet-forum at lists.r-forge.r-project.org
> Subject: Re: [adegenet-forum] Request an example of genetic distance among two individuals
>
> Hello there,
>
> there are many different distances that can be computed between allelic profiles, but at an individual levels there is somewhat less options.
>
> One is the Hamming distance, which you mention here (D=6), and which you can deduce from 'propShared'.
>
> The usual Euclidean distance is different though. Between two vectors of allelic profiles x=[x_i] and y=[y_i], the Euclidean distance is given by (using latex notations):
>
> D(x,y) = || x - y || = sqrt{ (x-y)^T (x-y)} = sqrt(\sum_i (x_i - y_i)^2
>
> Using your example:
>> x <- c(0,0,1,2,2)
>> y <- c(0,2,2,1,0)
>> sqrt(sum((x-y)^2))
> [1] 3.162278
>> dist(rbind.data.frame(x,y))
> 1
> 2 3.162278
>
>
> Note that in adegenet, data in genind objects are standardized to relative frequencies, so that the distance would be different:
>> x.rel <- x/2
>> y.rel <- y/2
>> dist(rbind.data.frame(x.rel,y.rel))
> 1
> 2 1.581139
>
> That is, the distance between the raw allele count profiles divided by the ploidy.
>
> As a last note, there is a particular case for haploid data, where the Hamming distance equals the squared Euclidean distance (it follows that a PCA on the covariance matrix is also the best reduced-space representation of Hamming distances).
>
> Cheers
>
> Thibaut
>
>
> --
> ######################################
> Dr Thibaut JOMBART
> MRC Centre for Outbreak Analysis and Modelling
> Department of Infectious Disease Epidemiology
> Imperial College - School of Public Health
> St Mary’s Campus
> Norfolk Place
> London W2 1PG
> United Kingdom
> Tel. : 0044 (0)20 7594 3658
> t.jombart at imperial.ac.uk
> http://sites.google.com/site/thibautjombart/
> http://adegenet.r-forge.r-project.org/
> ________________________________________
> From:adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Fernando Cruz [fernando.cruz at ebd.csic.es]
> Sent: 15 November 2013 18:53
> To:adegenet-forum at lists.r-forge.r-project.org
> Subject: [adegenet-forum] Request an example of genetic distance among two individuals
>
> Hi Thibaut,
>
> I performed a NJ Tree using 1M SNPs with 10 samples, following the
> instructions in the documentation. However I would like to know exactly
> the genetic distance among individuals is calculated. Is it based on the
> number of shared alleles?
>
> Could you provide a simple example? Like for this two individuals using
> 5 SNPs:
> Ind1 00122
> Ind2 02210
>
> Using the binary information, they share 2+0+1+1+0= 4 alleles out of 10
>
> Thanks in advance,
> Fernando Cruz
>
>
> --
> ****************************************
> Dr. Fernando Cruz
> Estación Biológica de Doñana (EBD-CSIC)
> Avd. Americo Vespucio s/n
> 41092-Seville (Spain)
> Tel. +34 954466700/Ext. 1079
> Fax: +34 95 4621125
> Room: 0/12
>
> e-mail:fernando.cruz at ebd.csic.es
> Website:http://openwetware.org/wiki/User:Fernando_Cruz
> Web EcoGenes EU-FP7:http://www.ebd.csic.es/ecogenes/news.html
> ****************************************
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
--
****************************************
Dr. Fernando Cruz
Estación Biológica de Doñana (EBD-CSIC)
Avd. Americo Vespucio s/n
41092-Seville (Spain)
Tel. +34 954466700/Ext. 1079
Fax: +34 95 4621125
Room: 0/12
e-mail:fernando.cruz at ebd.csic.es
Website:http://openwetware.org/wiki/User:Fernando_Cruz
Web EcoGenes EU-FP7:http://www.ebd.csic.es/ecogenes/news.html
****************************************
More information about the adegenet-forum
mailing list