[adegenet-forum] Request an example of genetic distance among two individuals

Fernando Cruz fernando.cruz at ebd.csic.es
Sun Nov 17 17:03:21 CET 2013


Hi Tibaut,

The nj tree of APE. What I basically did was:

mygenlight  <- read.snp("/Users/Nando/Documents/mydata.snp", chunk=2)

x<- seploc(k31_13c_lp23,n.block=100) # ~10000 SNPs each

library(ape)
lD<-lapply(x, function(e) dist(as.matrix(e))) # dist is used within a 
lapply loop to compute pairwise distances between individuals for each block
class(lD[[1]])

#The general distance matrix is obtained by summing these:
D <- Reduce("+", lD)
plot (nj(D), type="fan")

Cheers,
Fernando

On 11/17/13 4:45 PM, Jombart, Thibaut wrote:
> Hi there,
>
> I'm not sure which tree you are referring to.
>
> Cheers
> Thibaut
> ________________________________________
> From: Fernando Cruz [fernando.cruz at ebd.csic.es]
> Sent: 17 November 2013 15:41
> To: Jombart, Thibaut; adegenet-forum at lists.r-forge.r-project.org
> Subject: Re: [adegenet-forum] Request an example of genetic distance among two  individuals
>
> Thanks Tibaut,
>
> This clarifies. In both the euclidean and the Hamming distances, the
> distance between a pair of individuals depends on the number of
> "unshared alleles".
> By the way, then the standardized distance is plot in the NJ Tree
> instead of using the Saitou & Nei (1987) used by APE library, right?
>
> Cheers,
> Fernando
>
> On 11/17/13 4:23 PM, Jombart, Thibaut wrote:
>> Just realized a typo:
>>
>>    sqrt(\sum_i (x_i - y_i)^2
>>
>> should read
>>
>>    sqrt{ \sum_i (x_i - y_i)^2 }
>>
>> Cheers
>> Thibaut
>> ________________________________________
>> From:adegenet-forum-bounces at lists.r-forge.r-project.org  [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Jombart, Thibaut [t.jombart at imperial.ac.uk]
>> Sent: 17 November 2013 15:07
>> To: Fernando Cruz;adegenet-forum at lists.r-forge.r-project.org
>> Subject: Re: [adegenet-forum] Request an example of genetic distance among two  individuals
>>
>> Hello there,
>>
>> there are many different distances that can be computed between allelic profiles, but at an individual levels there is somewhat less options.
>>
>> One is the Hamming distance, which you mention here (D=6), and which you can deduce from 'propShared'.
>>
>> The usual Euclidean distance is different though. Between two vectors of allelic profiles x=[x_i] and y=[y_i], the Euclidean distance is given by (using latex notations):
>>
>> D(x,y) = || x - y || = sqrt{ (x-y)^T (x-y)} = sqrt(\sum_i (x_i - y_i)^2
>>
>> Using your example:
>>> x <- c(0,0,1,2,2)
>>> y <- c(0,2,2,1,0)
>>> sqrt(sum((x-y)^2))
>> [1] 3.162278
>>> dist(rbind.data.frame(x,y))
>>            1
>> 2 3.162278
>>
>>
>> Note that in adegenet, data in genind objects are standardized to relative frequencies, so that the distance would be different:
>>> x.rel <- x/2
>>> y.rel <- y/2
>>> dist(rbind.data.frame(x.rel,y.rel))
>>            1
>> 2 1.581139
>>
>> That is, the distance between the raw allele count profiles divided by the ploidy.
>>
>> As a last note, there is a particular case for haploid data, where the Hamming distance equals the squared Euclidean distance (it follows that a PCA on the covariance matrix is also the best reduced-space representation of Hamming distances).
>>
>> Cheers
>>
>> Thibaut
>>
>>
>> --
>> ######################################
>> Dr Thibaut JOMBART
>> MRC Centre for Outbreak Analysis and Modelling
>> Department of Infectious Disease Epidemiology
>> Imperial College - School of Public Health
>> St Mary’s Campus
>> Norfolk Place
>> London W2 1PG
>> United Kingdom
>> Tel. : 0044 (0)20 7594 3658
>> t.jombart at imperial.ac.uk
>> http://sites.google.com/site/thibautjombart/
>> http://adegenet.r-forge.r-project.org/
>> ________________________________________
>> From:adegenet-forum-bounces at lists.r-forge.r-project.org  [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Fernando Cruz [fernando.cruz at ebd.csic.es]
>> Sent: 15 November 2013 18:53
>> To:adegenet-forum at lists.r-forge.r-project.org
>> Subject: [adegenet-forum] Request an example of genetic distance among two      individuals
>>
>> Hi Thibaut,
>>
>> I performed a NJ Tree using 1M SNPs with 10 samples, following the
>> instructions in the documentation. However I would like to know exactly
>> the genetic distance among individuals is calculated. Is it based on the
>> number of shared alleles?
>>
>> Could you provide a simple  example? Like for this two individuals using
>> 5 SNPs:
>> Ind1 00122
>> Ind2 02210
>>
>> Using the binary information, they share 2+0+1+1+0= 4 alleles out of 10
>>
>> Thanks in advance,
>> Fernando Cruz
>>
>>
>> --
>> ****************************************
>> Dr. Fernando Cruz
>> Estación Biológica de Doñana (EBD-CSIC)
>> Avd. Americo Vespucio s/n
>> 41092-Seville (Spain)
>> Tel. +34 954466700/Ext. 1079
>> Fax: +34 95 4621125
>> Room: 0/12
>>
>> e-mail:fernando.cruz at ebd.csic.es
>> Website:http://openwetware.org/wiki/User:Fernando_Cruz
>> Web EcoGenes EU-FP7:http://www.ebd.csic.es/ecogenes/news.html
>> ****************************************
>>
>> _______________________________________________
>> adegenet-forum mailing list
>> adegenet-forum at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>> _______________________________________________
>> adegenet-forum mailing list
>> adegenet-forum at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>
> --
> ****************************************
> Dr. Fernando Cruz
> Estación Biológica de Doñana (EBD-CSIC)
> Avd. Americo Vespucio s/n
> 41092-Seville (Spain)
> Tel. +34 954466700/Ext. 1079
> Fax: +34 95 4621125
> Room: 0/12
>
> e-mail:fernando.cruz at ebd.csic.es
> Website:http://openwetware.org/wiki/User:Fernando_Cruz
> Web EcoGenes EU-FP7:http://www.ebd.csic.es/ecogenes/news.html
> ****************************************
>


-- 
****************************************
Dr. Fernando Cruz
Estación Biológica de Doñana (EBD-CSIC)
Avd. Americo Vespucio s/n
41092-Seville (Spain)
Tel. +34 954466700/Ext. 1079
Fax: +34 95 4621125
Room: 0/12

e-mail: fernando.cruz at ebd.csic.es
Website: http://openwetware.org/wiki/User:Fernando_Cruz
Web EcoGenes EU-FP7: http://www.ebd.csic.es/ecogenes/news.html
****************************************



More information about the adegenet-forum mailing list