[adegenet-forum] Request an example of genetic distance among two individuals

Fernando Cruz fernando.cruz at ebd.csic.es
Sun Nov 17 17:13:29 CET 2013


Well,there's a typo sorry. "k31_13c_lp23" is the same as "mygenlight"
Thanks,
Fernando

On 11/17/13 5:03 PM, Fernando Cruz wrote:
> Hi Tibaut,
>
> The nj tree of APE. What I basically did was:
>
> mygenlight  <- read.snp("/Users/Nando/Documents/mydata.snp", chunk=2)
>
> x<- seploc(k31_13c_lp23,n.block=100) # ~10000 SNPs each
>
> library(ape)
> lD<-lapply(x, function(e) dist(as.matrix(e))) # dist is used within a 
> lapply loop to compute pairwise distances between individuals for each 
> block
> class(lD[[1]])
>
> #The general distance matrix is obtained by summing these:
> D <- Reduce("+", lD)
> plot (nj(D), type="fan")
>
> Cheers,
> Fernando
>
> On 11/17/13 4:45 PM, Jombart, Thibaut wrote:
>> Hi there,
>>
>> I'm not sure which tree you are referring to.
>>
>> Cheers
>> Thibaut
>> ________________________________________
>> From: Fernando Cruz [fernando.cruz at ebd.csic.es]
>> Sent: 17 November 2013 15:41
>> To: Jombart, Thibaut; adegenet-forum at lists.r-forge.r-project.org
>> Subject: Re: [adegenet-forum] Request an example of genetic distance 
>> among two  individuals
>>
>> Thanks Tibaut,
>>
>> This clarifies. In both the euclidean and the Hamming distances, the
>> distance between a pair of individuals depends on the number of
>> "unshared alleles".
>> By the way, then the standardized distance is plot in the NJ Tree
>> instead of using the Saitou & Nei (1987) used by APE library, right?
>>
>> Cheers,
>> Fernando
>>
>> On 11/17/13 4:23 PM, Jombart, Thibaut wrote:
>>> Just realized a typo:
>>>
>>>    sqrt(\sum_i (x_i - y_i)^2
>>>
>>> should read
>>>
>>>    sqrt{ \sum_i (x_i - y_i)^2 }
>>>
>>> Cheers
>>> Thibaut
>>> ________________________________________
>>> From:adegenet-forum-bounces at lists.r-forge.r-project.org 
>>> [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of 
>>> Jombart, Thibaut [t.jombart at imperial.ac.uk]
>>> Sent: 17 November 2013 15:07
>>> To: Fernando Cruz;adegenet-forum at lists.r-forge.r-project.org
>>> Subject: Re: [adegenet-forum] Request an example of genetic distance 
>>> among two  individuals
>>>
>>> Hello there,
>>>
>>> there are many different distances that can be computed between 
>>> allelic profiles, but at an individual levels there is somewhat less 
>>> options.
>>>
>>> One is the Hamming distance, which you mention here (D=6), and which 
>>> you can deduce from 'propShared'.
>>>
>>> The usual Euclidean distance is different though. Between two 
>>> vectors of allelic profiles x=[x_i] and y=[y_i], the Euclidean 
>>> distance is given by (using latex notations):
>>>
>>> D(x,y) = || x - y || = sqrt{ (x-y)^T (x-y)} = sqrt(\sum_i (x_i - y_i)^2
>>>
>>> Using your example:
>>>> x <- c(0,0,1,2,2)
>>>> y <- c(0,2,2,1,0)
>>>> sqrt(sum((x-y)^2))
>>> [1] 3.162278
>>>> dist(rbind.data.frame(x,y))
>>>            1
>>> 2 3.162278
>>>
>>>
>>> Note that in adegenet, data in genind objects are standardized to 
>>> relative frequencies, so that the distance would be different:
>>>> x.rel <- x/2
>>>> y.rel <- y/2
>>>> dist(rbind.data.frame(x.rel,y.rel))
>>>            1
>>> 2 1.581139
>>>
>>> That is, the distance between the raw allele count profiles divided 
>>> by the ploidy.
>>>
>>> As a last note, there is a particular case for haploid data, where 
>>> the Hamming distance equals the squared Euclidean distance (it 
>>> follows that a PCA on the covariance matrix is also the best 
>>> reduced-space representation of Hamming distances).
>>>
>>> Cheers
>>>
>>> Thibaut
>>>
>>>
>>> -- 
>>> ######################################
>>> Dr Thibaut JOMBART
>>> MRC Centre for Outbreak Analysis and Modelling
>>> Department of Infectious Disease Epidemiology
>>> Imperial College - School of Public Health
>>> St Mary’s Campus
>>> Norfolk Place
>>> London W2 1PG
>>> United Kingdom
>>> Tel. : 0044 (0)20 7594 3658
>>> t.jombart at imperial.ac.uk
>>> http://sites.google.com/site/thibautjombart/
>>> http://adegenet.r-forge.r-project.org/
>>> ________________________________________
>>> From:adegenet-forum-bounces at lists.r-forge.r-project.org 
>>> [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of 
>>> Fernando Cruz [fernando.cruz at ebd.csic.es]
>>> Sent: 15 November 2013 18:53
>>> To:adegenet-forum at lists.r-forge.r-project.org
>>> Subject: [adegenet-forum] Request an example of genetic distance 
>>> among two      individuals
>>>
>>> Hi Thibaut,
>>>
>>> I performed a NJ Tree using 1M SNPs with 10 samples, following the
>>> instructions in the documentation. However I would like to know exactly
>>> the genetic distance among individuals is calculated. Is it based on 
>>> the
>>> number of shared alleles?
>>>
>>> Could you provide a simple  example? Like for this two individuals 
>>> using
>>> 5 SNPs:
>>> Ind1 00122
>>> Ind2 02210
>>>
>>> Using the binary information, they share 2+0+1+1+0= 4 alleles out of 10
>>>
>>> Thanks in advance,
>>> Fernando Cruz
>>>
>>>
>>> -- 
>>> ****************************************
>>> Dr. Fernando Cruz
>>> Estación Biológica de Doñana (EBD-CSIC)
>>> Avd. Americo Vespucio s/n
>>> 41092-Seville (Spain)
>>> Tel. +34 954466700/Ext. 1079
>>> Fax: +34 95 4621125
>>> Room: 0/12
>>>
>>> e-mail:fernando.cruz at ebd.csic.es
>>> Website:http://openwetware.org/wiki/User:Fernando_Cruz
>>> Web EcoGenes EU-FP7:http://www.ebd.csic.es/ecogenes/news.html
>>> ****************************************
>>>
>>> _______________________________________________
>>> adegenet-forum mailing list
>>> adegenet-forum at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum 
>>>
>>> _______________________________________________
>>> adegenet-forum mailing list
>>> adegenet-forum at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum 
>>>
>>
>> -- 
>> ****************************************
>> Dr. Fernando Cruz
>> Estación Biológica de Doñana (EBD-CSIC)
>> Avd. Americo Vespucio s/n
>> 41092-Seville (Spain)
>> Tel. +34 954466700/Ext. 1079
>> Fax: +34 95 4621125
>> Room: 0/12
>>
>> e-mail:fernando.cruz at ebd.csic.es
>> Website:http://openwetware.org/wiki/User:Fernando_Cruz
>> Web EcoGenes EU-FP7:http://www.ebd.csic.es/ecogenes/news.html
>> ****************************************
>>
>
>


-- 
****************************************
Dr. Fernando Cruz
Estación Biológica de Doñana (EBD-CSIC)
Avd. Americo Vespucio s/n
41092-Seville (Spain)
Tel. +34 954466700/Ext. 1079
Fax: +34 95 4621125
Room: 0/12

e-mail: fernando.cruz at ebd.csic.es
Website: http://openwetware.org/wiki/User:Fernando_Cruz
Web EcoGenes EU-FP7: http://www.ebd.csic.es/ecogenes/news.html
****************************************



More information about the adegenet-forum mailing list