[adegenet-forum] Request an example of genetic distance among two individuals

Jombart, Thibaut t.jombart at imperial.ac.uk
Sun Nov 17 16:23:52 CET 2013


Just realized a typo:

 sqrt(\sum_i (x_i - y_i)^2

should read

 sqrt{ \sum_i (x_i - y_i)^2 }

Cheers
Thibaut
________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Jombart, Thibaut [t.jombart at imperial.ac.uk]
Sent: 17 November 2013 15:07
To: Fernando Cruz; adegenet-forum at lists.r-forge.r-project.org
Subject: Re: [adegenet-forum] Request an example of genetic distance among two  individuals

Hello there,

there are many different distances that can be computed between allelic profiles, but at an individual levels there is somewhat less options.

One is the Hamming distance, which you mention here (D=6), and which you can deduce from 'propShared'.

The usual Euclidean distance is different though. Between two vectors of allelic profiles x=[x_i] and y=[y_i], the Euclidean distance is given by (using latex notations):

D(x,y) = || x - y || = sqrt{ (x-y)^T (x-y)} = sqrt(\sum_i (x_i - y_i)^2

Using your example:
> x <- c(0,0,1,2,2)
> y <- c(0,2,2,1,0)
> sqrt(sum((x-y)^2))
[1] 3.162278
> dist(rbind.data.frame(x,y))
         1
2 3.162278


Note that in adegenet, data in genind objects are standardized to relative frequencies, so that the distance would be different:
> x.rel <- x/2
> y.rel <- y/2
> dist(rbind.data.frame(x.rel,y.rel))
         1
2 1.581139

That is, the distance between the raw allele count profiles divided by the ploidy.

As a last note, there is a particular case for haploid data, where the Hamming distance equals the squared Euclidean distance (it follows that a PCA on the covariance matrix is also the best reduced-space representation of Hamming distances).

Cheers

Thibaut


--
######################################
Dr Thibaut JOMBART
MRC Centre for Outbreak Analysis and Modelling
Department of Infectious Disease Epidemiology
Imperial College - School of Public Health
St Mary’s Campus
Norfolk Place
London W2 1PG
United Kingdom
Tel. : 0044 (0)20 7594 3658
t.jombart at imperial.ac.uk
http://sites.google.com/site/thibautjombart/
http://adegenet.r-forge.r-project.org/
________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Fernando Cruz [fernando.cruz at ebd.csic.es]
Sent: 15 November 2013 18:53
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] Request an example of genetic distance among two      individuals

Hi Thibaut,

I performed a NJ Tree using 1M SNPs with 10 samples, following the
instructions in the documentation. However I would like to know exactly
the genetic distance among individuals is calculated. Is it based on the
number of shared alleles?

Could you provide a simple  example? Like for this two individuals using
5 SNPs:
Ind1 00122
Ind2 02210

Using the binary information, they share 2+0+1+1+0= 4 alleles out of 10

Thanks in advance,
Fernando Cruz


--
****************************************
Dr. Fernando Cruz
Estación Biológica de Doñana (EBD-CSIC)
Avd. Americo Vespucio s/n
41092-Seville (Spain)
Tel. +34 954466700/Ext. 1079
Fax: +34 95 4621125
Room: 0/12

e-mail: fernando.cruz at ebd.csic.es
Website: http://openwetware.org/wiki/User:Fernando_Cruz
Web EcoGenes EU-FP7: http://www.ebd.csic.es/ecogenes/news.html
****************************************

_______________________________________________
adegenet-forum mailing list
adegenet-forum at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
_______________________________________________
adegenet-forum mailing list
adegenet-forum at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum


More information about the adegenet-forum mailing list