[adegenet-forum] Principal coordinate analysis on triangular matrix

Jombart, Thibaut t.jombart at imperial.ac.uk
Fri Jun 18 19:15:09 CEST 2010

Dear Emma, 

that's fairly doable in R with ade4, but not documented. I assume you've read an Fst matrix in R and called it 'mat'. If this is an issue, please post a toy example of your dataset so that I can figure out how to read it. The Principal Coordinate Analysis (PCoA) is implemented in ade4 by the function dudi.pco (see ?dudi.pco).

## make a fake distance matrix:
> mat=as.matrix(dist(cbind(rnorm(5), rnorm(5))))
> mat
          1         2        3         4         5
1 0.0000000 1.2790368 2.498724 0.4224921 0.2215370
2 1.2790368 0.0000000 1.219687 0.8565447 1.0574997
3 2.4987236 1.2196868 0.000000 2.0762315 2.2771865
4 0.4224921 0.8565447 2.076232 0.0000000 0.2009550
5 0.2215370 1.0574997 2.277187 0.2009550 0.0000000

> library(ade4)
> D <- as.dist(mat) # convert distance matrix to a dist object
> D
          1         2         3         4
2 1.2790368                              
3 2.4987236 1.2196868                    
4 0.4224921 0.8565447 2.0762315          
5 0.2215370 1.0574997 2.2771865 0.2009550

> is.euclid(D) # check that the distance is Euclidean
[1] TRUE

> pco1 <- dudi.pco(D) # here you have to select the number of retained components

The object pco1 contains your PCoA. Outputs are a bit cryptic for people unfamiliar with the duality diagram - that is, normal people -, but what you need is likely:
- pco1$eig: the eigenvalues of the analysis
- pco1$li: the principal components of the analysis

The instruction 'scatter(pco1)' will give a basic display of the first factorial plane. See ?scatter.pco, ?s.label, ?add.scatter.eig to adapt the graphics to your needs.

One last thing to pay attention to is ensuring that you are indeed analysing an Euclidean distance. If is.euclid(...) is FALSE,  then you can make your distance Euclidean using 'cailliez' or 'lingoes' (see corresponding manpages).

Lastly, if you need to repeat all this for several matrices, you may want to make a function out of your script and iterate over different input files (a for loop or a lapply would do the job).

Best regards,


From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] On Behalf Of Emma Carroll [ecar026 at aucklanduni.ac.nz]
Sent: 18 June 2010 17:28
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] Principal coordinate analysis on triangular matrix

I have triangular matrices of FST values that I've calculated in
GENEPOP v4 and Arlequin that I would like to perform separate
principal coordinate analyses on. I can't find any information on
whether adegenet or ade4 performs these or PCAs on triangular
matrices. GenAlEx performs a tri-distance matrix PCA (by converting
the distance matrix into a covariate matrix)  but as I can't find much
documentation on the underlying methodology I wanted to confirm the
results. Any help would be appreciated,
kind regards

Emma Carroll
PhD Candidate
Molecular Ecology and Evolution
School of Biological Sciences
University of Auckland
64 9 3737 599 x 88483

Courier Address:
School of Biological Sciences
Room 260, Level 2
Thomas Building
3a Symonds Street
Auckland Central 1010
New Zealand.

Postal Address:
University of Auckland
School of Biological Sciences
Private Bag 92019
Auckland Mail Centre
Auckland 1142
New Zealand
adegenet-forum mailing list
adegenet-forum at lists.r-forge.r-project.org

More information about the adegenet-forum mailing list