# [adegenet-forum] Principal coordinate analysis on triangular matrix

Jombart, Thibaut t.jombart at imperial.ac.uk
Fri Jun 18 19:15:09 CEST 2010

```Dear Emma,

that's fairly doable in R with ade4, but not documented. I assume you've read an Fst matrix in R and called it 'mat'. If this is an issue, please post a toy example of your dataset so that I can figure out how to read it. The Principal Coordinate Analysis (PCoA) is implemented in ade4 by the function dudi.pco (see ?dudi.pco).

## make a fake distance matrix:
> mat=as.matrix(dist(cbind(rnorm(5), rnorm(5))))
> mat
1         2        3         4         5
1 0.0000000 1.2790368 2.498724 0.4224921 0.2215370
2 1.2790368 0.0000000 1.219687 0.8565447 1.0574997
3 2.4987236 1.2196868 0.000000 2.0762315 2.2771865
4 0.4224921 0.8565447 2.076232 0.0000000 0.2009550
5 0.2215370 1.0574997 2.277187 0.2009550 0.0000000

> D <- as.dist(mat) # convert distance matrix to a dist object
> D
1         2         3         4
2 1.2790368
3 2.4987236 1.2196868
4 0.4224921 0.8565447 2.0762315
5 0.2215370 1.0574997 2.2771865 0.2009550

> is.euclid(D) # check that the distance is Euclidean
[1] TRUE

> pco1 <- dudi.pco(D) # here you have to select the number of retained components
######

The object pco1 contains your PCoA. Outputs are a bit cryptic for people unfamiliar with the duality diagram - that is, normal people -, but what you need is likely:
- pco1\$eig: the eigenvalues of the analysis
- pco1\$li: the principal components of the analysis

The instruction 'scatter(pco1)' will give a basic display of the first factorial plane. See ?scatter.pco, ?s.label, ?add.scatter.eig to adapt the graphics to your needs.

One last thing to pay attention to is ensuring that you are indeed analysing an Euclidean distance. If is.euclid(...) is FALSE,  then you can make your distance Euclidean using 'cailliez' or 'lingoes' (see corresponding manpages).

Lastly, if you need to repeat all this for several matrices, you may want to make a function out of your script and iterate over different input files (a for loop or a lapply would do the job).

Best regards,

Thibaut.

________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] On Behalf Of Emma Carroll [ecar026 at aucklanduni.ac.nz]
Sent: 18 June 2010 17:28
Subject: [adegenet-forum] Principal coordinate analysis on triangular matrix

Hi,
I have triangular matrices of FST values that I've calculated in
GENEPOP v4 and Arlequin that I would like to perform separate
principal coordinate analyses on. I can't find any information on
matrices. GenAlEx performs a tri-distance matrix PCA (by converting
the distance matrix into a covariate matrix)  but as I can't find much
documentation on the underlying methodology I wanted to confirm the
results. Any help would be appreciated,
kind regards
Emma

--
Emma Carroll
PhD Candidate
Molecular Ecology and Evolution
School of Biological Sciences
University of Auckland
64 9 3737 599 x 88483

School of Biological Sciences
Room 260, Level 2
Thomas Building
3a Symonds Street
Auckland Central 1010
New Zealand.

University of Auckland
School of Biological Sciences
Private Bag 92019
Auckland Mail Centre
Auckland 1142
New Zealand
_______________________________________________