[adegenet-forum] PCA using R

Jombart, Thibaut t.jombart at imperial.ac.uk
Wed Mar 24 11:50:21 CET 2010


Dear Arun,

welcome in the exciting world of R (although some would argue about the 'exciting').

> Dear Thibaut
> As you know I am new to R. I may need your help and guidance in between.
> while reading the R manual, I prepared an example "CSV file" file similar to my data. I used the following script to do Principal component analysis using "prcomp".
>
> R code starts
> > him <- read.csv("him.csv" , header=TRUE)
> > him
>  row.names(him) <- c("Chn","Dom","Tib","Bod","Gad","Raj","Mal","Gdb","Guj","GdR","Dgr","Pur","Zan","BrB","BrI","Bal","DgB","DaM","GaB","GjM","Chm")
> > row.names(him)
> > prcomp(him , scale=TRUE)
> > plot(prcomp(him))
> > summary(prcomp(him, Scale=TRUE))
> > biplot(prcomp(him , scale=TRUE))
> > plot(him)
> End of R code
>
> As a result I got a PCA plot  (attachment 1),  bit similar to PCA done using SPSS.
>
> NOW I want to know :
>  1) whether I am travelling in the right direction in R.

The point is, it is very difficult to tell if you're heading towards the right direction if we don't know where you want to go. We do not know what your data are, or what is the question you ask. All I can tell is that your command lines seem to be valid.

>  2) if so how to remove the "RED ARROWS" in the Biplot which start from the center and diverge to all directions.

If you are interested in analysing genetic data in R with multivariate analyses, I recommend you to first go through the tutorials available from adegenet website:
http://adegenet.r-forge.r-project.org/
section 'Documents'.

You will find there the kind of figures you want to produce.
See in particular dudi.pca to perform a PCA, and s.label to plot the principal component of your analysis, both in the ade4 package.

>  3) How to include a symbol in the Biplot  and color it.
> When compared to other softwares I think R  simple and suitable for my data. I know it will be quite difficult in the begining but I want to learn it. Please help.

R requires a bit of investment at the beginning, but this will pay off thousands of times if you need to do some consequent statistical analyses.

Best regards and good luck.

Thibaut.




More information about the adegenet-forum mailing list