[adegenet-forum] PCA with tetraploid data

AVIK RAY avik.ray.kol at gmail.com
Tue May 17 11:41:54 CEST 2011


Hi
thats fine; but If I have 204 individuals, 7 loci, 4 alleles per locus, 
I have to manually incorporate all information into adegenet using 
dat=data.frame(locus1=c............), thats a quite a job right! or what 
else.....?
AVIK


On 5/10/2011 2:46 PM, Jombart, Thibaut wrote:
> Hello,
>
> df2genind does that annoying job for you. All you need is to read your 
> data into R as a data.frame with one column for each locus, each 
> genotype being a series of separated alleles. For instance:
> ####
> > dat = data.frame(loc1=c("80/80/78/60","60/60/60/60","78/80/80/82"), 
> loc2=c("50/55/60/75","50/50/50/50","55/55/55/55"))
> > dat
>          loc1        loc2
> 1 80/80/78/60 50/55/60/75
> 2 60/60/60/60 50/50/50/50
> 3 78/80/80/82 55/55/55/55
>
> > x=df2genind(dat, sep="/", ploidy=4)
> > x
>
>    #####################
>    ### Genind object ###
>    #####################
> - genotypes of individuals -
>
> S4 class:  genind
> @call: df2genind(X = dat, sep = "/", ploidy = 4)
>
> @tab:  3 x 8 matrix of genotypes
>
> @ind.names: vector of  3 individual names
> @loc.names: vector of  2 locus names
> @loc.nall: number of alleles per locus
> @loc.fac: locus factor for the  8 columns of @tab
> @all.names: list of  2 components yielding allele names for each locus
> @ploidy:  4
> @type:  codom
>
> Optionnal contents:
> @pop:  - empty -
> @pop.names:  - empty -
>
> @other: - empty -
>
> > truenames(x)
>   loc1.60 loc1.78 loc1.80 loc1.82 loc2.50 loc2.55 loc2.60 loc2.75
> 1    0.25    0.25     0.5    0.00    0.25    0.25    0.25    0.25
> 2    1.00    0.00     0.0    0.00    1.00    0.00    0.00    0.00
> 3    0.00    0.25     0.5    0.25    0.00    1.00    0.00    0.00
> ####
>
> So that you can perform a PCA on truenames(x), or better on a 
> centred/scaled version of this matrix using scaleGen(x).
>
> Best
>
> Thibaut
>
> -- 
> ######################################
> Dr Thibaut JOMBART
> MRC Centre for Outbreak Analysis and Modelling
> Department of Infectious Disease Epidemiology
> Imperial College - Faculty of Medicine
> St Mary’s Campus
> Norfolk Place
> London W2 1PG
> United Kingdom
> Tel. : 0044 (0)20 7594 3658
> t.jombart at imperial.ac.uk
> http://sites.google.com/site/thibautjombart/
> http://adegenet.r-forge.r-project.org/
> ------------------------------------------------------------------------
> *From:* adegenet-forum-bounces at r-forge.wu-wien.ac.at 
> [adegenet-forum-bounces at r-forge.wu-wien.ac.at] on behalf of AVIK RAY 
> [avik.ray.kol at gmail.com]
> *Sent:* 09 May 2011 20:00
> *To:* adegenet-forum at r-forge.wu-wien.ac.at
> *Subject:* [adegenet-forum] PCA with tetraploid data
>
> Dear Dr Jombart
> I want to do PCA and other analyses in adegenet, however my data is 
> tetraploid dataset, 204 individuals, 7 microsatellite loci, so it is 
> not read using read.structure (as you mentioned in your earlier mails 
> to Sarah Castillo (19/10/2010, RE: Looking for help with a PCA using 
> adegenet in R);
>
> So far I’ve understood from the code is instead of coding each 
> individual for each locus as in read.structure (diploid data) idea is 
> to get the allele freq for each allele (whether present or absent) and 
> then code each individual genotypes accordingly, However, I did not 
> get the last part of the code, e.g.
>
> ………….
>
> $pop
>
> [1] ON ON ON ON ON ON ON ON
>
> Levels: ON
>
> > /genind2df(x, sep="/")/
>
> pop gen
>
> ……………
>
> Moreover, it seems extremely cumbersome for large datasets like mine 
> (204 indiv, 7 microsat loci); can you give any suggestion/s??
>
> Thanks
>
> best regards
>
> AVIK
>
>  --
>
> AVIK RAY
> Visiting Fellow
> National Center for Biological Sciences
> Tata Institute of Fundamental Research
> GKVK Campus
> Bellary Road
> Bangalore-560065
> India
> Ph 91-80-23666340
> Fax 91-80-2363 6662


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20110517/c4bd5b14/attachment-0001.htm>


More information about the adegenet-forum mailing list