[adegenet-forum] Using mtDNA haplogroup data to perform DAPC and sPCA

Jombart, Thibaut t.jombart at imperial.ac.uk
Mon May 26 07:58:37 CEST 2014


Hi Guillermo, 

I am not sure you should treat the data as a single locus. But to do so, creating the genind object should be straightforward: just use df2genind, specifying that the ploidy=1. As for the genpop, use the normal genind2genpop converter.

Now, if you have mtDNA sequences, it is probably best to keep SNP data, as you will still have much more information there. Otherwise, the only information input to the analyses is whether individuals have identical haplotypes, or not (and not the distance between haplotypes), which is not great.

Cheers
Thibaut

________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Guillermo Reales Monteagudo [greales at ucm.es]
Sent: 21 May 2014 09:29
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] Using mtDNA haplogroup data to perform DAPC and       sPCA

Hello,

I've just started to learn about the package and tried to use it on my own data, but I'm still having some issues which I suspect might be related to data input. I contact you to check whether I'm in the good direction or not.

My initial data are haplogroup frequencies in several (63) populations (total n = 944), so I for genind object construction I decided to treat these (9) haplogroups as haploid alleles of a single marker (i.e. 1 locus, 9 alleles in this case). First I converted my frequencies to absolute frequencies in each population and then I designed a function to retrieve a matrix with individual genotypes to fit a genind structure - 944-row matrix with ones and zeroes corresponding to each individual haplogroup.
Plus, I included a factor assigning these individuals to each population, and a proper xy list including UTM coordinates for each individual.
After constructing my genind and genpop objects I started performing some DAPC and sPCA analyses, yet I still don't know whether I can trust my results or not, as I haven't found any example handling such input data type to compare to.
Any suggestion for improvement?
Thank you all in advance.

Guillermo


More information about the adegenet-forum mailing list