[adegenet-forum] Reading in data from Stacks output files
Harriet Hunt
hvh22 at cam.ac.uk
Thu Nov 16 18:21:09 CET 2017
Hi Thibault et al,
I am trying to read in a SNP data set outputted from Julian Catchen's
Stacks program for downstream multivariate analyses (PCAs, genetic
distance measures, etc.) I have tried converting both the Structure file
format and vcf format but they don't seem to be giving the same genind
results - there are 2136 alleles (1068 loci, diploid) in the genind
converted from the structure file but 3592 alleles in the genind
converted from the vcf file.
Some of this is done using the package SNPstats rather than adegenet but
maybe someone can answer the question anyway? I would like to know if
there is an error in my code which means I get these conflicting
results. Or is it just the way data is coded in vcf?
My code is:
matrix98str <- read.structure("98percent.str", n.ind=371, n.loc=1068,
onerowperind = FALSE, col.lab=1, col.pop=2, row.marknames=1, NA.char=0)
vcf <- readVcf("98percent.vcf")
library("snpStats")
matrix98vcf <- genotypeToSnpMatrix(vcf)
matrix98vcfSNPs <- df2genind(matrix98vcf$genotypes, ploidy=2, sep="/",
ind.names=rownames(matrix98vcf$genotypes),
loc.names=colnames(matrix98vcf$genotypes), NA.char=NA)
and then I am comparing the 2 genind objects matrix98str and
matrix98vcfSNPs.
Thanks for any help! Harriet
--
Dr Harriet Hunt
Research Associate
McDonald Institute for Archaeological Research
University of Cambridge
Downing Street
Cambridge CB2 3ER
UK
Tel: +44 (0)1223 339330
e-mail: hvh22 at cam.ac.uk
More information about the adegenet-forum
mailing list