[adegenet-forum] Quality control

Gregory Neils Puncher gregoryneil.puncher2 at unibo.it
Mon Dec 10 11:28:41 CET 2012


Hello Thibaut and the adegenet community,
I am immersed in my first batch of SNPs and I've got to do some serious quality control. I've got over 500 individuals and 34,000 SNPs, so my dataset is quite bulky.

My query: First, I need to develop some script that will allow me to eliminate individuals from my dataset that have less than 70% calls for each loci in my dataset.
Second, I need to remove all loci with >30% NaN or no calls.

I can't figure out how to target the "NaN" values or "0000" genotype for removal. For the sake of documentation I'd also like to know how many of the loci or individuals were removed according to the above criteria but obviously I can't view a summary of each of the 34,000 SNPs. Can I produce a summary of the aforementioned editing exercises?

Thanks in advance.

Greg Puncher, PhD Student
Molecular Genetics for Environmental & Fishery Resources Laboratory (GenMAP)
University of Bologna
Via S. Alberto 163, 48123 Ravenna (Italy)
Ph: 39(0)544/937311  Fax: 39(0)544937411


More information about the adegenet-forum mailing list