[adegenet-forum] SNP positional information

Jombart, Thibaut t.jombart at imperial.ac.uk
Wed Apr 14 18:32:55 CEST 2010

Dear Joanne, 

I still have to learn how to use pegas, so I can only reply to you about adegenet's uses.

You should be able to convert your data.frame into a genind object using df2genind. See ?df2genind, or in the adegenet tutorial section 3.1, which you can find from the website (http://adegenet.r-forge.r-project.org/, section Documents), or using the R command (while connected to the web) :

Then, you can store the position of the SNPs as the names of the loci; if 'foo' is your genind object containing SNPs:
foo$loc.names <- as.character(myPositions)

where 'myPositions' is a vector of numbers indicating the position of your SNPs, from the first to the last column of your original data.frame.

Regarding Tajima's D, it is not implemented in adegenet, but there's a 'tajima.test' function in pegas according to:

Do not hesitate if you have further questions, or if some of this is cryptic.

Best regards

From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] On Behalf Of Joanne Berghout, Miss [joanne.berghout at mail.mcgill.ca]
Sent: 14 April 2010 00:22
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] SNP positional information

Hello adegenet users,

Is there a way that I can assign genetic positions to my SNPs?

I've just downloaded the ape, adegenet and pegas packages in R and I'm trying to analyze a data set of 400 individuals (13 populations) for 78 SNPs.

The data was collected by sequencing a single gene across each of the exons and looking for polymorphic sites both manually (chromatogram inspection) and using PhredPhrap.  I've been able up upload the data as a data frame (using read.loci) where each row is an individual and each column is a SNP (alleles coded as A/A).  I would like to look at the haplotype differences in the different populations, as well as do some neutrality tests (specifically Tajima's D) and a few other standard descriptive statistics.

I am trying to avoid creating FASTA files of the whole sequence for each of my individuals as I have multiple non-overlapping amplicons for each individual which (unless you also have a suggestion here) means that I would have to manually combine thousands of sequence reads (and edit for accuracy as PhredPhrap miscalled or just missed on quite a number of SNPs).


PS. I'm a pretty new R user, though I have a fair bit of experience with the R/qtl package as all my work so far has been on very straightforward inbred mouse crosses.  So, please, I would appreciate a little simplicity and explanation...
adegenet-forum mailing list
adegenet-forum at lists.r-forge.r-project.org

More information about the adegenet-forum mailing list