[adegenet-forum] randomize pop labels in a genind object for randomization experiment.

Jombart, Thibaut t.jombart at imperial.ac.uk
Thu Sep 18 11:41:12 CEST 2014


Hi there, 

no need to recode everything: what you describe is cross-validation, and it is implemented in adegenet. See ?xvalDapc

Cheers

Thibaut


________________________________________
From: Caroline Judy [caroline.duffie at gmail.com]
Sent: 16 September 2014 21:45
To: Jombart, Thibaut
Cc: Vikram Chhatre; adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] randomize pop labels in a genind object for randomization experiment.

Hi Thibaut, Vikram, and others:

I'd like to try a randomization experiment to further explore my radseq data using DAPC.

Data structure:
40 individuals in 2 (apriori) populations
6451 SNP loci

My data are for two very closely related "species" which show little to no divergence at traditional markers. I performed a DAPC using a priori pop definitions (set as species). The function can discriminate my species, but the allelic contributions are very low ( highest few around .0015).

I am interested in trying a randomization experiment in which I shuffle the population labels 100 times and then perform DAPC on each of these. Ultimately the goal is to compare allelic loadings for the discriminant function generated using true labels vs. randomized labels.

I am fairly new to R. A colleague suggested the general format to create a loop, but could anyone offer a solution that could be implemented with a genind object? Otherwise, I think it would be too labor intensive - I would have to create 100 different structure input files to be converted to genind objects.

nrep<- 100
results<- list()  # or vector/matrix, depending on the case
For(I in 1:nrep)
{
 Rand.labels<- sample(labels)
## do some analyses and assign relevant results to results
}

Thanks,
Caroline


On Sun, Sep 14, 2014 at 3:45 PM, Jombart, Thibaut <t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk>> wrote:

Yes, you need to use:
?genind2hierfstat

Cheers
Thibaut

________________________________________
From: Vikram Chhatre [crypticlineage at gmail.com<mailto:crypticlineage at gmail.com>]
Sent: 13 September 2014 21:48
To: adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-forum at lists.r-forge.r-project.org>; Jombart, Thibaut
Subject: Re: [adegenet-forum] Per locus pairwise Fst

Thank you for all the replies.  I have been looking at the pp.fst() function in the Hierfstat package.  Does the post-seploc data frame need to be converted into something that Hierfstat understands first?  The following doesn't seem to work:

# Use seploc to separate loci:
gen100_seploc <- seploc(gen100_genind, truenames=TRUE, res.type=c('genind', 'matrix')

# Load Hierfstat
library(hierfstat)

# Calculate pairwise Fst:
gen100_perLocusPWFst <- lapply(gen100_seploc, pp.fst, diploid=TRUE)

Error in unique.default(Pop) : unique() applies only to vectors

On Sat, Sep 13, 2014 at 2:20 PM, Jombart, Thibaut <t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk><mailto:t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk>>> wrote:

Hi there,

yes, this function is not optimized for large datasets. You can use the same approach but using functions from the hierfstat package.

Cheers
Thibaut
________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org><mailto:adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org>> [adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org><mailto:adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org>>] on behalf of Vikram Chhatre [crypticlineage at gmail.com<mailto:crypticlineage at gmail.com><mailto:crypticlineage at gmail.com<mailto:crypticlineage at gmail.com>>]
Sent: 12 September 2014 18:31
To: adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-forum at lists.r-forge.r-project.org><mailto:adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-forum at lists.r-forge.r-project.org>>
Subject: Re: [adegenet-forum] Per locus pairwise Fst

I am revisiting this topic due to some technical problems.

The task at hand is to estimate pairwise Fst matrices for each locus separately.

# Genind object is stored in:
gen100_genind

# Use seploc to separate loci:
gen100_seploc <- seploc(gen100_genind, truenames=TRUE, res.type=c('genind', 'matrix')

# Calculate pairwise Fst:
gen100_perLocusPWFst <- lapply(gen100_seploc, pairwise.fst, res.type=c('dist', 'matrix'), trunames=TRUE)

For a data set consisting of 30 populations, 20 individuals each, 1000 loci and 2 alleles per locus (1.2 million data points), it takes up to 6 hours to estimate the pairwise Fst matrix with this method.

Is there any way to speed this up?  Should I look into any other packages?

Many thanks for your time and help.
Vikram




On Mon, Jul 14, 2014 at 9:16 AM, Vikram Chhatre <crypticlineage at gmail.com<mailto:crypticlineage at gmail.com><mailto:crypticlineage at gmail.com<mailto:crypticlineage at gmail.com>><mailto:crypticlineage at gmail.com<mailto:crypticlineage at gmail.com><mailto:crypticlineage at gmail.com<mailto:crypticlineage at gmail.com>>>> wrote:
Perfect!  Thank you for both solutions.

V


On Mon, Jul 14, 2014 at 9:13 AM, Jombart, Thibaut <t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk><mailto:t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk>><mailto:t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk><mailto:t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk>>>> wrote:

Hi there,

you can use seploc to separate loci, and lapply over the resulting list using your prefered fst function.

Cheers
Thibaut
________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org><mailto:adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org>><mailto:adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org><mailto:adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org>>> [adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org><mailto:adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org>><mailto:adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org><mailto:adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org>>>] on behalf of Vikram Chhatre [crypticlineage at gmail.com<mailto:crypticlineage at gmail.com><mailto:crypticlineage at gmail.com<mailto:crypticlineage at gmail.com>><mailto:crypticlineage at gmail.com<mailto:crypticlineage at gmail.com><mailto:crypticlineage at gmail.com<mailto:crypticlineage at gmail.com>>>]
Sent: 14 July 2014 14:01
To: adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-forum at lists.r-forge.r-project.org><mailto:adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-forum at lists.r-forge.r-project.org>><mailto:adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-forum at lists.r-forge.r-project.org><mailto:adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-forum at lists.r-forge.r-project.org>>>
Subject: [adegenet-forum] Per locus pairwise Fst

Good morning.

I would like to estimate per locus pairwise Fst for populations, but it appears that Adegenet only estimates this over all loci (i.e. single matrix).  What I would like is one matrix per locus.  Has anyone modified the functions or know of alternative programs that can do this?

Thanks
Vikram




_______________________________________________
adegenet-forum mailing list
adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-forum at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum


More information about the adegenet-forum mailing list