[adegenet-forum] randomize pop labels in a genind object for randomization experiment.

Caroline Judy caroline.duffie at gmail.com
Thu Sep 18 16:20:47 CEST 2014


I've been working through the tutorial again, and I now see (and
understand) the randomization step that is part of cross validation. I'm so
glad Adegenet has this formalized test. Thanks so much.

C

On Thu, Sep 18, 2014 at 5:41 AM, Jombart, Thibaut <t.jombart at imperial.ac.uk>
wrote:

>
> Hi there,
>
> no need to recode everything: what you describe is cross-validation, and
> it is implemented in adegenet. See ?xvalDapc
>
> Cheers
>
> Thibaut
>
>
> ________________________________________
> From: Caroline Judy [caroline.duffie at gmail.com]
> Sent: 16 September 2014 21:45
> To: Jombart, Thibaut
> Cc: Vikram Chhatre; adegenet-forum at lists.r-forge.r-project.org
> Subject: [adegenet-forum] randomize pop labels in a genind object for
> randomization experiment.
>
> Hi Thibaut, Vikram, and others:
>
> I'd like to try a randomization experiment to further explore my radseq
> data using DAPC.
>
> Data structure:
> 40 individuals in 2 (apriori) populations
> 6451 SNP loci
>
> My data are for two very closely related "species" which show little to no
> divergence at traditional markers. I performed a DAPC using a priori pop
> definitions (set as species). The function can discriminate my species, but
> the allelic contributions are very low ( highest few around .0015).
>
> I am interested in trying a randomization experiment in which I shuffle
> the population labels 100 times and then perform DAPC on each of these.
> Ultimately the goal is to compare allelic loadings for the discriminant
> function generated using true labels vs. randomized labels.
>
> I am fairly new to R. A colleague suggested the general format to create a
> loop, but could anyone offer a solution that could be implemented with a
> genind object? Otherwise, I think it would be too labor intensive - I would
> have to create 100 different structure input files to be converted to
> genind objects.
>
> nrep<- 100
> results<- list()  # or vector/matrix, depending on the case
> For(I in 1:nrep)
> {
>  Rand.labels<- sample(labels)
> ## do some analyses and assign relevant results to results
> }
>
> Thanks,
> Caroline
>
>
> On Sun, Sep 14, 2014 at 3:45 PM, Jombart, Thibaut <
> t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk>> wrote:
>
> Yes, you need to use:
> ?genind2hierfstat
>
> Cheers
> Thibaut
>
> ________________________________________
> From: Vikram Chhatre [crypticlineage at gmail.com<mailto:
> crypticlineage at gmail.com>]
> Sent: 13 September 2014 21:48
> To: adegenet-forum at lists.r-forge.r-project.org<mailto:
> adegenet-forum at lists.r-forge.r-project.org>; Jombart, Thibaut
> Subject: Re: [adegenet-forum] Per locus pairwise Fst
>
> Thank you for all the replies.  I have been looking at the pp.fst()
> function in the Hierfstat package.  Does the post-seploc data frame need to
> be converted into something that Hierfstat understands first?  The
> following doesn't seem to work:
>
> # Use seploc to separate loci:
> gen100_seploc <- seploc(gen100_genind, truenames=TRUE,
> res.type=c('genind', 'matrix')
>
> # Load Hierfstat
> library(hierfstat)
>
> # Calculate pairwise Fst:
> gen100_perLocusPWFst <- lapply(gen100_seploc, pp.fst, diploid=TRUE)
>
> Error in unique.default(Pop) : unique() applies only to vectors
>
> On Sat, Sep 13, 2014 at 2:20 PM, Jombart, Thibaut <
> t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk><mailto:
> t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk>>> wrote:
>
> Hi there,
>
> yes, this function is not optimized for large datasets. You can use the
> same approach but using functions from the hierfstat package.
>
> Cheers
> Thibaut
> ________________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org><mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org>> [
> adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org><mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org>>] on behalf of Vikram
> Chhatre [crypticlineage at gmail.com<mailto:crypticlineage at gmail.com><mailto:
> crypticlineage at gmail.com<mailto:crypticlineage at gmail.com>>]
> Sent: 12 September 2014 18:31
> To: adegenet-forum at lists.r-forge.r-project.org<mailto:
> adegenet-forum at lists.r-forge.r-project.org><mailto:
> adegenet-forum at lists.r-forge.r-project.org<mailto:
> adegenet-forum at lists.r-forge.r-project.org>>
> Subject: Re: [adegenet-forum] Per locus pairwise Fst
>
> I am revisiting this topic due to some technical problems.
>
> The task at hand is to estimate pairwise Fst matrices for each locus
> separately.
>
> # Genind object is stored in:
> gen100_genind
>
> # Use seploc to separate loci:
> gen100_seploc <- seploc(gen100_genind, truenames=TRUE,
> res.type=c('genind', 'matrix')
>
> # Calculate pairwise Fst:
> gen100_perLocusPWFst <- lapply(gen100_seploc, pairwise.fst,
> res.type=c('dist', 'matrix'), trunames=TRUE)
>
> For a data set consisting of 30 populations, 20 individuals each, 1000
> loci and 2 alleles per locus (1.2 million data points), it takes up to 6
> hours to estimate the pairwise Fst matrix with this method.
>
> Is there any way to speed this up?  Should I look into any other packages?
>
> Many thanks for your time and help.
> Vikram
>
>
>
>
> On Mon, Jul 14, 2014 at 9:16 AM, Vikram Chhatre <crypticlineage at gmail.com
> <mailto:crypticlineage at gmail.com><mailto:crypticlineage at gmail.com<mailto:
> crypticlineage at gmail.com>><mailto:crypticlineage at gmail.com<mailto:
> crypticlineage at gmail.com><mailto:crypticlineage at gmail.com<mailto:
> crypticlineage at gmail.com>>>> wrote:
> Perfect!  Thank you for both solutions.
>
> V
>
>
> On Mon, Jul 14, 2014 at 9:13 AM, Jombart, Thibaut <
> t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk><mailto:
> t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk>><mailto:
> t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk><mailto:
> t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk>>>> wrote:
>
> Hi there,
>
> you can use seploc to separate loci, and lapply over the resulting list
> using your prefered fst function.
>
> Cheers
> Thibaut
> ________________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org><mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org>><mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org><mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org>>> [
> adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org><mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org>><mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org><mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org>>>] on behalf of
> Vikram Chhatre [crypticlineage at gmail.com<mailto:crypticlineage at gmail.com
> ><mailto:crypticlineage at gmail.com<mailto:crypticlineage at gmail.com
> >><mailto:crypticlineage at gmail.com<mailto:crypticlineage at gmail.com
> ><mailto:crypticlineage at gmail.com<mailto:crypticlineage at gmail.com>>>]
> Sent: 14 July 2014 14:01
> To: adegenet-forum at lists.r-forge.r-project.org<mailto:
> adegenet-forum at lists.r-forge.r-project.org><mailto:
> adegenet-forum at lists.r-forge.r-project.org<mailto:
> adegenet-forum at lists.r-forge.r-project.org>><mailto:
> adegenet-forum at lists.r-forge.r-project.org<mailto:
> adegenet-forum at lists.r-forge.r-project.org><mailto:
> adegenet-forum at lists.r-forge.r-project.org<mailto:
> adegenet-forum at lists.r-forge.r-project.org>>>
> Subject: [adegenet-forum] Per locus pairwise Fst
>
> Good morning.
>
> I would like to estimate per locus pairwise Fst for populations, but it
> appears that Adegenet only estimates this over all loci (i.e. single
> matrix).  What I would like is one matrix per locus.  Has anyone modified
> the functions or know of alternative programs that can do this?
>
> Thanks
> Vikram
>
>
>
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org<mailto:
> adegenet-forum at lists.r-forge.r-project.org>
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20140918/93e3a2ed/attachment-0001.html>


More information about the adegenet-forum mailing list