From takele_taye at yahoo.com Fri Feb 8 13:49:02 2013
From: takele_taye at yahoo.com (takele taye)
Date: Fri, 8 Feb 2013 04:49:02 -0800 (PST)
Subject: [adegenet-forum] assigning pop in genind object
Message-ID: <1360327742.57742.YahooMailClassic@web163602.mail.gq1.yahoo.com>
Dears
I converted my hj.str data into genind object using the read.structure function.?hj<- read.structure(file="hj.str", n.ind=747, n.loc=168344, ?onerowperind=TRUE, col.lab=1, col.pop=2, NA.char=-9, ask=FALSE)
My data contains two populations (represented by 1 & 2 in pop column of the hj.str data), the data was not sorted by?population. How can I define the population of each individual in the genind object hj at pop.
I tried this onepop_1 <-?hj at pop==1pop_2?<-?hj at pop==2
However, this has clumped the entire population just into two dots instead of representing each individual in the PCA plot.
Any help is appreciated
Takele
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From gabriel.terraz at univ-lyon1.fr Fri Feb 8 14:01:06 2013
From: gabriel.terraz at univ-lyon1.fr (Gabriel Terraz)
Date: Fri, 08 Feb 2013 14:01:06 +0100
Subject: [adegenet-forum] error "Miss-formed strings in replacement"
In-Reply-To: <2CB2DA8E426F3541AB1907F98ABA657057A28FCC@icexch-m1.ic.ac.uk>
References: <2CB2DA8E426F3541AB1907F98ABA657057A28FCC@icexch-m1.ic.ac.uk>
Message-ID: <5114F712.9000109@univ-lyon1.fr>
Hello,
I am encountering a problem with the fasta2genlight function:
Here is the error it gives me:
Erreur dans `alleles<-`(`*tmp*`, value = list()) :
Miss-formed strings in replacement (must be e.g. 'c/g')
It seems that the error is due to SNPs absence in the data file.
Does someone already encountered this error ?
Thanks for your help,
Gabriel
From t.jombart at imperial.ac.uk Fri Feb 8 14:58:43 2013
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Fri, 8 Feb 2013 13:58:43 +0000
Subject: [adegenet-forum] assigning pop in genind object
In-Reply-To: <1360327742.57742.YahooMailClassic@web163602.mail.gq1.yahoo.com>
References: <1360327742.57742.YahooMailClassic@web163602.mail.gq1.yahoo.com>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA657057A29E7F@icexch-m1.ic.ac.uk>
Hello,
please have a look at the documentation, especially the vignette on basics (vignette("adegenet-basics")).
You want to use :
pop(hj) <- ...
Cheers
Thibaut
________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of takele taye [takele_taye at yahoo.com]
Sent: 08 February 2013 12:49
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] assigning pop in genind object
Dears
I converted my hj.str data into genind object using the read.structure function.
hj<- read.structure(file="hj.str", n.ind=747, n.loc=168344, onerowperind=TRUE, col.lab=1, col.pop=2, NA.char=-9, ask=FALSE)
My data contains two populations (represented by 1 & 2 in pop column of the hj.str data), the data was not sorted by population. How can I define the population of each individual in the genind object hj at pop.
I tried this one
pop_1 <- hj at pop==1
pop_2 <- hj at pop==2
However, this has clumped the entire population just into two dots instead of representing each individual in the PCA plot.
Any help is appreciated
Takele
From t.jombart at imperial.ac.uk Fri Feb 8 15:00:27 2013
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Fri, 8 Feb 2013 14:00:27 +0000
Subject: [adegenet-forum] error "Miss-formed strings in replacement"
In-Reply-To: <5114F712.9000109@univ-lyon1.fr>
References: <2CB2DA8E426F3541AB1907F98ABA657057A28FCC@icexch-m1.ic.ac.uk>,
<5114F712.9000109@univ-lyon1.fr>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA657057A29E92@icexch-m1.ic.ac.uk>
Hello,
can you post a (small) toy dataset to reproduce the error?
Cheers
Thibaut
________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Gabriel Terraz [gabriel.terraz at univ-lyon1.fr]
Sent: 08 February 2013 13:01
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] error "Miss-formed strings in replacement"
Hello,
I am encountering a problem with the fasta2genlight function:
Here is the error it gives me:
Erreur dans `alleles<-`(`*tmp*`, value = list()) :
Miss-formed strings in replacement (must be e.g. 'c/g')
It seems that the error is due to SNPs absence in the data file.
Does someone already encountered this error ?
Thanks for your help,
Gabriel
_______________________________________________
adegenet-forum mailing list
adegenet-forum at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
From gabriel.terraz at univ-lyon1.fr Fri Feb 8 15:51:13 2013
From: gabriel.terraz at univ-lyon1.fr (Gabriel Terraz)
Date: Fri, 08 Feb 2013 15:51:13 +0100
Subject: [adegenet-forum] error "Miss-formed strings in replacement"
In-Reply-To: <2CB2DA8E426F3541AB1907F98ABA657057A29E92@icexch-m1.ic.ac.uk>
References: <2CB2DA8E426F3541AB1907F98ABA657057A28FCC@icexch-m1.ic.ac.uk>,
<5114F712.9000109@univ-lyon1.fr>
<2CB2DA8E426F3541AB1907F98ABA657057A29E92@icexch-m1.ic.ac.uk>
Message-ID: <511510E1.1030206@univ-lyon1.fr>
Here is a dataset (attached file):
(My whole dataset are numerous file with small dataset like this one)
>ana
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>ere
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>sec
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>vil
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
Thanks a lot
Gabriel Terraz -Doctorant-
Tel: +33(0)4 72 43 29 08
Laboratoire de Biom?trie et Biologie Evolutive, UMR CNRS 5558
Batiment Mendel
Universit? Claude Bernard - Lyon 1
43, Bd du 11 novembre 1918
69622 Villeurbanne
Le 08/02/2013 15:00, Jombart, Thibaut a ?crit :
> Hello,
>
> can you post a (small) toy dataset to reproduce the error?
>
> Cheers
>
> Thibaut
> ________________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Gabriel Terraz [gabriel.terraz at univ-lyon1.fr]
> Sent: 08 February 2013 13:01
> To: adegenet-forum at lists.r-forge.r-project.org
> Subject: [adegenet-forum] error "Miss-formed strings in replacement"
>
> Hello,
> I am encountering a problem with the fasta2genlight function:
> Here is the error it gives me:
>
>
> Erreur dans `alleles<-`(`*tmp*`, value = list()) :
> Miss-formed strings in replacement (must be e.g. 'c/g')
>
>
> It seems that the error is due to SNPs absence in the data file.
>
> Does someone already encountered this error ?
>
> Thanks for your help,
>
> Gabriel
>
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
> .
>
-------------- next part --------------
>ana
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>ere
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>sec
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>vil
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
From t.jombart at imperial.ac.uk Fri Feb 8 18:55:03 2013
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Fri, 8 Feb 2013 17:55:03 +0000
Subject: [adegenet-forum] error "Miss-formed strings in replacement"
In-Reply-To: <511510E1.1030206@univ-lyon1.fr>
References: <2CB2DA8E426F3541AB1907F98ABA657057A28FCC@icexch-m1.ic.ac.uk>,
<5114F712.9000109@univ-lyon1.fr>
<2CB2DA8E426F3541AB1907F98ABA657057A29E92@icexch-m1.ic.ac.uk>,
<511510E1.1030206@univ-lyon1.fr>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA657057A29EF2@icexch-m1.ic.ac.uk>
Hello,
yes indeed, this is a bug, the function does not expect entirely non-typed loci.
If RAM is not a constraint (if your dataset is small), you don't have to use genlight. You can use DNAbin format; to read data in:
dna <- fasta2genlight("sequ.fa")
Cheers
Thibaut
________________________________________
From: Gabriel Terraz [gabriel.terraz at univ-lyon1.fr]
Sent: 08 February 2013 14:51
To: Jombart, Thibaut
Cc: adegenet-forum at lists.r-forge.r-project.org
Subject: Re: [adegenet-forum] error "Miss-formed strings in replacement"
Here is a dataset (attached file):
(My whole dataset are numerous file with small dataset like this one)
>ana
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>ere
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>sec
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>vil
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
Thanks a lot
Gabriel Terraz -Doctorant-
Tel: +33(0)4 72 43 29 08
Laboratoire de Biom?trie et Biologie Evolutive, UMR CNRS 5558
Batiment Mendel
Universit? Claude Bernard - Lyon 1
43, Bd du 11 novembre 1918
69622 Villeurbanne
Le 08/02/2013 15:00, Jombart, Thibaut a ?crit :
> Hello,
>
> can you post a (small) toy dataset to reproduce the error?
>
> Cheers
>
> Thibaut
> ________________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Gabriel Terraz [gabriel.terraz at univ-lyon1.fr]
> Sent: 08 February 2013 13:01
> To: adegenet-forum at lists.r-forge.r-project.org
> Subject: [adegenet-forum] error "Miss-formed strings in replacement"
>
> Hello,
> I am encountering a problem with the fasta2genlight function:
> Here is the error it gives me:
>
>
> Erreur dans `alleles<-`(`*tmp*`, value = list()) :
> Miss-formed strings in replacement (must be e.g. 'c/g')
>
>
> It seems that the error is due to SNPs absence in the data file.
>
> Does someone already encountered this error ?
>
> Thanks for your help,
>
> Gabriel
>
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
> .
>
From t.jombart at imperial.ac.uk Fri Feb 8 19:02:11 2013
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Fri, 8 Feb 2013 18:02:11 +0000
Subject: [adegenet-forum] error "Miss-formed strings in replacement"
In-Reply-To: <511510E1.1030206@univ-lyon1.fr>
References: <2CB2DA8E426F3541AB1907F98ABA657057A28FCC@icexch-m1.ic.ac.uk>,
<5114F712.9000109@univ-lyon1.fr>
<2CB2DA8E426F3541AB1907F98ABA657057A29E92@icexch-m1.ic.ac.uk>,
<511510E1.1030206@univ-lyon1.fr>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA657057A29F02@icexch-m1.ic.ac.uk>
My bad, this is not a bug as such.
genlight is meant to store SNPs. All your sequences are identical.
Cheers
Thibaut
________________________________________
From: Gabriel Terraz [gabriel.terraz at univ-lyon1.fr]
Sent: 08 February 2013 14:51
To: Jombart, Thibaut
Cc: adegenet-forum at lists.r-forge.r-project.org
Subject: Re: [adegenet-forum] error "Miss-formed strings in replacement"
Here is a dataset (attached file):
(My whole dataset are numerous file with small dataset like this one)
>ana
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>ere
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>sec
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>vil
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
Thanks a lot
Gabriel Terraz -Doctorant-
Tel: +33(0)4 72 43 29 08
Laboratoire de Biom?trie et Biologie Evolutive, UMR CNRS 5558
Batiment Mendel
Universit? Claude Bernard - Lyon 1
43, Bd du 11 novembre 1918
69622 Villeurbanne
Le 08/02/2013 15:00, Jombart, Thibaut a ?crit :
> Hello,
>
> can you post a (small) toy dataset to reproduce the error?
>
> Cheers
>
> Thibaut
> ________________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Gabriel Terraz [gabriel.terraz at univ-lyon1.fr]
> Sent: 08 February 2013 13:01
> To: adegenet-forum at lists.r-forge.r-project.org
> Subject: [adegenet-forum] error "Miss-formed strings in replacement"
>
> Hello,
> I am encountering a problem with the fasta2genlight function:
> Here is the error it gives me:
>
>
> Erreur dans `alleles<-`(`*tmp*`, value = list()) :
> Miss-formed strings in replacement (must be e.g. 'c/g')
>
>
> It seems that the error is due to SNPs absence in the data file.
>
> Does someone already encountered this error ?
>
> Thanks for your help,
>
> Gabriel
>
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
> .
>
From t.jombart at imperial.ac.uk Fri Feb 8 19:12:31 2013
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Fri, 8 Feb 2013 18:12:31 +0000
Subject: [adegenet-forum] error "Miss-formed strings in replacement"
In-Reply-To: <2CB2DA8E426F3541AB1907F98ABA657057A29F02@icexch-m1.ic.ac.uk>
References: <2CB2DA8E426F3541AB1907F98ABA657057A28FCC@icexch-m1.ic.ac.uk>,
<5114F712.9000109@univ-lyon1.fr>
<2CB2DA8E426F3541AB1907F98ABA657057A29E92@icexch-m1.ic.ac.uk>,
<511510E1.1030206@univ-lyon1.fr>,
<2CB2DA8E426F3541AB1907F98ABA657057A29F02@icexch-m1.ic.ac.uk>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA657057A29F18@icexch-m1.ic.ac.uk>
This issue is now fixed in the development version. The patch is available at the address below:
https://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/R/import.R?root=adegenet
Cheers
Thibaut
________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Jombart, Thibaut [t.jombart at imperial.ac.uk]
Sent: 08 February 2013 18:02
To: Gabriel Terraz
Cc: adegenet-forum at lists.r-forge.r-project.org
Subject: Re: [adegenet-forum] error "Miss-formed strings in replacement"
My bad, this is not a bug as such.
genlight is meant to store SNPs. All your sequences are identical.
Cheers
Thibaut
________________________________________
From: Gabriel Terraz [gabriel.terraz at univ-lyon1.fr]
Sent: 08 February 2013 14:51
To: Jombart, Thibaut
Cc: adegenet-forum at lists.r-forge.r-project.org
Subject: Re: [adegenet-forum] error "Miss-formed strings in replacement"
Here is a dataset (attached file):
(My whole dataset are numerous file with small dataset like this one)
>ana
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>ere
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>sec
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
>vil
CAGGTGACG-CAATTTTACTGTAATTTGTTTGGCCGCACGTAC---TTGGAGGCCT-GACATGGGGCAATGTCAGCTCGTTTGTGCATGCTCAG-------
Thanks a lot
Gabriel Terraz -Doctorant-
Tel: +33(0)4 72 43 29 08
Laboratoire de Biom?trie et Biologie Evolutive, UMR CNRS 5558
Batiment Mendel
Universit? Claude Bernard - Lyon 1
43, Bd du 11 novembre 1918
69622 Villeurbanne
Le 08/02/2013 15:00, Jombart, Thibaut a ?crit :
> Hello,
>
> can you post a (small) toy dataset to reproduce the error?
>
> Cheers
>
> Thibaut
> ________________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Gabriel Terraz [gabriel.terraz at univ-lyon1.fr]
> Sent: 08 February 2013 13:01
> To: adegenet-forum at lists.r-forge.r-project.org
> Subject: [adegenet-forum] error "Miss-formed strings in replacement"
>
> Hello,
> I am encountering a problem with the fasta2genlight function:
> Here is the error it gives me:
>
>
> Erreur dans `alleles<-`(`*tmp*`, value = list()) :
> Miss-formed strings in replacement (must be e.g. 'c/g')
>
>
> It seems that the error is due to SNPs absence in the data file.
>
> Does someone already encountered this error ?
>
> Thanks for your help,
>
> Gabriel
>
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
> .
>
_______________________________________________
adegenet-forum mailing list
adegenet-forum at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
From mirainoshojo at gmail.com Sat Feb 9 19:35:28 2013
From: mirainoshojo at gmail.com (Valeria Montano)
Date: Sat, 9 Feb 2013 19:35:28 +0100
Subject: [adegenet-forum] Interpreting results of sPCA
In-Reply-To:
References:
Message-ID:
Hi Kelvin,
sorry about this reaction as prompt as the one of a stone in drunker
stupor.
I guess you have probably moved a bit further on the interpretation of your
results by now. Any how, I can try to tell you something
useful (? - who knows)
> Here are my questions:
> 1. For the sPCA based on spatial (not depth) coordinates, the barplot of
> eigenvalues shows the typical pattern of PC3 > PC4 > PC5, but if you look
> at the screeplot (the graph of PC score variance vs spatial
> autocorrelation), PC5 accounts for a larger amount of variance than PC3 and
> 4. This seems contradictory to me. Does anyone have an explanation?
>
> the eigenvalues in the spca have two components, the variance and the
spatial autocorrelation, if you type summary(yourspca)[[3]] you will see
the list of var and morane values for each of the eigenvalues. The PC5 may
be correlated with genetic variables with higher variance than the ones
contributing to pc3 and 4 but that are not spatially ordered?
2. Next, to do more exploratory analyses, I wanted to see how robust these
> results were for different distance limits (d2) in constructing the
> connection network. I noticed that when I pick an arbitrary number, like
> d2=12 for the sPCA using spatial (not depth) coordinates, the spatial
> patchiness disappears and instead there now appears to be a cline. Because
> sPCA decomposes both genetic and spatial variance, is it possible for the
> spatial variance to swamp out the genetic variance, particularly if you
> define a connection network too arbitrarily? In other words, by defining
> d2=12, does the sPCA miss the finer scale spatial patchiness that was found
> when I defined my connection network with a more "sensible" d2?
>
> In my personal experience with spca, usually if the spatial patterns are
strong no matter what graph is used they do not change substantially, in
the worst case just a few points look a bit different. In your specific
case the fact that you decided to consider neighbours on the basis of the
positive spatial autocorrelation sounds a bit circular, in this case you
might be forcing the method to highlight the pattern of positive spatial
autocorrelation that may not be driving the genetic distribution of your
sample. I would rather go for inverse distances which are usually more
accurate. Btw, did you run the global and local tests? If they change from
significant to non significant changing the neighbouring method I would not
think that there is a spatial significant pattern.
> 3. Clearly depth and space are autocorrelated with each other. Based on
> the partial mantel tests, both are significantly, but only weakly
> correlated with genetic relatedness. Are there any general guidelines for
> interpreting low Mantel r values? As I understand it, Mantel r is not the
> same as a correlation r, because Mantel tests are based on distances and
> not raw data. I've seen other studies commenting on how small Mantel r's
> are often reported, but so far, I have not come across any studies that
> report values as small as mine.
>
> I've never seen so small mantel test values either...In this case, when I
first read about this issue of 'controlling' depth for the space I had two
different thoughts about it:
1. if you think about spatial proximities, being less or more depth does
not mean to be more or less close, clearly. Considering your results of a
spatial gradient from more to less depth, this is likely highlighting a
adaptive pattern to depth, but maybe this is exactly the reason why you run
the method on depth only.
2. If I wanted to see the effect of space and depth, I would probably use
the depth in combination with a linear simplified distance scheme (like
points on a line or a circle reproducing the spatial shape of the coral
reef) and build the spatial connection with it. In this case you would
analyse together the role of spatial distances (in 3-D) and the potential
role of adaptation, which is already disentangled in the spatial analysis
based on depth only.
End. Just to let you know I hate you a bit because you work in the Hawaii.
Ciao
Valeria
On 30 January 2013 21:10, Kelvin Gorospe wrote:
> Hello all,
>
> I'd like to ask some input on interpreting some results. I have
> microsatellite genotypes, depth, and spatial coordinates for 2352 corals
> from a single coral reef. I ran partial mantel tests looking at the
> relationship between genetic relatedness and space (controlling for depth)
> as well as the relationship between genetic relatedness and depth
> (controlling for space) and found highly significant p values (p=0.001) but
> very small Mantel r values (0.008 for space and 0.01 for depth). So there
> is a small, but still significant relationship between genetics and space
> as well as genetics and depth on a very small scale (the reef covered an
> area of only about 1300m^2 with depths of between 1 and 4m).
>
> Next, I wanted to visualize these structures using sPCA. So first I
> constructed two connection networks: both neighbor by distance connections,
> but one based on depth measurements (0,z) and one based on spatial
> coordinates (x,y). The distance limit (d2) for each network was based on
> inspecting correlograms for genetics vs. depth and genetics vs. space and
> using the extent of positive autocorrelation as the upper limit (d2) for
> defining neighbors in each of the connection networks. After performing
> sPCA I then plot the PCs using the spatial (x,y) coordinates to visualize
> the spatial arrangement of genetic relatedness. The sPCA based on spatial
> coordinates show a patchy reef, groups of similar PC scores clumping
> together throughout the reef. The sPCA based one depth coordinates,
> however, show a depth cline, with corals in the center of the reef (the
> shallow part) having distinct PC scores from corals on the outer slopes of
> the reef (the deeper part).
>
> Here are my questions:
> 1. For the sPCA based on spatial (not depth) coordinates, the barplot of
> eigenvalues shows the typical pattern of PC3 > PC4 > PC5, but if you look
> at the screeplot (the graph of PC score variance vs spatial
> autocorrelation), PC5 accounts for a larger amount of variance than PC3 and
> 4. This seems contradictory to me. Does anyone have an explanation?
>
> 2. Next, to do more exploratory analyses, I wanted to see how robust these
> results were for different distance limits (d2) in constructing the
> connection network. I noticed that when I pick an arbitrary number, like
> d2=12 for the sPCA using spatial (not depth) coordinates, the spatial
> patchiness disappears and instead there now appears to be a cline. Because
> sPCA decomposes both genetic and spatial variance, is it possible for the
> spatial variance to swamp out the genetic variance, particularly if you
> define a connection network too arbitrarily? In other words, by defining
> d2=12, does the sPCA miss the finer scale spatial patchiness that was found
> when I defined my connection network with a more "sensible" d2?
>
> 3. Clearly depth and space are autocorrelated with each other. Based on
> the partial mantel tests, both are significantly, but only weakly
> correlated with genetic relatedness. Are there any general guidelines for
> interpreting low Mantel r values? As I understand it, Mantel r is not the
> same as a correlation r, because Mantel tests are based on distances and
> not raw data. I've seen other studies commenting on how small Mantel r's
> are often reported, but so far, I have not come across any studies that
> report values as small as mine.
>
> I've also tried to attach some graphs to this email, but I'm not sure if
> the list serve allows attachments. But hopefully my descriptions of my
> results were still good enough to get some feedback. Any input would be
> greatly appreciated! Thanks everyone!
>
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From stefanomontanari at gmail.com Mon Feb 11 22:58:08 2013
From: stefanomontanari at gmail.com (Stefano Montanari)
Date: Tue, 12 Feb 2013 07:58:08 +1000
Subject: [adegenet-forum] compoplot, STRUCTURE,
and the analysis of a hybrid zone
Message-ID:
Dear Dr. Jombart,
I hope this email finds you well. We have exchanged thoughts before, and I
wish to thank you for having gotten back to me in the past.
I have been going through your latest vignette about dapc in adegenet (Nov
2012). I have used dapc on a butterflyfish hybrid zone in the past
(Montanari et al 2012, Ecology and Evolution), and now I am going through a
second dataset, and would like to compare the 2. Hence, I have a couple of
questions for you:
- am I correct in thinking that I want the same level of stability between
the 2 analyses if I am to compare the results? (eg, in both have retained
PCs = N/3)
- in your tutorial you mention that dapc$posterior used to construct
compoplot are not the same as structure admixture coefficients. Could you
point me in a direction that would allow me to understand how they are not?
I have run the results through structure and the hybrids show up nicely as
50/50 clustred with parent 1 and 2 (k=2). adegenet also reckons that k=2
should be the best, but the compoplot shows no membership misassignment
(even if the # of PCs is conservative). Do you have any suggestions as to
why?
Hoping to have been clear enough and not to have bored you senseless, I
look forward to hearing back from you.
Best regards,
Stef
--------------------------
Stefano R. Montanari
PhD Candidate
James Cook University
School of Marine and Tropical Biology
ATSIP (Building 145 James Cook Drive)
4811 Townsville QLD
stefanomontanari at gmail.com
Work: +61 7 4781 5441
Mob: +61 404 736 509
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From t.jombart at imperial.ac.uk Tue Feb 12 16:12:40 2013
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Tue, 12 Feb 2013 15:12:40 +0000
Subject: [adegenet-forum] compoplot, STRUCTURE,
and the analysis of a hybrid zone
In-Reply-To:
References:
Message-ID: <2CB2DA8E426F3541AB1907F98ABA657057A2A2E9@icexch-m1.ic.ac.uk>
Hi Stefano,
thanks for reposting on the forum. It gives me the chance to clarify an important point.
For the first point, there is not a linear relationship between 'stability' of DAPC results and the number of PCs retained in the PCA step. 'xxx' PCs can represent 2% of the variance in one analysis and 60% in another. If the two data table have fairly comparable dimensions, it would be best to retain roughly the same proportion of variance. If their dimensions are very different, then the same number of PCs makes sense.
STRUCTURE or similar approaches have a model which partitions genotypes into groups. It is basically a mixture distribution problem with a multinomial distribution for each locus and group. So the 'admixture' coefficient has a a straightforward biological interpretation.
In DAPC, assignment of individuals to groups using the discriminant functions are based on a geometric criteria. In other words, "tell me where you are in the discriminant space, I will tell you the probability that you belong to groups xxx, yyy and zzz". This is of course dependent on the discriminant space. The more dimensions retained in the PCA step, the easier it is the find a space providing perfect discrimination. The obtained group membership probabilities can reflect admixture, but they do not represent the proportion of the genome assigned to a given group. In your case, use a smaller space, you may start seeing less clear-cut group definition. optim.a.score may help selecting the number of PCs.
Cheers
Thibaut.
________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Stefano Montanari [stefanomontanari at gmail.com]
Sent: 11 February 2013 21:58
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] compoplot, STRUCTURE, and the analysis of a hybrid zone
Dear Dr. Jombart,
I hope this email finds you well. We have exchanged thoughts before, and I wish to thank you for having gotten back to me in the past.
I have been going through your latest vignette about dapc in adegenet (Nov 2012). I have used dapc on a butterflyfish hybrid zone in the past (Montanari et al 2012, Ecology and Evolution), and now I am going through a second dataset, and would like to compare the 2. Hence, I have a couple of questions for you:
- am I correct in thinking that I want the same level of stability between the 2 analyses if I am to compare the results? (eg, in both have retained PCs = N/3)
- in your tutorial you mention that dapc$posterior used to construct compoplot are not the same as structure admixture coefficients. Could you point me in a direction that would allow me to understand how they are not? I have run the results through structure and the hybrids show up nicely as 50/50 clustred with parent 1 and 2 (k=2). adegenet also reckons that k=2 should be the best, but the compoplot shows no membership misassignment (even if the # of PCs is conservative). Do you have any suggestions as to why?
Hoping to have been clear enough and not to have bored you senseless, I look forward to hearing back from you.
Best regards,
Stef
--------------------------
Stefano R. Montanari
PhD Candidate
James Cook University
School of Marine and Tropical Biology
ATSIP (Building 145 James Cook Drive)
4811 Townsville QLD
stefanomontanari at gmail.com
Work: +61 7 4781 5441
Mob: +61 404 736 509
From stefanomontanari at gmail.com Wed Feb 13 02:36:37 2013
From: stefanomontanari at gmail.com (Stefano Montanari)
Date: Wed, 13 Feb 2013 11:36:37 +1000
Subject: [adegenet-forum] compoplot, STRUCTURE,
and the analysis of a hybrid zone
In-Reply-To: <2CB2DA8E426F3541AB1907F98ABA657057A2A2E9@icexch-m1.ic.ac.uk>
References:
<2CB2DA8E426F3541AB1907F98ABA657057A2A2E9@icexch-m1.ic.ac.uk>
Message-ID:
Hi Thibaut,
thank you for your prompt reply, it was very clear. Just a quick question
about optim.a.score: I had used it before, and this morning I tried again
just to make sure I remembered the results correctly. For one dataset
(N=109, 12 loci) it finds that 17 PCs is the best; for the other (N=83, 20
loci), retaining only 1 PC (not possible since PC=>2) gives the highest a
score. This worries me. Do you think these data should not be used for
DAPC?
Cheers
Stef
--------------------------
Stefano R. Montanari
PhD Candidate
James Cook University
School of Marine and Tropical Biology
ATSIP (Building 145 James Cook Drive)
4811 Townsville QLD
stefanomontanari at gmail.com
Work: +61 7 4781 5441
Mob: +61 404 736 509
On 13 February 2013 01:12, Jombart, Thibaut wrote:
> Hi Stefano,
>
> thanks for reposting on the forum. It gives me the chance to clarify an
> important point.
>
> For the first point, there is not a linear relationship between
> 'stability' of DAPC results and the number of PCs retained in the PCA step.
> 'xxx' PCs can represent 2% of the variance in one analysis and 60% in
> another. If the two data table have fairly comparable dimensions, it would
> be best to retain roughly the same proportion of variance. If their
> dimensions are very different, then the same number of PCs makes sense.
>
> STRUCTURE or similar approaches have a model which partitions genotypes
> into groups. It is basically a mixture distribution problem with a
> multinomial distribution for each locus and group. So the 'admixture'
> coefficient has a a straightforward biological interpretation.
>
> In DAPC, assignment of individuals to groups using the discriminant
> functions are based on a geometric criteria. In other words, "tell me where
> you are in the discriminant space, I will tell you the probability that you
> belong to groups xxx, yyy and zzz". This is of course dependent on the
> discriminant space. The more dimensions retained in the PCA step, the
> easier it is the find a space providing perfect discrimination. The
> obtained group membership probabilities can reflect admixture, but they do
> not represent the proportion of the genome assigned to a given group. In
> your case, use a smaller space, you may start seeing less clear-cut group
> definition. optim.a.score may help selecting the number of PCs.
>
> Cheers
>
> Thibaut.
>
>
> ________________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org [
> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Stefano
> Montanari [stefanomontanari at gmail.com]
> Sent: 11 February 2013 21:58
> To: adegenet-forum at lists.r-forge.r-project.org
> Subject: [adegenet-forum] compoplot, STRUCTURE, and the analysis of a
> hybrid zone
>
> Dear Dr. Jombart,
>
> I hope this email finds you well. We have exchanged thoughts before, and I
> wish to thank you for having gotten back to me in the past.
>
> I have been going through your latest vignette about dapc in adegenet (Nov
> 2012). I have used dapc on a butterflyfish hybrid zone in the past
> (Montanari et al 2012, Ecology and Evolution), and now I am going through a
> second dataset, and would like to compare the 2. Hence, I have a couple of
> questions for you:
>
> - am I correct in thinking that I want the same level of stability between
> the 2 analyses if I am to compare the results? (eg, in both have retained
> PCs = N/3)
>
> - in your tutorial you mention that dapc$posterior used to construct
> compoplot are not the same as structure admixture coefficients. Could you
> point me in a direction that would allow me to understand how they are not?
> I have run the results through structure and the hybrids show up nicely as
> 50/50 clustred with parent 1 and 2 (k=2). adegenet also reckons that k=2
> should be the best, but the compoplot shows no membership misassignment
> (even if the # of PCs is conservative). Do you have any suggestions as to
> why?
>
> Hoping to have been clear enough and not to have bored you senseless, I
> look forward to hearing back from you.
>
> Best regards,
>
> Stef
>
> --------------------------
> Stefano R. Montanari
> PhD Candidate
> James Cook University
> School of Marine and Tropical Biology
> ATSIP (Building 145 James Cook Drive)
> 4811 Townsville QLD
> stefanomontanari at gmail.com
> Work: +61 7 4781 5441
> Mob: +61 404 736 509
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From sibelletorres at gmail.com Wed Feb 13 15:37:34 2013
From: sibelletorres at gmail.com (=?ISO-8859-1?Q?Sibelle_Vila=E7a?=)
Date: Wed, 13 Feb 2013 15:37:34 +0100
Subject: [adegenet-forum] 3D sPCA and the use of temporal data
Message-ID:
Hi all,
I'm working with a domestic species and I have been trying to integrate
spatio-temporal data with haplotype frequencies. I've been working with
some ancient DNA data, associated with spatial location. I have a small
fragment of the mitochondrial DNA for several hundreds of individuals, each
associated with a geographical coordinate (x,y) and to a 14C dating. I can
see that there is a strong geographical correlation (geographically close
samples show related DNA haplotypes in similar frequencies), but I can also
see that this distribution is also strongly correlated with sampling time.
Because I'm working with a domestic species, this strong influence of time
has to do with Neolithic migrations, the change of haplotypes frequencies
is directly correlated with human migrations and their time of arrival in
different European locations.
Following the list, a saw it was possible to perform a 3D sPCA using depth
data. So, I was wondering if instead of depth data as a Z coordinate, it
would be possible to integrate temporal data on a sPCA, using something
like (x, y, time)? I have a radiocarbon dating for each sample, but I was
thinking of using time frames (or categories) related to the different
human cultures (Neolithic, Mesolithic, etc) since the change in haplotype
frequencies is directly related to the changes in human culture....
Is it possible?
Sibelle
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From t.jombart at imperial.ac.uk Thu Feb 14 10:37:38 2013
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Thu, 14 Feb 2013 09:37:38 +0000
Subject: [adegenet-forum] compoplot, STRUCTURE,
and the analysis of a hybrid zone
In-Reply-To:
References:
<2CB2DA8E426F3541AB1907F98ABA657057A2A2E9@icexch-m1.ic.ac.uk>,
Message-ID: <2CB2DA8E426F3541AB1907F98ABA657057A2A4DE@icexch-m1.ic.ac.uk>
Hello,
Why 'not possible since PC >=2'? You can choose to retain only on PC if you wish.
This suggests that the first PC of the second analysis already contains all the between-group discrimination.
Cheers
Thibaut
________________________________________
From: Stefano Montanari [stefanomontanari at gmail.com]
Sent: 13 February 2013 01:36
To: Jombart, Thibaut
Cc: adegenet-forum at lists.r-forge.r-project.org
Subject: Re: [adegenet-forum] compoplot, STRUCTURE, and the analysis of a hybrid zone
Hi Thibaut,
thank you for your prompt reply, it was very clear. Just a quick question about optim.a.score: I had used it before, and this morning I tried again just to make sure I remembered the results correctly. For one dataset (N=109, 12 loci) it finds that 17 PCs is the best; for the other (N=83, 20 loci), retaining only 1 PC (not possible since PC=>2) gives the highest a score. This worries me. Do you think these data should not be used for DAPC?
Cheers
Stef
--------------------------
Stefano R. Montanari
PhD Candidate
James Cook University
School of Marine and Tropical Biology
ATSIP (Building 145 James Cook Drive)
4811 Townsville QLD
stefanomontanari at gmail.com
Work: +61 7 4781 5441
Mob: +61 404 736 509
On 13 February 2013 01:12, Jombart, Thibaut > wrote:
Hi Stefano,
thanks for reposting on the forum. It gives me the chance to clarify an important point.
For the first point, there is not a linear relationship between 'stability' of DAPC results and the number of PCs retained in the PCA step. 'xxx' PCs can represent 2% of the variance in one analysis and 60% in another. If the two data table have fairly comparable dimensions, it would be best to retain roughly the same proportion of variance. If their dimensions are very different, then the same number of PCs makes sense.
STRUCTURE or similar approaches have a model which partitions genotypes into groups. It is basically a mixture distribution problem with a multinomial distribution for each locus and group. So the 'admixture' coefficient has a a straightforward biological interpretation.
In DAPC, assignment of individuals to groups using the discriminant functions are based on a geometric criteria. In other words, "tell me where you are in the discriminant space, I will tell you the probability that you belong to groups xxx, yyy and zzz". This is of course dependent on the discriminant space. The more dimensions retained in the PCA step, the easier it is the find a space providing perfect discrimination. The obtained group membership probabilities can reflect admixture, but they do not represent the proportion of the genome assigned to a given group. In your case, use a smaller space, you may start seeing less clear-cut group definition. optim.a.score may help selecting the number of PCs.
Cheers
Thibaut.
________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Stefano Montanari [stefanomontanari at gmail.com]
Sent: 11 February 2013 21:58
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] compoplot, STRUCTURE, and the analysis of a hybrid zone
Dear Dr. Jombart,
I hope this email finds you well. We have exchanged thoughts before, and I wish to thank you for having gotten back to me in the past.
I have been going through your latest vignette about dapc in adegenet (Nov 2012). I have used dapc on a butterflyfish hybrid zone in the past (Montanari et al 2012, Ecology and Evolution), and now I am going through a second dataset, and would like to compare the 2. Hence, I have a couple of questions for you:
- am I correct in thinking that I want the same level of stability between the 2 analyses if I am to compare the results? (eg, in both have retained PCs = N/3)
- in your tutorial you mention that dapc$posterior used to construct compoplot are not the same as structure admixture coefficients. Could you point me in a direction that would allow me to understand how they are not? I have run the results through structure and the hybrids show up nicely as 50/50 clustred with parent 1 and 2 (k=2). adegenet also reckons that k=2 should be the best, but the compoplot shows no membership misassignment (even if the # of PCs is conservative). Do you have any suggestions as to why?
Hoping to have been clear enough and not to have bored you senseless, I look forward to hearing back from you.
Best regards,
Stef
--------------------------
Stefano R. Montanari
PhD Candidate
James Cook University
School of Marine and Tropical Biology
ATSIP (Building 145 James Cook Drive)
4811 Townsville QLD
stefanomontanari at gmail.com>
Work: +61 7 4781 5441
Mob: +61 404 736 509
From t.jombart at imperial.ac.uk Thu Feb 14 11:06:31 2013
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Thu, 14 Feb 2013 10:06:31 +0000
Subject: [adegenet-forum] 3D sPCA and the use of temporal data
In-Reply-To:
References:
Message-ID: <2CB2DA8E426F3541AB1907F98ABA657057A2A522@icexch-m1.ic.ac.uk>
Dear Sibelle,
yes, I think it makes sense, although you'll probably have a harder time interpreting and plotting the results.
But basically, all you need to do is formulate the spatio-temporal proximities in a proximity matrix (terms >= 0, diagonal =0). Another, more complicated approach would be discretizing the temporal data and have say 'T' time steps with one matrix of proxity each. You'll then have to coordinate 'T' sPCA, which is doable using K-table approaches such as multiple co-inertia or STATIS (all in ade4), but substantially more of a pain.
Cheers
Thibaut
________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Sibelle Vila?a [sibelletorres at gmail.com]
Sent: 13 February 2013 14:37
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] 3D sPCA and the use of temporal data
Hi all,
I'm working with a domestic species and I have been trying to integrate spatio-temporal data with haplotype frequencies. I've been working with some ancient DNA data, associated with spatial location. I have a small fragment of the mitochondrial DNA for several hundreds of individuals, each associated with a geographical coordinate (x,y) and to a 14C dating. I can see that there is a strong geographical correlation (geographically close samples show related DNA haplotypes in similar frequencies), but I can also see that this distribution is also strongly correlated with sampling time. Because I'm working with a domestic species, this strong influence of time has to do with Neolithic migrations, the change of haplotypes frequencies is directly correlated with human migrations and their time of arrival in different European locations.
Following the list, a saw it was possible to perform a 3D sPCA using depth data. So, I was wondering if instead of depth data as a Z coordinate, it would be possible to integrate temporal data on a sPCA, using something like (x, y, time)? I have a radiocarbon dating for each sample, but I was thinking of using time frames (or categories) related to the different human cultures (Neolithic, Mesolithic, etc) since the change in haplotype frequencies is directly related to the changes in human culture....
Is it possible?
Sibelle
From bssc08 at bangor.ac.uk Wed Feb 20 20:40:13 2013
From: bssc08 at bangor.ac.uk (Niklas Tysklind)
Date: Wed, 20 Feb 2013 19:40:13 -0000
Subject: [adegenet-forum] Combining genetic and phenotypic data?
Message-ID: <00b901ce0fa2$1a8f2370$4fad6a50$@bangor.ac.uk>
Dear Thibaut and the rest of the Adegenet users,
Well done on all those DAPC assignment and structure functions. Me and my
data are loving them!
I am currently using the assignment functions based on a set of
microsatellite data AND (following your suggestion at the top of the
vignette) on a set of environment driven quantitative phenotype data. It's
working quite well with both datasets, albeit showing rather different
structures at different spatial scales. And here's the thing, would it be
possible to analyse both datasets merged together? I believe this would
allow maximum assignment power as the structures complement each other
(rather than mimic each other).
Many thanks in advance for any help and congratulations on an amazing
package!
Regards,
Niklas
Dr Niklas Tysklind
Postdoctoral Research Officer
Celtic Sea Trout Project
Environment Centre for Wales
School of Biological Sciences
College of Natural Sciences
Bangor University,
Bangor, LL57 2UW
UK
Phone: +44 1248 382139
Email: ntysklind at bangor.ac.uk
MEFGL-NewLogo
--
Rhif Elusen Gofrestredig / Registered Charity No. 1141565
Gall y neges e-bost hon, ac unrhyw atodiadau a anfonwyd gyda hi,
gynnwys deunydd cyfrinachol ac wedi eu bwriadu i'w defnyddio'n unig
gan y sawl y cawsant eu cyfeirio ato (atynt). Os ydych wedi derbyn y
neges e-bost hon trwy gamgymeriad, rhowch wybod i'r anfonwr ar
unwaith a dil?wch y neges. Os na fwriadwyd anfon y neges atoch chi,
rhaid i chi beidio ? defnyddio, cadw neu ddatgelu unrhyw wybodaeth a
gynhwysir ynddi. Mae unrhyw farn neu safbwynt yn eiddo i'r sawl a'i
hanfonodd yn unig ac nid yw o anghenraid yn cynrychioli barn
Prifysgol Bangor. Nid yw Prifysgol Bangor yn gwarantu
bod y neges e-bost hon neu unrhyw atodiadau yn rhydd rhag firysau neu
100% yn ddiogel. Oni bai fod hyn wedi ei ddatgan yn uniongyrchol yn
nhestun yr e-bost, nid bwriad y neges e-bost hon yw ffurfio contract
rhwymol - mae rhestr o lofnodwyr awdurdodedig ar gael o Swyddfa
Cyllid Prifysgol Bangor. www.bangor.ac.uk
This email and any attachments may contain confidential material and
is solely for the use of the intended recipient(s). If you have
received this email in error, please notify the sender immediately
and delete this email. If you are not the intended recipient(s), you
must not use, retain or disclose any information contained in this
email. Any views or opinions are solely those of the sender and do
not necessarily represent those of Bangor University.
Bangor University does not guarantee that this email or
any attachments are free from viruses or 100% secure. Unless
expressly stated in the body of the text of the email, this email is
not intended to form a binding contract - a list of authorised
signatories is available from the Bangor University Finance
Office. www.bangor.ac.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 7637 bytes
Desc: not available
URL:
From t.jombart at imperial.ac.uk Fri Feb 22 16:50:19 2013
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Fri, 22 Feb 2013 15:50:19 +0000
Subject: [adegenet-forum] Combining genetic and phenotypic data?
In-Reply-To: <00b901ce0fa2$1a8f2370$4fad6a50$@bangor.ac.uk>
References: <00b901ce0fa2$1a8f2370$4fad6a50$@bangor.ac.uk>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA657057A34F76@icexch-m1.ic.ac.uk>
Hello,
the quick and dirty way to do this would be taking the transformed data, normalize them to the same inertia (sum of squared values of the entries in each table), bind them into a single table, and run DAPC on this. It is not very elegant, but may do the trick if you want a quick and sample answer.
There is quite a bit of literature on coupling data. I think I mention a few in a very quick overview in my review paper (http://www.ncbi.nlm.nih.gov/pubmed/19156164). Coinertia analysis would be an option (function "coinertia" in the package "ade4"), but it won't allow you to couple two DAPC (only say, two PCA). Such implementation would be possible, but would probably demand quite a bit of work (essentially, a new paper, and a slightly boring one to write too!).
One option in between a clean, elegant solution and something manageable without too much pain is: look for combinations of the Discriminant Factors which are most alike between the two datasets. The procedure would be:
1) Make a DAPC for each table; keep all axes
2) Get the DAPC coordinates of the two tables, and standardize them to the same (say, 1) inertia. That is, divide the table by the sum of all squared entries.
3) Use these new matrices as inputs of coinertia
Does this make sense?
Cheers
Thibaut
________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Niklas Tysklind [bssc08 at bangor.ac.uk]
Sent: 20 February 2013 19:40
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] Combining genetic and phenotypic data?
Dear Thibaut and the rest of the Adegenet users,
Well done on all those DAPC assignment and structure functions. Me and my data are loving them!
I am currently using the assignment functions based on a set of microsatellite data AND (following your suggestion at the top of the vignette) on a set of environment driven quantitative phenotype data. It?s working quite well with both datasets, albeit showing rather different structures at different spatial scales. And here?s the thing, would it be possible to analyse both datasets merged together? I believe this would allow maximum assignment power as the structures complement each other (rather than mimic each other).
Many thanks in advance for any help and congratulations on an amazing package!
Regards,
Niklas
Dr Niklas Tysklind
Postdoctoral Research Officer
Celtic Sea Trout Project
Environment Centre for Wales
School of Biological Sciences
College of Natural Sciences
Bangor University,
Bangor, LL57 2UW
UK
Phone: +44 1248 382139
Email: ntysklind at bangor.ac.uk
[MEFGL-NewLogo]
--
Rhif Elusen Gofrestredig / Registered Charity No. 1141565
Gall y neges e-bost hon, ac unrhyw atodiadau a anfonwyd gyda hi, gynnwys deunydd cyfrinachol ac wedi eu bwriadu i'w defnyddio'n unig gan y sawl y cawsant eu cyfeirio ato (atynt). Os ydych wedi derbyn y neges e-bost hon trwy gamgymeriad, rhowch wybod i'r anfonwr ar unwaith a dil?wch y neges. Os na fwriadwyd anfon y neges atoch chi, rhaid i chi beidio ? defnyddio, cadw neu ddatgelu unrhyw wybodaeth a gynhwysir ynddi. Mae unrhyw farn neu safbwynt yn eiddo i'r sawl a'i hanfonodd yn unig ac nid yw o anghenraid yn cynrychioli barn Prifysgol Bangor. Nid yw Prifysgol Bangor yn gwarantu bod y neges e-bost hon neu unrhyw atodiadau yn rhydd rhag firysau neu 100% yn ddiogel. Oni bai fod hyn wedi ei ddatgan yn uniongyrchol yn nhestun yr e-bost, nid bwriad y neges e-bost hon yw ffurfio contrac t rhwymol - mae rhestr o lofnodwyr awdurdodedig ar gael o Swyddfa Cyllid Prifysgol Bangor. www.bangor.ac.uk
This email and any attachments may contain confidential material and is solely for the use of the intended recipient(s). If you have received this email in error, please notify the sender immediately and delete this email. If you are not the intended recipient(s), you must not use, retain or disclose any information contained in this email. Any views or opinions are solely those of the sender and do not necessarily represent those of Bangor University. Bangor University does not guarantee that this email or any attachments are free from viruses or 100% secure. Unless expressly stated in the body of the text of the email, this email is not intended to form a binding contract - a list of authorised sig natories is available from the Bangor University Finance Office. www.bangor.ac.uk
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 7637 bytes
Desc: image001.jpg
URL: