[adegenet-forum] propShared problems
Timothy Frasier
timothy.frasier at SMU.CA
Mon Aug 16 15:53:11 CEST 2010
Dear Thibaut,
The .RData file is attached. The command line used for creating the genind file is:
> ds1 <- read.structure("porpoise.str", n.ind=35, n.loc=9, onerowperind=T, col.lab=1, col.pop=0, col.others=0, row.marknames=0, missing=NA, ask=T, quiet=F)
Then,
> save(ds1, file="porpgenind.RData")
Again, the genind file seems to read it correctly as a diploid data set, but that seems to be lost when trying to use the propShared function.
Thanks again!
-Tim
-------------- next part --------------
A non-text attachment was scrubbed...
Name: porpgenind.RData
Type: application/octet-stream
Size: 1977 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20100816/deb0b48b/attachment.obj>
-------------- next part --------------
On 2010-08-15, at 9:01 AM, Jombart, Thibaut wrote:
> Dear Tim,
>
> this looks like a possible issue in your genind object. genind2df may not reveal the issue - for instance the degree of ploidy may be wrong. Could you please post your genind object (as .RData, saved using 'save') along with the command lines that created it?
>
> Best regards
>
> Thibaut.
>
> ________________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] On Behalf Of Timothy Frasier [timothy.frasier at SMU.CA]
> Sent: 13 August 2010 13:40
> To: adegenet-forum at lists.r-forge.r-project.org
> Subject: Re: [adegenet-forum] propShared problems
>
> Dear Thibaut,
>
> Thanks for looking into this, and I apologize for not being clearer in my original posting. The results you obtained are correct - although the first individual (Nph0001) is missing from what you posted, but I assume that that is something minor. However, I do not get the results that you did. I get:
>
> Nph0001 Nph0003 Nph0004 Nph0005
> Nph0001 1 0.0 0.0 0
> Nph0003 0 1.0 0.5 0
> Nph0004 0 0.5 1 0
> Nph0005 0 0.0 0.0 1
>
> Upon further inspection, it looks like it (the propShared function) is not detecting the delimiter between my two alleles at each locus, and instead is lumping them together for the comparisons. What's tricky is that it detects them properly in the genind class...
>
> x$all.names
> $L1
> 1 2 3
> "119" "121" "123"
>
> $L2
> 1 2
> "191" "197"
>
> $L3
> 1 2 3 4 5 6
> "195" "197" "199" "201" "203" "205"
>
> ...but when I export the data, the two alleles at each locus are lumped together:
>
> tab <- genind2df(x)
> tab
> L1 L2 L3 L4 L5 L6 L7 L8 L9
> Nph0001 119123 191197 199201 138140 099099 173175 136136 168170 144144
> Nph0003 119121 191191 203203 136144 097099 159169 132132 <NA> 146150
> Nph0004 119121 191191 203203 136136 097097 159171 132132 158160 146152
>
> Also, the results from propShared(x) that I get are consistent with the alleles being lumped together for each locus.
>
> So, it looks like the issue I am having is that the alleles appear to be delimited properly when read into the genind class, but then that delimitation seems to disappear in the propShared function.
>
> Since you did not get that same problem, I assume that it has something to do with how I am reading the file in (although the genind class seems correct). I will keep looking into it.
>
> Thanks again for your help.
>
> Sincerely,
> Tim Frasier
>
>
>
> On 2010-08-13, at 8:40 AM, Jombart, Thibaut wrote:
>
> Dear Tim,
>
> Issues are always possible, but it would be nice if you could point out a particular case where calculations are 'clearly wrong'. For now it is not clear to me that the function is wrong. It works with the example dataset, and it works with the first rows of your dataset too:
>
> tab <- genind2df(x)
> tab
> X119 X123 X191 X197 X199 X201 X138 X140 X099 X099.1 X173 X175 X136
> Nph0003 119 121 191 191 203 203 136 144 097 099 159 169 132
> Nph0004 119 121 191 191 203 203 136 136 097 097 159 171 132
> Nph0005 119 119 191 191 203 203 132 138 097 099 159 161 132
> Nph0007 <NA> <NA> 191 191 201 205 136 136 097 099 159 159 132
> X136.1 X168 X170 X144 X144.1
> Nph0003 132 <NA> <NA> 146 150
> Nph0004 132 158 160 146 152
> Nph0005 132 160 160 146 148
> Nph0007 134 158 160 148 150
>
> propShared(x)
> Nph0003 Nph0004 Nph0005 Nph0007
> Nph0003 1.0000000 0.7500000 0.6875000 0.5714286
> Nph0004 0.7500000 1.0000000 0.6111111 0.5625000
> Nph0005 0.6875000 0.6111111 1.0000000 0.4375000
> Nph0007 0.5714286 0.5625000 0.4375000 1.0000000
>
> mean(tab[1,]==tab[2,], na.rm=TRUE) # OK
> [1] 0.75
> mean(tab[1,]==tab[3,], na.rm=TRUE) # OK
> [1] 0.6875
> mean(tab[1,]==tab[4,], na.rm=TRUE) # OK
> [1] 0.5714286
> mean(tab[2,]==tab[3,], na.rm=TRUE) # OK
> [1] 0.6111111
> mean(tab[2,]==tab[4,], na.rm=TRUE) # OK
> [1] 0.5625
> mean(tab[3,]==tab[4,], na.rm=TRUE) #OK
> [1] 0.4375
>
> All these results are right. Obviously the proportion of shared alleles is computed for sampled loci only; for instance, for proportion of shared alleles between individuals 1 and 2 equates 12/16, and not 12/18.
>
> Can you please provide a sample of code illustrating an issue in the calculations?
>
> Best
>
> Thibaut.
> ________________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org> [adegenet-forum-bounces at lists.r-forge.r-project.org] On Behalf Of Timothy Frasier [timothy.frasier at SMU.CA]
> Sent: 12 August 2010 15:15
> To: adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-forum at lists.r-forge.r-project.org>
> Subject: [adegenet-forum] propShared problems
>
> Hi,
>
> I am trying to use the propShared function, and realized that there is an issue with the calculation - the results are clearly not calculating the proportion of shared alleles properly. I browsed through both the R and C code to try to find the issue, but nothing immediately jumped out at me. I will keep looking, but thought that you could likely find and fix it faster than I could.
>
> I've attached a small example data set that I am working with (35 individuals typed at 9 microsatellite loci). Its size is good because it is easy to calculate the proportion of shared alleles manually and then compare that to the results obtained from the propShared function.
>
> Thanks for your help!
>
> Sincerely,
> Tim Frasier
>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Timothy R. Frasier
> Department of Biology
> Saint Mary's University
> 923 Robie Street
> Halifax, NS B3H 3C3
> Canada
> E-mail: timothy.frasier at smu.ca<mailto:timothy.frasier at smu.ca>
> Tel: (902) 491-6382
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Timothy R. Frasier
> Department of Biology
> Saint Mary's University
> 923 Robie Street
> Halifax, NS B3H 3C3
> Canada
> E-mail: timothy.frasier at smu.ca<mailto:timothy.frasier at smu.ca>
> Tel: (902) 491-6382
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
>
>
>
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Timothy R. Frasier
Department of Biology
Saint Mary's University
923 Robie Street
Halifax, NS B3H 3C3
Canada
E-mail: timothy.frasier at smu.ca
Tel: (902) 491-6382
~~~~~~~~~~~~~~~~~~~~~~~~~~~
More information about the adegenet-forum
mailing list