[adegenet-forum] propShared problems

Timothy Frasier timothy.frasier at SMU.CA
Mon Aug 16 15:53:11 CEST 2010


Dear Thibaut,

The .RData file is attached.  The command line used for creating the genind file is:

> ds1 <- read.structure("porpoise.str", n.ind=35, n.loc=9, onerowperind=T, col.lab=1, col.pop=0, col.others=0, row.marknames=0, missing=NA, ask=T, quiet=F)

Then,

> save(ds1, file="porpgenind.RData")

Again, the genind file seems to read it correctly as a diploid data set, but that seems to be lost when trying to use the propShared function.

Thanks again!

-Tim

-------------- next part --------------
A non-text attachment was scrubbed...
Name: porpgenind.RData
Type: application/octet-stream
Size: 1977 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20100816/deb0b48b/attachment.obj>
-------------- next part --------------



On 2010-08-15, at 9:01 AM, Jombart, Thibaut wrote:

> Dear Tim, 
> 
> this looks like a possible issue in your genind object. genind2df may not reveal the issue - for instance the degree of ploidy may be wrong. Could you please post your genind object (as .RData, saved using 'save') along with the command lines that created it?
> 
> Best regards
> 
> Thibaut.
> 
> ________________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] On Behalf Of Timothy Frasier [timothy.frasier at SMU.CA]
> Sent: 13 August 2010 13:40
> To: adegenet-forum at lists.r-forge.r-project.org
> Subject: Re: [adegenet-forum] propShared problems
> 
> Dear Thibaut,
> 
> Thanks for looking into this, and I apologize for not being clearer in my original posting.  The results you obtained are correct - although the first individual (Nph0001) is missing from what you posted, but I assume that that is something minor.  However, I do not get the results that you did.  I get:
> 
> Nph0001 Nph0003 Nph0004 Nph0005
> Nph0001 1 0.0 0.0 0
> Nph0003 0 1.0 0.5 0
> Nph0004 0 0.5 1 0
> Nph0005 0 0.0 0.0 1
> 
> Upon further inspection, it looks like it (the propShared function) is not detecting the delimiter between my two alleles at each locus, and instead is lumping them together for the comparisons. What's tricky is that it detects them properly in the genind class...
> 
> x$all.names
> $L1
> 1     2     3
> "119" "121" "123"
> 
> $L2
> 1     2
> "191" "197"
> 
> $L3
> 1     2     3     4     5     6
> "195" "197" "199" "201" "203" "205"
> 
> ...but when I export the data, the two alleles at each locus are lumped together:
> 
> tab <- genind2df(x)
> tab
>         L1     L2     L3     L4     L5     L6     L7     L8     L9
> Nph0001 119123 191197 199201 138140 099099 173175 136136 168170 144144
> Nph0003 119121 191191 203203 136144 097099 159169 132132   <NA> 146150
> Nph0004 119121 191191 203203 136136 097097 159171 132132 158160 146152
> 
> Also, the results from propShared(x) that I get are consistent with the alleles being lumped together for each locus.
> 
> So, it looks like the issue I am having is that the alleles appear to be delimited properly when read into the genind class, but then that delimitation seems to disappear in the propShared function.
> 
> Since you did not get that same problem, I assume that it has something to do with how I am reading the file in (although the genind class seems correct).  I will keep looking into it.
> 
> Thanks again for your help.
> 
> Sincerely,
> Tim Frasier
> 
> 
> 
> On 2010-08-13, at 8:40 AM, Jombart, Thibaut wrote:
> 
> Dear Tim,
> 
> Issues are always possible, but it would be nice if you could point out a particular case where calculations are 'clearly wrong'. For now it is not clear to me that the function is wrong. It works with the example dataset, and it works with the first rows of your dataset too:
> 
> tab <- genind2df(x)
> tab
>    X119 X123 X191 X197 X199 X201 X138 X140 X099 X099.1 X173 X175 X136
> Nph0003  119  121  191  191  203  203  136  144  097    099  159  169  132
> Nph0004  119  121  191  191  203  203  136  136  097    097  159  171  132
> Nph0005  119  119  191  191  203  203  132  138  097    099  159  161  132
> Nph0007 <NA> <NA>  191  191  201  205  136  136  097    099  159  159  132
>    X136.1 X168 X170 X144 X144.1
> Nph0003    132 <NA> <NA>  146    150
> Nph0004    132  158  160  146    152
> Nph0005    132  160  160  146    148
> Nph0007    134  158  160  148    150
> 
> propShared(x)
>      Nph0003   Nph0004   Nph0005   Nph0007
> Nph0003 1.0000000 0.7500000 0.6875000 0.5714286
> Nph0004 0.7500000 1.0000000 0.6111111 0.5625000
> Nph0005 0.6875000 0.6111111 1.0000000 0.4375000
> Nph0007 0.5714286 0.5625000 0.4375000 1.0000000
> 
> mean(tab[1,]==tab[2,], na.rm=TRUE)  # OK
> [1] 0.75
> mean(tab[1,]==tab[3,], na.rm=TRUE) # OK
> [1] 0.6875
> mean(tab[1,]==tab[4,], na.rm=TRUE) # OK
> [1] 0.5714286
> mean(tab[2,]==tab[3,], na.rm=TRUE) # OK
> [1] 0.6111111
> mean(tab[2,]==tab[4,], na.rm=TRUE) # OK
> [1] 0.5625
> mean(tab[3,]==tab[4,], na.rm=TRUE) #OK
> [1] 0.4375
> 
> All these results are right. Obviously the proportion of shared alleles is computed for sampled loci only; for instance, for proportion of shared alleles between individuals 1 and 2 equates 12/16, and not 12/18.
> 
> Can you please provide a sample of code illustrating an issue in the calculations?
> 
> Best
> 
> Thibaut.
> ________________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org> [adegenet-forum-bounces at lists.r-forge.r-project.org] On Behalf Of Timothy Frasier [timothy.frasier at SMU.CA]
> Sent: 12 August 2010 15:15
> To: adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-forum at lists.r-forge.r-project.org>
> Subject: [adegenet-forum] propShared problems
> 
> Hi,
> 
> I am trying to use the propShared function, and realized that there is an issue with the calculation - the results are clearly not calculating the proportion of shared alleles properly.  I browsed through both the R and C code to try to find the issue, but nothing immediately jumped out at me.  I will keep looking, but thought that you could likely find and fix it faster than I could.
> 
> I've attached a small example data set that I am working with (35 individuals typed at 9 microsatellite loci).  Its size is good because it is easy to calculate the proportion of shared alleles manually and then compare that to the results obtained from the propShared function.
> 
> Thanks for your help!
> 
> Sincerely,
> Tim Frasier
> 
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Timothy R. Frasier
> Department of Biology
> Saint Mary's University
> 923 Robie Street
> Halifax, NS B3H 3C3
> Canada
> E-mail: timothy.frasier at smu.ca<mailto:timothy.frasier at smu.ca>
> Tel: (902) 491-6382
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Timothy R. Frasier
> Department of Biology
> Saint Mary's University
> 923 Robie Street
> Halifax, NS B3H 3C3
> Canada
> E-mail: timothy.frasier at smu.ca<mailto:timothy.frasier at smu.ca>
> Tel: (902) 491-6382
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> 
> 
> 
> 

~~~~~~~~~~~~~~~~~~~~~~~~~~~
Timothy R. Frasier
Department of Biology
Saint Mary's University
923 Robie Street
Halifax, NS B3H 3C3
Canada
E-mail: timothy.frasier at smu.ca
Tel: (902) 491-6382
~~~~~~~~~~~~~~~~~~~~~~~~~~~







More information about the adegenet-forum mailing list