<br>Hello List, <br><br>I am trying to calculate individual genetic distances with the function `adegenet::propShared`. I found that results from `propShared` did not correlate very well with other genetic distances such as Rousset'a. Based on this results, I wrote my own function to calculate proportion of shared alleles (see below). Results from this function do not correspond with the results from `propShared`, but correlates better with other Rousset's a. Am I missing something on the methodology of `propShared`?<br>
<br>Thanks a lot for hints, <br><br>Best<br>Johannes<br><br># --------------------------------------------------------------------------------------------------------------------------- #<br># Example<br><br># Load packages<br>
library(adegenet)<br>library(stringr)<br>library(reshape2)<br><br># Own function to calculate proportion shared alleles<br># Following instruction from here: <a href="http://helix.biology.mcmaster.ca/brent/node8.html">http://helix.biology.mcmaster.ca/brent/node8.html</a><br>
psa <- function(s) {<br> # s - each row is a genotype, first column is id<br><br> # all combinations<br> idx <- combn(1:nrow(s), 2)<br><br> # calculate distances<br> d <- 1 - apply(idx, 2, function(x) sum(apply(s[x, -1], 2, diff) == 0)) / ncol(s[,-1])<br>
<br> # create object of class distance<br> nams <- unique(c(as.character(idx[1, ]), as.character(idx[2, ])))<br> dd <- structure(d, Size = length(nams), Labels = nams, Diag = FALSE, Upper = FALSE, method = "user", class = "dist")<br>
return(dd)<br>}<br><br><br><br># Example dataset<br>dat <- data.frame(id=as.character(1:4), x=0, y=0, <br> l_1=c(108, 116, 122, 119), <br> l_2=c(110, 114, 152, 122), <br> m_1=c(110, 90, 122, 69), <br> m_2=c(110, 111, 128, 128))<br>
<br><br># Create genind <br>coords <- dat[, 2:3]<br>datMelted <- melt(dat[, c(1, 4:ncol(dat))])<br> <br>names(datMelted) <- c("ind", "loci", "allele")<br><br># removing 1 and 2, since it is the same loci<br>
datMelted$loci <- str_sub(as.character(datMelted$loci), end=-3)<br><br># Creating a genind object for the adegenet package<br>datCasted <- dcast(datMelted, ind ~ loci, paste, collapse="/")<br>gi <- df2genind(datCasted[, 2:ncol(datCasted)], sep="/", missing="NA", ind=datCasted[, 1])<br>
<br><br># calcualte proportion of shared alleles<br>propShared(gi)<br>as.matrix(psa(dat[, -c(2,3)]))<br><br># It is my understanding that propShared does not calcualte 1 - porpShared and is the reason for a systematic difference<br>
# However, ind 3 and 4 differ by one of four alleles (= 0.25), but propShared suggests 0.5<br><br>