<HTML><HEAD></HEAD>
<BODY dir=ltr>
<DIV dir=ltr>
<DIV style="FONT-SIZE: 12pt; FONT-FAMILY: 'Calibri'; COLOR: #000000">
<DIV>Hi!</DIV>
<DIV> </DIV>
<DIV>I was performing pairwise DAPCs on a data set containing > 62.000 SNPs
in 4 populations. </DIV>
<DIV> </DIV>
<DIV>Because of a problem with my PLINK input, my SNPs were initially in a
“wrong” order. When I noticed the error, I repeated the DAPC analysis with a
correctly ordered input file, primarily to get correct SNP-IDs for the allele
loadings. Thereby I noticed differences between the two analyses:</DIV>
<DIV>Individual coordinates (dapc$ind.scores) and DAPC eigenvalues (dapc$eig)
differed and allele loadings (dapc$var.contr) were also slightly different. The
output of find.clusters was not different between the input files.</DIV>
<DIV> </DIV>
<DIV>My code:</DIV>
<DIV>> data1 <- read.PLINK("file.raw", map.file = NULL, quiet = FALSE,
chunkSize = 1000, parallel = FALSE)</DIV>
<DIV>> clust1 <- find.clusters(data1, stat="BIC", choose.n.clust=TRUE,
max.n.clust=10, n.iter=1e5, n.start=10, pca.select="percVar",
perc.pca=100, glPca=NULL, parallel=FALSE)</DIV>
<DIV>> dapc1 <- dapc(data1, pop = clust1$grp, pca.select = "percVar",
perc.pca = 100, parallel=FALSE)</DIV>
<DIV>> loadings1 <- dapc1$var.contr</DIV>
<DIV>> ld1 <- loadings1[order(loadings1[,1], decreasing = TRUE),]</DIV>
<DIV>> write.table(ld1, file = "PW1_loadings_rc.txt")</DIV>
<DIV> </DIV>
<DIV>DAPC Output 1 (unsorted SNPs): </DIV>
<DIV>$eig (eigenvalues): 8.495e+32 vector length
content
</DIV>
<DIV>$ind.scores</DIV>
<DIV> LD1</DIV>
<DIV>Cluster1 -8.413869e+15 #(same value for
all 6 individuals in this cluster)</DIV>
<DIV>Cluster2 8.413869e+15 #(same value for
all 6 individuals in this cluster)</DIV>
<DIV>$var.contr (in decreasing order)</DIV>
<DIV>[line 72] "60732" 0.000171552889525985</DIV>
<DIV>[line 72] "2993" 0.000154678359525634</DIV>
<DIV>[line 72] "976" 0.000152483049214648</DIV>
<DIV> </DIV>
<DIV>DAPC Output 2 (correctly sorted SNPs):</DIV>
<DIV>$eig (eigenvalues): 4.459e+33 vector length
content </DIV>
<DIV>$ind.scores</DIV>
<DIV> LD1</DIV>
<DIV>Cluster1 -1.927727e+16 #(same value for
all 6 individuals in this cluster)</DIV>
<DIV>Cluster2 1.927727e+16 #(same value for
all 6 individuals in this cluster)</DIV>
<DIV>$var.contr (in decreasing order) </DIV>
<DIV>[line 72] "60819" 0.000171552889525985</DIV>
<DIV>[line 73] "42995" 0.000154678359525636</DIV>
<DIV>[line 74] "697" 0.000152483049214647</DIV>
<DIV>(-> of course the SNP-IDs don’t match, but the values in each line of
the output should correspond )</DIV>
<DIV> </DIV>
<DIV>I used exactly the same code and the only difference between the analyses
was the sorting of SNPs. I realize that the difference in allele loadings (my
main interest) is marginal but I was surprised to find differences in the first
place. Is that normal? Shouldn’t these results be independent from the sorting
of SNPs? Could it be because of rounding errors? </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV>Thank you for your time!</DIV>
<DIV>Cheers,</DIV>
<DIV>Maike</DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV> </DIV></DIV></DIV></BODY></HTML>