<html><body><div style="font-family: trebuchet ms,sans-serif; font-size: 12pt; color: #000000"><div>Hi Elizabeth,</div><div><br data-mce-bogus="1"></div><div>it would appear there is something funky happening with the code due to locus names being numeric. This has happened before in some other function. Until we fix this, you can change your locus names so that they start with a letter.<br></div><div><br></div><div>Here is the excerpt from the genind object indicating that these two samples have alleles 33:</div><div><br data-mce-bogus="1"></div><div><span style="font-family: "courier new", courier, monaco, monospace, sans-serif;" data-mce-style="font-family: 'courier new', courier, monaco, monospace, sans-serif;"> X1401_25.13 X1401_25.33 X1403_13.11 X1403_13.13 X1403_13.33 X1404_17.13 X1404_17.33 X1404_17.11</span><br><span style="font-family: "courier new", courier, monaco, monospace, sans-serif;" data-mce-style="font-family: 'courier new', courier, monaco, monospace, sans-serif;">C_KH1059 0 1 1 0 0 0 1 0</span><br><span style="font-family: "courier new", courier, monaco, monospace, sans-serif;" data-mce-style="font-family: 'courier new', courier, monaco, monospace, sans-serif;">M_KH1834 0 1 1 0 0 1 0 0</span></div><div><br data-mce-bogus="1"></div><div><br data-mce-bogus="1"></div><div>Cheers,</div><div>Roman</div><div><br data-mce-bogus="1"></div><div><br data-mce-bogus="1"></div><div data-marker="__SIG_PRE__">----<br>In god we trust, all others bring data.</div><br><hr id="zwchr" data-marker="__DIVIDER__"><div data-marker="__HEADERS__"><b>From: </b>"Biz Sheedy" <biz.sheedy@gmail.com><br><b>To: </b>"Roman Luštrik" <roman.lustrik@biolitika.si><br><b>Cc: </b>adegenet-forum@lists.r-forge.r-project.org<br><b>Sent: </b>Monday, November 28, 2016 11:00:53 AM<br><b>Subject: </b>Re: [adegenet-forum] Discrepancy in NA counts<br></div><br><div data-marker="__QUOTED_TEXT__"><div dir="ltr"><div><div><div><div><div>Thanks for looking into this.<br><br></div>Something that I did differently to the code you provided, was that I only answered the prompts for the read.structure function. This meant I did not use sep="\t" and the number of alleles was 62 instead of 72, which I think should be comparable to the excel count. Following the code you provide, '<a href="http://is.na" target="_blank">is.na</a>' finds 23 NAs (instead of 20 NAs at 62 alleles and 16 zeroes in excel). <br><br>Your explanation makes sense to me for the additional three NAs in adegenet, but I still don't understand how in locus 1401_25 the data for two
individuals (C_KH1059 and M_KH1834) changed from being homozygous for "3" to
being "NA"?<br><br></div></div>I would really appreciate any further help on this.<br><br></div>Thanks again,<br></div>Elizabeth<br><div><div><div><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 28 November 2016 at 18:03, Roman Luštrik <span dir="ltr"><<a href="mailto:roman.lustrik@biolitika.si" target="_blank">roman.lustrik@biolitika.si</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0 0 0 .8ex; border-left: 1px #ccc solid; padding-left: 1ex;" data-mce-style="margin: 0 0 0 .8ex; border-left: 1px #ccc solid; padding-left: 1ex;"><div><div style="font-family: trebuchet ms,sans-serif; font-size: 12pt; color: #000000;" data-mce-style="font-family: trebuchet ms,sans-serif; font-size: 12pt; color: #000000;"><div>Hi,</div><br><div>I think the problem is that adegenet, for consistency, adds NAs to accommodate the extra alleles present for a particular locus. Take for example C_KH1238 (bottom row in the example pasted belo).</div><div>In raw file, it has missing values for locus 1378_53, but this locus has three alleles, ergo 3 NAs and not 2. Can't go through all the NAs right now, but I think there's a pretty good chance this is what is causing the discrepancy between what you see in "excel" and in adegenet.</div><br><div><span style="font-family: 'courier new',courier,monaco,monospace,sans-serif;" data-mce-style="font-family: 'courier new',courier,monaco,monospace,sans-serif;"> 1369_41.11 1372_14.22 1372_14.24 1373_9.44 1373_9.24 1377_42.44 1377_42.24 1378_53.22 1378_53.24 1378_53.44 1379_10.33 1379_10.13 1382_37.33</span><br>...<br><span style="font-family: 'courier new',courier,monaco,monospace,sans-serif;" data-mce-style="font-family: 'courier new',courier,monaco,monospace,sans-serif;">C_KH1238 0 1 0 1 0 1 0 <b>NA NA NA</b> 1 0 1 # notice 3 NAs for all available alleles for 1378_53, not just two (as expected for diploid)</span></div><br><br><div>Here is the code I used to explore this:</div><br><div><span style="font-family: 'courier new',courier,monaco,monospace,sans-serif;" data-mce-style="font-family: 'courier new',courier,monaco,monospace,sans-serif;">library(adegenet)</span><br><br><span style="font-family: 'courier new',courier,monaco,monospace,sans-serif;" data-mce-style="font-family: 'courier new',courier,monaco,monospace,sans-serif;">xy <- read.table("Sub_batch_1.stru", header = TRUE, sep = "\t")</span><br><span style="font-family: 'courier new',courier,monaco,monospace,sans-serif;" data-mce-style="font-family: 'courier new',courier,monaco,monospace,sans-serif;">xy <- xy[, c(-1, -2)]</span><br><span style="font-family: 'courier new',courier,monaco,monospace,sans-serif;" data-mce-style="font-family: 'courier new',courier,monaco,monospace,sans-serif;">table(as.matrix(xy))</span><br><br><span style="font-family: 'courier new',courier,monaco,monospace,sans-serif;" data-mce-style="font-family: 'courier new',courier,monaco,monospace,sans-serif;"># 0 1 2 3 4 </span><br><span style="font-family: 'courier new',courier,monaco,monospace,sans-serif;" data-mce-style="font-family: 'courier new',courier,monaco,monospace,sans-serif;"># 16 467 618 760 867</span><br><br><br><span style="font-family: 'courier new',courier,monaco,monospace,sans-serif;" data-mce-style="font-family: 'courier new',courier,monaco,monospace,sans-serif;">xy <- read.structure("Sub_batch_1.stru", NA.char="0",</span><br><span style="font-family: 'courier new',courier,monaco,monospace,sans-serif;" data-mce-style="font-family: 'courier new',courier,monaco,monospace,sans-serif;"> n.ind = 44, n.loc = 31, onerowperind = FALSE,</span><br><span style="font-family: 'courier new',courier,monaco,monospace,sans-serif;" data-mce-style="font-family: 'courier new',courier,monaco,monospace,sans-serif;"> col.lab = 1, col.pop = 2, row.marknames = 1,</span><br><span style="font-family: 'courier new',courier,monaco,monospace,sans-serif;" data-mce-style="font-family: 'courier new',courier,monaco,monospace,sans-serif;"> sep = "\t", col.others = 0)</span><br><br><span style="font-family: 'courier new',courier,monaco,monospace,sans-serif;" data-mce-style="font-family: 'courier new',courier,monaco,monospace,sans-serif;">xy <- tab(xy)</span><br><span style="font-family: 'courier new',courier,monaco,monospace,sans-serif;" data-mce-style="font-family: 'courier new',courier,monaco,monospace,sans-serif;">xy[grepl("C_KH1238", rownames(xy)), grepl("1378_53", colnames(xy))]</span><br></div><span class=""><br><div>Cheers,</div><div>Roman</div><br><div>----<br>In god we trust, all others bring data.</div><br><hr id="m_-2885502866782724293zwchr"></span><div><b>From: </b>"Biz Sheedy" <<a href="mailto:biz.sheedy@gmail.com" target="_blank">biz.sheedy@gmail.com</a>><br><b>To: </b>"Roman Luštrik" <<a href="mailto:roman.lustrik@biolitika.si" target="_blank">roman.lustrik@biolitika.si</a>><br><b>Sent: </b>Monday, November 28, 2016 9:11:39 AM<br><b>Subject: </b>Re: [adegenet-forum] Discrepancy in NA counts<br></div><div><div class="h5"><br><div><div dir="ltr"><div>My apologies. First time posting to a forum so I am a little unsure of things. I have attached a subset of the data, which includes the locus that I saw had problems. <br><br>In this case there are 31 loci with 16 zeroes counted (excel), and 20 NAs counted (adegenet). The additional NAs occur in locus 1401_25.<br><br></div><div>Thanks so much,<br></div><div>Elizabeth<br></div><div class="gmail_extra"><br><div class="gmail_quote">On 28 November 2016 at 16:31, Roman Luštrik <span dir="ltr"><<a href="mailto:roman.lustrik@biolitika.si" target="_blank">roman.lustrik@biolitika.si</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0 0 0 .8ex; border-left: 1px #ccc solid; padding-left: 1ex;" data-mce-style="margin: 0 0 0 .8ex; border-left: 1px #ccc solid; padding-left: 1ex;"><div><div style="font-family: trebuchet ms,sans-serif; font-size: 12pt; color: #000000;" data-mce-style="font-family: trebuchet ms,sans-serif; font-size: 12pt; color: #000000;"><div>Hi,</div><br><div>can you share a (subset) of the dataset? It's hard to pinpoint where things might be going wrong without some data in hand.</div><br><div>Cheers,</div><div>Roman</div><br><div>----<br>In god we trust, all others bring data.</div><br><hr id="m_-2885502866782724293m_6274320645841491781zwchr"><div><b>From: </b>"Biz Sheedy" <<a href="mailto:biz.sheedy@gmail.com" target="_blank">biz.sheedy@gmail.com</a>><br><b>To: </b><a href="mailto:adegenet-forum@lists.r-forge.r-project.org" target="_blank">adegenet-forum@lists.r-forge.r-project.org</a><br><b>Sent: </b>Friday, November 25, 2016 10:44:16 AM<br><b>Subject: </b>[adegenet-forum] Discrepancy in NA counts<br></div><br><div><div><div class="m_-2885502866782724293h5"><div dir="ltr"><div><div><div><div><div>Dear All,<br><br></div>I am trying to read SNP data from Stacks into adegenet. I have tried read.structure and read.genepop but they both give (the same) NA counts that are higher than expected. Using read.table on the structure-formatted file (with "ind" and "pop" inserted into the first two columns of row one) gave the expected number of missing data. <br><br>I looked at a single population subset (both the original and the converted data) in excel and found a locus where in the original data, all nine individuals were "3", but in the converted data one individual was "NA". The loci before and after this one both matched/were correct.<br><br>I am not sure what I have missed for this to happen, my R skills are beginner at best. Any help with reading the data in correctly would be greatly appreciated!<br></div><div><br>Thank you,<br></div><div>Elizabeth<br></div><div><br><br></div>R version 3.3.2<br>adegenet version 2.0.1<br><br></div><div>Data: 44 individuals, diploid, 4279 loci.<br></div><br>all<-read.structure("all_batch_1.stru", NA.char="0")<br><br>Total cells in excel: 376552<br></div><div>After read.structure/genepop: 44*8558=376552<br></div><br><div>0s in excel: 3952<br></div><div>0s after read.table; length(which(X==0)): 3952<br></div><div>NA after read.structure/genepop; sum(<a href="http://is.na" target="_blank">is.na</a>(all$tab)): 4008<br></div><div>Difference: 56<br></div><br><div>Subset Chichi<br></div><div>Total cells: 77022<br></div><div>After read.structure/genepop: 9*8558=77022<br></div><br><div>0s in excel: 742<br></div>NA after read.structure/genepop; sum(<a href="http://is.na" target="_blank">is.na</a>(chi$tab)): 756<br></div>Difference: 14<br><div><div><div><span style="color: #009900; font-weight: bold;" data-mce-style="color: #009900; font-weight: bold;"></span><br><br><br><span style="color: #009900; font-weight: bold;" data-mce-style="color: #009900; font-weight: bold;"></span><div><div><div><div><div>-- <br><div class="m_-2885502866782724293m_6274320645841491781gmail_signature"><div dir="ltr"><div>4-1-1 Amakubo<br></div><div>Department of Botany<br></div><div>National Museum of Nature and Science<br></div><div>Tsukuba, Ibaraki 305-0005<br></div><div>Japan<br><br></div><div><a href="mailto:biz.sheedy@gmail.com" target="_blank">biz.sheedy@gmail.com</a><br></div></div></div>
</div></div></div></div></div></div></div></div></div>
<br></div></div>_______________________________________________<br>adegenet-forum mailing list<br><a href="mailto:adegenet-forum@lists.r-forge.r-project.org" target="_blank">adegenet-forum@lists.r-forge.r-project.org</a><br><a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum</a><br></div></div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="m_-2885502866782724293gmail_signature"><div dir="ltr"><div>4-1-1 Amakubo<br></div><div>Department of Botany<br></div><div>National Museum of Nature and Science<br></div><div>Tsukuba, Ibaraki 305-0005<br></div><div>Japan<br><br></div><div><a href="mailto:biz.sheedy@gmail.com" target="_blank">biz.sheedy@gmail.com</a><br></div></div></div>
</div></div><br></div></div></div></div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature"><div dir="ltr"><div>4-1-1 Amakubo<br></div><div>Department of Botany<br></div><div>National Museum of Nature and Science<br></div><div>Tsukuba, Ibaraki 305-0005<br></div><div>Japan<br><br></div><div><a href="mailto:biz.sheedy@gmail.com" target="_blank">biz.sheedy@gmail.com</a><br></div></div></div>
</div></div><br></div></div></body></html>