<div dir="ltr">Awesome. Thanks for the report, and many thanks to Roman for solving the issue! <div><br></div><div>Best</div><div><br>Thibaut</div></div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div><div><div><br>--<br>Dr Thibaut Jombart<br>Lecturer, Department of Infectious Disease Epidemiology, <span style="font-size:12.8px">Imperial College London</span></div></div><div><span style="font-size:12.8px">Head of RECON: </span><span style="font-size:12.8px"><a href="http://repidemicsconsortium.org" target="_blank">repidemicsconsortium.org</a></span><br></div></div><div><a href="http://sites.google.com/site/thibautjombart/" style="font-size:12.8px" target="_blank">sites.google.com/site/thibautjombart/</a><br></div><div><a href="http://github.com/thibautjombart" target="_blank">github.com/thibautjombart</a></div>Twitter: <a href="http://twitter.com/TeebzR" target="_blank">@TeebzR</a><br></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div>

<br><div class="gmail_quote">On 7 December 2016 at 01:00, Biz Sheedy <span dir="ltr"><<a href="mailto:biz.sheedy@gmail.com" target="_blank">biz.sheedy@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div>Hi Thibaut and Roman,<br><br></div>Yes, the fix has solved the issue for me. Thanks so much to you both!<br><br>(Sorry for the delayed response, I couldn't get the devel version at work.)<br><br></div>Cheers,<br></div>Elizabeth<br><div><div><div><div><br></div></div></div></div></div><div class="gmail_extra"><div><div class="h5"><br><div class="gmail_quote">On 5 December 2016 at 21:10, Thibaut Jombart <span dir="ltr"><<a href="mailto:thibautjombart@gmail.com" target="_blank">thibautjombart@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello,<br>

<br>

Roman has fixed this bug in the current devel version of adegenet. See:<br>

<a href="https://github.com/thibautjombart/adegenet" rel="noreferrer" target="_blank">https://github.com/thibautjomb<wbr>art/adegenet</a><br>

<br>

For guidelines on installing it. Can you confirm it solves your issue?<br>

<br>

Best<br>

Thibaut<br>

<br>

--<br>

Dr Thibaut Jombart<br>

Lecturer, Department of Infectious Disease Epidemiology, Imperial College London<br>

Head of RECON: <a href="http://repidemicsconsortium.org" rel="noreferrer" target="_blank">repidemicsconsortium.org</a><br>

<a href="http://sites.google.com/site/thibautjombart/" rel="noreferrer" target="_blank">sites.google.com/site/thibautj<wbr>ombart/</a><br>

<a href="http://github.com/thibautjombart" rel="noreferrer" target="_blank">github.com/thibautjombart</a><br>

Twitter: @TeebzR<br>

<div class="m_-6756316008542754742HOEnZb"><div class="m_-6756316008542754742h5"><br>

<br>

On 28 November 2016 at 12:40, Roman Luštrik <<a href="mailto:roman.lustrik@biolitika.si" target="_blank">roman.lustrik@biolitika.si</a>> wrote:<br>

> Hi Elizabeth,<br>

><br>

> it would appear there is something funky happening with the code due to<br>

> locus names being numeric. This has happened before in some other function.<br>

> Until we fix this, you can change your locus names so that they start with a<br>

> letter.<br>

><br>

> Here is the excerpt from the genind object indicating that these two samples<br>

> have alleles 33:<br>

><br>

>          X1401_25.13 X1401_25.33 X1403_13.11 X1403_13.13 X1403_13.33<br>

> X1404_17.13 X1404_17.33 X1404_17.11<br>

> C_KH1059 0 1 1 0 0 0 1 0<br>

> M_KH1834 0 1 1 0 0 1 0 0<br>

><br>

><br>

> Cheers,<br>

> Roman<br>

><br>

><br>

> ----<br>

> In god we trust, all others bring data.<br>

><br>

> ______________________________<wbr>__<br>

> From: "Biz Sheedy" <<a href="mailto:biz.sheedy@gmail.com" target="_blank">biz.sheedy@gmail.com</a>><br>

> To: "Roman Luštrik" <<a href="mailto:roman.lustrik@biolitika.si" target="_blank">roman.lustrik@biolitika.si</a>><br>

> Cc: <a href="mailto:adegenet-forum@lists.r-forge.r-project.org" target="_blank">adegenet-forum@lists.r-forge.r<wbr>-project.org</a><br>

> Sent: Monday, November 28, 2016 11:00:53 AM<br>

><br>

> Subject: Re: [adegenet-forum] Discrepancy in NA counts<br>

><br>

> Thanks for looking into this.<br>

><br>

> Something that I did differently to the code you provided, was that I only<br>

> answered the prompts for the read.structure function. This meant I did not<br>

> use sep="\t" and the number of alleles was 62 instead of 72, which I think<br>

> should be comparable to the excel count. Following the code you provide,<br>

> '<a href="http://is.na" rel="noreferrer" target="_blank">is.na</a>' finds 23 NAs (instead of 20 NAs at 62 alleles and 16 zeroes in<br>

> excel).<br>

><br>

> Your explanation makes sense to me for the additional three NAs in adegenet,<br>

> but I still don't understand how in locus 1401_25 the data for two<br>

> individuals (C_KH1059 and M_KH1834) changed from being homozygous for "3" to<br>

> being "NA"?<br>

><br>

> I would really appreciate any further help on this.<br>

><br>

> Thanks again,<br>

> Elizabeth<br>

><br>

><br>

> On 28 November 2016 at 18:03, Roman Luštrik <<a href="mailto:roman.lustrik@biolitika.si" target="_blank">roman.lustrik@biolitika.si</a>><br>

> wrote:<br>

>><br>

>> Hi,<br>

>><br>

>> I think the problem is that adegenet, for consistency, adds NAs to<br>

>> accommodate the extra alleles present for a particular locus. Take for<br>

>> example C_KH1238 (bottom row in the example pasted belo).<br>

>> In raw file, it has missing values for locus 1378_53, but this locus has<br>

>> three alleles, ergo 3 NAs and not 2. Can't go through all the NAs right now,<br>

>> but I think there's a pretty good chance this is what is causing the<br>

>> discrepancy between what you see in "excel" and in adegenet.<br>

>><br>

>> 1369_41.11 1372_14.22 1372_14.24 1373_9.44 1373_9.24 1377_42.44 1377_42.24<br>

>> 1378_53.22 1378_53.24 1378_53.44 1379_10.33 1379_10.13 1382_37.33<br>

>> ...<br>

>> C_KH1238 0 1 0 1 0 1 0 NA NA NA 1 0 1 # notice 3 NAs for all available<br>

>> alleles for 1378_53, not just two (as expected for diploid)<br>

>><br>

>><br>

>> Here is the code I used to explore this:<br>

>><br>

>> library(adegenet)<br>

>><br>

>> xy <- read.table("Sub_batch_1.stru", header = TRUE, sep = "\t")<br>

>> xy <- xy[, c(-1, -2)]<br>

>> table(as.matrix(xy))<br>

>><br>

>> # 0 1 2 3 4<br>

>> # 16 467 618 760 867<br>

>><br>

>><br>

>> xy <- read.structure("Sub_batch_1.st<wbr>ru", NA.char="0",<br>

>> n.ind = 44, n.loc = 31, onerowperind = FALSE,<br>

>> col.lab = 1, col.pop = 2, row.marknames = 1,<br>

>> sep = "\t", col.others = 0)<br>

>><br>

>> xy <- tab(xy)<br>

>> xy[grepl("C_KH1238", rownames(xy)), grepl("1378_53", colnames(xy))]<br>

>><br>

>> Cheers,<br>

>> Roman<br>

>><br>

>> ----<br>

>> In god we trust, all others bring data.<br>

>><br>

>> ______________________________<wbr>__<br>

>> From: "Biz Sheedy" <<a href="mailto:biz.sheedy@gmail.com" target="_blank">biz.sheedy@gmail.com</a>><br>

>> To: "Roman Luštrik" <<a href="mailto:roman.lustrik@biolitika.si" target="_blank">roman.lustrik@biolitika.si</a>><br>

>> Sent: Monday, November 28, 2016 9:11:39 AM<br>

>> Subject: Re: [adegenet-forum] Discrepancy in NA counts<br>

>><br>

>> My apologies. First time posting to a forum so I am a little unsure of<br>

>> things. I have attached a subset of the data, which includes the locus that<br>

>> I saw had problems.<br>

>><br>

>> In this case there are 31 loci with 16 zeroes counted (excel), and 20 NAs<br>

>> counted (adegenet). The additional NAs occur in locus 1401_25.<br>

>><br>

>> Thanks so much,<br>

>> Elizabeth<br>

>><br>

>> On 28 November 2016 at 16:31, Roman Luštrik <<a href="mailto:roman.lustrik@biolitika.si" target="_blank">roman.lustrik@biolitika.si</a>><br>

>> wrote:<br>

>>><br>

>>> Hi,<br>

>>><br>

>>> can you share a (subset) of the dataset? It's hard to pinpoint where<br>

>>> things might be going wrong without some data in hand.<br>

>>><br>

>>> Cheers,<br>

>>> Roman<br>

>>><br>

>>> ----<br>

>>> In god we trust, all others bring data.<br>

>>><br>

>>> ______________________________<wbr>__<br>

>>> From: "Biz Sheedy" <<a href="mailto:biz.sheedy@gmail.com" target="_blank">biz.sheedy@gmail.com</a>><br>

>>> To: <a href="mailto:adegenet-forum@lists.r-forge.r-project.org" target="_blank">adegenet-forum@lists.r-forge.r<wbr>-project.org</a><br>

>>> Sent: Friday, November 25, 2016 10:44:16 AM<br>

>>> Subject: [adegenet-forum] Discrepancy in NA counts<br>

>>><br>

>>> Dear All,<br>

>>><br>

>>> I am trying to read SNP data from Stacks into adegenet. I have tried<br>

>>> read.structure and read.genepop but they both give (the same) NA counts that<br>

>>> are higher than expected. Using read.table on the structure-formatted file<br>

>>> (with "ind" and "pop" inserted into the first two columns of row one) gave<br>

>>> the expected number of missing data.<br>

>>><br>

>>> I looked at a single population subset (both the original and the<br>

>>> converted data) in excel and found a locus where in the original data, all<br>

>>> nine individuals were "3", but in the converted data one individual was<br>

>>> "NA". The loci before and after this one both matched/were correct.<br>

>>><br>

>>> I am not sure what I have missed for this to happen, my R skills are<br>

>>> beginner at best. Any help with reading the data in correctly would be<br>

>>> greatly appreciated!<br>

>>><br>

>>> Thank you,<br>

>>> Elizabeth<br>

>>><br>

>>><br>

>>> R version 3.3.2<br>

>>> adegenet version 2.0.1<br>

>>><br>

>>> Data: 44 individuals, diploid, 4279 loci.<br>

>>><br>

>>> all<-read.structure("all_batch<wbr>_1.stru", NA.char="0")<br>

>>><br>

>>> Total cells in excel: 376552<br>

>>> After read.structure/genepop: 44*8558=376552<br>

>>><br>

>>> 0s in excel: 3952<br>

>>> 0s after read.table; length(which(X==0)): 3952<br>

>>> NA after read.structure/genepop; sum(<a href="http://is.na" rel="noreferrer" target="_blank">is.na</a>(all$tab)): 4008<br>

>>> Difference: 56<br>

>>><br>

>>> Subset Chichi<br>

>>> Total cells: 77022<br>

>>> After read.structure/genepop: 9*8558=77022<br>

>>><br>

>>> 0s in excel: 742<br>

>>> NA after read.structure/genepop; sum(<a href="http://is.na" rel="noreferrer" target="_blank">is.na</a>(chi$tab)): 756<br>

>>> Difference: 14<br>

>>><br>

>>><br>

>>><br>

>>> --<br>

>>> 4-1-1 Amakubo<br>

>>> Department of Botany<br>

>>> National Museum of Nature and Science<br>

>>> Tsukuba, Ibaraki 305-0005<br>

>>> Japan<br>

>>><br>

>>> <a href="mailto:biz.sheedy@gmail.com" target="_blank">biz.sheedy@gmail.com</a><br>

>>><br>

>>> ______________________________<wbr>_________________<br>

>>> adegenet-forum mailing list<br>

>>> <a href="mailto:adegenet-forum@lists.r-forge.r-project.org" target="_blank">adegenet-forum@lists.r-forge.r<wbr>-project.org</a><br>

>>><br>

>>> <a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum" rel="noreferrer" target="_blank">https://lists.r-forge.r-projec<wbr>t.org/cgi-bin/mailman/listinfo<wbr>/adegenet-forum</a><br>

>><br>

>><br>

>><br>

>><br>

>> --<br>

>> 4-1-1 Amakubo<br>

>> Department of Botany<br>

>> National Museum of Nature and Science<br>

>> Tsukuba, Ibaraki 305-0005<br>

>> Japan<br>

>><br>

>> <a href="mailto:biz.sheedy@gmail.com" target="_blank">biz.sheedy@gmail.com</a><br>

>><br>

><br>

><br>

><br>

> --<br>

> 4-1-1 Amakubo<br>

> Department of Botany<br>

> National Museum of Nature and Science<br>

> Tsukuba, Ibaraki 305-0005<br>

> Japan<br>

><br>

> <a href="mailto:biz.sheedy@gmail.com" target="_blank">biz.sheedy@gmail.com</a><br>

><br>

><br>

> ______________________________<wbr>_________________<br>

> adegenet-forum mailing list<br>

> <a href="mailto:adegenet-forum@lists.r-forge.r-project.org" target="_blank">adegenet-forum@lists.r-forge.r<wbr>-project.org</a><br>

> <a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum" rel="noreferrer" target="_blank">https://lists.r-forge.r-projec<wbr>t.org/cgi-bin/mailman/listinfo<wbr>/adegenet-forum</a><br>

</div></div></blockquote></div><br><br clear="all"><br></div></div><span class="HOEnZb"><font color="#888888">-- <br></font></span><div class="m_-6756316008542754742gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><span class="HOEnZb"><font color="#888888"><div>Dr Elizabeth Sheedy<br></div>JSPS Postdoctoral fellow</font></span><span class=""><br><div><br></div><div>4-1-1 Amakubo<br></div><div>Department of Botany<br></div><div>National Museum of Nature and Science<br></div><div>Tsukuba, Ibaraki 305-0005<br></div><div>Japan<br><br></div><div><a href="mailto:biz.sheedy@gmail.com" target="_blank">biz.sheedy@gmail.com</a><br></div></span></div></div>

</div>

</blockquote></div><br></div>