<div dir="ltr">Hi,<div><br></div><div>I tried using different columns for the population. The Readme file lists these but none actually works. I am not sure if the problem is with the file or with the function because I am not an expert.</div><div><pre style="color:rgb(0,0,0);word-wrap:break-word;white-space:pre-wrap">Columns for individual data (HGDP/India/Africa individuals):

1. HGDP ID number or HapMap NA number

2. numeric code for population

3. name of population

4. country of origin</pre><pre style="color:rgb(0,0,0);word-wrap:break-word;white-space:pre-wrap"><br></pre></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 11 October 2017 at 11:53, Thibaut Jombart <span dir="ltr"><<a href="mailto:thibautjombart@gmail.com" target="_blank">thibautjombart@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi there,<br>

<br>

reading populations info hasn't been a problem before (I think) in<br>

read.structure. I would double-check which column it is, though I<br>

assume you have. If you think there is a problem with the function<br>

please post an issue on github with a reproducible example and we'll<br>

try to sort it out.<br>

<span class="im HOEnZb"><br>

Best<br>

Thibaut<br>

<br>

--<br>

Dr Thibaut Jombart<br>

Lecturer, Department of Infectious Disease Epidemiology, Imperial College London<br>

Head of RECON: <a href="http://repidemicsconsortium.org" rel="noreferrer" target="_blank">repidemicsconsortium.org</a><br>

WHO Consultant - outbreak analysis<br>

<a href="http://sites.google.com/site/thibautjombart/" rel="noreferrer" target="_blank">sites.google.com/site/<wbr>thibautjombart/</a><br>

Twitter: @TeebzR<br>

<a href="tel:%2B44%280%2920%207594%203658" value="+442075943658">+44(0)20 7594 3658</a><br>

<br>

<br>

</span><div class="HOEnZb"><div class="h5">On 6 October 2017 at 12:08, Davide Piffer <<a href="mailto:pifferdavide@gmail.com">pifferdavide@gmail.com</a>> wrote:<br>

> Ok, I think have found the file I need here:<br>

> <a href="https://rosenberglab.stanford.edu/data/huangEtAl2011/HuangEtAl_2011-GenetEpi.zip" rel="noreferrer" target="_blank">https://rosenberglab.stanford.<wbr>edu/data/huangEtAl2011/<wbr>HuangEtAl_2011-GenetEpi.zip</a><br>

> . However, it's in .str format. Following the instructions on the manual, I<br>

> tried to assign correct labels based on the Readme file<br>

> (<a href="https://rosenberglab.stanford.edu/data/huangEtAl2011/huangEtAl2011snpdata_readme" rel="noreferrer" target="_blank">https://rosenberglab.<wbr>stanford.edu/data/<wbr>huangEtAl2011/<wbr>huangEtAl2011snpdata_readme</a>)<br>

><br>

> Mydata=read.structure("<wbr>unphased_HGDP+India+Africa_<wbr>2810SNPs-regions1to36.stru",<br>

> onerowperind = FALSE,col.lab = 8,col.pop = 2,row.marknames = 1,n.ind = 1107,<br>

> n.loc = 2810, ask = FALSE)#convert into genind<br>

> Mydata_pop=genind2genpop(<wbr>Mydata)#convert into genpop<br>

><br>

> However, I get a file with only 1 population.<br>

><br>

> head(Mydata_pop)<br>

> /// GENPOP OBJECT /////////<br>

><br>

>  // 1 population; 2,810 loci; 7,217 alleles; size: 1.5 Mb<br>

><br>

>  // Basic content<br>

>    @tab:  1 x 7217 matrix of allele counts<br>

>    @loc.n.all: number of alleles per locus (range: 2-4)<br>

>    @loc.fac: locus factor for the 7217 columns of @tab<br>

>    @all.names: list of allele names for each locus<br>

>    @ploidy: ploidy of each individual  (range: 2-2)<br>

>    @type:  codom<br>

>    @call: .local(x = x, i = i, j = j, drop = dro<br>

><br>

> This is obviously wrong since there are 50+ populations.<br>

><br>

> I tried changing col.pop from 2 to 3 but got the same output.<br>

><br>

> Am I missing something?<br>

><br>

><br>

> All the best,<br>

> Davide<br>

><br>

><br>

><br>

> On 6 October 2017 at 11:35, Thibaut Jombart <<a href="mailto:thibautjombart@gmail.com">thibautjombart@gmail.com</a>><br>

> wrote:<br>

>><br>

>> Hi again,<br>

>><br>

>> OK I think I got it. So:<br>

>> - I can't remember how I built the eHGDP dataset, but it's an easy task<br>

>> - I don't know if the data you're looking for is publicly available<br>

>> - assuming you find it, there are two ways to get a genpop object:<br>

>> #1: from individual data with pop info: read data in (read.csv /<br>

>> read.table), use df2genind (be patient there, that'll take a while),<br>

>> then genind2genpop<br>

>><br>

>> #2: from population data (allele counts): read data in (read.csv /<br>

>> read.table), use the genpop() constructor to make the data a genpop<br>

>> object; I think this is documented in the basics tutorial, but<br>

>> definitely also in ?genpop<br>

>><br>

>> HTH<br>

>> Best<br>

>> Thibaut<br>

>><br>

>> --<br>

>> Dr Thibaut Jombart<br>

>> Lecturer, Department of Infectious Disease Epidemiology, Imperial College<br>

>> London<br>

>> Head of RECON: <a href="http://repidemicsconsortium.org" rel="noreferrer" target="_blank">repidemicsconsortium.org</a><br>

>> WHO Consultant - outbreak analysis<br>

>> <a href="http://sites.google.com/site/thibautjombart/" rel="noreferrer" target="_blank">sites.google.com/site/<wbr>thibautjombart/</a><br>

>> Twitter: @TeebzR<br>

>> <a href="tel:%2B44%280%2920%207594%203658" value="+442075943658">+44(0)20 7594 3658</a><br>

>><br>

>><br>

>> On 6 October 2017 at 10:24, Davide Piffer <<a href="mailto:pifferdavide@gmail.com">pifferdavide@gmail.com</a>> wrote:<br>

>> > Dear Thibaut,<br>

>> ><br>

>> > thanks for answering my question. I will try to reformulate my question<br>

>> > differently, stating the assumptions:<br>

>> > 1)  I assume that the eHGDP object was made into a genpop object from<br>

>> > some<br>

>> > raw .txt file, like the HGDP file I linked to in the previous email.<br>

>> > 2) I need an object that looks exactly like the eHGDP object, but with<br>

>> > SNPs<br>

>> > instead of microsatellite alleles.<br>

>> > 3) Since it's gonna be a rather complex task, I asked if any of you<br>

>> > knows if<br>

>> > someone has already done this job before and published it (e.g. as<br>

>> > supplementary file).<br>

>> > 4) Otherwise, I would like to know how to produce such a file myself,<br>

>> > starting from a version of the HGDP file with population information. If<br>

>> > this was done for microsatellites, surely it can be done for the SNPs as<br>

>> > well? I assume they rely on the same raw HGDP file.<br>

>> ><br>

>> > Many thanks!<br>

>> ><br>

>> > Davide<br>

>> ><br>

>> > On 6 October 2017 at 10:56, Thibaut Jombart <<a href="mailto:thibautjombart@gmail.com">thibautjombart@gmail.com</a>><br>

>> > wrote:<br>

>> >><br>

>> >> Hi Davide,<br>

>> >><br>

>> >> I am not entirely sure what you need, so sorry if I miss the point.<br>

>> >> adegenet cannot make up for absent population information, but you can<br>

>> >> try to identify clusters of course, e.g. using find.clusters.<br>

>> >><br>

>> >> eHGDP is not a file (at least not in the sense you probably mean), but<br>

>> >> a genind object. If the question is how you can get a file looking<br>

>> >> like the one you link into a genind object, you probably want to use<br>

>> >> something like read.csv and then df2genind. Imports should be detailed<br>

>> >> in the basics tutorial:<br>

>> >> <a href="https://github.com/thibautjombart/adegenet/wiki/Tutorials" rel="noreferrer" target="_blank">https://github.com/<wbr>thibautjombart/adegenet/wiki/<wbr>Tutorials</a><br>

>> >><br>

>> >> Best<br>

>> >> Thibaut<br>

>> >><br>

>> >> --<br>

>> >> Dr Thibaut Jombart<br>

>> >> Lecturer, Department of Infectious Disease Epidemiology, Imperial<br>

>> >> College<br>

>> >> London<br>

>> >> Head of RECON: <a href="http://repidemicsconsortium.org" rel="noreferrer" target="_blank">repidemicsconsortium.org</a><br>

>> >> WHO Consultant - outbreak analysis<br>

>> >> <a href="http://sites.google.com/site/thibautjombart/" rel="noreferrer" target="_blank">sites.google.com/site/<wbr>thibautjombart/</a><br>

>> >> Twitter: @TeebzR<br>

>> >> <a href="tel:%2B44%280%2920%207594%203658" value="+442075943658">+44(0)20 7594 3658</a><br>

>> >><br>

>> >><br>

>> >> On 4 October 2017 at 14:08, Davide Piffer <<a href="mailto:pifferdavide@gmail.com">pifferdavide@gmail.com</a>><br>

>> >> wrote:<br>

>> >> > Hello,<br>

>> >> ><br>

>> >> > I am new to Adegenet. I would like to retrieve population frequencies<br>

>> >> > of<br>

>> >> > SNPs (using rsID) from the HGDP file "HGDP_FinalReport_Forward.txt" :<br>

>> >> > <a href="http://www.hagsc.org/hgdp/files.html" rel="noreferrer" target="_blank">http://www.hagsc.org/hgdp/<wbr>files.html</a><br>

>> >> ><br>

>> >> > However, the file lacks population information. It contains SNPs x<br>

>> >> > individuals.<br>

>> >> > I need a file structured like the eHGDP (except with SNPs and not<br>

>> >> > microsatellite data) file provided with the package, that can be<br>

>> >> > easily<br>

>> >> > converted into genpop file and then compute the frequencies via<br>

>> >> > makefreq.<br>

>> >> > Do you know if there is any such file downloadable on the internet?<br>

>> >> > i guess there must be a way to produce such a file using ADEGENET<br>

>> >> > starting<br>

>> >> > from raw data. but my knowledge of this package is not advanced<br>

>> >> > enough<br>

>> >> > yet.<br>

>> >> ><br>

>> >> > Best wishes,<br>

>> >> ><br>

>> >> > Davide<br>

>> >> ><br>

>> >> > ______________________________<wbr>_________________<br>

>> >> > adegenet-forum mailing list<br>

>> >> > <a href="mailto:adegenet-forum@lists.r-forge.r-project.org">adegenet-forum@lists.r-forge.<wbr>r-project.org</a><br>

>> >> ><br>

>> >> ><br>

>> >> > <a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum" rel="noreferrer" target="_blank">https://lists.r-forge.r-<wbr>project.org/cgi-bin/mailman/<wbr>listinfo/adegenet-forum</a><br>

>> ><br>

>> ><br>

><br>

><br>

</div></div></blockquote></div><br></div>