<div dir="ltr">Thanks, this is pretty clear. I hope I will find the dataset I need! Please let me know if you come across it!<div><br></div><div>Davide</div></div><div class="gmail_extra"><br><div class="gmail_quote">On 6 October 2017 at 11:35, Thibaut Jombart <span dir="ltr"><<a href="mailto:thibautjombart@gmail.com" target="_blank">thibautjombart@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi again,<br>
<br>
OK I think I got it. So:<br>
- I can't remember how I built the eHGDP dataset, but it's an easy task<br>
- I don't know if the data you're looking for is publicly available<br>
- assuming you find it, there are two ways to get a genpop object:<br>
#1: from individual data with pop info: read data in (read.csv /<br>
read.table), use df2genind (be patient there, that'll take a while),<br>
then genind2genpop<br>
<br>
#2: from population data (allele counts): read data in (read.csv /<br>
read.table), use the genpop() constructor to make the data a genpop<br>
object; I think this is documented in the basics tutorial, but<br>
definitely also in ?genpop<br>
<br>
HTH<br>
<span class="im HOEnZb">Best<br>
Thibaut<br>
<br>
--<br>
Dr Thibaut Jombart<br>
Lecturer, Department of Infectious Disease Epidemiology, Imperial College London<br>
Head of RECON: <a href="http://repidemicsconsortium.org" rel="noreferrer" target="_blank">repidemicsconsortium.org</a><br>
WHO Consultant - outbreak analysis<br>
<a href="http://sites.google.com/site/thibautjombart/" rel="noreferrer" target="_blank">sites.google.com/site/<wbr>thibautjombart/</a><br>
Twitter: @TeebzR<br>
<a href="tel:%2B44%280%2920%207594%203658" value="+442075943658">+44(0)20 7594 3658</a><br>
<br>
<br>
</span><div class="HOEnZb"><div class="h5">On 6 October 2017 at 10:24, Davide Piffer <<a href="mailto:pifferdavide@gmail.com">pifferdavide@gmail.com</a>> wrote:<br>
> Dear Thibaut,<br>
><br>
> thanks for answering my question. I will try to reformulate my question<br>
> differently, stating the assumptions:<br>
> 1) I assume that the eHGDP object was made into a genpop object from some<br>
> raw .txt file, like the HGDP file I linked to in the previous email.<br>
> 2) I need an object that looks exactly like the eHGDP object, but with SNPs<br>
> instead of microsatellite alleles.<br>
> 3) Since it's gonna be a rather complex task, I asked if any of you knows if<br>
> someone has already done this job before and published it (e.g. as<br>
> supplementary file).<br>
> 4) Otherwise, I would like to know how to produce such a file myself,<br>
> starting from a version of the HGDP file with population information. If<br>
> this was done for microsatellites, surely it can be done for the SNPs as<br>
> well? I assume they rely on the same raw HGDP file.<br>
><br>
> Many thanks!<br>
><br>
> Davide<br>
><br>
> On 6 October 2017 at 10:56, Thibaut Jombart <<a href="mailto:thibautjombart@gmail.com">thibautjombart@gmail.com</a>><br>
> wrote:<br>
>><br>
>> Hi Davide,<br>
>><br>
>> I am not entirely sure what you need, so sorry if I miss the point.<br>
>> adegenet cannot make up for absent population information, but you can<br>
>> try to identify clusters of course, e.g. using find.clusters.<br>
>><br>
>> eHGDP is not a file (at least not in the sense you probably mean), but<br>
>> a genind object. If the question is how you can get a file looking<br>
>> like the one you link into a genind object, you probably want to use<br>
>> something like read.csv and then df2genind. Imports should be detailed<br>
>> in the basics tutorial:<br>
>> <a href="https://github.com/thibautjombart/adegenet/wiki/Tutorials" rel="noreferrer" target="_blank">https://github.com/<wbr>thibautjombart/adegenet/wiki/<wbr>Tutorials</a><br>
>><br>
>> Best<br>
>> Thibaut<br>
>><br>
>> --<br>
>> Dr Thibaut Jombart<br>
>> Lecturer, Department of Infectious Disease Epidemiology, Imperial College<br>
>> London<br>
>> Head of RECON: <a href="http://repidemicsconsortium.org" rel="noreferrer" target="_blank">repidemicsconsortium.org</a><br>
>> WHO Consultant - outbreak analysis<br>
>> <a href="http://sites.google.com/site/thibautjombart/" rel="noreferrer" target="_blank">sites.google.com/site/<wbr>thibautjombart/</a><br>
>> Twitter: @TeebzR<br>
>> <a href="tel:%2B44%280%2920%207594%203658" value="+442075943658">+44(0)20 7594 3658</a><br>
>><br>
>><br>
>> On 4 October 2017 at 14:08, Davide Piffer <<a href="mailto:pifferdavide@gmail.com">pifferdavide@gmail.com</a>> wrote:<br>
>> > Hello,<br>
>> ><br>
>> > I am new to Adegenet. I would like to retrieve population frequencies of<br>
>> > SNPs (using rsID) from the HGDP file "HGDP_FinalReport_Forward.txt" :<br>
>> > <a href="http://www.hagsc.org/hgdp/files.html" rel="noreferrer" target="_blank">http://www.hagsc.org/hgdp/<wbr>files.html</a><br>
>> ><br>
>> > However, the file lacks population information. It contains SNPs x<br>
>> > individuals.<br>
>> > I need a file structured like the eHGDP (except with SNPs and not<br>
>> > microsatellite data) file provided with the package, that can be easily<br>
>> > converted into genpop file and then compute the frequencies via<br>
>> > makefreq.<br>
>> > Do you know if there is any such file downloadable on the internet?<br>
>> > i guess there must be a way to produce such a file using ADEGENET<br>
>> > starting<br>
>> > from raw data. but my knowledge of this package is not advanced enough<br>
>> > yet.<br>
>> ><br>
>> > Best wishes,<br>
>> ><br>
>> > Davide<br>
>> ><br>
>> > ______________________________<wbr>_________________<br>
>> > adegenet-forum mailing list<br>
>> > <a href="mailto:adegenet-forum@lists.r-forge.r-project.org">adegenet-forum@lists.r-forge.<wbr>r-project.org</a><br>
>> ><br>
>> > <a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum" rel="noreferrer" target="_blank">https://lists.r-forge.r-<wbr>project.org/cgi-bin/mailman/<wbr>listinfo/adegenet-forum</a><br>
><br>
><br>
</div></div></blockquote></div><br></div>