<html dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<style id="owaParaStyle" type="text/css">P {margin-top:0;margin-bottom:0;}</style>
</head>
<body ocsi="0" fpstyle="1" bgcolor="#ffffff">
<div style="direction: ltr;font-family: Tahoma;color: #000000;font-size: 10pt;">Hello,
<br>
<br>
df2genind does that annoying job for you. All you need is to read your data into R as a data.frame with one column for each locus, each genotype being a series of separated alleles. For instance:<br>
####<br>
<span style="font-family: Courier New;">> dat = data.frame(loc1=c("80/80/78/60","60/60/60/60","78/80/80/82"), loc2=c("50/55/60/75","50/50/50/50","55/55/55/55"))</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">> dat</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;"> loc1 loc2</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">1 80/80/78/60 50/55/60/75</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">2 60/60/60/60 50/50/50/50</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">3 78/80/80/82 55/55/55/55</span><br style="font-family: Courier New;">
<br style="font-family: Courier New;">
<span style="font-family: Courier New;">> x=df2genind(dat, sep="/", ploidy=4)</span><br>
> x<br>
<br>
#####################<br>
### Genind object ### <br>
#####################<br>
- genotypes of individuals - <br>
<br>
S4 class: genind<br>
@call: df2genind(X = dat, sep = "/", ploidy = 4)<br>
<br>
@tab: 3 x 8 matrix of genotypes<br>
<br>
@ind.names: vector of 3 individual names<br>
@loc.names: vector of 2 locus names<br>
@loc.nall: number of alleles per locus<br>
@loc.fac: locus factor for the 8 columns of @tab<br>
@all.names: list of 2 components yielding allele names for each locus<br>
@ploidy: 4<br>
@type: codom<br>
<br>
Optionnal contents: <br>
@pop: - empty -<br>
@pop.names: - empty -<br>
<br>
@other: - empty -<br>
<br style="font-family: Courier New;">
<span style="font-family: Courier New;">> truenames(x)</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;"> loc1.60 loc1.78 loc1.80 loc1.82 loc2.50 loc2.55 loc2.60 loc2.75</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">1 0.25 0.25 0.5 0.00 0.25 0.25 0.25 0.25</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">2 1.00 0.00 0.0 0.00 1.00 0.00 0.00 0.00</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">3 0.00 0.25 0.5 0.25 0.00 1.00 0.00 0.00</span><br>
####<br>
<br>
So that you can perform a PCA on truenames(x), or better on a centred/scaled version of this matrix using scaleGen(x).<br>
<br>
Best<br>
<br>
Thibaut<br>
<div><br>
<div class="BodyFragment"><font size="2">
<div class="PlainText">-- <br>
######################################<br>
Dr Thibaut JOMBART<br>
MRC Centre for Outbreak Analysis and Modelling<br>
Department of Infectious Disease Epidemiology<br>
Imperial College - Faculty of Medicine<br>
St Marys Campus<br>
Norfolk Place<br>
London W2 1PG<br>
United Kingdom<br>
Tel. : 0044 (0)20 7594 3658<br>
t.jombart@imperial.ac.uk<br>
http://sites.google.com/site/thibautjombart/<br>
http://adegenet.r-forge.r-project.org/<br>
</div>
</font></div>
</div>
<div style="font-family: Times New Roman; color: rgb(0, 0, 0); font-size: 16px;">
<hr tabindex="-1">
<div style="direction: ltr;" id="divRpF608378"><font color="#000000" face="Tahoma" size="2"><b>From:</b> adegenet-forum-bounces@r-forge.wu-wien.ac.at [adegenet-forum-bounces@r-forge.wu-wien.ac.at] on behalf of AVIK RAY [avik.ray.kol@gmail.com]<br>
<b>Sent:</b> 09 May 2011 20:00<br>
<b>To:</b> adegenet-forum@r-forge.wu-wien.ac.at<br>
<b>Subject:</b> [adegenet-forum] PCA with tetraploid data<br>
</font><br>
</div>
<div></div>
<div>Dear Dr Jombart<br>
I want to do PCA and other analyses in adegenet, however my data is tetraploid dataset, 204 individuals, 7 microsatellite loci, so it is not read using read.structure (as you mentioned in your earlier mails to Sarah Castillo (19/10/2010, RE: Looking for help
with a PCA using adegenet in R);
<p class="MsoNormal">So far Ive understood from the code is instead of coding each individual for each locus as in read.structure (diploid data) idea is to get the allele freq for each allele (whether present or absent) and then code each individual genotypes
accordingly, However, I did not get the last part of the code, e.g.</p>
<p class="MsoNormal" style="margin-bottom: 0.0001pt; line-height: normal;">
.</p>
<p class="MsoNormal" style="margin-bottom: 0.0001pt; line-height: normal;">$pop</p>
<p class="MsoNormal" style="margin-bottom: 0.0001pt; line-height: normal;">[1] ON ON ON ON ON ON ON ON</p>
<p class="MsoNormal">Levels: ON</p>
<p class="MsoNormal" style="margin-bottom: 0.0001pt; line-height: normal;">> <i>genind2df(x, sep="/")</i></p>
<p class="MsoNormal">pop gen</p>
<p class="MsoNormal">
</p>
<p class="MsoNormal">Moreover, it seems extremely cumbersome for large datasets like mine (204 indiv, 7 microsat loci); can you give any suggestion/s??</p>
<p class="MsoNormal">Thanks</p>
<p class="MsoNormal">best regards</p>
<p class="MsoNormal">AVIK <br>
</p>
<p class="MsoNormal"> -- <br>
</p>
<pre class="moz-signature" cols="72">AVIK RAY
Visiting Fellow
National Center for Biological Sciences
Tata Institute of Fundamental Research
GKVK Campus
Bellary Road
Bangalore-560065
India
Ph 91-80-23666340
Fax 91-80-2363 6662
</pre>
</div>
</div>
</div>
</body>
</html>