<html dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style id="owaParaStyle" type="text/css">P {margin-top:0;margin-bottom:0;}</style>
</head>
<body ocsi="0" fpstyle="1">
<div style="direction: ltr;font-family: Tahoma;color: #000000;font-size: 10pt;">Dear Ella,
<br>
<br>
there is no one-size-fits-all answer to this question, but some general ideas may be useful.<br>
<br>
Missing data should ideally be i) not too numerous and ii) randomly distributed in the dataset. In a situation like yours, individuals are more precious than markers, so I would discard loci with a majority of NAs, and briefly check the structure of the remaining
missing entries.<br>
<br>
NAs are basically replaced to the mean allele frequency. This means individuals with NAs will tend to be placed closer to the origin. Also, individuals with similar patterns of NAs will be seen as more similar than they probably are in reality.
<br>
<br>
If you really have a big missing value problem, and lot of NAs you cannot discard, one possibility would be to get a matrix of 1 and 0 where '1' indicate NAs, and do the PCA of this. If you obtain a structure, then this is a sign of problem - your NAs are not
randomly distributed.<br>
<br>
Hope this helps.<br>
<br>
Cheers<br>
Thibaut<br>
<div><br>
<div style="font-family:Tahoma; font-size:13px">
<div class="BodyFragment"><font size="2"><span style="font-size:10pt">
<div class="PlainText"> <br>
==============================<br>
Dr Thibaut Jombart<br>
MRC Centre for Outbreak Analysis and Modelling<br>
Department of Infectious Disease Epidemiology<br>
Imperial College - School of Public Health<br>
Norfolk Place, London W2 1PG, UK<br>
Tel. : 0044 (0)20 7594 3658<br>
http://sites.google.com/site/thibautjombart/<br>
http://sites.google.com/site/therepiproject/<br>
http://adegenet.r-forge.r-project.org/<br>
Twitter: @thibautjombart<br>
<br>
<br>
</div>
</span></font></div>
</div>
</div>
<div style="font-family: Times New Roman; color: #000000; font-size: 16px">
<hr tabindex="-1">
<div style="direction: ltr;" id="divRpF536382"><font face="Tahoma" color="#000000" size="2"><b>From:</b> adegenet-forum-bounces@lists.r-forge.r-project.org [adegenet-forum-bounces@lists.r-forge.r-project.org] on behalf of Ella Bowles [ebowles@ucalgary.ca]<br>
<b>Sent:</b> 22 September 2015 20:16<br>
<b>To:</b> adegenet-forum@lists.r-forge.r-project.org<br>
<b>Subject:</b> [adegenet-forum] how do I know if missing data is affecting PCA or DAPC results<br>
</font><br>
</div>
<div></div>
<div>
<div dir="ltr">
<div class="gmail_default" style="font-size:large; color:#0000ff">Hello,</div>
<div class="gmail_default" style="font-size:large; color:#0000ff"><br>
</div>
<div class="gmail_default" style="font-size:large; color:#0000ff">I'm attempting to do a PCA and a DAPC on genomic data, 186 individuals spread over 11 putative populations, with just over 4000 loci. I have converted the data to a genlight object. I'm wondering,
I know that I have some missing data (markers are present in at least 65% of individuals). In the Adegent manual it specifies that missing data could bias results. How do I know if I have too much missing data, or should I just get rid of all the loci that
have missing values before doing the analysis?</div>
<div class="gmail_default" style="font-size:large; color:#0000ff"><br>
</div>
<div class="gmail_default" style="font-size:large; color:#0000ff">With thanks,</div>
<div class="gmail_default" style="font-size:large; color:#0000ff">Ella </div>
<div><br>
</div>
-- <br>
<div class="gmail_signature">
<div dir="ltr">
<div>Ella Bowles<br>
PhD Candidate </div>
<div>Biological Sciences</div>
<div>University of Calgary<br>
<br>
e-mail: <a href="mailto:ebowles@ucalgary.ca" target="_blank">ebowles@ucalgary.ca</a>,
<a href="mailto:bowlese@gmail.com" target="_blank">bowlese@gmail.com</a></div>
<div>website: <a href="http://ellabowlesphd.wordpress.com/" rel="nofollow me" style="color:rgb(59,89,152); font-family:'lucida grande',tahoma,verdana,arial,sans-serif; font-size:11.199999809265137px; line-height:17px" target="_blank">http://<span style="display:inline-block"></span>ellabowlesphd.wordpre<span style="display:inline-block"></span>ss.com/</a></div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>