<html dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style>
<!--
.hmmessage p
{margin:0px;
padding:0px}
body.hmmessage
{font-size:12pt;
font-family:Calibri}
-->
</style><style id="owaParaStyle" type="text/css">P {margin-top:0;margin-bottom:0;}</style>
</head>
<body ocsi="0" fpstyle="1" class="hmmessage">
<div style="direction: ltr;font-family: Tahoma;color: #000000;font-size: 10pt;">Dear Anne,
<br>
<br>
this relates to:<br>
<a href="https://github.com/thibautjombart/adegenet/issues/95" target="_blank">https://github.com/thibautjombart/adegenet/issues/95</a><br>
<br>
And there is no simple solution to the problem. I can see two options:<br>
#1 PCA then multispati<br>
Run a PCA on your data first and then use ade4's equivalent of the sPCA, multispati, to get a spatial analysis; your new object won't exactly be of the same structure as a sPCA but most outputs will be
<br>
there.<br>
<br>
#2 PCA, retain most contributing loci, then sPCA<br>
You can do a PCA on your whole dataset and then compute the average contribution (squared loadings) of your loci over the first xxx axes. Then you can keep the xxx loci which are most informative. I think I'd prefer this option slightly, as then you still work
with alleles for your sPCA and not synthetic variables, but up to you:<br>
<br>
Here's an example to get a new dataset with the 25% most contributing loci:<br>
<span style="font-family: Courier New;">> library(adegenet)</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">> data(H3N2)</span><br style="font-family: Courier New;">
<br style="font-family: Courier New;">
<span style="font-family: Courier New;">## make your PCA</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">> pca1 <- dudi.pca(tab(H3N2, freq=TRUE,NA.method="mean"),scannf=FALSE,scale=FALSE, nf=3)</span><br style="font-family: Courier New;">
<br style="font-family: Courier New;">
<span style="font-family: Courier New;">## use loadingplot to ID most contributing loci</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">> toKeep <- loadingplot(pca1$c1^2, byfac=TRUE,fac=locFac(H3N2))$var.idx</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">> toKeep</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;"> 45 60 73 148 168 171 225 247 317 351 391 396 435 463 464 468 476 483 490 566 577 578 582 600 604 673 676 679 763 807 963
</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;"> 5 7 9 16 19 20 22 25 31 35 40 41 51 53 54 55 57 58 59 68 70 71 72 77 79 91 93 94 98 100 116
</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">> new.x <- H3N2[loc=toKeep]</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">> new.x</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">/// GENIND OBJECT /////////</span><br style="font-family: Courier New;">
<br style="font-family: Courier New;">
<span style="font-family: Courier New;"> // 1,903 individuals; 31 loci; 78 alleles; size: 1.4 Mb</span><br style="font-family: Courier New;">
<br style="font-family: Courier New;">
<span style="font-family: Courier New;"> // Basic content</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;"> @tab: 1903 x 78 matrix of allele counts</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;"> @loc.n.all: number of alleles per locus (range: 2-4)</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;"> @loc.fac: locus factor for the 78 columns of @tab</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;"> @all.names: list of allele names for each locus</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;"> @ploidy: ploidy of each individual (range: 1-1)</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;"> @type: codom</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;"> @call: .local(x = x, i = i, j = j, loc = ..1, drop = drop)</span><br style="font-family: Courier New;">
<br style="font-family: Courier New;">
<span style="font-family: Courier New;"> // Optional content</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;"> @other: a list containing: x xy epid
</span><br style="font-family: Courier New;">
<br style="font-family: Courier New;">
<span style="font-family: Courier New;">> locNames(new.x)</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;"> [1] "45" "60" "73" "148" "168" "171" "225" "247" "317" "351" "391" "396" "435" "463" "464" "468" "476" "483" "490" "566" "577"</span><br style="font-family: Courier New;">
<span style="font-family: Courier New;">[22] "578" "582" "600" "604" "673" "676" "679" "763" "807" "963"</span><br style="font-family: Courier New;">
<br>
<br>
Cheers<br>
Thibaut<br>
<br>
<div><br>
<div style="font-family:Tahoma; font-size:13px">
<div class="BodyFragment"><font size="2"><span style="font-size:10pt">
<div class="PlainText"><br>
<br>
</div>
</span></font></div>
</div>
</div>
<div style="font-family: Times New Roman; color: #000000; font-size: 16px">
<hr tabindex="-1">
<div style="direction: ltr;" id="divRpF9264"><font color="#000000" face="Tahoma" size="2"><b>From:</b> adegenet-forum-bounces@lists.r-forge.r-project.org [adegenet-forum-bounces@lists.r-forge.r-project.org] on behalf of anne DaSilva [anne_dasilva@hotmail.com]<br>
<b>Sent:</b> 16 October 2015 18:02<br>
<b>To:</b> adegenet-forum@lists.r-forge.r-project.org<br>
<b>Subject:</b> [adegenet-forum] spatial pca with large SNP data<br>
</font><br>
</div>
<div></div>
<div>
<div dir="ltr">Dear all,<br>
I would like to conduct a spatial PCA with 24000 SNP (after pruning). If I understand spca is not possible with a genlight object, so I use PLINK to recode my data in a STRUCTURE format, and then I work on a genind object in R....but the analysis ends....because
of the size of the genind object I imagine.<br>
Is there a solution (split the data? but how?)?<br>
I am loosing all my hair over that problem (perhaps pathetically simple....) and I would be really grateful if someone could help me to escape from my ignorance....<br>
Kind regards<br>
Anne<br>
<br>
<br>
<br>
Anne Blondeau Da Silva<br>
Unité de Génétique Moléculaire Animale<br>
UMR 1061 INRA-Université de Limoges<br>
Faculté des Sciences et Techniques<br>
123 Avenue Albert Thomas<br>
87060 LIMOGES Cedex<br>
Tél. 05 55 45 76 75<br>
Fax. 05 55 45 76 53<br>
<div>
<div style="font-family:trebuchet ms,sans-serif; font-size:12pt; color:#000000">
<div style="color:#000; font-weight:normal; font-style:normal; text-decoration:none; font-family:Helvetica,Arial,sans-serif; font-size:12pt">
_______________________________________________<br>
adegenet-forum mailing list<br>
adegenet-forum@lists.r-forge.r-project.org<br>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum</div>
<div><br>
</div>
</div>
<br>
_______________________________________________ <br>
adegenet-forum mailing list adegenet-forum@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum</div>
</div>
</div>
</div>
</div>
</body>
</html>