[adegenet-forum] Guidance on sPCA

Fri Mar 18 12:54:11 CET 2022

Hello adegenet forum users,

I am looking for some guidance regarding sPCA analyses. I am currently dealing with a set of butterfly microsatellite data (379 genotypes, 15 localities) and I have used the steps described on the sPCA tutorial as a reference... However, I still have some doubts on the way I am performing the analyses.

a) Regarding the type of connection network: types 1 to 4 (Delaunay triangulation, Gabriel graph, Relative neighbours Minimum spanning tree) work only for data that shows no duplicate locations. As the coordinates I have are specific to the 15 sampled locations and not to the 379 individual samples, I added some jitter to avoid those duplicates. However, this first four types of connection network still produce the same error:

== PROBLEM DETECTED ==

Duplicate locations detected

Please choose another graph (5-7) or add random noise to locations (see ?jitter).

I believe this can be due to the coordinates still being really close together, is this correct?

b) The next step after choosing the connection network is to select the number of eigenvalues to retain (both positive and negative). Is there any specific criterion I should follow to decide on the number of retained eigenvalues?

c) Finally, on plotting the results: I have performed exploratory analyses with my data and I already know there is one locality that seems to be highly monomorphic and is always recovered as a separate group, clearly differentiated from the rest. This can be seen on the top right plot printed with plot.spca. However, when I try to elaborate on that plot using the method described on the tutorial (interpolating principal components) I can't seem to find the right way of depicting that same result. On the example provided in the tutorial (pages 18-22) I can see there is a clear structure in two groups (east and west), and that same structure can be observed both on the plot obtained with plot.spca and on the one using the interpolated principal components. In my case, that graphic display seems to be more or less informative depending on the axis I choose to represent and again, I am not sure about the criterion I should follow to decide on it. I would appreciate any guidance on how to proceed.

I hope I managed to explain my questions clearly and I would like to thank everyone in advance.

All the best,
Laura
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20220318/4e36b693/attachment.html>