[adegenet-forum] DAPC

Kirsty Medcalf kirsty.m.medcalf at gmail.com
Wed Sep 23 06:45:15 CEST 2015


Dear Forum

This is my first post, so I would like to thank you for your patience.  The
multivariate data that I am using contains two categorical grouping factors
(V4 or G8) under the column family (response variable) and 12 accompanying
predictor variables. The data is called LDA.scores and is found at the
bottom of my Stack Overflow page by following the link below, which shows
my attempted step-by-step logic and figures.

http://stackoverflow.com/questions/32704902/discriminant-analysis-of-principal-components-and-how-to-graphically-show-the-di

I have been attempting to graphically show the distance of data points to
its multivariate centroid using DAPC analysis and the function `scatter' in
the `adegenet' package in R. After splitting the two categorical factors
into two separate data frames (coding below), I attempt to produce these
scatterplot. I understand this package is used for the analysis of genetic
markers, however, I am also under the impression that all types of
multivariate data can be analysed using this package. I tried to manipulate
the data but to no avail.

Code used to produce figure
*Split the dataframe into just V4 and G8

Just.V4<-LDA.scores[LDA.scores$Family=="V4",]
Just.G8 <-LDA.scores[LDA.scores$Family=="G8",]

#Attempt to produce a scatterplot for the categorical factor V4
library(adegenet)
x<-Just.V4[2:13]

*Find the clusters

grp<-find.clusters(x, max.n.clust=12, na.action="omit")

The next step is the perform the discriminant analysis of principal
components

 dapc1<-dapc(x, grp$grp)
 scatter(dapc1)

I have tried many different combinations of code and here are some of the
error messages

Error in dapc.data.frame(x, grp1$grp1) : Inconsistent length for grp
Warning in find.clusters.data.frame(as.data.frame(x), ...) :
NAs introduced by coercion
Error in if (n.pca >= N) warning("number of retained PCs of PCA is
greater than N") :
missing value where TRUE/FALSE needed


If anyone has a solution in terms of how to produce two figures for each
categorical factor which illustrates the clusters (12 parameters measured)
to its multivariate centroid, then thank so much. I have followed lots of
tutorials, searched online and read papers, and still do not understand
these error and warning messages.

Thank you if anyone can help.

Best wishes,
Kaikash
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150922/54fd1d71/attachment-0001.html>


More information about the adegenet-forum mailing list