[adegenet-forum] DAPC

Thu May 8 16:47:33 CEST 2014

Hello, 

this is documented already. Please see ?scatter.dapc and the DAPC tutorial, section 3.3 "Customising DAPC scatterplots".

Also, from ?scatter.dapc, see the arguments:

  grp: a factor defining group membership for the individuals. The
          scatterplot is optimal only for the default group, i.e. the
          one used in the DAPC analysis.

     col: a suitable color to be used for groups. The specified vector
          should match the number of groups, not the number of
          individuals.

     pch: a ‘numeric’ indicating the type of point to be used to
          indicate the prior group of individuals (see ‘points’
          documentation for more details); one value is expected for
          each group; recycled if necessary.

Which answers your question: the groups by default are those used in the DAPC analysis (coming from k-means, or else).

It is unlikely to find classification errors between k-means and DAPC. The scatterplot allows quite a few customizations, but pointing out individual points is not one of them. For this, you will need to identify which points you want to highlight, and then use e.g. "points" or "text". If you look for misclassified individuals, use "predict", which has a method for dapc object. If 'dapc1' is your dapc, then:
##
predict(dapc1)$assign
##

will give you the group prediction, so that

##
which(predict(dapc1)$assign != group)
##

will tell you which individuals are assigned to a different cluster than in 'group' (where 'group' is the reference clustering you want to use).

Cheers
Thibaut

________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Markus Ruhsam [M.Ruhsam at rbge.ac.uk]
Sent: 08 May 2014 12:10
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] DAPC

Hello,

It’s the first time I am using DAPC to analyse my data and have the following question. Is it correct that in the plot generated by the scatter function samples are coloured according to their k-means group membership and not according to their DAPC membership probability? If this is right is there a way to highlight individuals in the plot where there is a discrepancy between their prior (k-means) and posterior (DAPC membership probability) grouping?

I also was wondering if it’s possible to assign different symbols to individuals according to their population or species (as defined in the input file) in the scatter plot. This seems to be possible as I found this Figure (see below) in a recent publication (Wofford AM, Finch K, Bigott A, Willyard A (2014) A Set of Plastid Loci for Use in Multiplex Fragment Length Genotyping for Intraspecific Variation in Pinus (Pinaceae). Applications in Plant Sciences)

[cid:image005.jpg at 01CF6AB6.7C916810]

This would be very useful because at the moment I can see that there are 3 groups in my plot (see below) but I don’t know if these groups more or less correspond to my population definitions, for example are all the individuals in group 3 from one population or are there also samples from other pops? I am aware that I can check this with the table.value command but it would be a lot easier to see this info at a glance from the plot.

[cid:image006.jpg at 01CF6AB6.7C916810]

Thank you

Markus

--
The Royal Botanic Garden Edinburgh is a charity registered in Scotland (No SC007983)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: image005.jpg
Type: image/jpeg
Size: 57429 bytes
Desc: image005.jpg
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20140508/44db4718/attachment-0002.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image006.jpg
Type: image/jpeg
Size: 29487 bytes
Desc: image006.jpg
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20140508/44db4718/attachment-0003.jpg>