[adegenet-forum] Kmeans and DAPC on poolSeq data

Thibaut Jombart thibautjombart at gmail.com
Fri Feb 2 18:25:45 CET 2018


Hi again,

such plot typically indicates no clustering. Just to confirm: are we
talking about 7 rows and 100,000 columns?

If so, your pools are technically your statistical individuals, and the
method explore clustering solutions for 1-6 clusters for 7 individuals,
which won't go far - not enough individuals to detect clustering really.
Apologies if I misunderstood.

Best
Thibaut


--
Dr Thibaut Jombart
Lecturer, Department of Infectious Disease Epidemiology, Imperial College
London
Head of RECON: repidemicsconsortium.org
WHO Consultant - outbreak analysis
https://thibautjombart.netlify.com
Twitter: @TeebzR
+44(0)20 7594 3658

On 2 February 2018 at 09:07, Benjamin Dauphin <benjamin.dauphin at wsl.ch>
wrote:

> Hi Mark,
>
> Thanks for response. I’ve run find.clusters() with the matrix of allele
> frequencies as input file, and then run the DAPC using still the matrix
> (not the genind or genlight object) by assigning the group generated with
> kmeans (grp$grp). It works but I have a strange “inverted parabolic curve"
> for the kmean analysis.
> Is it a common picture for pooldseq data?
>
> Thanks,
> Ben
>
>
>
>
> > On 1 Feb 2018, at 18:01, Mark Coulson <Mark.Coulson.ic at uhi.ac.uk> wrote:
> >
> > Hi Ben,
> >
> > I have used allelotype data with the input as a matrix of the frequency
> of the A allele in each group to run DAPC and it worked well. However, my
> groups were defined already but could the same type of input not be used to
> find.clusters?
> >
> > Mark
> >
> >
> > -----Original Message-----
> > From: adegenet-forum-bounces at lists.r-forge.r-project.org [mailto:
> adegenet-forum-bounces at lists.r-forge.r-project.org] On Behalf Of Benjamin
> Dauphin
> > Sent: 31 January 2018 09:18
> > To: adegenet-forum at lists.r-forge.r-project.org
> > Subject: [adegenet-forum] Kmeans and DAPC on poolSeq data
> >
> > Dear all,
> >
> > I am newly working on pool sequencing data and I simply wonder if I can
> use kmeans (find.cluster) and DAPC to investigate population structure from
> poolseq data (allele frequencies)? How find.clusters can deal with allele
> frequencies?
> >
> > Dataset: 7 pools and 100’000 SNPs
> >
> > Any comment or help would be much appreciated.
> > Best regards
> > Ben
> >
> >
> > _______________________________________________
> > adegenet-forum mailing list
> > adegenet-forum at lists.r-forge.r-project.org
> > https://lists.r-forge.r-project.org/cgi-bin/mailman/
> listinfo/adegenet-forum
> > Inverness College UHI, a partner in the University of the Highlands and
> Islands www.inverness.uhi.ac.uk Board of Management of Inverness College
> (known as Inverness College UHI), Scottish Charity No SC021197.
>
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/
> listinfo/adegenet-forum
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20180202/2ea672cc/attachment.html>


More information about the adegenet-forum mailing list