[adegenet-forum] Kmeans and DAPC on poolSeq data

Thibaut Jombart thibautjombart at gmail.com
Mon Feb 5 12:32:59 CET 2018


Hi Ben

while I'm not aware of hard rules for numbers of individuals needed to
detect a specific number of clusters, and I appreciate it will depend on
how clear-cut differences are, I don't think it is realistic to look for 4
clusters amongst 7 observations. Even 2 clusters will already be a stretch,
unless differences are really very obvious.

Cheers
Thibaut




--
Dr Thibaut Jombart
Lecturer, Department of Infectious Disease Epidemiology, Imperial College
London
Head of RECON: repidemicsconsortium.org
WHO Consultant - outbreak analysis
https://thibautjombart.netlify.com
Twitter: @TeebzR
+44(0)20 7594 3658

On 2 February 2018 at 21:01, DAUPHIN Benjamin <benjamin.dauphin at unine.ch>
wrote:

> Thanks Thibaut.
> Yes i have 7 pools (=7 rows or =7 individuals in the analysis), and i
> expect two clusters representing two already characterized lineages. I have
> found 4 likely clusters based on HCPC but i want to double check this, with
> a kmeans if possible.
> Best
> Ben
> ________________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org [
> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Thibaut
> Jombart [thibautjombart at gmail.com]
> Sent: 02 February 2018 18:25
> To: Benjamin Dauphin
> Cc: adegenet-forum at lists.r-forge.r-project.org
> Subject: Re: [adegenet-forum] Kmeans and DAPC on poolSeq data
>
> Hi again,
>
> such plot typically indicates no clustering. Just to confirm: are we
> talking about 7 rows and 100,000 columns?
>
> If so, your pools are technically your statistical individuals, and the
> method explore clustering solutions for 1-6 clusters for 7 individuals,
> which won't go far - not enough individuals to detect clustering really.
> Apologies if I misunderstood.
>
> Best
> Thibaut
>
>
> --
> Dr Thibaut Jombart
> Lecturer, Department of Infectious Disease Epidemiology, Imperial College
> London
> Head of RECON: repidemicsconsortium.org<http://repidemicsconsortium.org>
> WHO Consultant - outbreak analysis
> https://thibautjombart.netlify.com
> Twitter: @TeebzR
> +44(0)20 7594 3658
>
> On 2 February 2018 at 09:07, Benjamin Dauphin <benjamin.dauphin at wsl.ch<
> mailto:benjamin.dauphin at wsl.ch>> wrote:
> Hi Mark,
>
> Thanks for response. I’ve run find.clusters() with the matrix of allele
> frequencies as input file, and then run the DAPC using still the matrix
> (not the genind or genlight object) by assigning the group generated with
> kmeans (grp$grp). It works but I have a strange “inverted parabolic curve"
> for the kmean analysis.
> Is it a common picture for pooldseq data?
>
> Thanks,
> Ben
>
>
>
>
> > On 1 Feb 2018, at 18:01, Mark Coulson <Mark.Coulson.ic at uhi.ac.uk<mailto:
> Mark.Coulson.ic at uhi.ac.uk>> wrote:
> >
> > Hi Ben,
> >
> > I have used allelotype data with the input as a matrix of the frequency
> of the A allele in each group to run DAPC and it worked well. However, my
> groups were defined already but could the same type of input not be used to
> find.clusters?
> >
> > Mark
> >
> >
> > -----Original Message-----
> > From: adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:a
> degenet-forum-bounces at lists.r-forge.r-project.org> [mailto:adegenet-forum-
> bounces at lists.r-forge.r-project.org<mailto:adegenet-
> forum-bounces at lists.r-forge.r-project.org>] On Behalf Of Benjamin Dauphin
> > Sent: 31 January 2018 09:18
> > To: adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-
> forum at lists.r-forge.r-project.org>
> > Subject: [adegenet-forum] Kmeans and DAPC on poolSeq data
> >
> > Dear all,
> >
> > I am newly working on pool sequencing data and I simply wonder if I can
> use kmeans (find.cluster) and DAPC to investigate population structure from
> poolseq data (allele frequencies)? How find.clusters can deal with allele
> frequencies?
> >
> > Dataset: 7 pools and 100’000 SNPs
> >
> > Any comment or help would be much appreciated.
> > Best regards
> > Ben
> >
> >
> > _______________________________________________
> > adegenet-forum mailing list
> > adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-
> forum at lists.r-forge.r-project.org>
> > https://lists.r-forge.r-project.org/cgi-bin/mailman/
> listinfo/adegenet-forum
> > Inverness College UHI, a partner in the University of the Highlands and
> Islands www.inverness.uhi.ac.uk<http://www.inverness.uhi.ac.uk> Board of
> Management of Inverness College (known as Inverness College UHI), Scottish
> Charity No SC021197.
>
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-
> forum at lists.r-forge.r-project.org>
> https://lists.r-forge.r-project.org/cgi-bin/mailman/
> listinfo/adegenet-forum
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20180205/57e578cb/attachment.html>


More information about the adegenet-forum mailing list