From biabiany.stella at yahoo.fr Wed May 17 09:46:15 2017 From: biabiany.stella at yahoo.fr (Biabiany Stella) Date: Wed, 17 May 2017 07:46:15 +0000 (UTC) Subject: [adegenet-forum] Matrix structure References: <2056175821.1888264.1495007175462.ref@mail.yahoo.com> Message-ID: <2056175821.1888264.1495007175462@mail.yahoo.com> ?Hi!I am using the adegenet package (the lastest version). I have 300 individuals and 275K SNP so I use the genlight object. I did everything in the tutorial for large SNP data (Eigthenvalues and PCA) but I don't know how to obtain a matrix structure to do my GWAS with mlmm.r or maybe GAPIT. I don't know If it's possible to obtain a stucture matrix with the genlight object in fact.? ----BIABIANY Stella?INRA-Angers,?42 Rue Georges Morel, 49070 Beaucouz?. Agrocampus-ouest,?Universit? d'Angers (http://www.agrocampus-ouest.fr,?http://www.univ-angers.fr)Phone : 0033 6 52 49 79 75----P?Please consider the environment before printing this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mark.Coulson.ic at uhi.ac.uk Wed May 17 17:46:53 2017 From: Mark.Coulson.ic at uhi.ac.uk (Mark Coulson) Date: Wed, 17 May 2017 15:46:53 +0000 Subject: [adegenet-forum] allele frequency matrix Message-ID: Hi I have allele frequency data for pools of individuals (no individual genotype data) for >500,000 SNPs. I know I can do a dapc on allele frequencies directly but given this many SNPs should I be using a 'genlight' object or is this only for individual genotypes? Thanks, Mark Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibautjombart at gmail.com Wed May 17 19:36:03 2017 From: thibautjombart at gmail.com (Thibaut Jombart) Date: Wed, 17 May 2017 18:36:03 +0100 Subject: [adegenet-forum] allele frequency matrix In-Reply-To: References: Message-ID: Hi Mark, I am afraid genlights are meant for individual genotypes indeed, although given the varying ploidy you might be able to use gene pools - maybe worth a try. Best Thibaut -- Dr Thibaut Jombart Lecturer, Department of Infectious Disease Epidemiology, Imperial College London Head of RECON: repidemicsconsortium.org sites.google.com/site/thibautjombart/ github.com/thibautjombart Twitter: @TeebzR +44(0)20 7594 3658 On 17 May 2017 at 16:46, Mark Coulson wrote: > Hi > > > > I have allele frequency data for pools of individuals (no individual > genotype data) for >500,000 SNPs. I know I can do a dapc on allele > frequencies directly but given this many SNPs should I be using a ?genlight? > object or is this only for individual genotypes? > > > > Thanks, > > > > Mark > > Inverness College UHI, a partner in the University of the Highlands and > Islands www.inverness.uhi.ac.uk Board of Management of Inverness College > (known as Inverness College UHI), Scottish Charity No SC021197. > > _______________________________________________ > adegenet-forum mailing list > adegenet-forum at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum From Mark.Coulson.ic at uhi.ac.uk Fri May 19 12:27:02 2017 From: Mark.Coulson.ic at uhi.ac.uk (Mark Coulson) Date: Fri, 19 May 2017 10:27:02 +0000 Subject: [adegenet-forum] DAPC on allele frequency matrix Message-ID: Hello, I'm using DAPC to try to discriminate between two groups. However, the data are not individual genotypes, but rather the result of genotyping pools of samples. There are 20 individual pools in each of the two groups. So basically I am providing the analysis with a frequency of the A allele (all dimorphic SNPs) for each pool. There are ~600,000 SNPs in the dataset. I ran the xvalDapc function and it identified 20 PC as the optimum. However when I run the DAPC on the 20, I get the following warning: Warning message: In dapc.data.frame(as.data.frame(x), ...) : number of retained PCs of PCA may be too large (> N /3) results may be unstable What does this mean in terms of my discrimination, which is pretty good among the two groups? In other analyses such as ranking SNPs according to FST, outlier analyses, etc. the separation is pretty good but not as clear as with DAPC overall. Therefore I am not sure if 1) DAPC is genuinely doing a better job at separating the groups or (2) there is still over-fitting of the data with DAPC given the large number of variables and am I simply finding a solution (which may not be real?) Any thoughts would be helpful Mark Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbzandi at znu.ac.ir Mon May 22 22:33:54 2017 From: mbzandi at znu.ac.ir (mbzandi at znu.ac.ir) Date: Mon, 22 May 2017 20:33:54 +0000 Subject: [adegenet-forum] How we can convert the genotyping result of SSR marker to binary cod In-Reply-To: References: Message-ID: <2232ec8c1cc48406947362b83a9a7a91@mail.znu.ac.ir> Dear all,I am a new member of this forum, I?m trying to use adegenet to define genetic clusters and genetic diversity between two groups. As mentioned in user guide the SSR data converted to binary code like "data(microbov)". I have raw genotyping result in Genepop format as copied below: ============================pop77568 , 0308 0506 0205 0404 0405 0105 0103 0405 0406 0103 0505 0506 030677571 , 0108 0104 0207 0607 0505 0202 0505 0101 0104 0104 0606 0404 010877575 , 0103 0102 0608 0104 0210 0203 0202 0103 0406 0404 0202 0406 010477579 , 0303 0102 0407 0406 0405 0203 0305 0107 0406 0103 0505 0405 030477580 , 0104 0105 0105 0406 0305 0304 0305 0203 0202 0303 0606 0405 0304 .... ===================================== My question is: How I can convert the genotyping result of SSR marker to binary code in adegenet? Is it possible with other packages or other software in this regards? Has anyone come across this and be able to overcome it? Best regards, Mohammad ================================== Mohammad Bagher Zandi B.M. Genetics and Animal Breeding(PhD) Associate Professor Faculty of Agriculture Department of Animal Science University of Zanjan, Zanjan, Iran -------------- next part -------------- An HTML attachment was scrubbed... URL: From roman.lustrik at biolitika.si Tue May 23 09:12:26 2017 From: roman.lustrik at biolitika.si (Roman =?utf-8?Q?Lu=C5=A1trik?=) Date: Tue, 23 May 2017 09:12:26 +0200 (CEST) Subject: [adegenet-forum] How we can convert the genotyping result of SSR marker to binary cod In-Reply-To: <2232ec8c1cc48406947362b83a9a7a91@mail.znu.ac.ir> References: <2232ec8c1cc48406947362b83a9a7a91@mail.znu.ac.ir> Message-ID: <1482297301.160598.1495523546966.JavaMail.zimbra@biolitika.si> The function `read.genepop` does that for you. You didn't provide a fully working genepop example, but here's a demonstration using nancycats dataset: > xy <- system.file("files/nancycats.gen", package = "adegenet") > system(sprintf("cat %s", xy)) Genotypes of cats from 17 colonies of Nancy (France) fca8 fca23 fca43 fca45 fca77 fca78 fca90 fca96 fca37 Pop 1, 0000 0409 0404 0103 0909 0306 0909 0808 1010 1, 0000 0909 0407 0305 0909 0306 0209 0808 1010 1, 0913 0409 0505 0101 0709 0303 0808 0808 1111 1, 0809 0505 0405 0105 0606 0306 0909 0104 1010 ... > xy <- read.genepop(xy) ... > tab(xy)[1:10, 1:10] fca8.09 fca8.13 fca8.08 fca8.10 fca8.12 fca8.06 fca8.07 001 NA NA NA NA NA NA NA 002 NA NA NA NA NA NA NA 003 1 1 0 0 0 0 0 004 1 0 1 0 0 0 0 005 1 0 1 0 0 0 0 006 1 1 0 0 0 0 0 007 2 0 0 0 0 0 0 008 1 1 0 0 0 0 0 009 0 1 0 1 0 0 0 010 2 0 0 0 0 0 0 Is this what you're after? Cheers, Roman ---- In god we trust, all others bring data. > Zahtevaj IJZ na https://kurc.biolitika.si From: mbzandi at znu.ac.ir To: adegenet-forum at lists.r-forge.r-project.org Cc: adegenet-forum at lists.r-forge.r-project.org Sent: Monday, May 22, 2017 10:33:54 PM Subject: Re: [adegenet-forum] How we can convert the genotyping result of SSR marker to binary cod Dear all, I am a new member of this forum, I?m trying to use adegenet to define genetic clusters and genetic diversity between two groups. As mentioned in user guide the SSR data converted to binary code like "data(microbov)". I have raw genotyping result in Genepop format as copied below: ============================ pop 77568 , 0308 0506 0205 0404 0405 0105 0103 0405 0406 0103 0505 0506 0306 77571 , 0108 0104 0207 0607 0505 0202 0505 0101 0104 0104 0606 0404 0108 77575 , 0103 0102 0608 0104 0210 0203 0202 0103 0406 0404 0202 0406 0104 77579 , 0303 0102 0407 0406 0405 0203 0305 0107 0406 0103 0505 0405 0304 77580 , 0104 0105 0105 0406 0305 0304 0305 0203 0202 0303 0606 0405 0304 .... ===================================== My question is: How I can convert the genotyping result of SSR marker to binary code in adegenet? Is it possible with other packages or other software in this regards? Has anyone come across this and be able to overcome it? Best regards, Mohammad ================================== Mohammad Bagher Zandi B.M. Genetics and Animal Breeding(PhD) Associate Professor Faculty of Agriculture Department of Animal Science University of Zanjan, Zanjan, Iran _______________________________________________ adegenet-forum mailing list adegenet-forum at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum -------------- next part -------------- An HTML attachment was scrubbed... URL: From 17869067 at students.latrobe.edu.au Thu May 25 01:57:36 2017 From: 17869067 at students.latrobe.edu.au (LAURA NICOLE WOODINGS) Date: Wed, 24 May 2017 23:57:36 +0000 Subject: [adegenet-forum] Negative BIC ok for find.clusters? Message-ID: Hi all, I am using the find.clusters function in Adegenet to identify an appropriate number of k?s for my outlier SNPs. It is a small-ish dataset with only 8 outlier SNPs and 184 individuals. The BIC versus cluster plot that I received is a bit strange. At k=1 BIC is slightly above 0, but for k=2 BIC falls below o and for all the other k?s BIC is getting progressively more negative, until it plateau?s off at k=15. I was just wondering if a negative BIC is ok to decide how many clusters the data has? I have not had a negative BIC plot returned with other datasets before, however they have had many more loci in them. Cheers, Laura PhD Candidate Department of Ecology, Environment and Evolution| La Trobe University | Bundoora 3086 | Australia m: +61 408 642 006 | e: 17869067 at students.latrobe.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mark.Coulson.ic at uhi.ac.uk Wed May 31 13:59:12 2017 From: Mark.Coulson.ic at uhi.ac.uk (Mark Coulson) Date: Wed, 31 May 2017 11:59:12 +0000 Subject: [adegenet-forum] xvalDapc Message-ID: Hi, I'm running the xvalDapc function for a dataset and wondering how to modify the number of PCA axes retained? When I run the following xval1 <- xvalDapc(FD_t, group, n.pca.max=40, result="groupMean", center=TRUE, scale=FALSE, xval.plot=TRUE) I get results back at 5, 10, 15, 20, 25, 30, 35 However, when I run (on the same dataset) xval1a <- xvalDapc(FD_t, group, n.pca.max=40, result="groupMean", training.set=0.7, center=TRUE, scale=FALSE, xval.plot=TRUE) I get results back at 13 different PCA axes levels, roughly by increments of 2 Also, I am looking to specify the increments so tried something like the following: xval2 <- xvalDapc(FD_t, group, n.pca.max=40, result="groupMean", training.set=0.7, center=TRUE, scale=FALSE, n.pca=seq(5, by=5,to=40),xval.plot=TRUE) but I don't get these exact increments. So what determines the scale of the x-axis? Thanks, Mark Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. -------------- next part -------------- An HTML attachment was scrubbed... URL: From coulsonmw at gmail.com Wed May 17 17:48:07 2017 From: coulsonmw at gmail.com (Mark Coulson) Date: Wed, 17 May 2017 15:48:07 -0000 Subject: [adegenet-forum] dapc on allele frequencies Message-ID: Hi I have allele frequency data for pools of individuals (no individual genotype data) for >500,000 SNPs. I know I can do a dapc on allele frequencies directly but given this many SNPs should I be using a ?genlight? object or is this only for individual genotypes? Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From wxx10 at psu.edu Tue May 30 21:06:32 2017 From: wxx10 at psu.edu (Weiya Xue) Date: Tue, 30 May 2017 19:06:32 -0000 Subject: [adegenet-forum] SNP data Message-ID: <13d20f53cc324ca99abc89d6633c8bac@PSU.EDU> Hi , I want to use adegenet for SNP data analysis in ployploids. How should I prepare the data? Does any one have the syntax of the input file? Thanks, Weiya Xue -------------- next part -------------- An HTML attachment was scrubbed... URL: