From leweleit at uni-bielefeld.de Fri Jun 1 14:06:20 2018 From: leweleit at uni-bielefeld.de (Lucienne Eweleit) Date: Fri, 01 Jun 2018 14:06:20 +0200 Subject: [adegenet-forum] adegenet-creating input file from csv Message-ID: <72d0ab6115c6c6.5b1152dc@uni-bielefeld.de> Hi all I searched the forum for a while now and also googled a couple of hours, but it seems, I am not able to follow the instructions, or they are not fitting my problem. I have an excel file with several columns: "ID", "pop", "Longitude", "Latitude", "Locus_1", "Locus_2", etc These columns are filed with life by 277 individuals (rows). Now, I would like to perform the spatial analysis, described in the tutorial of adegenet 2.0.0 read.csv and then transforming into genind by using df2genind works.....partly. The populations are not recognised and also the coordinates are?not recognised. Can anyone help me please and explain how I can transform my file and have all the information handled correctly? Thanks a lot!! Cheers, Lucienne -------------- next part -------------- An HTML attachment was scrubbed... URL: From bowlese at gmail.com Fri Jun 1 17:56:12 2018 From: bowlese at gmail.com (Ella Bowles) Date: Fri, 1 Jun 2018 11:56:12 -0400 Subject: [adegenet-forum] are xval results only relevant for the dapc part of dapc analysis and not find.cluster Message-ID: Hello, When trying to find the number of clusters, as is known, I get different results when I retain different numbers of PCs. As background, I have samples from 180 individuals over 11 different sites, and am trying to find the best structure. In the tutorial, it says that when you run find.clusters there is no reason for keeping small numbers of principle components here. When I run with n.pca.max = 60 (so, n/3), using xval I get pretty consistently that the good number of PCs to retain is 50. When I run find.cluster using 50 PCs I get anywhere between 7 and 9 clusters, mostly telling the same story for the data. However, when I run find.cluster with over 100 PCs I consistently get k = 4 or 5, and the plot is much cleaner. In addition, however, when I look at my variance explained plots, they don?t really asymptote, either for find.cluster or for dapc. Both of the variance explained plots look like Using the scaled dataset mat <- scaleGen(Stickle8c10NoOdds, NA.method="mean") I use 120 PCs, and get If I run with 90 PCs the number of clusters bumps up to 5, However, if I run find.cluster and choose 50 PCs, I get > head(NumClust$Kstat, 11) K=1 K=2 K=3 K=4 K=5 K=6 K=7 K=8 K=9 K=10 K=11 1492.620 1472.790 1455.980 1448.216 1443.735 1442.909 1440.166 1440.344 1440.867 1441.979 1443.101 Are the xval procedure results (i.e., 50 PCs in my case) meant to be used only at the dapc1 <- dapc(mat, NumClust$grp) stage? And, do my variance explained plots concern you at all given that they don?t asymptote? I've attached a word document with all the same text and plots as are in this message in case they don't show up on your screen. Thank you for your time, Ella -- Ella Bowles, PhD Postdoctoral Researcher Department of Biology Concordia University Website: https://ellabowlesphd.wordpress.com/ Email: bowlese at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 18721 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 35674 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 12289 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 12666 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 12347 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: DAPC qustions_20180601.docx Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document Size: 116550 bytes Desc: not available URL: From bowlese at gmail.com Tue Jun 5 16:37:49 2018 From: bowlese at gmail.com (Ella Bowles) Date: Tue, 5 Jun 2018 10:37:49 -0400 Subject: [adegenet-forum] are xval results only relevant for the dapc part of dapc analysis and not find.cluster In-Reply-To: References: Message-ID: Hello, I'm writing to follow up on my message from last week. Although my email looks a bit long, I don't think my two questions (which are at the end) should take very long to answer. I re-attached the work document with the questions and plots as well. When trying to find the number of clusters, as is known, I get different results when I retain different numbers of PCs. As background, I have samples from 180 individuals over 11 different sites, and am trying to find the best structure. In the tutorial, it says that when you run find.clusters there is no reason for keeping small numbers of principle components here. When I run with n.pca.max = 60 (so, n/3), using xval I get pretty consistently that the good number of PCs to retain is 50. When I run find.cluster using 50 PCs I get anywhere between 7 and 9 clusters, mostly telling the same story for the data. However, when I run find.cluster with over 100 PCs I consistently get k = 4 or 5, and the plot is much cleaner. In addition, however, when I look at my variance explained plots, they don?t really asymptote, either for find.cluster or for dapc. Both of the variance explained plots look like Using the scaled dataset mat <- scaleGen(Stickle8c10NoOdds, NA.method="mean") I use 120 PCs, and get Running with 90 PCs However, if I run find.cluster and choose 50 PCs, I get > head(NumClust$Kstat, 11) K=1 K=2 K=3 K=4 K=5 K=6 K=7 K=8 K=9 K=10 K=11 1492.620 1472.790 1455.980 1448.216 1443.735 1442.909 1440.166 1440.344 1440.867 1441.979 1443.101 Are the xval procedure results (i.e., 50 PCs in my case) meant to be used only at the dapc1 <- dapc(mat, NumClust$grp) stage? And, do my variance explained plots concern you at all given that they don?t asymptote? With many thanks for your time. Ella On Fri, Jun 1, 2018 at 11:56 AM, Ella Bowles wrote: > Hello, > > When trying to find the number of clusters, as is known, I get different > results when I retain different numbers of PCs. > > As background, I have samples from 180 individuals over 11 different > sites, and am trying to find the best structure. > > In the tutorial, it says that when you run find.clusters there is no > reason for keeping small numbers of principle components here. When I run > with n.pca.max = 60 (so, n/3), using xval I get pretty consistently that > the good number of PCs to retain is 50. > > > > When I run find.cluster using 50 PCs I get anywhere between 7 and 9 > clusters, mostly telling the same story for the data. However, when I run > find.cluster with over 100 PCs I consistently get k = 4 or 5, and the plot > is much cleaner. In addition, however, when I look at my variance explained > plots, they don?t really asymptote, either for find.cluster or for dapc. > > Both of the variance explained plots look like > > > Using the scaled dataset > > mat <- scaleGen(Stickle8c10NoOdds, NA.method="mean") > > I use 120 PCs, and get > > > If I run with 90 PCs the number of clusters bumps up to 5, > > > > However, if I run find.cluster and choose 50 PCs, I get > > > > head(NumClust$Kstat, 11) > > K=1 K=2 K=3 K=4 K=5 K=6 K=7 K=8 > K=9 K=10 K=11 > > 1492.620 1472.790 1455.980 1448.216 1443.735 1442.909 1440.166 1440.344 > 1440.867 1441.979 1443.101 > > > > Are the xval procedure results (i.e., 50 PCs in my case) meant to be used > only at the dapc1 <- dapc(mat, NumClust$grp) stage? And, do my variance > explained plots concern you at all given that they don?t asymptote? > > > I've attached a word document with all the same text and plots as are in > this message in case they don't show up on your screen. > > Thank you for your time, > > Ella > -- > Ella Bowles, PhD > Postdoctoral Researcher > Department of Biology > Concordia University > > Website: https://ellabowlesphd.wordpress.com/ > Email: bowlese at gmail.com > -- Ella Bowles, PhD Postdoctoral Researcher Department of Biology Concordia University Website: https://ellabowlesphd.wordpress.com/ Email: bowlese at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 12666 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 12289 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 18721 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 35674 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 12520 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 12347 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 12347 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 35674 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 12289 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 12666 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: DAPC qustions_20180601.docx Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document Size: 116550 bytes Desc: not available URL: From arsalane at protonmail.com Sat Jun 23 11:22:04 2018 From: arsalane at protonmail.com (Arsalan Emami-Khoyi) Date: Sat, 23 Jun 2018 05:22:04 -0400 Subject: [adegenet-forum] choose.k recommendations Message-ID: Dear Thibaut, Zhian et al Thank you for maintaining and updating the forum. I am wondering if you have any recommendation regarding the use of choose.k function in snapclust? I see the description "Do not use. We work on that stuff. Contact us if interested." on rdrr.io. Many thanks in advance Regards Arsalan Emami-Khoyi Postdoctoral Research Fellow in Wildlife Genomics University of Johannesburg_Center for Ecological Genomics and Wildlife Conservation Auckland Park 2006 South Africa Email : Arsalane at uj.ac.za Phone :+27 (0)11 559 3373 Cellphone:+27 79 88 14 628 Website :https://sites.google.com/site/drpeterteske/postdocs [EGWC-LOGO (1).png] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: EGWC-LOGO (1).png Type: image/png Size: 15092 bytes Desc: not available URL: From bowlese at gmail.com Wed Jun 13 20:24:19 2018 From: bowlese at gmail.com (Ella Bowles) Date: Wed, 13 Jun 2018 14:24:19 -0400 Subject: [adegenet-forum] are xval results only relevant for the dapc part of dapc analysis and not find.cluster In-Reply-To: References: Message-ID: Hello, I am writing to follow-up on my post to the forum from a couple weeks ago, and then my follow-up to that. If the text below does not show up, click on the hidden text, since I just replied to my last email to generate this post. When trying to find the number of clusters, as is known, I get different results when I retain different numbers of PCs. As background, I have samples from 180 individuals over 11 different sites, and am trying to find the best structure. In the tutorial, it says that when you run find.clusters there is no reason for keeping small numbers of principle components here. When I run with n.pca.max = 60 (so, n/3), using xval I get pretty consistently that the good number of PCs to retain is 50. > > When I run find.cluster using 50 PCs I get anywhere between 7 and 9 clusters, mostly telling the same story for the data. However, when I run find.cluster with over 100 PCs I consistently get k = 4 or 5, and the plot is much cleaner. In addition, however, when I look at my variance explained plots, they don?t really asymptote, either for find.cluster or for dapc ?. ? Both of the variance explained plots look like > Using the scaled dataset mat <- scaleGen(Stickle8c10NoOdds, NA.method="mean") I use 120 PCs, and get > If I run with 90 PCs the number of clusters bumps up to 5, > However, if I run find.cluster and choose 50 PCs, I get > ?? > head(NumClust$Kstat, 11) > K=1 K=2 K=3 K=4 K=5 K=6 K=7 K=8 > K=9 K=10 K=11 > > 1492.620 1472.790 1455.980 1448.216 1443.735 1442.909 1440.166 1440.344 > 1440.867 1441.979 1443.101 > Are the xval procedure results (i.e., 50 PCs in my case) meant to be used only at the dapc1 <- dapc(mat, NumClust$grp) stage? And, do my variance explained plots concern you at all given that they don?t asymptote? I've attached a word document with all the same text and plots as are in this message in case they don't show up on your screen. Thank you for your time, E ?lla? -- Ella Bowles, PhD Postdoctoral Researcher Department of Biology Concordia University Website: https://ellabowlesphd.wordpress.com/ Email: bowlese at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 35674 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 12666 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 12347 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 12289 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 18721 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: DAPC qustions_20180601.docx Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document Size: 116550 bytes Desc: not available URL: From marcelolaia at gmail.com Sun Jun 24 03:31:14 2018 From: marcelolaia at gmail.com (Marcelo Laia) Date: Sat, 23 Jun 2018 22:31:14 -0300 Subject: [adegenet-forum] Fst(as.loci(pinus.genind)) return NaN for all F statistics Message-ID: <20180624013114.GA30623@localhost> Dear all, I try to use adegenet for the first time. library("ape") library("pegas") library("seqinr") library("ggplot2") library("adegenet") pinus.data <- read.table("Analises_pinus_individual.csv", header=TRUE, sep="\t", row.names = 1) head(pinus.data) > head(pinus.data) PTTX3081 RIPT0031 PTTX3011 PTTX2037 PTTX311 Populacao 1 186243 235284 131155 190190 191191 POP001 3 177235 235235 131155 190190 191195 POP003 15 177239 235235 131155 190190 191191 POP015 16 177254 235235 131155 190190 191195 POP016 17 177235 235235 NA 190190 191195 POP017 19 177235 235235 131155 190190 191195 POP019 > pinus.genind <- df2genind(pinus.data[, 1:5], ploidy=2, ncode=3, NA.char = "NA", pop = pinus.data$Populacao, type = "codom") pinus.genind tab(pinus.genind) popNames(pinus.genind) indNames(pinus.genind) locNames(pinus.genind) > is.genind(pinus.genind) [1] TRUE > > Fst(as.loci(pinus.genind)) Fit Fst Fis PTTX3081 NaN NaN NaN RIPT0031 NaN NaN NaN PTTX3011 NaN NaN NaN PTTX2037 NaN NaN NaN PTTX311 NaN NaN NaN The col "Populacao" in file Analises_pinus_individual.csv is the pop slot. I consider each individual from a different population. Please, could you help me? Thank you very much! -- Laia, ML From thibautjombart at gmail.com Mon Jun 25 12:25:16 2018 From: thibautjombart at gmail.com (Thibaut Jombart) Date: Mon, 25 Jun 2018 11:25:16 +0100 Subject: [adegenet-forum] choose.k recommendations In-Reply-To: References: Message-ID: Hello, you are using an outdated, development version of the package. The documentation should be more explicit if you update the package to current CRAN version. Also check the vignette https://github.com/thibautjombart/adegenet/wiki/Tutorials Cheers Thibaut -- Dr Thibaut Jombart Lecturer, Department of Infectious Disease Epidemiology, Imperial College London Head of RECON: repidemicsconsortium.org WHO Consultant - outbreak analysis https://thibautjombart.netlify.com Twitter: @TeebzR +44(0)20 7594 3658 On 23 June 2018 at 10:22, Arsalan Emami-Khoyi wrote: > Dear Thibaut, Zhian et al > Thank you for maintaining and updating the forum. > I am wondering if you have any recommendation regarding the use of > choose.k function in snapclust? I see the description "Do not use. We > work on that stuff. Contact us if interested." on rdrr.io. > > Many thanks in advance > > Regards > > Arsalan Emami-Khoyi > Postdoctoral Research Fellow in Wildlife Genomics > University of Johannesburg_Center for Ecological Genomics and Wildlife > Conservation > Auckland Park 2006 > South Africa > Email : Arsalane at uj.ac.za > Phone :+27 (0)11 559 3373 > Cellphone:+27 79 88 14 628 > Website :https://sites.google.com/site/drpeterteske/postdocs > [image: EGWC-LOGO (1).png] > > > > _______________________________________________ > adegenet-forum mailing list > adegenet-forum at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/ > listinfo/adegenet-forum > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: EGWC-LOGO (1).png Type: image/png Size: 15092 bytes Desc: not available URL: From thibautjombart at gmail.com Mon Jun 25 12:30:41 2018 From: thibautjombart at gmail.com (Thibaut Jombart) Date: Mon, 25 Jun 2018 11:30:41 +0100 Subject: [adegenet-forum] Fst(as.loci(pinus.genind)) return NaN for all F statistics In-Reply-To: <20180624013114.GA30623@localhost> References: <20180624013114.GA30623@localhost> Message-ID: Hi there, it is a bit hard to figure things out as we don't see the outputs of several commands and there is no reproducible example, but your import of the data looks good to me. It could be too many missing data in one of your groups, or a fixed locus? Maybe worth checking allele counts per populations: tab(genind2genpop(pinus.genind)) Best Thibaut -- Dr Thibaut Jombart Lecturer, Department of Infectious Disease Epidemiology, Imperial College London Head of RECON: repidemicsconsortium.org WHO Consultant - outbreak analysis https://thibautjombart.netlify.com Twitter: @TeebzR +44(0)20 7594 3658 On 24 June 2018 at 02:31, Marcelo Laia wrote: > Dear all, > > I try to use adegenet for the first time. > > library("ape") > library("pegas") > library("seqinr") > library("ggplot2") > library("adegenet") > > pinus.data <- read.table("Analises_pinus_individual.csv", header=TRUE, > sep="\t", row.names = 1) > head(pinus.data) > > head(pinus.data) > PTTX3081 RIPT0031 PTTX3011 PTTX2037 PTTX311 Populacao > 1 186243 235284 131155 190190 191191 POP001 > 3 177235 235235 131155 190190 191195 POP003 > 15 177239 235235 131155 190190 191191 POP015 > 16 177254 235235 131155 190190 191195 POP016 > 17 177235 235235 NA 190190 191195 POP017 > 19 177235 235235 131155 190190 191195 POP019 > > > > pinus.genind <- df2genind(pinus.data[, 1:5], ploidy=2, ncode=3, NA.char = > "NA", > pop = pinus.data$Populacao, type = "codom") > pinus.genind > tab(pinus.genind) > popNames(pinus.genind) > indNames(pinus.genind) > locNames(pinus.genind) > > > is.genind(pinus.genind) > [1] TRUE > > > > > Fst(as.loci(pinus.genind)) > Fit Fst Fis > PTTX3081 NaN NaN NaN > RIPT0031 NaN NaN NaN > PTTX3011 NaN NaN NaN > PTTX2037 NaN NaN NaN > PTTX311 NaN NaN NaN > > The col "Populacao" in file Analises_pinus_individual.csv is the pop slot. > I > consider each individual from a different population. > > Please, could you help me? > > Thank you very much! > > -- > Laia, ML > _______________________________________________ > adegenet-forum mailing list > adegenet-forum at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/ > listinfo/adegenet-forum > -------------- next part -------------- An HTML attachment was scrubbed... URL: