From Marta.Piotrowska at sruc.ac.uk Mon Aug 1 15:18:40 2016 From: Marta.Piotrowska at sruc.ac.uk (Marta Piotrowska) Date: Mon, 1 Aug 2016 13:18:40 +0000 Subject: [adegenet-forum] find.clusters missing values Message-ID: Dear Members, I am trying to use find.cluster function in adegenet package for my microsat data analysis but I can?t find anywhere information if it accepts missing values. I tried NAs but it seems not to like it. I know that one way around it is to remove all the samples with missing values but I wondered if there is alternative allowing me to keep the missing values? I will be grateful for your help. Regards, Marta Marta Piotrowska, PhD. Postdoctoral Researcher Crop and Soil Systems Group SRUC West Mains Road Edinburgh EH9 3JG Phone: 01315354294 Marta.Piotrowska at sruc.ac.uk Please don't print this e-mail unless you really need to. This e-mail message is confidential to the intended recipient at the email address to which it has been addressed. If the message has been received by you in error, it may not be disclosed to or used by anyone other than the intended addressee, nor may it be copied in any way. If it is not intended for you please inform us, immediately, then delete it from your system. If the content is not about the business of the organisation then the message is not from us nor is it sanctioned by us. Anything in this e-mail or its attachments which does not relate to SRUC's or SAC Commercial Limited's official business is neither given nor endorsed by SRUC or SAC Commercial Limited. SRUC A Charitable company limited by guarantee, Scottish Charity Number: SC003712. Registered in Scotland, Company Number: SC103046 - Registered Office: Peter Wilson Building, King?s Buildings, West Mains Road, Edinburgh EH9 3JG SAC Commercial Limited, an SRUC company Registered in Scotland, Company Number: SC148684 - Registered Office: Peter Wilson Building, King?s Buildings, West Mains Road, Edinburgh EH9 3JG -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibautjombart at gmail.com Tue Aug 2 16:26:14 2016 From: thibautjombart at gmail.com (Thibaut Jombart) Date: Tue, 2 Aug 2016 15:26:14 +0100 Subject: [adegenet-forum] find.clusters missing values In-Reply-To: References: Message-ID: Hi Marta, the function find.clusters has methods for matrix and data.frame objects. Easiest way to proceed is use 'tab' to extract allele frequencies from your genind object and replace missing values, and then use find.clusters on it. For instance: > data(microbov) > summary(microbov) // Number of individuals: 704 // Group sizes: 50 50 51 30 50 50 47 61 31 55 50 50 49 30 50 // Number of alleles per locus: 9 7 12 5 11 9 7 12 13 9 13 16 14 14 14 10 10 19 11 13 17 12 16 13 12 15 8 22 21 9 // Number of alleles per group: 251 235 143 179 194 212 146 196 176 200 213 186 191 168 188 *// Percentage of missing data: 2.32 %* // Observed heterozygosity: 0.55 0.54 0.69 0.45 0.64 0.6 0.29 0.59 0.68 0.58 0.66 0.71 0.6 0.71 0.8 0.64 0.45 0.64 0.65 0.63 0.66 0.66 0.59 0.74 0.68 0.77 0.62 0.69 0.68 0.44 // Expected heterozygosity: 0.71 0.6 0.78 0.54 0.79 0.76 0.49 0.69 0.83 0.77 0.77 0.82 0.75 0.76 0.89 0.75 0.63 0.77 0.75 0.78 0.77 0.77 0.77 0.84 0.74 0.89 0.69 0.77 0.89 0.56 > x <- tab(microbov, freq=TRUE, *NA.method="mean"*) # replace missing values here > g <- find.clusters(x) Choose the number PCs to retain (>=1): .... Best Thibaut -- Dr Thibaut Jombart Lecturer, Department of Infectious Disease Epidemiology Imperial College London https://sites.google.com/site/thibautjombart/ https://github.com/thibautjombart Twitter: @TeebzR On 1 August 2016 at 14:18, Marta Piotrowska wrote: > Dear Members, > > > > I am trying to use find.cluster function in adegenet package for my > microsat data analysis but I can?t find anywhere information if it accepts > missing values. I tried NAs but it seems not to like it. I know that one > way around it is to remove all the samples with missing values but I > wondered if there is alternative allowing me to keep the missing values? > > > > I will be grateful for your help. > > > > Regards, > > Marta > > > > > > *Marta Piotrowska, PhD.* > > *Postdoctoral Researcher* > > *Crop and Soil Systems Group* > > *SRUC* > > *West Mains Road* > > *Edinburgh EH9 3JG* > > > > *Phone: 01315354294* > > *Marta.Piotrowska at sruc.ac.uk * > > > > > > > > > > Please don't print this e-mail unless you really need to. > > This e-mail message is confidential to the intended recipient at the email > address to which it has been addressed. If the message has been received by > you in error, it may not be disclosed to or used by anyone other than the > intended addressee, nor may it be copied in any way. If it is not intended > for you please inform us, immediately, then delete it from your system. If > the content is not about the business of the organisation then the message > is not from us nor is it sanctioned by us. Anything in this e-mail or its > attachments which does not relate to SRUC's or SAC Commercial Limited's > official business is neither given nor endorsed by SRUC or SAC Commercial > Limited. > > SRUC > A Charitable company limited by guarantee, Scottish Charity Number: > SC003712. > Registered in Scotland, Company Number: SC103046 - Registered Office: > Peter Wilson Building, King?s Buildings, West Mains Road, Edinburgh EH9 3JG > SAC Commercial Limited, an SRUC company > Registered in Scotland, Company Number: SC148684 - Registered Office: > Peter Wilson Building, King?s Buildings, West Mains Road, Edinburgh EH9 3JG > > _______________________________________________ > adegenet-forum mailing list > adegenet-forum at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Julien.Varaldi at univ-lyon1.fr Tue Aug 2 19:44:27 2016 From: Julien.Varaldi at univ-lyon1.fr (VARALDI JULIEN) Date: Tue, 2 Aug 2016 17:44:27 +0000 Subject: [adegenet-forum] df2genind never stops Message-ID: Dear adegenet users, I have two datasets that I would like to combine into a single one, ideally a genlight one. The first dataset is a vcf file from the 1000 genomes. I can read it using the package vcfR and then convert it to a genlight object. This take a while (few minutes) but works fine: vcf=read.vcfR(vcf_file) my_genlight <- vcfR2genlight(x=vcf, n.cores = 8) The other dataset is a data frame containing genotypes obtained from genome-wide SNP array. It contains the genotypes for 31 individuals on 868146 loci. The initial file is only 90Mb. I tried to use df2genind but without success (I stopped it after 20 minutes or something like that? it is running without apparent error). Here is what I did: >tab=read.table(my_data, head=T, sep=",") >head(tab) >loci=tab$rs_number >tab=t(tab) >tab=tab[-1,] >colnames(tab)=loci > tab[1:5, 1:4] rs10458597 rs9629043 rs11510103 rs12565286 Sample_4 "CC" "CC" "AA" "CC" Sample_5 "CC" "NN" "AA" "CC" Sample_6 "CC" "CC" "AA" "CC" Sample_7 "CC" "CC" "AA" "CC" Sample_8 "CC" "CC" "AA" "CC" > dim(tab) [1] 31 868146 my_genind=df2genind(tab, ploidy=2, sep="", NA.char = "N") This last command lasts for ever. I would appreciate any suggestion. The next step is to combine the two datasets, with the difficulty that one will be a genlight, the other a genind, AND the 1000 thousand dataset contains much more loci than the snp dataset (does repool deal with this situation?). I would also appreciate any input on that. I am running R 3.3.1 on a mac os 10.11.4 thanks a lot, cheers, Julien From thibautjombart at gmail.com Wed Aug 3 18:41:36 2016 From: thibautjombart at gmail.com (Thibaut Jombart) Date: Wed, 3 Aug 2016 17:41:36 +0100 Subject: [adegenet-forum] df2genind never stops In-Reply-To: References: Message-ID: Hi Julien, this may be pushing the limits of genind objects, as they really weren't designed for more than a few hundreds / couple of thousands loci. As a sanity check, I would still try converting a small subset to check all is fine, e.g.: my_genind=df2genind(tab[,1:1000], ploidy=2, sep="", NA.char = "N") If you wrap this within a 'system.time', you'll get an approximate idea of how long the conversion of 1,000 loci takes; the extrapolation will give you a lower bound for the actual time to expect for the entire dataset (the algorithm does not scale linearly). As for the further steps, this will not be straightforward. genlight and genind objectsd cannot be combined as they are structurally very different: the first codes SNPs as binary variables (where 0 and 1 have no specific meaning other than differentiating 2 alleles), while the second stores data as allele counts. As for repool, it does handle differences in alleles but loci have to be the same. If you are to combine the two datasets, the best course of action would be: - combine them before (mapping everything against a reference?) - combine them for the analysis, e.g. adding distances (possibly after some scaling), or using 2-table methods in the case of factorial analysis Cheers Thibaut -- Dr Thibaut Jombart Lecturer, Department of Infectious Disease Epidemiology Imperial College London https://sites.google.com/site/thibautjombart/ https://github.com/thibautjombart Twitter: @TeebzR On 2 August 2016 at 18:44, VARALDI JULIEN wrote: > Dear adegenet users, > > I have two datasets that I would like to combine into a single one, > ideally a genlight one. The first dataset is a vcf file from the 1000 > genomes. I can read it using the package vcfR and then convert it to a > genlight object. This take a while (few minutes) but works fine: > > vcf=read.vcfR(vcf_file) > my_genlight <- vcfR2genlight(x=vcf, n.cores = 8) > > The other dataset is a data frame containing genotypes obtained from > genome-wide SNP array. It contains the genotypes for 31 individuals on > 868146 loci. The initial file is only 90Mb. I tried to use df2genind but > without success (I stopped it after 20 minutes or something like that? it > is running without apparent error). Here is what I did: > > >tab=read.table(my_data, head=T, sep=",") > >head(tab) > >loci=tab$rs_number > >tab=t(tab) > >tab=tab[-1,] > >colnames(tab)=loci > > > tab[1:5, 1:4] > rs10458597 rs9629043 rs11510103 rs12565286 > Sample_4 "CC" "CC" "AA" "CC" > Sample_5 "CC" "NN" "AA" "CC" > Sample_6 "CC" "CC" "AA" "CC" > Sample_7 "CC" "CC" "AA" "CC" > Sample_8 "CC" "CC" "AA" "CC" > > > dim(tab) > [1] 31 868146 > my_genind=df2genind(tab, ploidy=2, sep="", NA.char = "N") > > This last command lasts for ever. > > I would appreciate any suggestion. The next step is to combine the two > datasets, with the difficulty that one will be a genlight, the other a > genind, AND the 1000 thousand dataset contains much more loci than the snp > dataset (does repool deal with this situation?). I would also appreciate > any input on that. > > I am running R 3.3.1 on a mac os 10.11.4 > thanks a lot, > cheers, > Julien > _______________________________________________ > adegenet-forum mailing list > adegenet-forum at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mark.Coulson.ic at uhi.ac.uk Fri Aug 5 10:15:35 2016 From: Mark.Coulson.ic at uhi.ac.uk (Mark Coulson) Date: Fri, 5 Aug 2016 08:15:35 +0000 Subject: [adegenet-forum] redefining populations and excluding populations Message-ID: I have a genind file with an @pop factor however based on various cluster analysis I have redefined 4 groups (instead of the initial 12). I want to assign this grouping factor to my genind object and then eliminate one of the groups for subsequent analysis (presumably using the poppr package). How do I go about this? Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mark.Coulson.ic at uhi.ac.uk Fri Aug 5 11:34:58 2016 From: Mark.Coulson.ic at uhi.ac.uk (Mark Coulson) Date: Fri, 5 Aug 2016 09:34:58 +0000 Subject: [adegenet-forum] error in hclust Message-ID: Hello, Yesterday I ran the following code and everything worked just fine. Today I simply opened up my script and re-ran it and got the following laidon <- import2genind("laidon_project_data_no_sibs.str") X <- tab(laidon, freq=TRUE, NA.method="mean") D <- dist(X) D <- as.matrix(D) h1 <- hclust(D, method="complete") Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor exceed 65536") : missing value where TRUE/FALSE needed Not sure what this error means but more confused as to why suddenly it isn't working. Nothing else has changed Mark Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibautjombart at gmail.com Fri Aug 5 11:40:36 2016 From: thibautjombart at gmail.com (Thibaut Jombart) Date: Fri, 5 Aug 2016 10:40:36 +0100 Subject: [adegenet-forum] error in hclust In-Reply-To: References: Message-ID: Odd indeed. Are there any NAs in 'X'? Cheers Thibaut -- Dr Thibaut Jombart Lecturer, Department of Infectious Disease Epidemiology Imperial College London https://sites.google.com/site/thibautjombart/ https://github.com/thibautjombart Twitter: @TeebzR On 5 August 2016 at 10:34, Mark Coulson wrote: > Hello, > > > > Yesterday I ran the following code and everything worked just fine. Today > I simply opened up my script and re-ran it and got the following > > > > laidon <- import2genind("laidon_project_data_no_sibs.str") > > X <- tab(laidon, freq=TRUE, NA.method="mean") > > > > D <- dist(X) > > D <- as.matrix(D) > > > > h1 <- hclust(D, method="complete") > > > > Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor exceed > 65536") : > > missing value where TRUE/FALSE needed > > > > Not sure what this error means but more confused as to why suddenly it > isn?t working. Nothing else has changed > > > > Mark > Inverness College UHI, a partner in the University of the Highlands and > Islands www.inverness.uhi.ac.uk Board of Management of Inverness College > (known as Inverness College UHI), Scottish Charity No SC021197. > > _______________________________________________ > adegenet-forum mailing list > adegenet-forum at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/ > listinfo/adegenet-forum > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mark.Coulson.ic at uhi.ac.uk Fri Aug 5 11:42:48 2016 From: Mark.Coulson.ic at uhi.ac.uk (Mark Coulson) Date: Fri, 5 Aug 2016 09:42:48 +0000 Subject: [adegenet-forum] error in hclust In-Reply-To: References: Message-ID: There are but this should be taken care of by NA.method and again worked just fine yesterday! Mark From: Thibaut Jombart [mailto:thibautjombart at gmail.com] Sent: 05 August 2016 10:41 To: Mark Coulson Cc: adegenet-forum at lists.r-forge.r-project.org Subject: Re: [adegenet-forum] error in hclust Odd indeed. Are there any NAs in 'X'? Cheers Thibaut -- Dr Thibaut Jombart Lecturer, Department of Infectious Disease Epidemiology Imperial College London https://sites.google.com/site/thibautjombart/ https://github.com/thibautjombart Twitter: @TeebzR On 5 August 2016 at 10:34, Mark Coulson > wrote: Hello, Yesterday I ran the following code and everything worked just fine. Today I simply opened up my script and re-ran it and got the following laidon <- import2genind("laidon_project_data_no_sibs.str") X <- tab(laidon, freq=TRUE, NA.method="mean") D <- dist(X) D <- as.matrix(D) h1 <- hclust(D, method="complete") Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor exceed 65536") : missing value where TRUE/FALSE needed Not sure what this error means but more confused as to why suddenly it isn?t working. Nothing else has changed Mark Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. _______________________________________________ adegenet-forum mailing list adegenet-forum at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. -------------- next part -------------- An HTML attachment was scrubbed... URL: From roman.lustrik at biolitika.si Fri Aug 5 11:52:19 2016 From: roman.lustrik at biolitika.si (Roman =?utf-8?Q?Lu=C5=A1trik?=) Date: Fri, 5 Aug 2016 11:52:19 +0200 (CEST) Subject: [adegenet-forum] error in hclust In-Reply-To: References: Message-ID: <636729599.993068.1470390739346.JavaMail.zimbra@biolitika.si> It's not entirely clear what you've done yesterday. Can you provide a reproducible example? Cheers, Roman ---- In god we trust, all others bring data. From: "Mark Coulson" To: "Thibaut Jombart" Cc: adegenet-forum at lists.r-forge.r-project.org Sent: Friday, August 5, 2016 11:42:48 AM Subject: Re: [adegenet-forum] error in hclust There are but this should be taken care of by NA.method and again worked just fine yesterday! Mark From: Thibaut Jombart [mailto:thibautjombart at gmail.com] Sent: 05 August 2016 10:41 To: Mark Coulson Cc: adegenet-forum at lists.r-forge.r-project.org Subject: Re: [adegenet-forum] error in hclust Odd indeed. Are there any NAs in 'X'? Cheers Thibaut -- Dr Thibaut Jombart Lecturer, Department of Infectious Disease Epidemiology Imperial College London https://sites.google.com/site/thibautjombart/ https://github.com/thibautjombart Twitter: @TeebzR On 5 August 2016 at 10:34, Mark Coulson < Mark.Coulson.ic at uhi.ac.uk > wrote: Hello, Yesterday I ran the following code and everything worked just fine. Today I simply opened up my script and re-ran it and got the following laidon <- import2genind("laidon_project_data_no_sibs.str") X <- tab(laidon, freq=TRUE, NA.method="mean") D <- dist(X) D <- as.matrix(D) h1 <- hclust(D, method="complete") Error in if ( is.na (n) || n > 65536L) stop("size cannot be NA nor exceed 65536") : missing value where TRUE/FALSE needed Not sure what this error means but more confused as to why suddenly it isn?t working. Nothing else has changed Mark Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. _______________________________________________ adegenet-forum mailing list adegenet-forum at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. _______________________________________________ adegenet-forum mailing list adegenet-forum at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mark.Coulson.ic at uhi.ac.uk Fri Aug 5 11:54:30 2016 From: Mark.Coulson.ic at uhi.ac.uk (Mark Coulson) Date: Fri, 5 Aug 2016 09:54:30 +0000 Subject: [adegenet-forum] error in hclust In-Reply-To: <636729599.993068.1470390739346.JavaMail.zimbra@biolitika.si> References: <636729599.993068.1470390739346.JavaMail.zimbra@biolitika.si> Message-ID: I ran exactly that same script. From: Roman Lu?trik [mailto:roman.lustrik at biolitika.si] Sent: 05 August 2016 10:52 To: Mark Coulson Cc: Thibaut Jombart ; adegenet-forum at lists.r-forge.r-project.org Subject: Re: [adegenet-forum] error in hclust It's not entirely clear what you've done yesterday. Can you provide a reproducible example? Cheers, Roman ---- In god we trust, all others bring data. ________________________________ From: "Mark Coulson" > To: "Thibaut Jombart" > Cc: adegenet-forum at lists.r-forge.r-project.org Sent: Friday, August 5, 2016 11:42:48 AM Subject: Re: [adegenet-forum] error in hclust There are but this should be taken care of by NA.method and again worked just fine yesterday! Mark From: Thibaut Jombart [mailto:thibautjombart at gmail.com] Sent: 05 August 2016 10:41 To: Mark Coulson > Cc: adegenet-forum at lists.r-forge.r-project.org Subject: Re: [adegenet-forum] error in hclust Odd indeed. Are there any NAs in 'X'? Cheers Thibaut -- Dr Thibaut Jombart Lecturer, Department of Infectious Disease Epidemiology Imperial College London https://sites.google.com/site/thibautjombart/ https://github.com/thibautjombart Twitter: @TeebzR On 5 August 2016 at 10:34, Mark Coulson > wrote: Hello, Yesterday I ran the following code and everything worked just fine. Today I simply opened up my script and re-ran it and got the following laidon <- import2genind("laidon_project_data_no_sibs.str") X <- tab(laidon, freq=TRUE, NA.method="mean") D <- dist(X) D <- as.matrix(D) h1 <- hclust(D, method="complete") Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor exceed 65536") : missing value where TRUE/FALSE needed Not sure what this error means but more confused as to why suddenly it isn?t working. Nothing else has changed Mark Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. _______________________________________________ adegenet-forum mailing list adegenet-forum at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. _______________________________________________ adegenet-forum mailing list adegenet-forum at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. -------------- next part -------------- An HTML attachment was scrubbed... URL: From roman.lustrik at biolitika.si Fri Aug 5 12:01:42 2016 From: roman.lustrik at biolitika.si (Roman =?utf-8?Q?Lu=C5=A1trik?=) Date: Fri, 5 Aug 2016 12:01:42 +0200 (CEST) Subject: [adegenet-forum] error in hclust In-Reply-To: References: <636729599.993068.1470390739346.JavaMail.zimbra@biolitika.si> Message-ID: <1305499946.993121.1470391302050.JavaMail.zimbra@biolitika.si> Hi, it would be great if you could whip up a small reproducible example (you could consider creating a gist at e.g. gist.github.com) which demonstrates this problem. Without the data and code in hand, it's really all just speculation. Cheers, Roman ---- In god we trust, all others bring data. From: "Mark Coulson" To: "Roman Lu?trik" Cc: "Thibaut Jombart" , adegenet-forum at lists.r-forge.r-project.org Sent: Friday, August 5, 2016 11:54:30 AM Subject: RE: [adegenet-forum] error in hclust I ran exactly that same script. From: Roman Lu?trik [mailto:roman.lustrik at biolitika.si] Sent: 05 August 2016 10:52 To: Mark Coulson Cc: Thibaut Jombart ; adegenet-forum at lists.r-forge.r-project.org Subject: Re: [adegenet-forum] error in hclust It's not entirely clear what you've done yesterday. Can you provide a reproducible example? Cheers, Roman ---- In god we trust, all others bring data. From: "Mark Coulson" < Mark.Coulson.ic at uhi.ac.uk > To: "Thibaut Jombart" < thibautjombart at gmail.com > Cc: adegenet-forum at lists.r-forge.r-project.org Sent: Friday, August 5, 2016 11:42:48 AM Subject: Re: [adegenet-forum] error in hclust There are but this should be taken care of by NA.method and again worked just fine yesterday! Mark From: Thibaut Jombart [ mailto:thibautjombart at gmail.com ] Sent: 05 August 2016 10:41 To: Mark Coulson < Mark.Coulson.ic at uhi.ac.uk > Cc: adegenet-forum at lists.r-forge.r-project.org Subject: Re: [adegenet-forum] error in hclust Odd indeed. Are there any NAs in 'X'? Cheers Thibaut -- Dr Thibaut Jombart Lecturer, Department of Infectious Disease Epidemiology Imperial College London https://sites.google.com/site/thibautjombart/ https://github.com/thibautjombart Twitter: @TeebzR On 5 August 2016 at 10:34, Mark Coulson < Mark.Coulson.ic at uhi.ac.uk > wrote: Hello, Yesterday I ran the following code and everything worked just fine. Today I simply opened up my script and re-ran it and got the following laidon <- import2genind("laidon_project_data_no_sibs.str") X <- tab(laidon, freq=TRUE, NA.method="mean") D <- dist(X) D <- as.matrix(D) h1 <- hclust(D, method="complete") Error in if ( is.na (n) || n > 65536L) stop("size cannot be NA nor exceed 65536") : missing value where TRUE/FALSE needed Not sure what this error means but more confused as to why suddenly it isn?t working. Nothing else has changed Mark Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. _______________________________________________ adegenet-forum mailing list adegenet-forum at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. _______________________________________________ adegenet-forum mailing list adegenet-forum at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mark.Coulson.ic at uhi.ac.uk Fri Aug 5 12:35:50 2016 From: Mark.Coulson.ic at uhi.ac.uk (Mark Coulson) Date: Fri, 5 Aug 2016 10:35:50 +0000 Subject: [adegenet-forum] error in hclust In-Reply-To: References: Message-ID: Ok, So got it working now by omitting the D <- as.matrix(X) step and running the hclust only from the D <- dist(X) M From: Thibaut Jombart [mailto:thibautjombart at gmail.com] Sent: 05 August 2016 10:41 To: Mark Coulson Cc: adegenet-forum at lists.r-forge.r-project.org Subject: Re: [adegenet-forum] error in hclust Odd indeed. Are there any NAs in 'X'? Cheers Thibaut -- Dr Thibaut Jombart Lecturer, Department of Infectious Disease Epidemiology Imperial College London https://sites.google.com/site/thibautjombart/ https://github.com/thibautjombart Twitter: @TeebzR On 5 August 2016 at 10:34, Mark Coulson > wrote: Hello, Yesterday I ran the following code and everything worked just fine. Today I simply opened up my script and re-ran it and got the following laidon <- import2genind("laidon_project_data_no_sibs.str") X <- tab(laidon, freq=TRUE, NA.method="mean") D <- dist(X) D <- as.matrix(D) h1 <- hclust(D, method="complete") Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor exceed 65536") : missing value where TRUE/FALSE needed Not sure what this error means but more confused as to why suddenly it isn?t working. Nothing else has changed Mark Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. _______________________________________________ adegenet-forum mailing list adegenet-forum at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum Inverness College UHI, a partner in the University of the Highlands and Islands www.inverness.uhi.ac.uk Board of Management of Inverness College (known as Inverness College UHI), Scottish Charity No SC021197. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibautjombart at gmail.com Fri Aug 5 13:39:00 2016 From: thibautjombart at gmail.com (Thibaut Jombart) Date: Fri, 5 Aug 2016 12:39:00 +0100 Subject: [adegenet-forum] error in hclust In-Reply-To: References: Message-ID: Hi there, I tried offline - can't reproduce the issue, it works fine for me with: > R.version _ platform x86_64-pc-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 3 minor 3.1 year 2016 month 06 day 21 svn rev 70800 language R version.string R version 3.3.1 (2016-06-21) nickname Bug in Your Hair > sessionInfo() R version 3.3.1 (2016-06-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.4 LTS locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] adegenet_2.0.1 ade4_1.7-4 covr_2.2.0 testthat_1.0.2 [5] knitr_1.13 devtools_1.12.0 loaded via a namespace (and not attached): [1] Rcpp_0.12.6 spdep_0.6-6 plyr_1.8.4 LearnBayes_2.15 [5] tools_3.3.1 boot_1.3-18 digest_0.6.10 memoise_1.0.0 [9] tibble_1.1 gtable_0.2.0 nlme_3.1-128 lattice_0.20-33 [13] mgcv_1.8-13 Matrix_1.2-6 rex_1.1.1 igraph_1.0.1 [17] shiny_0.13.2.9004 DBI_0.4-1 parallel_3.3.1 coda_0.18-1 [21] cluster_2.0.4 withr_1.0.2 dplyr_0.5.0 stringr_1.0.0 [25] gtools_3.5.0 grid_3.3.1 R6_2.1.2 sp_1.2-3 [29] gdata_2.17.0 ggplot2_2.1.0 reshape2_1.4.1 seqinr_3.3-0 [33] deldir_0.1-12 magrittr_1.5 gmodels_2.16.2 splines_3.3.1 [37] scales_0.4.0 htmltools_0.3.5 MASS_7.3-45 assertthat_0.1 [41] permute_0.9-0 mime_0.5 ape_3.5 colorspace_1.2-6 [45] xtable_1.8-2 httpuv_1.3.3 stringi_1.1.1 lazyeval_0.2.0 [49] munsell_0.4.3 vegan_2.4-0 crayon_1.3.2 If the problem persists on the current version of R and adegenet, please post an issue and we'll try to reproduce the problem. Best Thibaut -- Dr Thibaut Jombart Lecturer, Department of Infectious Disease Epidemiology Imperial College London https://sites.google.com/site/thibautjombart/ https://github.com/thibautjombart Twitter: @TeebzR On 5 August 2016 at 11:35, Mark Coulson wrote: > Ok, > > > > So got it working now by omitting the D <- as.matrix(X) step and running > the hclust only from the D <- dist(X) > > > > M > > > > > > *From:* Thibaut Jombart [mailto:thibautjombart at gmail.com] > *Sent:* 05 August 2016 10:41 > *To:* Mark Coulson > *Cc:* adegenet-forum at lists.r-forge.r-project.org > *Subject:* Re: [adegenet-forum] error in hclust > > > > Odd indeed. Are there any NAs in 'X'? > > > > Cheers > > Thibaut > > > > -- > Dr Thibaut Jombart > Lecturer, Department of Infectious Disease Epidemiology > > Imperial College London > https://sites.google.com/site/thibautjombart/ > > https://github.com/thibautjombart > > Twitter: @TeebzR > > > > On 5 August 2016 at 10:34, Mark Coulson wrote: > > Hello, > > > > Yesterday I ran the following code and everything worked just fine. Today > I simply opened up my script and re-ran it and got the following > > > > laidon <- import2genind("laidon_project_data_no_sibs.str") > > X <- tab(laidon, freq=TRUE, NA.method="mean") > > > > D <- dist(X) > > D <- as.matrix(D) > > > > h1 <- hclust(D, method="complete") > > > > Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor exceed > 65536") : > > missing value where TRUE/FALSE needed > > > > Not sure what this error means but more confused as to why suddenly it > isn?t working. Nothing else has changed > > > > Mark > > Inverness College UHI, a partner in the University of the Highlands and > Islands www.inverness.uhi.ac.uk Board of Management of Inverness College > (known as Inverness College UHI), Scottish Charity No SC021197. > > > _______________________________________________ > adegenet-forum mailing list > adegenet-forum at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/ > listinfo/adegenet-forum > > > Inverness College UHI, a partner in the University of the Highlands and > Islands www.inverness.uhi.ac.uk Board of Management of Inverness College > (known as Inverness College UHI), Scottish Charity No SC021197. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibautjombart at gmail.com Fri Aug 5 13:47:38 2016 From: thibautjombart at gmail.com (Thibaut Jombart) Date: Fri, 5 Aug 2016 12:47:38 +0100 Subject: [adegenet-forum] redefining populations and excluding populations In-Reply-To: References: Message-ID: Hi there, most handling is documented: adegenetTutorial("basics") section 5.1. See also the manpage for most accessors: ?nLoc For instance, to exclude the Salers from microbov: x <- microbov[i=!as.character(pop(microbov)) %in% c("Salers")] Cheers Thibaut -- Dr Thibaut Jombart Lecturer, Department of Infectious Disease Epidemiology Imperial College London https://sites.google.com/site/thibautjombart/ https://github.com/thibautjombart Twitter: @TeebzR On 5 August 2016 at 09:15, Mark Coulson wrote: > I have a genind file with an @pop factor however based on various cluster > analysis I have redefined 4 groups (instead of the initial 12). I want to > assign this grouping factor to my genind object and then eliminate one of > the groups for subsequent analysis (presumably using the poppr package). > How do I go about this? > > > Inverness College UHI, a partner in the University of the Highlands and > Islands www.inverness.uhi.ac.uk Board of Management of Inverness College > (known as Inverness College UHI), Scottish Charity No SC021197. > > _______________________________________________ > adegenet-forum mailing list > adegenet-forum at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/ > listinfo/adegenet-forum > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zkamvar at gmail.com Fri Aug 5 19:04:16 2016 From: zkamvar at gmail.com (Zhian Kamvar) Date: Fri, 5 Aug 2016 10:04:16 -0700 Subject: [adegenet-forum] redefining populations and excluding populations In-Reply-To: References: Message-ID: <91F0FEA0-9966-4E99-9B6C-D6E0B40808A2@gmail.com> For the populations, you can also use the pop argument in the brackets. The previous example would look like this: y <- microbov[pop = !popNames(microbov) %in% "Salers"] You can also use strings or integers. - Zhian > Date: Fri, 5 Aug 2016 12:47:38 +0100 > From: Thibaut Jombart > To: Mark Coulson > Cc: "adegenet-forum at lists.r-forge.r-project.org" > > Subject: Re: [adegenet-forum] redefining populations and excluding > populations > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > Hi there, > > most handling is documented: > adegenetTutorial("basics") > > section 5.1. See also the manpage for most accessors: > ?nLoc > > For instance, to exclude the Salers from microbov: > x <- microbov[i=!as.character(pop(microbov)) %in% c("Salers")] > > Cheers > Thibaut > > > -- > Dr Thibaut Jombart > Lecturer, Department of Infectious Disease Epidemiology > Imperial College London > https://sites.google.com/site/thibautjombart/ > https://github.com/thibautjombart > Twitter: @TeebzR > > On 5 August 2016 at 09:15, Mark Coulson wrote: > >> I have a genind file with an @pop factor however based on various cluster >> analysis I have redefined 4 groups (instead of the initial 12). I want to >> assign this grouping factor to my genind object and then eliminate one of >> the groups for subsequent analysis (presumably using the poppr package). >> How do I go about this? >> >> >> Inverness College UHI, a partner in the University of the Highlands and >> Islands www.inverness.uhi.ac.uk Board of Management of Inverness College >> (known as Inverness College UHI), Scottish Charity No SC021197. >> >> _______________________________________________ >> adegenet-forum mailing list >> adegenet-forum at lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/ >> listinfo/adegenet-forum >> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > _______________________________________________ > adegenet-forum mailing list > adegenet-forum at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum > > End of adegenet-forum Digest, Vol 96, Issue 7 > ********************************************* From jenifaus at gmail.com Tue Aug 16 07:51:08 2016 From: jenifaus at gmail.com (Jennifer Austin) Date: Tue, 16 Aug 2016 10:21:08 +0430 Subject: [adegenet-forum] first step: creating "genind object" from mtDNA seq Message-ID: Hello all I have a dataset of 200 individuals and for each of them there is a 900 bp sequence of COI with 68 polymorphic sites. I want to use find.clusters function in the package adegent and I want to know how can I prepare my dataset to use them in the analysis. Is there any example file that can help me how to prepare my data? I would appreciate any help. Regards, Jenny -------------- next part -------------- An HTML attachment was scrubbed... URL: From thibautjombart at gmail.com Tue Aug 16 08:46:13 2016 From: thibautjombart at gmail.com (Thibaut Jombart) Date: Tue, 16 Aug 2016 07:46:13 +0100 Subject: [adegenet-forum] first step: creating "genind object" from mtDNA seq In-Reply-To: References: Message-ID: Hi Jenny You need to read your sequence into R and then convert it to a genind. This is documented in several places but a starting point is the basics tutorial - see for instance fasta2DNAbin and DNAbin2genind. Cheers Thibaut On 16 Aug 2016 06:51, "Jennifer Austin" wrote: Hello all I have a dataset of 200 individuals and for each of them there is a 900 bp sequence of COI with 68 polymorphic sites. I want to use find.clusters function in the package adegent and I want to know how can I prepare my dataset to use them in the analysis. Is there any example file that can help me how to prepare my data? I would appreciate any help. Regards, Jenny _______________________________________________ adegenet-forum mailing list adegenet-forum at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum -------------- next part -------------- An HTML attachment was scrubbed... URL: From csalgados at gmail.com Fri Aug 12 23:08:36 2016 From: csalgados at gmail.com (Catalina Salgado) Date: Fri, 12 Aug 2016 21:08:36 -0000 Subject: [adegenet-forum] Help to get membership probabilities from DAPC analyses Message-ID: Hello all! First apologize in advance if this question seems pretty basic. I have been using adegenet lately and I understand mostly everything. But I could not find the way to get the membership probabilities (the actual values) that where calculated when a DAPC analysis was done. I would like to have this file or information so I can compare it with STRUCTURE membership probabilities. Thank you very much in advance! Best, -- Catalina Salgado USDA-ARS Beltsville, MD USA -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanesse.labeyrie at cirad.fr Thu Aug 18 13:27:32 2016 From: vanesse.labeyrie at cirad.fr (vlabeyrie) Date: Thu, 18 Aug 2016 11:27:32 -0000 Subject: [adegenet-forum] Problem DAPC scatterplot supplementary individuals Message-ID: <57B59BA1.20505@cirad.fr> Dear adegenet users, I have a problem when plotting supplementary individuals on DAPC scatterplot. I defined two datasets: 1 to perform DAPC and another of supplementary individuals x.sup_80<-Ge_atp_gcp_80[c(1:nrow(Ge_atp at tab)),] # supplementary individuals x_80<-Ge_atp_gcp_80[-c(1:nrow(Ge_atp at tab)),] # Individuals on which performing DAPC Then I performed DAPC on X_80, specifying a-priori groups dapc_GCP_14ssr_STRk5_b<- dapc(x_80, pop(x_80), n.pca=30,n.da=4) #perform DAPC I assigned supplementary individuals to DAPC groups predict_atp_strk5<-predict.dapc(dapc_GCP_14ssr_STRk5_b,newdata=x.sup_80 The predicted group memberships of supplementary individuals based on DAPC results is high, so I expect that supplementary individuals would be located in the DAPC groups ... But it is not the case, as on the scatterplot, supplementary individuals mostly appear outside from the groups to which they are assigned !! |col<-c("#F8766D","#A3A500","#00BF7D","#00B0F6","#E76BF3") # colors for African collection individuals colb<-c("darkblue","dodgerblue","darkorange2","red","gold","grey") # colors for supplementary Mount Kenya individuals (according to their STRUCTURE group) #axes 1 and 2 col.points_80<-transp(col[as.integer(pop(x_80))],.2) # define the color of African individuals as transparent scatter(dapc_GCP_14ssr_STRk5_b,col=col,bg="white",scree.da=0,pch="",cstar=0,clab=0,xlim=c(-10,10),legend=F)# par(xpd=TRUE) points(dapc_GCP_14ssr_STRk5_b$ind.coord[,1],dapc_GCP_14ssr_STRk5_b$ind.coord[,2],pch=20,col=col.points_80,cex=1) ## scatter DAPC groups / African GCP dataset col.sup_80<-colb[as.integer(pop(x.sup_80))] ## Define supplementary individuals color points(predict_atp_strk5$ind.scores[,1],predict_atp_strk5$ind.scores[,2],pch=8,col=transp(col.sup_80,.7),cex=1) # plot supplementary individuals| With a previous version of adegenet, this problem did not appear as supplementary individuals were located within the DAPC groups, and this problem appeared while running my script with the new adegenet version ... Does someone have an idea of what the problem is ? I can provide the full script and data if needed Thank you for your help ! -- Vanesse Labeyrie -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: gfhijbbh.png Type: image/png Size: 69217 bytes Desc: not available URL: