From neagef at gmail.com Mon Oct 6 17:30:14 2014 From: neagef at gmail.com (Andrea Garavito) Date: Mon, 6 Oct 2014 12:30:14 -0300 Subject: [adegenet-forum] Fwd: Significance of allelic contribution to discriminant functions In-Reply-To: References: Message-ID: Hello Caitlin, I was taking a look to the adegenet forum and I found this previous answer about a statistical threshold for marker contributions. Originally I was planing to retain for each one of my discriminant functions, around the 0.3% of markers with the highest contributions by establishing a threshold of 3-sigma. I'm not sure if these data are distributed normally, but as I have almost 5000 markers I was assuming so. Then I saw your post about the snpzip analysis and decided to give it a try. I tested the function with all the methods available, and I think I'll use the "median" method as with the others I'm getting to many markers retained (and only one with the "single" method). I see that the snpzip test make the analysis for the first discriminant function, but is there a way to make it also for the other discriminant functions found with DAPC? Thanks for your answer Andrea 2014-08-26 12:58 GMT-03:00 Caitlin Collins : > Yeah, it's new! > > I might as well note, in case you decide only to try a subset of the > methods available: > - Ward's method is most likely to select a very large number of variables > to get the most complete picture > - Single linkage hierarchical clustering will probably select the fewest > - Centroid clustering will probably select a useful middle-ground. > > You can always check to see what proportion of the variance is contained > in the subset of variables retained, or you could even try running a DAPC/ > PCA with just those variables to compare the discriminatory power of the > entire set with that of the subset selected. > > Good luck. > > Cheers, > Caitlin. > > > On Tue, Aug 26, 2014 at 4:31 PM, Charlie Waters wrote: > >> Thanks Caitlin! I've never come across the snpzip function so I'll give >> those clustering methods a try. >> >> Thanks, >> Charlie >> >> >> On Tue, Aug 26, 2014 at 3:49 AM, Caitlin Collins > > wrote: >> >>> Hi Charlie, >>> >>> Good question. Technically, there is no one "correct" statistical >>> solution to your problem. But, there *are *a number of ways of >>> approaching the problem with more statistical rigour than simply using an >>> arbitrary threshold as you have done. >>> >>> Have you taken a look at the snpzip function in the adegenet packge? If >>> not, just type "?snpzip" into R with the adegenet package loaded. With this >>> function, you can apply one of seven different hierarchical clustering >>> formulas to the allelic contributions generated by dapc. Essentially, each >>> hierarchical clustering method uses a unique approach to determine where >>> the threshold should be drawn. I should note, however, that this >>> descriptive approach will not have an associated p-value. You may want to >>> try out a few different methods before deciding which variables you want >>> to consider "most significant". >>> >>> I hope that helps! >>> >>> Best, >>> Caitlin >>> >> >> >> >> -- >> Charlie Waters >> Box 355020 >> School of Aquatic and Fishery Sciences >> University of Washington >> Seattle, WA 98105 >> >> > > _______________________________________________ > adegenet-forum mailing list > adegenet-forum at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum > -------------- next part -------------- An HTML attachment was scrubbed... URL: From neagef at gmail.com Mon Oct 6 19:09:32 2014 From: neagef at gmail.com (Andrea Garavito) Date: Mon, 6 Oct 2014 14:09:32 -0300 Subject: [adegenet-forum] Fwd: Significance of allelic contribution to discriminant functions In-Reply-To: References: Message-ID: Hello again! I took a closer look into the object created by the snpzip tool, and I found the contributions for all the different axes. I didn't noticed them before as I was looking only at the plot obtained. Thanks anyway! Andrea 2014-10-06 12:30 GMT-03:00 Andrea Garavito : > Hello Caitlin, > I was taking a look to the adegenet forum and I found this previous answer > about a statistical threshold for marker contributions. > > Originally I was planing to retain for each one of my discriminant > functions, around the 0.3% of markers with the highest contributions by > establishing a threshold of 3-sigma. I'm not sure if these data are > distributed normally, but as I have almost 5000 markers I was assuming so. > Then I saw your post about the snpzip analysis and decided to give it a try. > I tested the function with all the methods available, and I think I'll use > the "median" method as with the others I'm getting to many markers retained > (and only one with the "single" method). > I see that the snpzip test make the analysis for the first discriminant > function, but is there a way to make it also for the other discriminant > functions found with DAPC? > > Thanks for your answer > Andrea > > > 2014-08-26 12:58 GMT-03:00 Caitlin Collins : > >> Yeah, it's new! >> >> I might as well note, in case you decide only to try a subset of the >> methods available: >> - Ward's method is most likely to select a very large number of variables >> to get the most complete picture >> - Single linkage hierarchical clustering will probably select the fewest >> - Centroid clustering will probably select a useful middle-ground. >> >> You can always check to see what proportion of the variance is contained >> in the subset of variables retained, or you could even try running a DAPC/ >> PCA with just those variables to compare the discriminatory power of the >> entire set with that of the subset selected. >> >> Good luck. >> >> Cheers, >> Caitlin. >> >> >> On Tue, Aug 26, 2014 at 4:31 PM, Charlie Waters wrote: >> >>> Thanks Caitlin! I've never come across the snpzip function so I'll give >>> those clustering methods a try. >>> >>> Thanks, >>> Charlie >>> >>> >>> On Tue, Aug 26, 2014 at 3:49 AM, Caitlin Collins < >>> caitiecollins at gmail.com> wrote: >>> >>>> Hi Charlie, >>>> >>>> Good question. Technically, there is no one "correct" statistical >>>> solution to your problem. But, there *are *a number of ways of >>>> approaching the problem with more statistical rigour than simply using an >>>> arbitrary threshold as you have done. >>>> >>>> Have you taken a look at the snpzip function in the adegenet packge? If >>>> not, just type "?snpzip" into R with the adegenet package loaded. With this >>>> function, you can apply one of seven different hierarchical clustering >>>> formulas to the allelic contributions generated by dapc. Essentially, each >>>> hierarchical clustering method uses a unique approach to determine where >>>> the threshold should be drawn. I should note, however, that this >>>> descriptive approach will not have an associated p-value. You may want to >>>> try out a few different methods before deciding which variables you want >>>> to consider "most significant". >>>> >>>> I hope that helps! >>>> >>>> Best, >>>> Caitlin >>>> >>> >>> >>> >>> -- >>> Charlie Waters >>> Box 355020 >>> School of Aquatic and Fishery Sciences >>> University of Washington >>> Seattle, WA 98105 >>> >>> >> >> _______________________________________________ >> adegenet-forum mailing list >> adegenet-forum at lists.r-forge.r-project.org >> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gemm2470 at uni-landau.de Sun Oct 12 10:47:23 2014 From: gemm2470 at uni-landau.de (Isabelle Gemmer) Date: Sun, 12 Oct 2014 10:47:23 +0200 Subject: [adegenet-forum] Isolation by distance (Mantel test) Message-ID: <543A401B.4060605@uni-landau.de> Hello, I installed coordinates for the Mantel test. > data1$other$xy<-dataxy > Dgeo <- dist(data1$other$xy) > ibd <- mantel.randtest(Dgen,Dgeo) It worked well. But in reality, the examined organisms can not swim through a lake, they can only migrate along a shoreline. Thus, I measured the distances and provide an own matrix of geographic distances. My question is: Can I also install own measured geographic distances instead of coordinates? Regards, Isabelle From vojta at trapa.cz Sun Oct 12 12:28:06 2014 From: vojta at trapa.cz (=?utf-8?B?Vm9qdMSbY2g=?= Zeisek) Date: Sun, 12 Oct 2014 12:28:06 +0200 Subject: [adegenet-forum] Isolation by distance (Mantel test) In-Reply-To: <543A401B.4060605@uni-landau.de> References: <543A401B.4060605@uni-landau.de> Message-ID: <19382021.B1bExHcYvj@veles.site> Hello Dne Ne 12. ??jna 2014 10:47:23, Isabelle Gemmer napsal(a): > Hello, > > I installed coordinates for the Mantel test. > > > data1$other$xy<-dataxy > > Dgeo <- dist(data1$other$xy) > > ibd <- mantel.randtest(Dgen,Dgeo) > > It worked well. But in reality, the examined organisms can not swim > through a lake, they can only migrate along a shoreline. Thus, I > measured the distances and provide an own matrix of geographic distances. > > My question is: Can I also install own measured geographic distances > instead of coordinates? Sure, just as Dgeo use matrix o distances along shoreline, so You wouldn't use dist function, but computed then for example in GIS and then imported into R. > Regards, > Isabelle Sincerely, Vojt?ch -- Vojt?ch Zeisek http://trapa.cz/en/ Department of Botany, Faculty of Science Charles University in Prague Ben?tsk? 2, Prague, 12801, CZ http://botany.natur.cuni.cz/en/ Institute of Botany, Academy of Science Z?mek 1, Pr?honice, 25243, CZ http://www.ibot.cas.cz/en/ Czech Republic -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 490 bytes Desc: This is a digitally signed message part. URL: From zuzmus at gmail.com Thu Oct 9 11:55:09 2014 From: zuzmus at gmail.com (zuzmus) Date: Thu, 9 Oct 2014 11:55:09 +0200 Subject: [adegenet-forum] PCA sensitive to order of samples? Message-ID: Dear colleagues, I would like to perform the PCA in adegenet package and managed to go through the procedure till the end. The problem is that the results don't make sense and I see an obvious bias towards the order of the samples in the input matrix. The matrix has 140 samples from 11 putative species and cca 2800 SNPs coming from the RAD-seq method (only biallelicm SNPs included; coded 0 - more frequent allele, 1 - heterozygote, 2 - rarer allele, NA - missing data). I used the following code: > data <- read.table("/Users/zuzana/Matrix_for_adegenet_cutSNPsTo2484_NoHybrids.txt") > x <- new("genlight", data) > pca1 <- glPca(x) > scatter(pca1, posi="bottomleft") The results always show first 5-7 individuals as strongly separated along the PC1 and 2 and the rest forms one cluster. When I repeated the same analysis after removing the first few individual from the matrix, the pattern stayed as it was - the new first individuals became separated. [image: Vlo?en? obr?zek 1] I also tried to play with most of the options for glPca command following the manual or help in R, but always got the similar results... Another issue is that I have quite some missing data (10 - 35 % per SNP, and cca 10 - 50% per individual) in my matrix, but this was the trade off of the experiment design ("sequence as much as possible as cheap as possible..."). But the first individuals in the list are quite well sequenced, so they are not the worst in sense of missing data... I wonder if I missed some basics, if I did something wrong or if it is possible that there really is a bias of the order of the samples in the matrix? I would be very happy if somebody could help me to find out how to solve this issue. Thank you very much of any help and suggestion!:-) With regards, Zuzana --- Zuzana Musilova, PhD. Zoological Institute University of Basel Vesalgasse 1 | 4051 Basel Switzerland | Europe )><(((@>....<@)))><( -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screen Shot 2014-10-09 at 11.22.14 AM.png Type: image/png Size: 27443 bytes Desc: not available URL: From t.jombart at imperial.ac.uk Tue Oct 14 12:31:52 2014 From: t.jombart at imperial.ac.uk (Jombart, Thibaut) Date: Tue, 14 Oct 2014 10:31:52 +0000 Subject: [adegenet-forum] PCA sensitive to order of samples? In-Reply-To: References: Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570A826EE08@icexch-m1.ic.ac.uk> Hi there, no, PCA is not sensitive to the ordering of samples. Note: given the size of the dataset, it is probably easier to use the basic PCA procedure (dudi.pca). genlight objects are meant to be used whenever your computer could not otherwise store the data. If your missing data are not randomly distributed, then many NAs is a problem: individuals with similar missing data will be seen as artificially similar, and SNPs with similar NAs will be seen as artificially correlated. It is safer to use less data, of better quality. In this case, you may want to remove SNPs with many NAs. Cheers Thibaut ________________________________ From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of zuzmus [zuzmus at gmail.com] Sent: 09 October 2014 10:55 To: adegenet-forum at lists.r-forge.r-project.org Subject: [adegenet-forum] PCA sensitive to order of samples? Dear colleagues, I would like to perform the PCA in adegenet package and managed to go through the procedure till the end. The problem is that the results don't make sense and I see an obvious bias towards the order of the samples in the input matrix. The matrix has 140 samples from 11 putative species and cca 2800 SNPs coming from the RAD-seq method (only biallelicm SNPs included; coded 0 - more frequent allele, 1 - heterozygote, 2 - rarer allele, NA - missing data). I used the following code: > data <- read.table("/Users/zuzana/Matrix_for_adegenet_cutSNPsTo2484_NoHybrids.txt") > x <- new("genlight", data) > pca1 <- glPca(x) > scatter(pca1, posi="bottomleft") The results always show first 5-7 individuals as strongly separated along the PC1 and 2 and the rest forms one cluster. When I repeated the same analysis after removing the first few individual from the matrix, the pattern stayed as it was - the new first individuals became separated. [Vlozen? obr?zek 1] I also tried to play with most of the options for glPca command following the manual or help in R, but always got the similar results... Another issue is that I have quite some missing data (10 - 35 % per SNP, and cca 10 - 50% per individual) in my matrix, but this was the trade off of the experiment design ("sequence as much as possible as cheap as possible..."). But the first individuals in the list are quite well sequenced, so they are not the worst in sense of missing data... I wonder if I missed some basics, if I did something wrong or if it is possible that there really is a bias of the order of the samples in the matrix? I would be very happy if somebody could help me to find out how to solve this issue. Thank you very much of any help and suggestion!:-) With regards, Zuzana --- Zuzana Musilova, PhD. Zoological Institute University of Basel Vesalgasse 1 | 4051 Basel Switzerland | Europe )><(((@>....<@)))><( -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screen Shot 2014-10-09 at 11.22.14 AM.png Type: image/png Size: 27443 bytes Desc: Screen Shot 2014-10-09 at 11.22.14 AM.png URL: From caitiecollins at gmail.com Thu Oct 16 13:38:42 2014 From: caitiecollins at gmail.com (Caitlin Collins) Date: Thu, 16 Oct 2014 12:38:42 +0100 Subject: [adegenet-forum] Fwd: Significance of allelic contribution to discriminant functions In-Reply-To: References: Message-ID: Hello again Andrea, Glad you found what you were looking for! Incidentally, and in case anyone else on the forum is looking to visualise the variable contributions to discriminant axes > 1, here is some code to do so for a toy example. (The last chunk will be the relevant bit for creating loading plots): # make a simulated dataset with 5 "groups" simpop <- glSim(200, 1000, 40, k=5, sort.pop=TRUE) snps <- as.matrix(simpop) phen <- simpop at other$ancestral.pops # for fun/ as a check, quickly visualise the clusters dapc1 <- dapc(snps, phen, n.pca=50, n.da=4) scatter(dapc1) # create an object called foo that contains the results of running snpzip on your dataset foo <- snpzip(snps, phen, xval.plot=TRUE, loading.plot=TRUE, method="centroid") # isolate the DAPC component of the snpzip results, calling it "dapc1" dapc1 <- foo$DAPC # specify that you want to run the following lines for all DA (ie. from DA=1 to DA=(k-1), where K is the number of groups in your dataset) DA <- c(1:dapc1$n.da) par(ask=TRUE) # generate separate loading plots for each DA for(i in DA){ title <- paste("Loading Plot for DA", i, sep=" ") maximus <- foo$FS[[i]][[2]] cutoff <- abs(dapc1$var.contr[maximus,i][(which.min(dapc1$var.contr[maximus,i]))])-0.000001 loadingplot(dapc1$var.contr[, i], threshold=cutoff, main=title) } Hope that helps! And thanks for your input: I'll try and implement the above code within snpzip to generate loadinplots for all DA automatically in the next release of adegenet. Cheers, Caitlin. On Mon, Oct 6, 2014 at 6:09 PM, Andrea Garavito wrote: > Hello again! > > I took a closer look into the object created by the snpzip tool, and I > found the contributions for all the different axes. > I didn't noticed them before as I was looking only at the plot obtained. > > Thanks anyway! > Andrea > > > 2014-10-06 12:30 GMT-03:00 Andrea Garavito : > > Hello Caitlin, >> I was taking a look to the adegenet forum and I found this previous >> answer about a statistical threshold for marker contributions. >> >> Originally I was planing to retain for each one of my discriminant >> functions, around the 0.3% of markers with the highest contributions by >> establishing a threshold of 3-sigma. I'm not sure if these data are >> distributed normally, but as I have almost 5000 markers I was assuming so. >> Then I saw your post about the snpzip analysis and decided to give it a try. >> I tested the function with all the methods available, and I think I'll >> use the "median" method as with the others I'm getting to many markers >> retained (and only one with the "single" method). >> I see that the snpzip test make the analysis for the first discriminant >> function, but is there a way to make it also for the other discriminant >> functions found with DAPC? >> >> Thanks for your answer >> Andrea >> >> >> 2014-08-26 12:58 GMT-03:00 Caitlin Collins : >> >>> Yeah, it's new! >>> >>> I might as well note, in case you decide only to try a subset of the >>> methods available: >>> - Ward's method is most likely to select a very large number of >>> variables to get the most complete picture >>> - Single linkage hierarchical clustering will probably select the fewest >>> - Centroid clustering will probably select a useful middle-ground. >>> >>> You can always check to see what proportion of the variance is contained >>> in the subset of variables retained, or you could even try running a DAPC/ >>> PCA with just those variables to compare the discriminatory power of the >>> entire set with that of the subset selected. >>> >>> Good luck. >>> >>> Cheers, >>> Caitlin. >>> >>> >>> On Tue, Aug 26, 2014 at 4:31 PM, Charlie Waters wrote: >>> >>>> Thanks Caitlin! I've never come across the snpzip function so I'll give >>>> those clustering methods a try. >>>> >>>> Thanks, >>>> Charlie >>>> >>>> >>>> On Tue, Aug 26, 2014 at 3:49 AM, Caitlin Collins < >>>> caitiecollins at gmail.com> wrote: >>>> >>>>> Hi Charlie, >>>>> >>>>> Good question. Technically, there is no one "correct" statistical >>>>> solution to your problem. But, there *are *a number of ways of >>>>> approaching the problem with more statistical rigour than simply using an >>>>> arbitrary threshold as you have done. >>>>> >>>>> Have you taken a look at the snpzip function in the adegenet packge? >>>>> If not, just type "?snpzip" into R with the adegenet package loaded. With >>>>> this function, you can apply one of seven different hierarchical clustering >>>>> formulas to the allelic contributions generated by dapc. Essentially, each >>>>> hierarchical clustering method uses a unique approach to determine where >>>>> the threshold should be drawn. I should note, however, that this >>>>> descriptive approach will not have an associated p-value. You may want to >>>>> try out a few different methods before deciding which variables you want >>>>> to consider "most significant". >>>>> >>>>> I hope that helps! >>>>> >>>>> Best, >>>>> Caitlin >>>>> >>>> >>>> >>>> >>>> -- >>>> Charlie Waters >>>> Box 355020 >>>> School of Aquatic and Fishery Sciences >>>> University of Washington >>>> Seattle, WA 98105 >>>> >>>> >>> >>> _______________________________________________ >>> adegenet-forum mailing list >>> adegenet-forum at lists.r-forge.r-project.org >>> >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From caitiecollins at gmail.com Thu Oct 16 14:28:13 2014 From: caitiecollins at gmail.com (Caitlin Collins) Date: Thu, 16 Oct 2014 13:28:13 +0100 Subject: [adegenet-forum] Fwd: Question about how to interpret Cross validation in my analysis. Thanks! In-Reply-To: References: Message-ID: ---------- Forwarded message ---------- From: Caitlin Collins Date: Thu, Oct 16, 2014 at 1:27 PM Subject: Re: Question about how to interpret Cross validation in my analysis. Thanks! To: Angela Merino Cc: "Collins, Caitlin" , "Jombart, Thibaut" Hi Angela, Well, I have two pieces of good news for you, and one piece of mediocre news. First, there?s nothing to worry about with respect to the ?NULL? that you are seeing. It just gets printed when xval.plot=TRUE as an artefact of one of the lines of the printing function. It has no meaning, and certainly does not imply that your model is not valid. (Given the stress that I now realise this glaring ?NULL? may cause, I?ve changed the way the plots print now, so in the next release of adegenet this won?t happen.) Second, you are absolutely correct in your interpretation of the results of xvalDapc (which are stored in whatever object you assigned the results to, in your case, ?xval?). This brings me to the mediocre news: given that your interpretation is correct, it seems that the best model you can achieve with DAPC, where n.pca=25, is only able to predict the group membership of validation set individuals in 63% of the cases, with a 32% root mean squared error. Arguably, this is not great. Your final comment on the matter, though, is quite insightful. The fact that you can achieve the same modest level of success with 20-80 PCs indicates that the optimisation procedure has not been particularly successful. Ideally, one would like to see an arch, with a maximum success point somewhere in the middle. In your case, there is a bit of an arch, but it isn?t particularly striking. The only thing I might add to your interpretation of this result is that it?s not so much that the model is poor because a similar level of success can be achieved with variable numbers of PCs. If mean success was virtually constant, but varying around 90%, the interpretation would not be that the model is poor, but rather that most levels of PC retention can compose a model that effectively discriminates between groups. I hope this has helped answer some of your questions. If you have any more, please feel free to ask. Best, Caitlin. On Mon, Oct 13, 2014 at 11:48 PM, Angela Merino < Angela.Merino at cawthron.org.nz> wrote: > Hi Caitlin Collins and Thibaut Jombart, > > > > My name is Angela Parody-Merino and I am a PhD student at Massey > University (New Zealand). I am studying the population genetic structure in > a migratory bird (the New Zealand Godwit) with 23 microsatellites. Anyway, > maybe this is a very simple question but I really want to understand and be > sure about the meaning and interpretation of the output when doing > cross-validation. I have been some days looking in the internet and reading > explanations etc?without being able to really understand what?s going on > with my analysis. Could you help me please? J > > > > This is the script of the analysis: > > > x <- ELpop > > > mat <- as.matrix(na.replace(x, method="mean")) > > > > Replaced 371 missing values > > > grp <- pop(x) > > > xval <- xvalDapc(mat, grp, n.pca.max = 40, training.set = 0.9, > > + result = "groupMean", center = TRUE, scale = FALSE, > > + n.pca = NULL, n.rep = 500, xval.plot = TRUE) > > NULL *>>> What does it mean this NULL? Does it mean that the model is not > valid?* > > *$`Median and Confidence Interval for Random Chance`* > > * 2.5% 50% 97.5% * > > *0.4294840 0.4928747 0.5962807 * > > > > *$`Mean Successful Assignment by Number of PCs of PCA`* > > * 5 10 15 20 25 30 > 35 40 * > > *0.5871429 0.6000000 0.5819048 0.6014286 0.6952381 0.6747619 0.6333333 > 0.6109524 * > > > > *$`Number of PCs Achieving Highest Mean Success`* > > *[1] "25"* > > > > *$`Root Mean Squared Error by Number of PCs of PCA`* > > * 5 10 15 20 25 30 > 35 40 * > > *0.4301795 0.4141872 0.4389381 0.4131429 0.3241735 0.3531491 0.3885084 > 0.4145894 * > > > > *$`Number of PCs Achieving Lowest MSE`* > > *[1] "25"* > > > > *From the screenshot and the output results of the cross validation (in > blue), I would say that my model (retaining 25PCs) can predict with a mean > of 63% but it is not such a good model because most of the models that can > be obtained by retaining 20, 40, 60, 80 PCs are quite the same successful. > Is it my interpretation correct?* > > > > > > > > Thanks in advance, > > > > Kind regards, > > > > ?Angela Parody-Merino > ------------------------------ > *Attention: * > This message is for the named person's use only. It may contain > confidential, proprietary or legally privileged information. If you > receive this message in error, please immediately delete it and all copies > of it from your system, destroy any hard copies of it and notify the > sender. You must not, directly or indirectly, use, disclose, distribute, > print, or copy any part of this message if you are not the intended > recipient. Cawthron reserves the right to monitor all e-mail communications > through its networks. Any opinions expressed in this message are those of > the individual sender, except where the message states otherwise and the > sender is authorised to make that statement. > > This e-mail message has been scanned and cleared by *MailMarshal * > ------------------------------ > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.jpg Type: image/jpeg Size: 48953 bytes Desc: not available URL: From caitiecollins at gmail.com Thu Oct 16 15:01:58 2014 From: caitiecollins at gmail.com (Caitlin Collins) Date: Thu, 16 Oct 2014 14:01:58 +0100 Subject: [adegenet-forum] Trouble converting to genid object In-Reply-To: <2CB2DA8E426F3541AB1907F98ABA6570A826B775@icexch-m1.ic.ac.uk> References: <2CB2DA8E426F3541AB1907F98ABA6570A826B775@icexch-m1.ic.ac.uk> Message-ID: Hi, Sorry for the delay. I think the problem may be something simple to do with the format or row and column names of the object test. When I tried the example with the data you sent, the first approach worked right away. Can you try for me something perhaps silly just to rule this out as the solution: # replace filename below with the path to your file, wherever it is on your computer filename <- "C:/Cait/Work/adegenet forum Qs/test.txt" # use read.table to read in the file anew to try to get it in the same format that I have it in test <- read.table(filename) # confirm that it looks to have the correct dimensions, names, contents head(test) # try creating a genind object out of test using the first approach you put forward obj1 <- genind(test, ploidy=1, type="PA") # confirm that a genind was created obj1 # confirm that it looks the same as the original object when in matrix form head(as.matrix(obj1)) Then please let me know if that works for you. If not, could you paste back the results or errors you get from the above commands? Best of luck. Cheers, Caitlin. On Thu, Sep 25, 2014 at 11:04 AM, Jombart, Thibaut wrote: > > Hi there, > > it looks like a bug. I'll investigate and get back to you. > > Cheers > Thibaut > > ------------------------------ > *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [ > adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Jackie > Lighten [Jackie.Lighten at Dal.Ca] > *Sent:* 22 September 2014 12:59 > *To:* adegenet-forum at lists.r-forge.r-project.org > *Subject:* [adegenet-forum] Trouble converting to genid object > > Hi, > > I am having trouble converting a presence/absence genotype data frame to > a genid object > > Please see attached for test data file. > > Using > > obj2 <- genind(test, ploidy=1, type="PA") > > I get the error: > > Error in `colnames<-`(`*tmp*`, value = c("L1", "L2")) : > length of 'dimnames' [2] not equal to array extent > > > Using > > obj2 <- df2genind(test, ploidy=1, type="PA") > > I get the error: > > Error in `colnames<-`(`*tmp*`, value = "L1") : > length of 'dimnames' [2] not equal to array extent > In addition: Warning messages: > 1: In eval(expr, envir, enclos) : NAs introduced by coercion > 2: In df2genind(test, ploidy = 1, type = "PA") : > entirely non-type marker(s) deleted > > > Any help would be much appreciated > > Thanks, > > Jack > > _______________________________________________ > adegenet-forum mailing list > adegenet-forum at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum > -------------- next part -------------- An HTML attachment was scrubbed... URL: From goatsrunfaster at gmail.com Tue Oct 21 17:28:20 2014 From: goatsrunfaster at gmail.com (Spencer Bruce) Date: Tue, 21 Oct 2014 11:28:20 -0400 Subject: [adegenet-forum] creating genetic data from scratch Message-ID: Hello All, Im looking to create some micro-satellite data for a simulation study. Is there a way to create a data set for a number of individuals, with a given number of loci, and neutral alleles at each loci? Im basically looking to simulate admixture between two pops but the number of individuals i have actual data for is only 20 or so from each, where I need to create a scenario with hundreds of individuals. any info would be greatly appreciated! Best, Spencer -- Spencer A Bruce 200 Washington St. Troy, NY 12180 518 225 0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: From t.jombart at imperial.ac.uk Tue Oct 21 20:31:26 2014 From: t.jombart at imperial.ac.uk (Jombart, Thibaut) Date: Tue, 21 Oct 2014 18:31:26 +0000 Subject: [adegenet-forum] creating genetic data from scratch In-Reply-To: References: Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570ABE699AF@icexch-m1.ic.ac.uk> Hello, Not in adegenet, but there are software around to do this - check out easypop for instance, which outputs files compatible with adegenet. Cheers Thibaut ________________________________ From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer Bruce [goatsrunfaster at gmail.com] Sent: 21 October 2014 16:28 To: adegenet-forum at lists.r-forge.r-project.org Subject: [adegenet-forum] creating genetic data from scratch Hello All, Im looking to create some micro-satellite data for a simulation study. Is there a way to create a data set for a number of individuals, with a given number of loci, and neutral alleles at each loci? Im basically looking to simulate admixture between two pops but the number of individuals i have actual data for is only 20 or so from each, where I need to create a scenario with hundreds of individuals. any info would be greatly appreciated! Best, Spencer -- Spencer A Bruce 200 Washington St. Troy, NY 12180 518 225 0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: From goatsrunfaster at gmail.com Thu Oct 23 16:29:10 2014 From: goatsrunfaster at gmail.com (Spencer Bruce) Date: Thu, 23 Oct 2014 10:29:10 -0400 Subject: [adegenet-forum] repooling random rows from genind objects Message-ID: Hello All! I have three seperate populations as genind objects. What I would like to do is pull a certain number of random individuals from each, to form a new single genind population. I would then like individuals from this new genind population to mate randomly, producing another genind object which would contain their offspring. Below is the code I came up with (which does not work): Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), 750), ], pop2[sample(nrow(pop2), 750), ], n=2000) Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], Year1[sample(nrow(Year1), 1000), ], n=2000) any help would be greatly appreciated! Best, Spencer -- Spencer A Bruce 200 Washington St. Troy, NY 12180 518 225 0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: From t.jombart at imperial.ac.uk Thu Oct 23 16:50:24 2014 From: t.jombart at imperial.ac.uk (Jombart, Thibaut) Date: Thu, 23 Oct 2014 14:50:24 +0000 Subject: [adegenet-forum] repooling random rows from genind objects In-Reply-To: References: Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE7F@icexch-m1.ic.ac.uk> Hello, hard to figure out what is wrong without the error message.. Cheers Thibaut ________________________________ From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer Bruce [goatsrunfaster at gmail.com] Sent: 23 October 2014 15:29 To: adegenet-forum at lists.r-forge.r-project.org Subject: [adegenet-forum] repooling random rows from genind objects Hello All! I have three seperate populations as genind objects. What I would like to do is pull a certain number of random individuals from each, to form a new single genind population. I would then like individuals from this new genind population to mate randomly, producing another genind object which would contain their offspring. Below is the code I came up with (which does not work): Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), 750), ], pop2[sample(nrow(pop2), 750), ], n=2000) Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], Year1[sample(nrow(Year1), 1000), ], n=2000) any help would be greatly appreciated! Best, Spencer -- Spencer A Bruce 200 Washington St. Troy, NY 12180 518 225 0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: From goatsrunfaster at gmail.com Thu Oct 23 16:52:59 2014 From: goatsrunfaster at gmail.com (Spencer Bruce) Date: Thu, 23 Oct 2014 10:52:59 -0400 Subject: [adegenet-forum] repooling random rows from genind objects In-Reply-To: <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE7F@icexch-m1.ic.ac.uk> References: <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE7F@icexch-m1.ic.ac.uk> Message-ID: Error message: Error in sample.int(length(x), size, replace, prob) : invalid first argument On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut wrote: > > Hello, > hard to figure out what is wrong without the error message.. > Cheers > Thibaut > ------------------------------ > *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [ > adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer > Bruce [goatsrunfaster at gmail.com] > *Sent:* 23 October 2014 15:29 > *To:* adegenet-forum at lists.r-forge.r-project.org > *Subject:* [adegenet-forum] repooling random rows from genind objects > > Hello All! > > I have three seperate populations as genind objects. What I would like > to do is pull a certain number of random individuals from each, to form a > new single genind population. > > I would then like individuals from this new genind population to mate > randomly, producing another genind object which would contain their > offspring. > > Below is the code I came up with (which does not work): > > Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), > 750), ], pop2[sample(nrow(pop2), 750), ], n=2000) > > Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], > Year1[sample(nrow(Year1), 1000), ], n=2000) > > > any help would be greatly appreciated! > > Best, > Spencer > > -- > Spencer A Bruce > 200 Washington St. > Troy, NY 12180 > 518 225 0787 > -- Spencer A Bruce 200 Washington St. Troy, NY 12180 518 225 0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: From t.jombart at imperial.ac.uk Thu Oct 23 16:54:53 2014 From: t.jombart at imperial.ac.uk (Jombart, Thibaut) Date: Thu, 23 Oct 2014 14:54:53 +0000 Subject: [adegenet-forum] repooling random rows from genind objects In-Reply-To: References: <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE7F@icexch-m1.ic.ac.uk>, Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE93@icexch-m1.ic.ac.uk> What does nrow(F1) and other nrow(...)'s say? ________________________________ From: Spencer Bruce [goatsrunfaster at gmail.com] Sent: 23 October 2014 15:52 To: Jombart, Thibaut Cc: adegenet-forum at lists.r-forge.r-project.org Subject: Re: [adegenet-forum] repooling random rows from genind objects Error message: Error in sample.int(length(x), size, replace, prob) : invalid first argument On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut > wrote: Hello, hard to figure out what is wrong without the error message.. Cheers Thibaut ________________________________ From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer Bruce [goatsrunfaster at gmail.com] Sent: 23 October 2014 15:29 To: adegenet-forum at lists.r-forge.r-project.org Subject: [adegenet-forum] repooling random rows from genind objects Hello All! I have three seperate populations as genind objects. What I would like to do is pull a certain number of random individuals from each, to form a new single genind population. I would then like individuals from this new genind population to mate randomly, producing another genind object which would contain their offspring. Below is the code I came up with (which does not work): Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), 750), ], pop2[sample(nrow(pop2), 750), ], n=2000) Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], Year1[sample(nrow(Year1), 1000), ], n=2000) any help would be greatly appreciated! Best, Spencer -- Spencer A Bruce 200 Washington St. Troy, NY 12180 518 225 0787 -- Spencer A Bruce 200 Washington St. Troy, NY 12180 518 225 0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: From goatsrunfaster at gmail.com Thu Oct 23 17:03:54 2014 From: goatsrunfaster at gmail.com (Spencer Bruce) Date: Thu, 23 Oct 2014 11:03:54 -0400 Subject: [adegenet-forum] repooling random rows from genind objects In-Reply-To: <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE93@icexch-m1.ic.ac.uk> References: <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE7F@icexch-m1.ic.ac.uk> <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE93@icexch-m1.ic.ac.uk> Message-ID: they both say Null, if I just type them into R. Just to be clear these genind objects contains microsat data for 11 loci for thousands of individuals. I'm rather new to R, so I apologize if I'm missing something obvious here... On Thu, Oct 23, 2014 at 10:54 AM, Jombart, Thibaut wrote: > > > What does nrow(F1) and other nrow(...)'s say? > > > > > ------------------------------ > *From:* Spencer Bruce [goatsrunfaster at gmail.com] > *Sent:* 23 October 2014 15:52 > *To:* Jombart, Thibaut > *Cc:* adegenet-forum at lists.r-forge.r-project.org > *Subject:* Re: [adegenet-forum] repooling random rows from genind objects > > Error message: > > Error in sample.int(length(x), size, replace, prob) : > invalid first argument > > On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut < > t.jombart at imperial.ac.uk> wrote: > >> >> Hello, >> hard to figure out what is wrong without the error message.. >> Cheers >> Thibaut >> ------------------------------ >> *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [ >> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer >> Bruce [goatsrunfaster at gmail.com] >> *Sent:* 23 October 2014 15:29 >> *To:* adegenet-forum at lists.r-forge.r-project.org >> *Subject:* [adegenet-forum] repooling random rows from genind objects >> >> Hello All! >> >> I have three seperate populations as genind objects. What I would like >> to do is pull a certain number of random individuals from each, to form a >> new single genind population. >> >> I would then like individuals from this new genind population to mate >> randomly, producing another genind object which would contain their >> offspring. >> >> Below is the code I came up with (which does not work): >> >> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), >> 750), ], pop2[sample(nrow(pop2), 750), ], n=2000) >> >> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], >> Year1[sample(nrow(Year1), 1000), ], n=2000) >> >> >> any help would be greatly appreciated! >> >> Best, >> Spencer >> >> -- >> Spencer A Bruce >> 200 Washington St. >> Troy, NY 12180 >> 518 225 0787 >> > > > > -- > Spencer A Bruce > 200 Washington St. > Troy, NY 12180 > 518 225 0787 > -- Spencer A Bruce 200 Washington St. Troy, NY 12180 518 225 0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: From francesco.montinaro at gmail.com Thu Oct 23 17:19:19 2014 From: francesco.montinaro at gmail.com (Francesco Montinaro) Date: Thu, 23 Oct 2014 16:19:19 +0100 Subject: [adegenet-forum] adegenet-forum Digest, Vol 74, Issue 9 In-Reply-To: References: Message-ID: Hi, I think that the problem is that since a genind object is a list, the nrow is NULL. Probably you want to sample from object$tab instead. Hope it helps. Best Francesco Montinaro On 23 October 2014 16:04, < adegenet-forum-request at lists.r-forge.r-project.org> wrote: > Send adegenet-forum mailing list submissions to > adegenet-forum at lists.r-forge.r-project.org > > To subscribe or unsubscribe via the World Wide Web, visit > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum > > or, via email, send a message with subject or body 'help' to > adegenet-forum-request at lists.r-forge.r-project.org > > You can reach the person managing the list at > adegenet-forum-owner at lists.r-forge.r-project.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of adegenet-forum digest..." > > > Today's Topics: > > 1. repooling random rows from genind objects (Spencer Bruce) > 2. Re: repooling random rows from genind objects (Jombart, Thibaut) > 3. Re: repooling random rows from genind objects (Spencer Bruce) > 4. Re: repooling random rows from genind objects (Jombart, Thibaut) > 5. Re: repooling random rows from genind objects (Spencer Bruce) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 23 Oct 2014 10:29:10 -0400 > From: Spencer Bruce > To: adegenet-forum at lists.r-forge.r-project.org > Subject: [adegenet-forum] repooling random rows from genind objects > Message-ID: > UFFOSerHR1qeKXHSwVzF-0whQf2do83wOV38s5w at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Hello All! > > I have three seperate populations as genind objects. What I would like to > do is pull a certain number of random individuals from each, to form a new > single genind population. > > I would then like individuals from this new genind population to mate > randomly, producing another genind object which would contain their > offspring. > > Below is the code I came up with (which does not work): > > Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), 750), > ], pop2[sample(nrow(pop2), 750), ], n=2000) > > Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], > Year1[sample(nrow(Year1), 1000), ], n=2000) > > > any help would be greatly appreciated! > > Best, > Spencer > > -- > Spencer A Bruce > 200 Washington St. > Troy, NY 12180 > 518 225 0787 > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/cb94a767/attachment-0001.html > > > > ------------------------------ > > Message: 2 > Date: Thu, 23 Oct 2014 14:50:24 +0000 > From: "Jombart, Thibaut" > To: Spencer Bruce , > "adegenet-forum at lists.r-forge.r-project.org" > > Subject: Re: [adegenet-forum] repooling random rows from genind > objects > Message-ID: > <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE7F at icexch-m1.ic.ac.uk> > Content-Type: text/plain; charset="iso-8859-1" > > > Hello, > hard to figure out what is wrong without the error message.. > Cheers > Thibaut > ________________________________ > From: adegenet-forum-bounces at lists.r-forge.r-project.org [ > adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer > Bruce [goatsrunfaster at gmail.com] > Sent: 23 October 2014 15:29 > To: adegenet-forum at lists.r-forge.r-project.org > Subject: [adegenet-forum] repooling random rows from genind objects > > Hello All! > > I have three seperate populations as genind objects. What I would like to > do is pull a certain number of random individuals from each, to form a new > single genind population. > > I would then like individuals from this new genind population to mate > randomly, producing another genind object which would contain their > offspring. > > Below is the code I came up with (which does not work): > > Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), 750), > ], pop2[sample(nrow(pop2), 750), ], n=2000) > > Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], > Year1[sample(nrow(Year1), 1000), ], n=2000) > > > any help would be greatly appreciated! > > Best, > Spencer > > -- > Spencer A Bruce > 200 Washington St. > Troy, NY 12180 > 518 225 0787 > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/c43c29fe/attachment-0001.html > > > > ------------------------------ > > Message: 3 > Date: Thu, 23 Oct 2014 10:52:59 -0400 > From: Spencer Bruce > To: "Jombart, Thibaut" > Cc: "adegenet-forum at lists.r-forge.r-project.org" > > Subject: Re: [adegenet-forum] repooling random rows from genind > objects > Message-ID: > < > CAGjKGeZhjLhurZbKiMZxSmZV_GFC3quw58FPt06LrXwakRDPeA at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Error message: > > Error in sample.int(length(x), size, replace, prob) : > invalid first argument > > On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut < > t.jombart at imperial.ac.uk > > wrote: > > > > > Hello, > > hard to figure out what is wrong without the error message.. > > Cheers > > Thibaut > > ------------------------------ > > *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [ > > adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer > > Bruce [goatsrunfaster at gmail.com] > > *Sent:* 23 October 2014 15:29 > > *To:* adegenet-forum at lists.r-forge.r-project.org > > *Subject:* [adegenet-forum] repooling random rows from genind objects > > > > Hello All! > > > > I have three seperate populations as genind objects. What I would like > > to do is pull a certain number of random individuals from each, to form a > > new single genind population. > > > > I would then like individuals from this new genind population to mate > > randomly, producing another genind object which would contain their > > offspring. > > > > Below is the code I came up with (which does not work): > > > > Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), > > 750), ], pop2[sample(nrow(pop2), 750), ], n=2000) > > > > Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], > > Year1[sample(nrow(Year1), 1000), ], n=2000) > > > > > > any help would be greatly appreciated! > > > > Best, > > Spencer > > > > -- > > Spencer A Bruce > > 200 Washington St. > > Troy, NY 12180 > > 518 225 0787 > > > > > > -- > Spencer A Bruce > 200 Washington St. > Troy, NY 12180 > 518 225 0787 > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/19bc54c8/attachment-0001.html > > > > ------------------------------ > > Message: 4 > Date: Thu, 23 Oct 2014 14:54:53 +0000 > From: "Jombart, Thibaut" > To: Spencer Bruce > Cc: "adegenet-forum at lists.r-forge.r-project.org" > > Subject: Re: [adegenet-forum] repooling random rows from genind > objects > Message-ID: > <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE93 at icexch-m1.ic.ac.uk> > Content-Type: text/plain; charset="iso-8859-1" > > > > What does nrow(F1) and other nrow(...)'s say? > > > > > ________________________________ > From: Spencer Bruce [goatsrunfaster at gmail.com] > Sent: 23 October 2014 15:52 > To: Jombart, Thibaut > Cc: adegenet-forum at lists.r-forge.r-project.org > Subject: Re: [adegenet-forum] repooling random rows from genind objects > > Error message: > > Error in sample.int(length(x), size, replace, prob) : > invalid first argument > > On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut < > t.jombart at imperial.ac.uk> wrote: > > Hello, > hard to figure out what is wrong without the error message.. > Cheers > Thibaut > ________________________________ > From: adegenet-forum-bounces at lists.r-forge.r-project.org adegenet-forum-bounces at lists.r-forge.r-project.org> [ > adegenet-forum-bounces at lists.r-forge.r-project.org adegenet-forum-bounces at lists.r-forge.r-project.org>] on behalf of Spencer > Bruce [goatsrunfaster at gmail.com] > Sent: 23 October 2014 15:29 > To: adegenet-forum at lists.r-forge.r-project.org adegenet-forum at lists.r-forge.r-project.org> > Subject: [adegenet-forum] repooling random rows from genind objects > > Hello All! > > I have three seperate populations as genind objects. What I would like to > do is pull a certain number of random individuals from each, to form a new > single genind population. > > I would then like individuals from this new genind population to mate > randomly, producing another genind object which would contain their > offspring. > > Below is the code I came up with (which does not work): > > Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), 750), > ], pop2[sample(nrow(pop2), 750), ], n=2000) > > Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], > Year1[sample(nrow(Year1), 1000), ], n=2000) > > > any help would be greatly appreciated! > > Best, > Spencer > > -- > Spencer A Bruce > 200 Washington St. > Troy, NY 12180 > 518 225 0787 > > > > -- > Spencer A Bruce > 200 Washington St. > Troy, NY 12180 > 518 225 0787 > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/2dfd9408/attachment-0001.html > > > > ------------------------------ > > Message: 5 > Date: Thu, 23 Oct 2014 11:03:54 -0400 > From: Spencer Bruce > To: "Jombart, Thibaut" > Cc: "adegenet-forum at lists.r-forge.r-project.org" > > Subject: Re: [adegenet-forum] repooling random rows from genind > objects > Message-ID: > < > CAGjKGeZiyS-27oF2CKb6JBPgS+VSyTA1-NS2Q8g2UC0JqOs4VA at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > they both say Null, if I just type them into R. > > Just to be clear these genind objects contains microsat data for 11 loci > for thousands of individuals. > > I'm rather new to R, so I apologize if I'm missing something obvious > here... > > On Thu, Oct 23, 2014 at 10:54 AM, Jombart, Thibaut < > t.jombart at imperial.ac.uk > > wrote: > > > > > > > What does nrow(F1) and other nrow(...)'s say? > > > > > > > > > > ------------------------------ > > *From:* Spencer Bruce [goatsrunfaster at gmail.com] > > *Sent:* 23 October 2014 15:52 > > *To:* Jombart, Thibaut > > *Cc:* adegenet-forum at lists.r-forge.r-project.org > > *Subject:* Re: [adegenet-forum] repooling random rows from genind objects > > > > Error message: > > > > Error in sample.int(length(x), size, replace, prob) : > > invalid first argument > > > > On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut < > > t.jombart at imperial.ac.uk> wrote: > > > >> > >> Hello, > >> hard to figure out what is wrong without the error message.. > >> Cheers > >> Thibaut > >> ------------------------------ > >> *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [ > >> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of > Spencer > >> Bruce [goatsrunfaster at gmail.com] > >> *Sent:* 23 October 2014 15:29 > >> *To:* adegenet-forum at lists.r-forge.r-project.org > >> *Subject:* [adegenet-forum] repooling random rows from genind objects > >> > >> Hello All! > >> > >> I have three seperate populations as genind objects. What I would like > >> to do is pull a certain number of random individuals from each, to form > a > >> new single genind population. > >> > >> I would then like individuals from this new genind population to mate > >> randomly, producing another genind object which would contain their > >> offspring. > >> > >> Below is the code I came up with (which does not work): > >> > >> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), > >> 750), ], pop2[sample(nrow(pop2), 750), ], n=2000) > >> > >> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], > >> Year1[sample(nrow(Year1), 1000), ], n=2000) > >> > >> > >> any help would be greatly appreciated! > >> > >> Best, > >> Spencer > >> > >> -- > >> Spencer A Bruce > >> 200 Washington St. > >> Troy, NY 12180 > >> 518 225 0787 > >> > > > > > > > > -- > > Spencer A Bruce > > 200 Washington St. > > Troy, NY 12180 > > 518 225 0787 > > > > > > -- > Spencer A Bruce > 200 Washington St. > Troy, NY 12180 > 518 225 0787 > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/c51ccf09/attachment.html > > > > ------------------------------ > > _______________________________________________ > adegenet-forum mailing list > adegenet-forum at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum > > End of adegenet-forum Digest, Vol 74, Issue 9 > ********************************************* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hilpert at ipk-gatersleben.de Fri Oct 24 09:13:22 2014 From: hilpert at ipk-gatersleben.de (Stefanie Hilpert) Date: Fri, 24 Oct 2014 07:13:22 +0000 Subject: [adegenet-forum] DAPC & Ploidylevel Message-ID: Dear everybody, I am currently using the adegenet package to perform a structure analysis of my microsatellite dataset and compare it to the results of an analysis using STRUCTURE software. The organism I am working on is an apomictic plant and I am aware that STRUCTURE is probably not adequate because it assumes HWE and asexuality violates HWE. Nevertheless we use STRUCTURE analysis for apomicts, because in most of the cases the assigned number of groups correlate to biological traits. Knowing that there is a bias using STRUCTURE we decided to perform a DAPC additionally. But now I ran into another problem using adegenet. I am working with a mixed ploidy system with ploidies ranging from 4 to 11. To implement the data into the adgenet package we coded all individuals as 11x because otherwise it was not possible to load the data. Now I am wondering how big is the bias if the calculation assumes that for example a tetraploid is now a hendecaploid and if I could still trust the results. I am asking because the results of the DAPC are completely different to the ones of STRUCTURE which puzzles me a bit because I somehow at least expected correlations (the number of optimal k is the same, but the assigned individuals to the clusters differ completely). I would appreciate some help Stefanie Hilpert ------------------------------------------------------------------------------ Stefanie Hilpert -PhD Candidate- Dept. of Cytogenetics and Genome Analysis Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Corrensstra?e 3, D-06466 Gatersleben Germany +49 (0)39482 5673 IPK Graduate School International Max-Planck Research School -------------- next part -------------- An HTML attachment was scrubbed... URL: From tingpu89 at gmail.com Mon Oct 20 19:18:56 2014 From: tingpu89 at gmail.com (Ting Pu) Date: Mon, 20 Oct 2014 10:18:56 -0700 Subject: [adegenet-forum] $li score in sPCA Message-ID: Hi all, I was just wondering in sPCA, after I have selected the first positive principal component (which represents global structures), how should I interpret the positiveness and negativeness of the $li (entity scores)? Does a high positive $li mean its spatial correlation is stronger than a negative li? Please correct me if I am wrong. Thank you for your time, Ting From t.jombart at imperial.ac.uk Fri Oct 24 12:07:49 2014 From: t.jombart at imperial.ac.uk (Jombart, Thibaut) Date: Fri, 24 Oct 2014 10:07:49 +0000 Subject: [adegenet-forum] $li score in sPCA In-Reply-To: References: Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570ABE6C2D3@icexch-m1.ic.ac.uk> Hello, as in any multivariate analysis, the sign of the PCs is arbitrary. Only the distance between individuals on this PC has a meaning, i.e. if you have (using integers to make things simpler): A = -1 B = 1 C = 3 D = 5 Then the difference between A and C is the same as between B and D. Cheers Thibaut ________________________________________ From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Ting Pu [tingpu89 at gmail.com] Sent: 20 October 2014 18:18 To: adegenet-forum at lists.r-forge.r-project.org Subject: [adegenet-forum] $li score in sPCA Hi all, I was just wondering in sPCA, after I have selected the first positive principal component (which represents global structures), how should I interpret the positiveness and negativeness of the $li (entity scores)? Does a high positive $li mean its spatial correlation is stronger than a negative li? Please correct me if I am wrong. Thank you for your time, Ting _______________________________________________ adegenet-forum mailing list adegenet-forum at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum From goatsrunfaster at gmail.com Fri Oct 24 16:29:58 2014 From: goatsrunfaster at gmail.com (Spencer Bruce) Date: Fri, 24 Oct 2014 10:29:58 -0400 Subject: [adegenet-forum] adegenet-forum Digest, Vol 74, Issue 9 In-Reply-To: References: Message-ID: Hello All, Thanks for the tip Francesco! This almost works for me... when I enter the code below in an attempt to randomly sample 20 individuals from the genind object "tdhybrids" I get back the new genind object called Random, I then used genind2genotype to view the contents of "Random" but there are only 2 individuals (not 20)? The code I used is below: Random <- tdhybrids[sample(tdhybrids$tab, 20), ] obj <- genind2genotype(Random) Am I missing something here? A big thank you to everyone in advance for putting up with my questions? -Spencer On Thu, Oct 23, 2014 at 11:19 AM, Francesco Montinaro < francesco.montinaro at gmail.com> wrote: > Hi, > I think that the problem is that since a genind object is a list, the nrow > is NULL. > > Probably you want to sample from object$tab instead. > > Hope it helps. > > Best > > > > Francesco Montinaro > > On 23 October 2014 16:04, < > adegenet-forum-request at lists.r-forge.r-project.org> wrote: > >> Send adegenet-forum mailing list submissions to >> adegenet-forum at lists.r-forge.r-project.org >> >> To subscribe or unsubscribe via the World Wide Web, visit >> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum >> >> or, via email, send a message with subject or body 'help' to >> adegenet-forum-request at lists.r-forge.r-project.org >> >> You can reach the person managing the list at >> adegenet-forum-owner at lists.r-forge.r-project.org >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of adegenet-forum digest..." >> >> >> Today's Topics: >> >> 1. repooling random rows from genind objects (Spencer Bruce) >> 2. Re: repooling random rows from genind objects (Jombart, Thibaut) >> 3. Re: repooling random rows from genind objects (Spencer Bruce) >> 4. Re: repooling random rows from genind objects (Jombart, Thibaut) >> 5. Re: repooling random rows from genind objects (Spencer Bruce) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Thu, 23 Oct 2014 10:29:10 -0400 >> From: Spencer Bruce >> To: adegenet-forum at lists.r-forge.r-project.org >> Subject: [adegenet-forum] repooling random rows from genind objects >> Message-ID: >> > UFFOSerHR1qeKXHSwVzF-0whQf2do83wOV38s5w at mail.gmail.com> >> Content-Type: text/plain; charset="utf-8" >> >> Hello All! >> >> I have three seperate populations as genind objects. What I would like to >> do is pull a certain number of random individuals from each, to form a new >> single genind population. >> >> I would then like individuals from this new genind population to mate >> randomly, producing another genind object which would contain their >> offspring. >> >> Below is the code I came up with (which does not work): >> >> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), 750), >> ], pop2[sample(nrow(pop2), 750), ], n=2000) >> >> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], >> Year1[sample(nrow(Year1), 1000), ], n=2000) >> >> >> any help would be greatly appreciated! >> >> Best, >> Spencer >> >> -- >> Spencer A Bruce >> 200 Washington St. >> Troy, NY 12180 >> 518 225 0787 >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: < >> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/cb94a767/attachment-0001.html >> > >> >> ------------------------------ >> >> Message: 2 >> Date: Thu, 23 Oct 2014 14:50:24 +0000 >> From: "Jombart, Thibaut" >> To: Spencer Bruce , >> "adegenet-forum at lists.r-forge.r-project.org" >> >> Subject: Re: [adegenet-forum] repooling random rows from genind >> objects >> Message-ID: >> <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE7F at icexch-m1.ic.ac.uk> >> Content-Type: text/plain; charset="iso-8859-1" >> >> >> Hello, >> hard to figure out what is wrong without the error message.. >> Cheers >> Thibaut >> ________________________________ >> From: adegenet-forum-bounces at lists.r-forge.r-project.org [ >> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer >> Bruce [goatsrunfaster at gmail.com] >> Sent: 23 October 2014 15:29 >> To: adegenet-forum at lists.r-forge.r-project.org >> Subject: [adegenet-forum] repooling random rows from genind objects >> >> Hello All! >> >> I have three seperate populations as genind objects. What I would like to >> do is pull a certain number of random individuals from each, to form a new >> single genind population. >> >> I would then like individuals from this new genind population to mate >> randomly, producing another genind object which would contain their >> offspring. >> >> Below is the code I came up with (which does not work): >> >> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), >> 750), ], pop2[sample(nrow(pop2), 750), ], n=2000) >> >> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], >> Year1[sample(nrow(Year1), 1000), ], n=2000) >> >> >> any help would be greatly appreciated! >> >> Best, >> Spencer >> >> -- >> Spencer A Bruce >> 200 Washington St. >> Troy, NY 12180 >> 518 225 0787 >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: < >> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/c43c29fe/attachment-0001.html >> > >> >> ------------------------------ >> >> Message: 3 >> Date: Thu, 23 Oct 2014 10:52:59 -0400 >> From: Spencer Bruce >> To: "Jombart, Thibaut" >> Cc: "adegenet-forum at lists.r-forge.r-project.org" >> >> Subject: Re: [adegenet-forum] repooling random rows from genind >> objects >> Message-ID: >> < >> CAGjKGeZhjLhurZbKiMZxSmZV_GFC3quw58FPt06LrXwakRDPeA at mail.gmail.com> >> Content-Type: text/plain; charset="utf-8" >> >> Error message: >> >> Error in sample.int(length(x), size, replace, prob) : >> invalid first argument >> >> On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut < >> t.jombart at imperial.ac.uk >> > wrote: >> >> > >> > Hello, >> > hard to figure out what is wrong without the error message.. >> > Cheers >> > Thibaut >> > ------------------------------ >> > *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [ >> > adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of >> Spencer >> > Bruce [goatsrunfaster at gmail.com] >> > *Sent:* 23 October 2014 15:29 >> > *To:* adegenet-forum at lists.r-forge.r-project.org >> > *Subject:* [adegenet-forum] repooling random rows from genind objects >> > >> > Hello All! >> > >> > I have three seperate populations as genind objects. What I would like >> > to do is pull a certain number of random individuals from each, to form >> a >> > new single genind population. >> > >> > I would then like individuals from this new genind population to mate >> > randomly, producing another genind object which would contain their >> > offspring. >> > >> > Below is the code I came up with (which does not work): >> > >> > Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), >> > 750), ], pop2[sample(nrow(pop2), 750), ], n=2000) >> > >> > Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], >> > Year1[sample(nrow(Year1), 1000), ], n=2000) >> > >> > >> > any help would be greatly appreciated! >> > >> > Best, >> > Spencer >> > >> > -- >> > Spencer A Bruce >> > 200 Washington St. >> > Troy, NY 12180 >> > 518 225 0787 >> > >> >> >> >> -- >> Spencer A Bruce >> 200 Washington St. >> Troy, NY 12180 >> 518 225 0787 >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: < >> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/19bc54c8/attachment-0001.html >> > >> >> ------------------------------ >> >> Message: 4 >> Date: Thu, 23 Oct 2014 14:54:53 +0000 >> From: "Jombart, Thibaut" >> To: Spencer Bruce >> Cc: "adegenet-forum at lists.r-forge.r-project.org" >> >> Subject: Re: [adegenet-forum] repooling random rows from genind >> objects >> Message-ID: >> <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE93 at icexch-m1.ic.ac.uk> >> Content-Type: text/plain; charset="iso-8859-1" >> >> >> >> What does nrow(F1) and other nrow(...)'s say? >> >> >> >> >> ________________________________ >> From: Spencer Bruce [goatsrunfaster at gmail.com] >> Sent: 23 October 2014 15:52 >> To: Jombart, Thibaut >> Cc: adegenet-forum at lists.r-forge.r-project.org >> Subject: Re: [adegenet-forum] repooling random rows from genind objects >> >> Error message: >> >> Error in sample.int(length(x), size, replace, prob) : >> invalid first argument >> >> On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut < >> t.jombart at imperial.ac.uk> wrote: >> >> Hello, >> hard to figure out what is wrong without the error message.. >> Cheers >> Thibaut >> ________________________________ >> From: adegenet-forum-bounces at lists.r-forge.r-project.org> adegenet-forum-bounces at lists.r-forge.r-project.org> [ >> adegenet-forum-bounces at lists.r-forge.r-project.org> adegenet-forum-bounces at lists.r-forge.r-project.org>] on behalf of >> Spencer Bruce [goatsrunfaster at gmail.com] >> Sent: 23 October 2014 15:29 >> To: adegenet-forum at lists.r-forge.r-project.org> adegenet-forum at lists.r-forge.r-project.org> >> Subject: [adegenet-forum] repooling random rows from genind objects >> >> Hello All! >> >> I have three seperate populations as genind objects. What I would like to >> do is pull a certain number of random individuals from each, to form a new >> single genind population. >> >> I would then like individuals from this new genind population to mate >> randomly, producing another genind object which would contain their >> offspring. >> >> Below is the code I came up with (which does not work): >> >> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), >> 750), ], pop2[sample(nrow(pop2), 750), ], n=2000) >> >> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], >> Year1[sample(nrow(Year1), 1000), ], n=2000) >> >> >> any help would be greatly appreciated! >> >> Best, >> Spencer >> >> -- >> Spencer A Bruce >> 200 Washington St. >> Troy, NY 12180 >> 518 225 0787 >> >> >> >> -- >> Spencer A Bruce >> 200 Washington St. >> Troy, NY 12180 >> 518 225 0787 >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: < >> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/2dfd9408/attachment-0001.html >> > >> >> ------------------------------ >> >> Message: 5 >> Date: Thu, 23 Oct 2014 11:03:54 -0400 >> From: Spencer Bruce >> To: "Jombart, Thibaut" >> Cc: "adegenet-forum at lists.r-forge.r-project.org" >> >> Subject: Re: [adegenet-forum] repooling random rows from genind >> objects >> Message-ID: >> < >> CAGjKGeZiyS-27oF2CKb6JBPgS+VSyTA1-NS2Q8g2UC0JqOs4VA at mail.gmail.com> >> Content-Type: text/plain; charset="utf-8" >> >> they both say Null, if I just type them into R. >> >> Just to be clear these genind objects contains microsat data for 11 loci >> for thousands of individuals. >> >> I'm rather new to R, so I apologize if I'm missing something obvious >> here... >> >> On Thu, Oct 23, 2014 at 10:54 AM, Jombart, Thibaut < >> t.jombart at imperial.ac.uk >> > wrote: >> >> > >> > >> > What does nrow(F1) and other nrow(...)'s say? >> > >> > >> > >> > >> > ------------------------------ >> > *From:* Spencer Bruce [goatsrunfaster at gmail.com] >> > *Sent:* 23 October 2014 15:52 >> > *To:* Jombart, Thibaut >> > *Cc:* adegenet-forum at lists.r-forge.r-project.org >> > *Subject:* Re: [adegenet-forum] repooling random rows from genind >> objects >> > >> > Error message: >> > >> > Error in sample.int(length(x), size, replace, prob) : >> > invalid first argument >> > >> > On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut < >> > t.jombart at imperial.ac.uk> wrote: >> > >> >> >> >> Hello, >> >> hard to figure out what is wrong without the error message.. >> >> Cheers >> >> Thibaut >> >> ------------------------------ >> >> *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [ >> >> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of >> Spencer >> >> Bruce [goatsrunfaster at gmail.com] >> >> *Sent:* 23 October 2014 15:29 >> >> *To:* adegenet-forum at lists.r-forge.r-project.org >> >> *Subject:* [adegenet-forum] repooling random rows from genind objects >> >> >> >> Hello All! >> >> >> >> I have three seperate populations as genind objects. What I would like >> >> to do is pull a certain number of random individuals from each, to >> form a >> >> new single genind population. >> >> >> >> I would then like individuals from this new genind population to mate >> >> randomly, producing another genind object which would contain their >> >> offspring. >> >> >> >> Below is the code I came up with (which does not work): >> >> >> >> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), >> >> 750), ], pop2[sample(nrow(pop2), 750), ], n=2000) >> >> >> >> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], >> >> Year1[sample(nrow(Year1), 1000), ], n=2000) >> >> >> >> >> >> any help would be greatly appreciated! >> >> >> >> Best, >> >> Spencer >> >> >> >> -- >> >> Spencer A Bruce >> >> 200 Washington St. >> >> Troy, NY 12180 >> >> 518 225 0787 >> >> >> > >> > >> > >> > -- >> > Spencer A Bruce >> > 200 Washington St. >> > Troy, NY 12180 >> > 518 225 0787 >> > >> >> >> >> -- >> Spencer A Bruce >> 200 Washington St. >> Troy, NY 12180 >> 518 225 0787 >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: < >> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/c51ccf09/attachment.html >> > >> >> ------------------------------ >> >> _______________________________________________ >> adegenet-forum mailing list >> adegenet-forum at lists.r-forge.r-project.org >> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum >> >> End of adegenet-forum Digest, Vol 74, Issue 9 >> ********************************************* >> > > > _______________________________________________ > adegenet-forum mailing list > adegenet-forum at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum > -- Spencer A Bruce 200 Washington St. Troy, NY 12180 518 225 0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: From caitiecollins at gmail.com Fri Oct 24 19:03:11 2014 From: caitiecollins at gmail.com (Caitlin Collins) Date: Fri, 24 Oct 2014 18:03:11 +0100 Subject: [adegenet-forum] Question about how to interpret Cross validation in my analysis. Thanks! In-Reply-To: References: Message-ID: Hello again, In response to your two questions: *1) * The output element ?mean and CI for random chance? provides the values that are used to draw the horizontal solid (mean) and dashed (CI) lines on the plot generated for cross-validation. In your case, the mean and CI for random chance was 49% (43%, 60%). The interpretation of this would be that if the highest success in outcome prediction that you were able to achieve with any model was between 43% and 60%, then you could be 95% confident that the ability of even the best model to assign individuals to the correct group does not differ significantly from the success rate you could achieve by assigning individuals to a group at random by, say, flipping a coin as a method of determining what group they belonged to. Ergo, you would not have succeeded in creating a useful model. However, your results indicate that with 25 PCs retained, your model had a success rate of 69.5%, so you *have* created a ?useful? model. Even though it is not a particularly successful model, it still has a mean success rate that is 20% higher than the mean success for the coin toss approach, and 10% higher than the upper limit of the CI for random chance. So you can be 95% confident that the somewhat modest ability of your best model to discriminate between groups is not just happening by chance?the model is truly doing something useful. ------ *2) 2)* While your interpretation is generally true, in that group membership is not well-predicted by any model, I think you have mis-read the results. The way they are laid out, at least in the text you copied into the e-mail, has skewed the values given for the means to the right of the number of PCs that they should be corresponding to? With 25 PCs, your optimal model is actually achieving a mean success of nearly 70%. Still not too good, but better than 63%. The MSE for 25 PCs is 32.4%, which is indeed quite high. However, the interpretation of this is not that you can only be ?sure? of correctly predicting around 20% to the right pre-defined group. Rather, you can be ?sure? of correctly predicting almost 70%! I think your confusion here may come from your interpretation of what the random chance values mean. Finding that the mean success for your best model is 20% above the mean success for random chance does not mean you can only be sure of 20% correct predictions. Rather, you could say that while you can in fact expect a 70% success rate (your highest mean success), your model is only providing an improvement of ~ 20% over the success rate you could have achieved by tossing a coin. This changes the severity of your final conclusion. First, I should mention that it?s not fair to say that ?[your] set of microsatellites can?t explain well [your] pre-defined groups?. Instead, it might be more accurate to say, ?*With* the set of microsatellites available, you are unable to build a *model* with DAPC that explains well the variation between your pre-defined groups.? Finally, in light of the points above, while it is still true that the model does not explain the variation between groups particularly well, it does explain about 70% of that variation, so I wouldn?t consider it to be ?unsuccessful?. ----- Sorry for the long answer, but I hope it helps a bit at least! Please let me know if it doesn?t though, or if you have any more questions. All the best, Caitlin. On Thu, Oct 16, 2014 at 11:30 PM, Angela Merino < Angela.Merino at cawthron.org.nz> wrote: > Thanks you very much! It was really helpful! J > > > > Then I understand that my models is not significantly the best model that > could be found using my variables (in my case, microsatellites). If I use a > model with n.pca=20 or =40 I got pretty the same success of membership > prediction (and with the same big root mean squared error). > > > > 1) My last questions (I hope!) to understand the output of the > *cross.validation* function is what does it mean the Median and > Confidence Interval for Random Chance (below in yellow)? I think it means > that with a confidence of 95% the value of successful assignment would be a > value between 43% and 60%, which therefore means again that the > optimization of my model was ?not successful?. (??) > > 2) About the global interpretation of this results, I would say that > membership of my predefined groups are not well predicted by any model as > the mean successful assignment is not higher than 63% (Maximum when > n.pcs=25) and in addition the mean squared errors is quite high (30-40%). I > would be ?sure? of predicting only around 20% to the right predefined > group. In short, my set of microsatellites can?t explain well my predefined > groups. > > > > > > [image: cid:image002.jpg at 01CFE7A4.CCC02130]*$`Median and Confidence > Interval for Random Chance`* > > * 2.5% 50% 97.5% * > > *0.4294840 0.4928747 0.5962807* > > *$`Mean Successful Assignment by Number of PCs of PCA`* > > * 5 10 15 20 25 30 > 35 40 * > > *0.5871429 0.6000000 0.5819048 0.6014286 0.6952381 0.6747619 0.6333333 > 0.6109524 * > > *$`Number of PCs Achieving Highest Mean Success`* > > *[1] "25"* > > *$`Root Mean Squared Error by Number of PCs of PCA`* > > * 5 10 15 20 25 30 > 35 40 * > > *0.4301795 0.4141872 0.4389381 0.4131429 0.3241735 0.3531491 0.3885084 > 0.4145894 * > > *$`Number of PCs Achieving Lowest MSE`* > > *[1] "25"* > > > > > > > > > > > > > > > > Thanks in advance! I am learning a lot about R and adegenet package and I > find really interesting to assess weak genetic population structure. > > > > Kind regards, > > > > ?Angela > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *From:* Caitlin Collins [mailto:caitiecollins at gmail.com] > *Sent:* Friday, 17 October 2014 1:28 a.m. > *To:* Angela Merino > *Cc:* Collins, Caitlin; Jombart, Thibaut > *Subject:* Re: Question about how to interpret Cross validation in my > analysis. Thanks! > > > > Hi Angela, > > Well, I have two pieces of good news for you, and one piece of mediocre > news. > > First, there?s nothing to worry about with respect to the ?NULL? that you > are seeing. It just gets printed when xval.plot=TRUE as an artefact of one > of the lines of the printing function. It has no meaning, and certainly > does not imply that your model is not valid. (Given the stress that I now > realise this glaring ?NULL? may cause, I?ve changed the way the plots print > now, so in the next release of adegenet this won?t happen.) > > Second, you are absolutely correct in your interpretation of the results > of xvalDapc (which are stored in whatever object you assigned the results > to, in your case, ?xval?). > > > > This brings me to the mediocre news: given that your interpretation is > correct, it seems that the best model you can achieve with DAPC, where > n.pca=25, is only able to predict the group membership of validation set > individuals in 63% of the cases, with a 32% root mean squared error. > Arguably, this is not great. Your final comment on the matter, though, is > quite insightful. The fact that you can achieve the same modest level of > success with 20-80 PCs indicates that the optimisation procedure has not > been particularly successful. Ideally, one would like to see an arch, with > a maximum success point somewhere in the middle. In your case, there is a > bit of an arch, but it isn?t particularly striking. > > > > The only thing I might add to your interpretation of this result is that > it?s not so much that the model is poor because a similar level of success > can be achieved with variable numbers of PCs. If mean success was virtually > constant, but varying around 90%, the interpretation would not be that the > model is poor, but rather that most levels of PC retention can compose a > model that effectively discriminates between groups. > > I hope this has helped answer some of your questions. If you have any > more, please feel free to ask. > > Best, > Caitlin. > > > > > > On Mon, Oct 13, 2014 at 11:48 PM, Angela Merino < > Angela.Merino at cawthron.org.nz> wrote: > > Hi Caitlin Collins and Thibaut Jombart, > > > > My name is Angela Parody-Merino and I am a PhD student at Massey > University (New Zealand). I am studying the population genetic structure in > a migratory bird (the New Zealand Godwit) with 23 microsatellites. Anyway, > maybe this is a very simple question but I really want to understand and be > sure about the meaning and interpretation of the output when doing > cross-validation. I have been some days looking in the internet and reading > explanations etc?without being able to really understand what?s going on > with my analysis. Could you help me please? J > > > > This is the script of the analysis: > > > x <- ELpop > > > mat <- as.matrix(na.replace(x, method="mean")) > > > > Replaced 371 missing values > > > grp <- pop(x) > > > xval <- xvalDapc(mat, grp, n.pca.max = 40, training.set = 0.9, > > + result = "groupMean", center = TRUE, scale = FALSE, > > + n.pca = NULL, n.rep = 500, xval.plot = TRUE) > > NULL *>>> What does it mean this NULL? Does it mean that the model is not > valid?* > > *$`Median and Confidence Interval for Random Chance`* > > * 2.5% 50% 97.5% * > > *0.4294840 0.4928747 0.5962807 * > > > > *$`Mean Successful Assignment by Number of PCs of PCA`* > > * 5 10 15 20 25 30 > 35 40 * > > *0.5871429 0.6000000 0.5819048 0.6014286 0.6952381 0.6747619 0.6333333 > 0.6109524 * > > > > *$`Number of PCs Achieving Highest Mean Success`* > > *[1] "25"* > > > > *$`Root Mean Squared Error by Number of PCs of PCA`* > > * 5 10 15 20 25 30 > 35 40 * > > *0.4301795 0.4141872 0.4389381 0.4131429 0.3241735 0.3531491 0.3885084 > 0.4145894 * > > > > *$`Number of PCs Achieving Lowest MSE`* > > *[1] "25"* > > > > *From the screenshot and the output results of the cross validation (in > blue), I would say that my model (retaining 25PCs) can predict with a mean > of 63% but it is not such a good model because most of the models that can > be obtained by retaining 20, 40, 60, 80 PCs are quite the same successful. > Is it my interpretation correct?* > > > > > > > > Thanks in advance, > > > > Kind regards, > > > > ?Angela Parody-Merino > ------------------------------ > > *Attention: * > This message is for the named person's use only. It may contain > confidential, proprietary or legally privileged information. If you > receive this message in error, please immediately delete it and all copies > of it from your system, destroy any hard copies of it and notify the > sender. You must not, directly or indirectly, use, disclose, distribute, > print, or copy any part of this message if you are not the intended > recipient. Cawthron reserves the right to monitor all e-mail communications > through its networks. Any opinions expressed in this message are those of > the individual sender, except where the message states otherwise and the > sender is authorised to make that statement. > > This e-mail message has been scanned and cleared by *MailMarshal * > ------------------------------ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 48953 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.jpg Type: image/jpeg Size: 31124 bytes Desc: not available URL: From goatsrunfaster at gmail.com Mon Oct 27 14:57:21 2014 From: goatsrunfaster at gmail.com (Spencer Bruce) Date: Mon, 27 Oct 2014 09:57:21 -0400 Subject: [adegenet-forum] Hybridize Function / df2genind error message Message-ID: Hello All, After hybridizing two populations, I converted the genind file to at dataframe to randomly extract individuals. I then attempt to convert this data frame back into a genind file, but get the error message below: > F1_G1 <- df2genind(randomF1) Error in df2genind(randomF1) : 2 alleles cannot be coded by a total of 19 characters Im assuming this is because the "pop" column, instead of being coded by a number contains the text generated by the hybridize function "honnedaga-tdhybrids" I tried to resolve this by using the following code, but ran into a second error message: > randomF1$pop[randomF1$pop == "honnedaga-tdhybrids"] <- 1 Warning message: In `[<-.factor`(`*tmp*`, randomF1$pop == "honnedaga-tdhybrids", : invalid factor level, NA generated any idea how I might be able to fix this? Thanks in advance!!! -Spencer -- Spencer A Bruce 200 Washington St. Troy, NY 12180 518 225 0787 -------------- next part -------------- An HTML attachment was scrubbed... URL: From roberto at geodev.com.br Thu Oct 30 16:40:18 2014 From: roberto at geodev.com.br (Roberto Oliveira Santos) Date: Thu, 30 Oct 2014 15:40:18 +0000 Subject: [adegenet-forum] find.clusters without PCA Message-ID: Dear all Is it possible to run find.clusters without the PCA analysis? I have interested in the clustering procedure but would like to compare the results with and without PCA transformation. Best wishes Roberto -------------- next part -------------- An HTML attachment was scrubbed... URL: From f.calboli at imperial.ac.uk Thu Oct 30 16:56:34 2014 From: f.calboli at imperial.ac.uk (Federico Calboli) Date: Thu, 30 Oct 2014 15:56:34 +0000 Subject: [adegenet-forum] find.clusters without PCA In-Reply-To: References: Message-ID: On 30 Oct 2014, at 15:40, Roberto Oliveira Santos wrote: > Dear all > > Is it possible to run find.clusters without the PCA analysis? I would not know whether find.clusters would like it, but in general you can surely find clusters without bothering with a PCA first ? you have a formula, you input some data, you get your results. It would also be completely and utterly idiotic. You use a PCA before because of correlation betwen the data, and you transform the data with a PCA in a set of independent variables (and you also have an idea of what linear combinations explain little or nothing in the bargain). You use a PCA to get some signal out of the noise. So, you can well not use a PCA and cluster. You will get some results, that might, or not, look like the results you get after a PCA decomposition. You will also have biased your clustering to an unknown amount, in a way that is not clear what might actually mean. BW F > I have interested in the clustering procedure but would like to compare the results with and without PCA transformation. > > Best wishes > > Roberto > _______________________________________________ > adegenet-forum mailing list > adegenet-forum at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: From roberto at geodev.com.br Thu Oct 30 17:02:57 2014 From: roberto at geodev.com.br (Roberto Oliveira Santos) Date: Thu, 30 Oct 2014 16:02:57 +0000 Subject: [adegenet-forum] find.clusters without PCA In-Reply-To: References: Message-ID: Dear Federico Many thanks. Very kind of you the "It would also be completely and utterly idiotic.". Best wishes Roberto 2014-10-30 15:56 GMT+00:00 Federico Calboli : > On 30 Oct 2014, at 15:40, Roberto Oliveira Santos > wrote: > > > Dear all > > > > Is it possible to run find.clusters without the PCA analysis? > > I would not know whether find.clusters would like it, but in general you > can surely find clusters without bothering with a PCA first -- you have a > formula, you input some data, you get your results. > > It would also be completely and utterly idiotic. > > You use a PCA before because of correlation betwen the data, and you > transform the data with a PCA in a set of independent variables (and you > also have an idea of what linear combinations explain little or nothing in > the bargain). You use a PCA to get some signal out of the noise. > > So, you can well not use a PCA and cluster. You will get some results, > that might, or not, look like the results you get after a PCA > decomposition. You will also have biased your clustering to an unknown > amount, in a way that is not clear what might actually mean. > > BW > > F > > > > I have interested in the clustering procedure but would like to compare > the results with and without PCA transformation. > > > > Best wishes > > > > Roberto > > _______________________________________________ > > adegenet-forum mailing list > > adegenet-forum at lists.r-forge.r-project.org > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From f.calboli at imperial.ac.uk Thu Oct 30 17:16:33 2014 From: f.calboli at imperial.ac.uk (Federico Calboli) Date: Thu, 30 Oct 2014 16:16:33 +0000 Subject: [adegenet-forum] find.clusters without PCA In-Reply-To: References: Message-ID: <43B55DB4-31DF-4C47-A4E7-F10B05131A3A@imperial.ac.uk> You?re welcome. I would not be presenting the results to referees, PhD examiners or colleagues. http://judgestarling.tumblr.com/post/79974811093/shaming-reputations-as-a-means-of-reducing-the Happy reading! F On 30 Oct 2014, at 16:02, Roberto Oliveira Santos wrote: > Dear Federico > > Many thanks. Very kind of you the "It would also be completely and utterly idiotic.". > > Best wishes > > Roberto > > > 2014-10-30 15:56 GMT+00:00 Federico Calboli : > On 30 Oct 2014, at 15:40, Roberto Oliveira Santos wrote: > > > Dear all > > > > Is it possible to run find.clusters without the PCA analysis? > > I would not know whether find.clusters would like it, but in general you can surely find clusters without bothering with a PCA first ? you have a formula, you input some data, you get your results. > > It would also be completely and utterly idiotic. > > You use a PCA before because of correlation betwen the data, and you transform the data with a PCA in a set of independent variables (and you also have an idea of what linear combinations explain little or nothing in the bargain). You use a PCA to get some signal out of the noise. > > So, you can well not use a PCA and cluster. You will get some results, that might, or not, look like the results you get after a PCA decomposition. You will also have biased your clustering to an unknown amount, in a way that is not clear what might actually mean. > > BW > > F > > > > I have interested in the clustering procedure but would like to compare the results with and without PCA transformation. > > > > Best wishes > > > > Roberto > > _______________________________________________ > > adegenet-forum mailing list > > adegenet-forum at lists.r-forge.r-project.org > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum > > From roberto at geodev.com.br Thu Oct 30 19:41:17 2014 From: roberto at geodev.com.br (Roberto Oliveira Santos) Date: Thu, 30 Oct 2014 18:41:17 +0000 Subject: [adegenet-forum] find.clusters without PCA In-Reply-To: <43B55DB4-31DF-4C47-A4E7-F10B05131A3A@imperial.ac.uk> References: <43B55DB4-31DF-4C47-A4E7-F10B05131A3A@imperial.ac.uk> Message-ID: Hi Federico "shaming reputations"? sorry..., pretty much sure I don't have any reputation :-) if anyone ask a naive question this should be response? I disagree... anyway, thanks for the text. I'll keep in mind. Cheers, Roberto 2014-10-30 16:16 GMT+00:00 Federico Calboli : > You're welcome. I would not be presenting the results to referees, PhD > examiners or colleagues. > > > http://judgestarling.tumblr.com/post/79974811093/shaming-reputations-as-a-means-of-reducing-the > > Happy reading! > > F > > > On 30 Oct 2014, at 16:02, Roberto Oliveira Santos > wrote: > > > Dear Federico > > > > Many thanks. Very kind of you the "It would also be completely and > utterly idiotic.". > > > > Best wishes > > > > Roberto > > > > > > 2014-10-30 15:56 GMT+00:00 Federico Calboli : > > On 30 Oct 2014, at 15:40, Roberto Oliveira Santos > wrote: > > > > > Dear all > > > > > > Is it possible to run find.clusters without the PCA analysis? > > > > I would not know whether find.clusters would like it, but in general you > can surely find clusters without bothering with a PCA first -- you have a > formula, you input some data, you get your results. > > > > It would also be completely and utterly idiotic. > > > > You use a PCA before because of correlation betwen the data, and you > transform the data with a PCA in a set of independent variables (and you > also have an idea of what linear combinations explain little or nothing in > the bargain). You use a PCA to get some signal out of the noise. > > > > So, you can well not use a PCA and cluster. You will get some results, > that might, or not, look like the results you get after a PCA > decomposition. You will also have biased your clustering to an unknown > amount, in a way that is not clear what might actually mean. > > > > BW > > > > F > > > > > > > I have interested in the clustering procedure but would like to > compare the results with and without PCA transformation. > > > > > > Best wishes > > > > > > Roberto > > > _______________________________________________ > > > adegenet-forum mailing list > > > adegenet-forum at lists.r-forge.r-project.org > > > > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andres.susrud at gmail.com Thu Oct 30 21:02:12 2014 From: andres.susrud at gmail.com (=?UTF-8?Q?Andres_Schj=C3=B8nhaug_Susrud?=) Date: Thu, 30 Oct 2014 21:02:12 +0100 Subject: [adegenet-forum] problems adding predicted points to scatter plot Message-ID: Dear list, I'm having problems adding points to a dapc scatter plot. grp = find.clusters(human_DR_bind_2[1:200,]) dapc1 <- dapc(human_DR_bind_2[1:200,],grp$grp) pred.sup <- predict.dapc(dapc1, newdata=x.sup2) names(pred.sup) scatter(dapc1, cell=2.5, pch=1, cstar=0, axesel=FALSE, col=c(2,3,4)) par(xpd=T) points(pred.sup$ind.scores[,1],pred.sup$ind.scores[,2],pch = 2,col = 6) the problem is that the predicted points are "all" visible, but completely out of placement. when plotting the dapc1$ind.scores[,1],dapc1$ind.scores plot(dapc1$ind.scores[,1],dapc1$ind.scores) points(pred.sup$ind.scores[,1],pred.sup$ind.scores[,2],pch = 2,col = 6) the alligment seems fine. thanks for any help on this matter BR Andres -------------- next part -------------- An HTML attachment was scrubbed... URL: From hilpert at ipk-gatersleben.de Wed Oct 29 11:24:26 2014 From: hilpert at ipk-gatersleben.de (Stefanie Hilpert) Date: Wed, 29 Oct 2014 10:24:26 +0000 Subject: [adegenet-forum] DAPC & Ploidylevel Message-ID: I've already asked the question a week ago, but I'll just try again, so here we go: Dear everybody, I am currently using the adegenet package to perform a structure analysis of my microsatellite dataset and compare it to the results of an analysis using STRUCTURE software. The organism I am working on is an apomictic plant and I am aware that STRUCTURE is probably not adequate because it assumes HWE and asexuality violates HWE. Nevertheless we use STRUCTURE analysis for apomicts, because in most of the cases the assigned number of groups correlate to biological traits. Knowing that there is a bias using STRUCTURE we decided to perform a DAPC additionally. But now I ran into another problem using adegenet. I am working with a mixed ploidy system with ploidies ranging from 4 to 11. To implement the data into the adgenet package we coded all individuals as 11x because otherwise it was not possible to load the data. Now I am wondering how big is the bias if the calculation assumes that for example a tetraploid is now a hendecaploid and if I could still trust the results. I am asking because the results of the DAPC are completely different to the ones of STRUCTURE which puzzles me a bit because I somehow at least expected correlations (the number of optimal k is the same, but the assigned individuals to the clusters differ completely). I would appreciate some help Stefanie Hilpert ------------------------------------------------------------------------------ Stefanie Hilpert -PhD Candidate- Dept. of Cytogenetics and Genome Analysis Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Corrensstra?e 3, D-06466 Gatersleben Germany +49 (0)39482 5673 IPK Graduate School International Max-Planck Research School -------------- next part -------------- An HTML attachment was scrubbed... URL: