From neagef at gmail.com Mon Oct 6 17:30:14 2014
From: neagef at gmail.com (Andrea Garavito)
Date: Mon, 6 Oct 2014 12:30:14 -0300
Subject: [adegenet-forum] Fwd: Significance of allelic contribution to
discriminant functions
In-Reply-To:
References:
Message-ID:
Hello Caitlin,
I was taking a look to the adegenet forum and I found this previous answer
about a statistical threshold for marker contributions.
Originally I was planing to retain for each one of my discriminant
functions, around the 0.3% of markers with the highest contributions by
establishing a threshold of 3-sigma. I'm not sure if these data are
distributed normally, but as I have almost 5000 markers I was assuming so.
Then I saw your post about the snpzip analysis and decided to give it a try.
I tested the function with all the methods available, and I think I'll use
the "median" method as with the others I'm getting to many markers retained
(and only one with the "single" method).
I see that the snpzip test make the analysis for the first discriminant
function, but is there a way to make it also for the other discriminant
functions found with DAPC?
Thanks for your answer
Andrea
2014-08-26 12:58 GMT-03:00 Caitlin Collins :
> Yeah, it's new!
>
> I might as well note, in case you decide only to try a subset of the
> methods available:
> - Ward's method is most likely to select a very large number of variables
> to get the most complete picture
> - Single linkage hierarchical clustering will probably select the fewest
> - Centroid clustering will probably select a useful middle-ground.
>
> You can always check to see what proportion of the variance is contained
> in the subset of variables retained, or you could even try running a DAPC/
> PCA with just those variables to compare the discriminatory power of the
> entire set with that of the subset selected.
>
> Good luck.
>
> Cheers,
> Caitlin.
>
>
> On Tue, Aug 26, 2014 at 4:31 PM, Charlie Waters wrote:
>
>> Thanks Caitlin! I've never come across the snpzip function so I'll give
>> those clustering methods a try.
>>
>> Thanks,
>> Charlie
>>
>>
>> On Tue, Aug 26, 2014 at 3:49 AM, Caitlin Collins > > wrote:
>>
>>> Hi Charlie,
>>>
>>> Good question. Technically, there is no one "correct" statistical
>>> solution to your problem. But, there *are *a number of ways of
>>> approaching the problem with more statistical rigour than simply using an
>>> arbitrary threshold as you have done.
>>>
>>> Have you taken a look at the snpzip function in the adegenet packge? If
>>> not, just type "?snpzip" into R with the adegenet package loaded. With this
>>> function, you can apply one of seven different hierarchical clustering
>>> formulas to the allelic contributions generated by dapc. Essentially, each
>>> hierarchical clustering method uses a unique approach to determine where
>>> the threshold should be drawn. I should note, however, that this
>>> descriptive approach will not have an associated p-value. You may want to
>>> try out a few different methods before deciding which variables you want
>>> to consider "most significant".
>>>
>>> I hope that helps!
>>>
>>> Best,
>>> Caitlin
>>>
>>
>>
>>
>> --
>> Charlie Waters
>> Box 355020
>> School of Aquatic and Fishery Sciences
>> University of Washington
>> Seattle, WA 98105
>>
>>
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From neagef at gmail.com Mon Oct 6 19:09:32 2014
From: neagef at gmail.com (Andrea Garavito)
Date: Mon, 6 Oct 2014 14:09:32 -0300
Subject: [adegenet-forum] Fwd: Significance of allelic contribution to
discriminant functions
In-Reply-To:
References:
Message-ID:
Hello again!
I took a closer look into the object created by the snpzip tool, and I
found the contributions for all the different axes.
I didn't noticed them before as I was looking only at the plot obtained.
Thanks anyway!
Andrea
2014-10-06 12:30 GMT-03:00 Andrea Garavito :
> Hello Caitlin,
> I was taking a look to the adegenet forum and I found this previous answer
> about a statistical threshold for marker contributions.
>
> Originally I was planing to retain for each one of my discriminant
> functions, around the 0.3% of markers with the highest contributions by
> establishing a threshold of 3-sigma. I'm not sure if these data are
> distributed normally, but as I have almost 5000 markers I was assuming so.
> Then I saw your post about the snpzip analysis and decided to give it a try.
> I tested the function with all the methods available, and I think I'll use
> the "median" method as with the others I'm getting to many markers retained
> (and only one with the "single" method).
> I see that the snpzip test make the analysis for the first discriminant
> function, but is there a way to make it also for the other discriminant
> functions found with DAPC?
>
> Thanks for your answer
> Andrea
>
>
> 2014-08-26 12:58 GMT-03:00 Caitlin Collins :
>
>> Yeah, it's new!
>>
>> I might as well note, in case you decide only to try a subset of the
>> methods available:
>> - Ward's method is most likely to select a very large number of variables
>> to get the most complete picture
>> - Single linkage hierarchical clustering will probably select the fewest
>> - Centroid clustering will probably select a useful middle-ground.
>>
>> You can always check to see what proportion of the variance is contained
>> in the subset of variables retained, or you could even try running a DAPC/
>> PCA with just those variables to compare the discriminatory power of the
>> entire set with that of the subset selected.
>>
>> Good luck.
>>
>> Cheers,
>> Caitlin.
>>
>>
>> On Tue, Aug 26, 2014 at 4:31 PM, Charlie Waters wrote:
>>
>>> Thanks Caitlin! I've never come across the snpzip function so I'll give
>>> those clustering methods a try.
>>>
>>> Thanks,
>>> Charlie
>>>
>>>
>>> On Tue, Aug 26, 2014 at 3:49 AM, Caitlin Collins <
>>> caitiecollins at gmail.com> wrote:
>>>
>>>> Hi Charlie,
>>>>
>>>> Good question. Technically, there is no one "correct" statistical
>>>> solution to your problem. But, there *are *a number of ways of
>>>> approaching the problem with more statistical rigour than simply using an
>>>> arbitrary threshold as you have done.
>>>>
>>>> Have you taken a look at the snpzip function in the adegenet packge? If
>>>> not, just type "?snpzip" into R with the adegenet package loaded. With this
>>>> function, you can apply one of seven different hierarchical clustering
>>>> formulas to the allelic contributions generated by dapc. Essentially, each
>>>> hierarchical clustering method uses a unique approach to determine where
>>>> the threshold should be drawn. I should note, however, that this
>>>> descriptive approach will not have an associated p-value. You may want to
>>>> try out a few different methods before deciding which variables you want
>>>> to consider "most significant".
>>>>
>>>> I hope that helps!
>>>>
>>>> Best,
>>>> Caitlin
>>>>
>>>
>>>
>>>
>>> --
>>> Charlie Waters
>>> Box 355020
>>> School of Aquatic and Fishery Sciences
>>> University of Washington
>>> Seattle, WA 98105
>>>
>>>
>>
>> _______________________________________________
>> adegenet-forum mailing list
>> adegenet-forum at lists.r-forge.r-project.org
>>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From gemm2470 at uni-landau.de Sun Oct 12 10:47:23 2014
From: gemm2470 at uni-landau.de (Isabelle Gemmer)
Date: Sun, 12 Oct 2014 10:47:23 +0200
Subject: [adegenet-forum] Isolation by distance (Mantel test)
Message-ID: <543A401B.4060605@uni-landau.de>
Hello,
I installed coordinates for the Mantel test.
> data1$other$xy<-dataxy
> Dgeo <- dist(data1$other$xy)
> ibd <- mantel.randtest(Dgen,Dgeo)
It worked well. But in reality, the examined organisms can not swim
through a lake, they can only migrate along a shoreline. Thus, I
measured the distances and provide an own matrix of geographic distances.
My question is: Can I also install own measured geographic distances
instead of coordinates?
Regards,
Isabelle
From vojta at trapa.cz Sun Oct 12 12:28:06 2014
From: vojta at trapa.cz (=?utf-8?B?Vm9qdMSbY2g=?= Zeisek)
Date: Sun, 12 Oct 2014 12:28:06 +0200
Subject: [adegenet-forum] Isolation by distance (Mantel test)
In-Reply-To: <543A401B.4060605@uni-landau.de>
References: <543A401B.4060605@uni-landau.de>
Message-ID: <19382021.B1bExHcYvj@veles.site>
Hello
Dne Ne 12. ??jna 2014 10:47:23, Isabelle Gemmer napsal(a):
> Hello,
>
> I installed coordinates for the Mantel test.
>
> > data1$other$xy<-dataxy
> > Dgeo <- dist(data1$other$xy)
> > ibd <- mantel.randtest(Dgen,Dgeo)
>
> It worked well. But in reality, the examined organisms can not swim
> through a lake, they can only migrate along a shoreline. Thus, I
> measured the distances and provide an own matrix of geographic distances.
>
> My question is: Can I also install own measured geographic distances
> instead of coordinates?
Sure, just as Dgeo use matrix o distances along shoreline, so You wouldn't use
dist function, but computed then for example in GIS and then imported into R.
> Regards,
> Isabelle
Sincerely,
Vojt?ch
--
Vojt?ch Zeisek
http://trapa.cz/en/
Department of Botany, Faculty of Science
Charles University in Prague
Ben?tsk? 2, Prague, 12801, CZ
http://botany.natur.cuni.cz/en/
Institute of Botany, Academy of Science
Z?mek 1, Pr?honice, 25243, CZ
http://www.ibot.cas.cz/en/
Czech Republic
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: This is a digitally signed message part.
URL:
From zuzmus at gmail.com Thu Oct 9 11:55:09 2014
From: zuzmus at gmail.com (zuzmus)
Date: Thu, 9 Oct 2014 11:55:09 +0200
Subject: [adegenet-forum] PCA sensitive to order of samples?
Message-ID:
Dear colleagues,
I would like to perform the PCA in adegenet package and managed to go
through the procedure till the end. The problem is that the results don't
make sense and I see an obvious bias towards the order of the samples in
the input matrix.
The matrix has 140 samples from 11 putative species and cca 2800 SNPs
coming from the RAD-seq method (only biallelicm SNPs included; coded 0 -
more frequent allele, 1 - heterozygote, 2 - rarer allele, NA - missing
data).
I used the following code:
> data <-
read.table("/Users/zuzana/Matrix_for_adegenet_cutSNPsTo2484_NoHybrids.txt")
> x <- new("genlight", data)
> pca1 <- glPca(x)
> scatter(pca1, posi="bottomleft")
The results always show first 5-7 individuals as strongly separated along
the PC1 and 2 and the rest forms one cluster. When I repeated the same
analysis after removing the first few individual from the matrix, the
pattern stayed as it was - the new first individuals became separated.
[image: Vlo?en? obr?zek 1]
I also tried to play with most of the options for glPca command following
the manual or help in R, but always got the similar results...
Another issue is that I have quite some missing data (10 - 35 % per SNP,
and cca 10 - 50% per individual) in my matrix, but this was the trade off
of the experiment design ("sequence as much as possible as cheap as
possible..."). But the first individuals in the list are quite well
sequenced, so they are not the worst in sense of missing data...
I wonder if I missed some basics, if I did something wrong or if it is
possible that there really is a bias of the order of the samples in the
matrix? I would be very happy if somebody could help me to find out how to
solve this issue.
Thank you very much of any help and suggestion!:-)
With regards,
Zuzana
---
Zuzana Musilova, PhD.
Zoological Institute
University of Basel
Vesalgasse 1 | 4051 Basel
Switzerland | Europe
)><(((@>....<@)))><(
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2014-10-09 at 11.22.14 AM.png
Type: image/png
Size: 27443 bytes
Desc: not available
URL:
From t.jombart at imperial.ac.uk Tue Oct 14 12:31:52 2014
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Tue, 14 Oct 2014 10:31:52 +0000
Subject: [adegenet-forum] PCA sensitive to order of samples?
In-Reply-To:
References:
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570A826EE08@icexch-m1.ic.ac.uk>
Hi there,
no, PCA is not sensitive to the ordering of samples.
Note: given the size of the dataset, it is probably easier to use the basic PCA procedure (dudi.pca). genlight objects are meant to be used whenever your computer could not otherwise store the data.
If your missing data are not randomly distributed, then many NAs is a problem: individuals with similar missing data will be seen as artificially similar, and SNPs with similar NAs will be seen as artificially correlated.
It is safer to use less data, of better quality. In this case, you may want to remove SNPs with many NAs.
Cheers
Thibaut
________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of zuzmus [zuzmus at gmail.com]
Sent: 09 October 2014 10:55
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] PCA sensitive to order of samples?
Dear colleagues,
I would like to perform the PCA in adegenet package and managed to go through the procedure till the end. The problem is that the results don't make sense and I see an obvious bias towards the order of the samples in the input matrix.
The matrix has 140 samples from 11 putative species and cca 2800 SNPs coming from the RAD-seq method (only biallelicm SNPs included; coded 0 - more frequent allele, 1 - heterozygote, 2 - rarer allele, NA - missing data).
I used the following code:
> data <- read.table("/Users/zuzana/Matrix_for_adegenet_cutSNPsTo2484_NoHybrids.txt")
> x <- new("genlight", data)
> pca1 <- glPca(x)
> scatter(pca1, posi="bottomleft")
The results always show first 5-7 individuals as strongly separated along the PC1 and 2 and the rest forms one cluster. When I repeated the same analysis after removing the first few individual from the matrix, the pattern stayed as it was - the new first individuals became separated.
[Vlozen? obr?zek 1]
I also tried to play with most of the options for glPca command following the manual or help in R, but always got the similar results...
Another issue is that I have quite some missing data (10 - 35 % per SNP, and cca 10 - 50% per individual) in my matrix, but this was the trade off of the experiment design ("sequence as much as possible as cheap as possible..."). But the first individuals in the list are quite well sequenced, so they are not the worst in sense of missing data...
I wonder if I missed some basics, if I did something wrong or if it is possible that there really is a bias of the order of the samples in the matrix? I would be very happy if somebody could help me to find out how to solve this issue.
Thank you very much of any help and suggestion!:-)
With regards,
Zuzana
---
Zuzana Musilova, PhD.
Zoological Institute
University of Basel
Vesalgasse 1 | 4051 Basel
Switzerland | Europe
)><(((@>....<@)))><(
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2014-10-09 at 11.22.14 AM.png
Type: image/png
Size: 27443 bytes
Desc: Screen Shot 2014-10-09 at 11.22.14 AM.png
URL:
From caitiecollins at gmail.com Thu Oct 16 13:38:42 2014
From: caitiecollins at gmail.com (Caitlin Collins)
Date: Thu, 16 Oct 2014 12:38:42 +0100
Subject: [adegenet-forum] Fwd: Significance of allelic contribution to
discriminant functions
In-Reply-To:
References:
Message-ID:
Hello again Andrea,
Glad you found what you were looking for!
Incidentally, and in case anyone else on the forum is looking to visualise
the variable contributions to discriminant axes > 1, here is some code to
do so for a toy example. (The last chunk will be the relevant bit for
creating loading plots):
# make a simulated dataset with 5 "groups"
simpop <- glSim(200, 1000, 40, k=5, sort.pop=TRUE)
snps <- as.matrix(simpop)
phen <- simpop at other$ancestral.pops
# for fun/ as a check, quickly visualise the clusters
dapc1 <- dapc(snps, phen, n.pca=50, n.da=4)
scatter(dapc1)
# create an object called foo that contains the results of running snpzip
on your dataset
foo <- snpzip(snps, phen, xval.plot=TRUE, loading.plot=TRUE,
method="centroid")
# isolate the DAPC component of the snpzip results, calling it "dapc1"
dapc1 <- foo$DAPC
# specify that you want to run the following lines for all DA (ie. from
DA=1 to DA=(k-1), where K is the number of groups in your dataset)
DA <- c(1:dapc1$n.da)
par(ask=TRUE)
# generate separate loading plots for each DA
for(i in DA){
title <- paste("Loading Plot for DA", i, sep=" ")
maximus <- foo$FS[[i]][[2]]
cutoff <-
abs(dapc1$var.contr[maximus,i][(which.min(dapc1$var.contr[maximus,i]))])-0.000001
loadingplot(dapc1$var.contr[, i], threshold=cutoff, main=title)
}
Hope that helps!
And thanks for your input: I'll try and implement the above code within
snpzip to generate loadinplots for all DA automatically in the next release
of adegenet.
Cheers,
Caitlin.
On Mon, Oct 6, 2014 at 6:09 PM, Andrea Garavito wrote:
> Hello again!
>
> I took a closer look into the object created by the snpzip tool, and I
> found the contributions for all the different axes.
> I didn't noticed them before as I was looking only at the plot obtained.
>
> Thanks anyway!
> Andrea
>
>
> 2014-10-06 12:30 GMT-03:00 Andrea Garavito :
>
> Hello Caitlin,
>> I was taking a look to the adegenet forum and I found this previous
>> answer about a statistical threshold for marker contributions.
>>
>> Originally I was planing to retain for each one of my discriminant
>> functions, around the 0.3% of markers with the highest contributions by
>> establishing a threshold of 3-sigma. I'm not sure if these data are
>> distributed normally, but as I have almost 5000 markers I was assuming so.
>> Then I saw your post about the snpzip analysis and decided to give it a try.
>> I tested the function with all the methods available, and I think I'll
>> use the "median" method as with the others I'm getting to many markers
>> retained (and only one with the "single" method).
>> I see that the snpzip test make the analysis for the first discriminant
>> function, but is there a way to make it also for the other discriminant
>> functions found with DAPC?
>>
>> Thanks for your answer
>> Andrea
>>
>>
>> 2014-08-26 12:58 GMT-03:00 Caitlin Collins :
>>
>>> Yeah, it's new!
>>>
>>> I might as well note, in case you decide only to try a subset of the
>>> methods available:
>>> - Ward's method is most likely to select a very large number of
>>> variables to get the most complete picture
>>> - Single linkage hierarchical clustering will probably select the fewest
>>> - Centroid clustering will probably select a useful middle-ground.
>>>
>>> You can always check to see what proportion of the variance is contained
>>> in the subset of variables retained, or you could even try running a DAPC/
>>> PCA with just those variables to compare the discriminatory power of the
>>> entire set with that of the subset selected.
>>>
>>> Good luck.
>>>
>>> Cheers,
>>> Caitlin.
>>>
>>>
>>> On Tue, Aug 26, 2014 at 4:31 PM, Charlie Waters wrote:
>>>
>>>> Thanks Caitlin! I've never come across the snpzip function so I'll give
>>>> those clustering methods a try.
>>>>
>>>> Thanks,
>>>> Charlie
>>>>
>>>>
>>>> On Tue, Aug 26, 2014 at 3:49 AM, Caitlin Collins <
>>>> caitiecollins at gmail.com> wrote:
>>>>
>>>>> Hi Charlie,
>>>>>
>>>>> Good question. Technically, there is no one "correct" statistical
>>>>> solution to your problem. But, there *are *a number of ways of
>>>>> approaching the problem with more statistical rigour than simply using an
>>>>> arbitrary threshold as you have done.
>>>>>
>>>>> Have you taken a look at the snpzip function in the adegenet packge?
>>>>> If not, just type "?snpzip" into R with the adegenet package loaded. With
>>>>> this function, you can apply one of seven different hierarchical clustering
>>>>> formulas to the allelic contributions generated by dapc. Essentially, each
>>>>> hierarchical clustering method uses a unique approach to determine where
>>>>> the threshold should be drawn. I should note, however, that this
>>>>> descriptive approach will not have an associated p-value. You may want to
>>>>> try out a few different methods before deciding which variables you want
>>>>> to consider "most significant".
>>>>>
>>>>> I hope that helps!
>>>>>
>>>>> Best,
>>>>> Caitlin
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Charlie Waters
>>>> Box 355020
>>>> School of Aquatic and Fishery Sciences
>>>> University of Washington
>>>> Seattle, WA 98105
>>>>
>>>>
>>>
>>> _______________________________________________
>>> adegenet-forum mailing list
>>> adegenet-forum at lists.r-forge.r-project.org
>>>
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From caitiecollins at gmail.com Thu Oct 16 14:28:13 2014
From: caitiecollins at gmail.com (Caitlin Collins)
Date: Thu, 16 Oct 2014 13:28:13 +0100
Subject: [adegenet-forum] Fwd: Question about how to interpret Cross
validation in my analysis. Thanks!
In-Reply-To:
References:
Message-ID:
---------- Forwarded message ----------
From: Caitlin Collins
Date: Thu, Oct 16, 2014 at 1:27 PM
Subject: Re: Question about how to interpret Cross validation in my
analysis. Thanks!
To: Angela Merino
Cc: "Collins, Caitlin" , "Jombart,
Thibaut"
Hi Angela,
Well, I have two pieces of good news for you, and one piece of mediocre
news.
First, there?s nothing to worry about with respect to the ?NULL? that you
are seeing. It just gets printed when xval.plot=TRUE as an artefact of one
of the lines of the printing function. It has no meaning, and certainly
does not imply that your model is not valid. (Given the stress that I now
realise this glaring ?NULL? may cause, I?ve changed the way the plots print
now, so in the next release of adegenet this won?t happen.)
Second, you are absolutely correct in your interpretation of the results of
xvalDapc (which are stored in whatever object you assigned the results to,
in your case, ?xval?).
This brings me to the mediocre news: given that your interpretation is
correct, it seems that the best model you can achieve with DAPC, where
n.pca=25, is only able to predict the group membership of validation set
individuals in 63% of the cases, with a 32% root mean squared error.
Arguably, this is not great. Your final comment on the matter, though, is
quite insightful. The fact that you can achieve the same modest level of
success with 20-80 PCs indicates that the optimisation procedure has not
been particularly successful. Ideally, one would like to see an arch, with
a maximum success point somewhere in the middle. In your case, there is a
bit of an arch, but it isn?t particularly striking.
The only thing I might add to your interpretation of this result is that
it?s not so much that the model is poor because a similar level of success
can be achieved with variable numbers of PCs. If mean success was virtually
constant, but varying around 90%, the interpretation would not be that the
model is poor, but rather that most levels of PC retention can compose a
model that effectively discriminates between groups.
I hope this has helped answer some of your questions. If you have any more,
please feel free to ask.
Best,
Caitlin.
On Mon, Oct 13, 2014 at 11:48 PM, Angela Merino <
Angela.Merino at cawthron.org.nz> wrote:
> Hi Caitlin Collins and Thibaut Jombart,
>
>
>
> My name is Angela Parody-Merino and I am a PhD student at Massey
> University (New Zealand). I am studying the population genetic structure in
> a migratory bird (the New Zealand Godwit) with 23 microsatellites. Anyway,
> maybe this is a very simple question but I really want to understand and be
> sure about the meaning and interpretation of the output when doing
> cross-validation. I have been some days looking in the internet and reading
> explanations etc?without being able to really understand what?s going on
> with my analysis. Could you help me please? J
>
>
>
> This is the script of the analysis:
>
> > x <- ELpop
>
> > mat <- as.matrix(na.replace(x, method="mean"))
>
>
>
> Replaced 371 missing values
>
> > grp <- pop(x)
>
> > xval <- xvalDapc(mat, grp, n.pca.max = 40, training.set = 0.9,
>
> + result = "groupMean", center = TRUE, scale = FALSE,
>
> + n.pca = NULL, n.rep = 500, xval.plot = TRUE)
>
> NULL *>>> What does it mean this NULL? Does it mean that the model is not
> valid?*
>
> *$`Median and Confidence Interval for Random Chance`*
>
> * 2.5% 50% 97.5% *
>
> *0.4294840 0.4928747 0.5962807 *
>
>
>
> *$`Mean Successful Assignment by Number of PCs of PCA`*
>
> * 5 10 15 20 25 30
> 35 40 *
>
> *0.5871429 0.6000000 0.5819048 0.6014286 0.6952381 0.6747619 0.6333333
> 0.6109524 *
>
>
>
> *$`Number of PCs Achieving Highest Mean Success`*
>
> *[1] "25"*
>
>
>
> *$`Root Mean Squared Error by Number of PCs of PCA`*
>
> * 5 10 15 20 25 30
> 35 40 *
>
> *0.4301795 0.4141872 0.4389381 0.4131429 0.3241735 0.3531491 0.3885084
> 0.4145894 *
>
>
>
> *$`Number of PCs Achieving Lowest MSE`*
>
> *[1] "25"*
>
>
>
> *From the screenshot and the output results of the cross validation (in
> blue), I would say that my model (retaining 25PCs) can predict with a mean
> of 63% but it is not such a good model because most of the models that can
> be obtained by retaining 20, 40, 60, 80 PCs are quite the same successful.
> Is it my interpretation correct?*
>
>
>
>
>
>
>
> Thanks in advance,
>
>
>
> Kind regards,
>
>
>
> ?Angela Parody-Merino
> ------------------------------
> *Attention: *
> This message is for the named person's use only. It may contain
> confidential, proprietary or legally privileged information. If you
> receive this message in error, please immediately delete it and all copies
> of it from your system, destroy any hard copies of it and notify the
> sender. You must not, directly or indirectly, use, disclose, distribute,
> print, or copy any part of this message if you are not the intended
> recipient. Cawthron reserves the right to monitor all e-mail communications
> through its networks. Any opinions expressed in this message are those of
> the individual sender, except where the message states otherwise and the
> sender is authorised to make that statement.
>
> This e-mail message has been scanned and cleared by *MailMarshal *
> ------------------------------
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 48953 bytes
Desc: not available
URL:
From caitiecollins at gmail.com Thu Oct 16 15:01:58 2014
From: caitiecollins at gmail.com (Caitlin Collins)
Date: Thu, 16 Oct 2014 14:01:58 +0100
Subject: [adegenet-forum] Trouble converting to genid object
In-Reply-To: <2CB2DA8E426F3541AB1907F98ABA6570A826B775@icexch-m1.ic.ac.uk>
References:
<2CB2DA8E426F3541AB1907F98ABA6570A826B775@icexch-m1.ic.ac.uk>
Message-ID:
Hi,
Sorry for the delay. I think the problem may be something simple to do with
the format or row and column names of the object test.
When I tried the example with the data you sent, the first approach worked
right away.
Can you try for me something perhaps silly just to rule this out as the
solution:
# replace filename below with the path to your file, wherever it is on your
computer
filename <- "C:/Cait/Work/adegenet forum Qs/test.txt"
# use read.table to read in the file anew to try to get it in the same
format that I have it in
test <- read.table(filename)
# confirm that it looks to have the correct dimensions, names, contents
head(test)
# try creating a genind object out of test using the first approach you put
forward
obj1 <- genind(test, ploidy=1, type="PA")
# confirm that a genind was created
obj1
# confirm that it looks the same as the original object when in matrix form
head(as.matrix(obj1))
Then please let me know if that works for you. If not, could you paste back
the results or errors you get from the above commands?
Best of luck.
Cheers,
Caitlin.
On Thu, Sep 25, 2014 at 11:04 AM, Jombart, Thibaut wrote:
>
> Hi there,
>
> it looks like a bug. I'll investigate and get back to you.
>
> Cheers
> Thibaut
>
> ------------------------------
> *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [
> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Jackie
> Lighten [Jackie.Lighten at Dal.Ca]
> *Sent:* 22 September 2014 12:59
> *To:* adegenet-forum at lists.r-forge.r-project.org
> *Subject:* [adegenet-forum] Trouble converting to genid object
>
> Hi,
>
> I am having trouble converting a presence/absence genotype data frame to
> a genid object
>
> Please see attached for test data file.
>
> Using
>
> obj2 <- genind(test, ploidy=1, type="PA")
>
> I get the error:
>
> Error in `colnames<-`(`*tmp*`, value = c("L1", "L2")) :
> length of 'dimnames' [2] not equal to array extent
>
>
> Using
>
> obj2 <- df2genind(test, ploidy=1, type="PA")
>
> I get the error:
>
> Error in `colnames<-`(`*tmp*`, value = "L1") :
> length of 'dimnames' [2] not equal to array extent
> In addition: Warning messages:
> 1: In eval(expr, envir, enclos) : NAs introduced by coercion
> 2: In df2genind(test, ploidy = 1, type = "PA") :
> entirely non-type marker(s) deleted
>
>
> Any help would be much appreciated
>
> Thanks,
>
> Jack
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From goatsrunfaster at gmail.com Tue Oct 21 17:28:20 2014
From: goatsrunfaster at gmail.com (Spencer Bruce)
Date: Tue, 21 Oct 2014 11:28:20 -0400
Subject: [adegenet-forum] creating genetic data from scratch
Message-ID:
Hello All,
Im looking to create some micro-satellite data for a simulation study. Is
there a way to create a data set for a number of individuals, with a given
number of loci, and neutral alleles at each loci?
Im basically looking to simulate admixture between two pops but the number
of individuals i have actual data for is only 20 or so from each, where I
need to create a scenario with hundreds of individuals.
any info would be greatly appreciated!
Best,
Spencer
--
Spencer A Bruce
200 Washington St.
Troy, NY 12180
518 225 0787
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From t.jombart at imperial.ac.uk Tue Oct 21 20:31:26 2014
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Tue, 21 Oct 2014 18:31:26 +0000
Subject: [adegenet-forum] creating genetic data from scratch
In-Reply-To:
References:
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570ABE699AF@icexch-m1.ic.ac.uk>
Hello,
Not in adegenet, but there are software around to do this - check out easypop for instance, which outputs files compatible with adegenet.
Cheers
Thibaut
________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer Bruce [goatsrunfaster at gmail.com]
Sent: 21 October 2014 16:28
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] creating genetic data from scratch
Hello All,
Im looking to create some micro-satellite data for a simulation study. Is there a way to create a data set for a number of individuals, with a given number of loci, and neutral alleles at each loci?
Im basically looking to simulate admixture between two pops but the number of individuals i have actual data for is only 20 or so from each, where I need to create a scenario with hundreds of individuals.
any info would be greatly appreciated!
Best,
Spencer
--
Spencer A Bruce
200 Washington St.
Troy, NY 12180
518 225 0787
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From goatsrunfaster at gmail.com Thu Oct 23 16:29:10 2014
From: goatsrunfaster at gmail.com (Spencer Bruce)
Date: Thu, 23 Oct 2014 10:29:10 -0400
Subject: [adegenet-forum] repooling random rows from genind objects
Message-ID:
Hello All!
I have three seperate populations as genind objects. What I would like to
do is pull a certain number of random individuals from each, to form a new
single genind population.
I would then like individuals from this new genind population to mate
randomly, producing another genind object which would contain their
offspring.
Below is the code I came up with (which does not work):
Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), 750),
], pop2[sample(nrow(pop2), 750), ], n=2000)
Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ],
Year1[sample(nrow(Year1), 1000), ], n=2000)
any help would be greatly appreciated!
Best,
Spencer
--
Spencer A Bruce
200 Washington St.
Troy, NY 12180
518 225 0787
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From t.jombart at imperial.ac.uk Thu Oct 23 16:50:24 2014
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Thu, 23 Oct 2014 14:50:24 +0000
Subject: [adegenet-forum] repooling random rows from genind objects
In-Reply-To:
References:
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE7F@icexch-m1.ic.ac.uk>
Hello,
hard to figure out what is wrong without the error message..
Cheers
Thibaut
________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer Bruce [goatsrunfaster at gmail.com]
Sent: 23 October 2014 15:29
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] repooling random rows from genind objects
Hello All!
I have three seperate populations as genind objects. What I would like to do is pull a certain number of random individuals from each, to form a new single genind population.
I would then like individuals from this new genind population to mate randomly, producing another genind object which would contain their offspring.
Below is the code I came up with (which does not work):
Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), 750), ], pop2[sample(nrow(pop2), 750), ], n=2000)
Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], Year1[sample(nrow(Year1), 1000), ], n=2000)
any help would be greatly appreciated!
Best,
Spencer
--
Spencer A Bruce
200 Washington St.
Troy, NY 12180
518 225 0787
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From goatsrunfaster at gmail.com Thu Oct 23 16:52:59 2014
From: goatsrunfaster at gmail.com (Spencer Bruce)
Date: Thu, 23 Oct 2014 10:52:59 -0400
Subject: [adegenet-forum] repooling random rows from genind objects
In-Reply-To: <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE7F@icexch-m1.ic.ac.uk>
References:
<2CB2DA8E426F3541AB1907F98ABA6570ABE6AE7F@icexch-m1.ic.ac.uk>
Message-ID:
Error message:
Error in sample.int(length(x), size, replace, prob) :
invalid first argument
On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut wrote:
>
> Hello,
> hard to figure out what is wrong without the error message..
> Cheers
> Thibaut
> ------------------------------
> *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [
> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer
> Bruce [goatsrunfaster at gmail.com]
> *Sent:* 23 October 2014 15:29
> *To:* adegenet-forum at lists.r-forge.r-project.org
> *Subject:* [adegenet-forum] repooling random rows from genind objects
>
> Hello All!
>
> I have three seperate populations as genind objects. What I would like
> to do is pull a certain number of random individuals from each, to form a
> new single genind population.
>
> I would then like individuals from this new genind population to mate
> randomly, producing another genind object which would contain their
> offspring.
>
> Below is the code I came up with (which does not work):
>
> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1),
> 750), ], pop2[sample(nrow(pop2), 750), ], n=2000)
>
> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ],
> Year1[sample(nrow(Year1), 1000), ], n=2000)
>
>
> any help would be greatly appreciated!
>
> Best,
> Spencer
>
> --
> Spencer A Bruce
> 200 Washington St.
> Troy, NY 12180
> 518 225 0787
>
--
Spencer A Bruce
200 Washington St.
Troy, NY 12180
518 225 0787
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From t.jombart at imperial.ac.uk Thu Oct 23 16:54:53 2014
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Thu, 23 Oct 2014 14:54:53 +0000
Subject: [adegenet-forum] repooling random rows from genind objects
In-Reply-To:
References:
<2CB2DA8E426F3541AB1907F98ABA6570ABE6AE7F@icexch-m1.ic.ac.uk>,
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE93@icexch-m1.ic.ac.uk>
What does nrow(F1) and other nrow(...)'s say?
________________________________
From: Spencer Bruce [goatsrunfaster at gmail.com]
Sent: 23 October 2014 15:52
To: Jombart, Thibaut
Cc: adegenet-forum at lists.r-forge.r-project.org
Subject: Re: [adegenet-forum] repooling random rows from genind objects
Error message:
Error in sample.int(length(x), size, replace, prob) :
invalid first argument
On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut > wrote:
Hello,
hard to figure out what is wrong without the error message..
Cheers
Thibaut
________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer Bruce [goatsrunfaster at gmail.com]
Sent: 23 October 2014 15:29
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] repooling random rows from genind objects
Hello All!
I have three seperate populations as genind objects. What I would like to do is pull a certain number of random individuals from each, to form a new single genind population.
I would then like individuals from this new genind population to mate randomly, producing another genind object which would contain their offspring.
Below is the code I came up with (which does not work):
Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), 750), ], pop2[sample(nrow(pop2), 750), ], n=2000)
Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ], Year1[sample(nrow(Year1), 1000), ], n=2000)
any help would be greatly appreciated!
Best,
Spencer
--
Spencer A Bruce
200 Washington St.
Troy, NY 12180
518 225 0787
--
Spencer A Bruce
200 Washington St.
Troy, NY 12180
518 225 0787
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From goatsrunfaster at gmail.com Thu Oct 23 17:03:54 2014
From: goatsrunfaster at gmail.com (Spencer Bruce)
Date: Thu, 23 Oct 2014 11:03:54 -0400
Subject: [adegenet-forum] repooling random rows from genind objects
In-Reply-To: <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE93@icexch-m1.ic.ac.uk>
References:
<2CB2DA8E426F3541AB1907F98ABA6570ABE6AE7F@icexch-m1.ic.ac.uk>
<2CB2DA8E426F3541AB1907F98ABA6570ABE6AE93@icexch-m1.ic.ac.uk>
Message-ID:
they both say Null, if I just type them into R.
Just to be clear these genind objects contains microsat data for 11 loci
for thousands of individuals.
I'm rather new to R, so I apologize if I'm missing something obvious here...
On Thu, Oct 23, 2014 at 10:54 AM, Jombart, Thibaut wrote:
>
>
> What does nrow(F1) and other nrow(...)'s say?
>
>
>
>
> ------------------------------
> *From:* Spencer Bruce [goatsrunfaster at gmail.com]
> *Sent:* 23 October 2014 15:52
> *To:* Jombart, Thibaut
> *Cc:* adegenet-forum at lists.r-forge.r-project.org
> *Subject:* Re: [adegenet-forum] repooling random rows from genind objects
>
> Error message:
>
> Error in sample.int(length(x), size, replace, prob) :
> invalid first argument
>
> On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut <
> t.jombart at imperial.ac.uk> wrote:
>
>>
>> Hello,
>> hard to figure out what is wrong without the error message..
>> Cheers
>> Thibaut
>> ------------------------------
>> *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [
>> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer
>> Bruce [goatsrunfaster at gmail.com]
>> *Sent:* 23 October 2014 15:29
>> *To:* adegenet-forum at lists.r-forge.r-project.org
>> *Subject:* [adegenet-forum] repooling random rows from genind objects
>>
>> Hello All!
>>
>> I have three seperate populations as genind objects. What I would like
>> to do is pull a certain number of random individuals from each, to form a
>> new single genind population.
>>
>> I would then like individuals from this new genind population to mate
>> randomly, producing another genind object which would contain their
>> offspring.
>>
>> Below is the code I came up with (which does not work):
>>
>> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1),
>> 750), ], pop2[sample(nrow(pop2), 750), ], n=2000)
>>
>> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ],
>> Year1[sample(nrow(Year1), 1000), ], n=2000)
>>
>>
>> any help would be greatly appreciated!
>>
>> Best,
>> Spencer
>>
>> --
>> Spencer A Bruce
>> 200 Washington St.
>> Troy, NY 12180
>> 518 225 0787
>>
>
>
>
> --
> Spencer A Bruce
> 200 Washington St.
> Troy, NY 12180
> 518 225 0787
>
--
Spencer A Bruce
200 Washington St.
Troy, NY 12180
518 225 0787
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From francesco.montinaro at gmail.com Thu Oct 23 17:19:19 2014
From: francesco.montinaro at gmail.com (Francesco Montinaro)
Date: Thu, 23 Oct 2014 16:19:19 +0100
Subject: [adegenet-forum] adegenet-forum Digest, Vol 74, Issue 9
In-Reply-To:
References:
Message-ID:
Hi,
I think that the problem is that since a genind object is a list, the nrow
is NULL.
Probably you want to sample from object$tab instead.
Hope it helps.
Best
Francesco Montinaro
On 23 October 2014 16:04, <
adegenet-forum-request at lists.r-forge.r-project.org> wrote:
> Send adegenet-forum mailing list submissions to
> adegenet-forum at lists.r-forge.r-project.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>
> or, via email, send a message with subject or body 'help' to
> adegenet-forum-request at lists.r-forge.r-project.org
>
> You can reach the person managing the list at
> adegenet-forum-owner at lists.r-forge.r-project.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of adegenet-forum digest..."
>
>
> Today's Topics:
>
> 1. repooling random rows from genind objects (Spencer Bruce)
> 2. Re: repooling random rows from genind objects (Jombart, Thibaut)
> 3. Re: repooling random rows from genind objects (Spencer Bruce)
> 4. Re: repooling random rows from genind objects (Jombart, Thibaut)
> 5. Re: repooling random rows from genind objects (Spencer Bruce)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 23 Oct 2014 10:29:10 -0400
> From: Spencer Bruce
> To: adegenet-forum at lists.r-forge.r-project.org
> Subject: [adegenet-forum] repooling random rows from genind objects
> Message-ID:
> UFFOSerHR1qeKXHSwVzF-0whQf2do83wOV38s5w at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hello All!
>
> I have three seperate populations as genind objects. What I would like to
> do is pull a certain number of random individuals from each, to form a new
> single genind population.
>
> I would then like individuals from this new genind population to mate
> randomly, producing another genind object which would contain their
> offspring.
>
> Below is the code I came up with (which does not work):
>
> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), 750),
> ], pop2[sample(nrow(pop2), 750), ], n=2000)
>
> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ],
> Year1[sample(nrow(Year1), 1000), ], n=2000)
>
>
> any help would be greatly appreciated!
>
> Best,
> Spencer
>
> --
> Spencer A Bruce
> 200 Washington St.
> Troy, NY 12180
> 518 225 0787
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/cb94a767/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 2
> Date: Thu, 23 Oct 2014 14:50:24 +0000
> From: "Jombart, Thibaut"
> To: Spencer Bruce ,
> "adegenet-forum at lists.r-forge.r-project.org"
>
> Subject: Re: [adegenet-forum] repooling random rows from genind
> objects
> Message-ID:
> <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE7F at icexch-m1.ic.ac.uk>
> Content-Type: text/plain; charset="iso-8859-1"
>
>
> Hello,
> hard to figure out what is wrong without the error message..
> Cheers
> Thibaut
> ________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org [
> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer
> Bruce [goatsrunfaster at gmail.com]
> Sent: 23 October 2014 15:29
> To: adegenet-forum at lists.r-forge.r-project.org
> Subject: [adegenet-forum] repooling random rows from genind objects
>
> Hello All!
>
> I have three seperate populations as genind objects. What I would like to
> do is pull a certain number of random individuals from each, to form a new
> single genind population.
>
> I would then like individuals from this new genind population to mate
> randomly, producing another genind object which would contain their
> offspring.
>
> Below is the code I came up with (which does not work):
>
> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), 750),
> ], pop2[sample(nrow(pop2), 750), ], n=2000)
>
> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ],
> Year1[sample(nrow(Year1), 1000), ], n=2000)
>
>
> any help would be greatly appreciated!
>
> Best,
> Spencer
>
> --
> Spencer A Bruce
> 200 Washington St.
> Troy, NY 12180
> 518 225 0787
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/c43c29fe/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 3
> Date: Thu, 23 Oct 2014 10:52:59 -0400
> From: Spencer Bruce
> To: "Jombart, Thibaut"
> Cc: "adegenet-forum at lists.r-forge.r-project.org"
>
> Subject: Re: [adegenet-forum] repooling random rows from genind
> objects
> Message-ID:
> <
> CAGjKGeZhjLhurZbKiMZxSmZV_GFC3quw58FPt06LrXwakRDPeA at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Error message:
>
> Error in sample.int(length(x), size, replace, prob) :
> invalid first argument
>
> On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut <
> t.jombart at imperial.ac.uk
> > wrote:
>
> >
> > Hello,
> > hard to figure out what is wrong without the error message..
> > Cheers
> > Thibaut
> > ------------------------------
> > *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [
> > adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer
> > Bruce [goatsrunfaster at gmail.com]
> > *Sent:* 23 October 2014 15:29
> > *To:* adegenet-forum at lists.r-forge.r-project.org
> > *Subject:* [adegenet-forum] repooling random rows from genind objects
> >
> > Hello All!
> >
> > I have three seperate populations as genind objects. What I would like
> > to do is pull a certain number of random individuals from each, to form a
> > new single genind population.
> >
> > I would then like individuals from this new genind population to mate
> > randomly, producing another genind object which would contain their
> > offspring.
> >
> > Below is the code I came up with (which does not work):
> >
> > Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1),
> > 750), ], pop2[sample(nrow(pop2), 750), ], n=2000)
> >
> > Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ],
> > Year1[sample(nrow(Year1), 1000), ], n=2000)
> >
> >
> > any help would be greatly appreciated!
> >
> > Best,
> > Spencer
> >
> > --
> > Spencer A Bruce
> > 200 Washington St.
> > Troy, NY 12180
> > 518 225 0787
> >
>
>
>
> --
> Spencer A Bruce
> 200 Washington St.
> Troy, NY 12180
> 518 225 0787
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/19bc54c8/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 4
> Date: Thu, 23 Oct 2014 14:54:53 +0000
> From: "Jombart, Thibaut"
> To: Spencer Bruce
> Cc: "adegenet-forum at lists.r-forge.r-project.org"
>
> Subject: Re: [adegenet-forum] repooling random rows from genind
> objects
> Message-ID:
> <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE93 at icexch-m1.ic.ac.uk>
> Content-Type: text/plain; charset="iso-8859-1"
>
>
>
> What does nrow(F1) and other nrow(...)'s say?
>
>
>
>
> ________________________________
> From: Spencer Bruce [goatsrunfaster at gmail.com]
> Sent: 23 October 2014 15:52
> To: Jombart, Thibaut
> Cc: adegenet-forum at lists.r-forge.r-project.org
> Subject: Re: [adegenet-forum] repooling random rows from genind objects
>
> Error message:
>
> Error in sample.int(length(x), size, replace, prob) :
> invalid first argument
>
> On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut <
> t.jombart at imperial.ac.uk> wrote:
>
> Hello,
> hard to figure out what is wrong without the error message..
> Cheers
> Thibaut
> ________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org adegenet-forum-bounces at lists.r-forge.r-project.org> [
> adegenet-forum-bounces at lists.r-forge.r-project.org adegenet-forum-bounces at lists.r-forge.r-project.org>] on behalf of Spencer
> Bruce [goatsrunfaster at gmail.com]
> Sent: 23 October 2014 15:29
> To: adegenet-forum at lists.r-forge.r-project.org adegenet-forum at lists.r-forge.r-project.org>
> Subject: [adegenet-forum] repooling random rows from genind objects
>
> Hello All!
>
> I have three seperate populations as genind objects. What I would like to
> do is pull a certain number of random individuals from each, to form a new
> single genind population.
>
> I would then like individuals from this new genind population to mate
> randomly, producing another genind object which would contain their
> offspring.
>
> Below is the code I came up with (which does not work):
>
> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), 750),
> ], pop2[sample(nrow(pop2), 750), ], n=2000)
>
> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ],
> Year1[sample(nrow(Year1), 1000), ], n=2000)
>
>
> any help would be greatly appreciated!
>
> Best,
> Spencer
>
> --
> Spencer A Bruce
> 200 Washington St.
> Troy, NY 12180
> 518 225 0787
>
>
>
> --
> Spencer A Bruce
> 200 Washington St.
> Troy, NY 12180
> 518 225 0787
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/2dfd9408/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 5
> Date: Thu, 23 Oct 2014 11:03:54 -0400
> From: Spencer Bruce
> To: "Jombart, Thibaut"
> Cc: "adegenet-forum at lists.r-forge.r-project.org"
>
> Subject: Re: [adegenet-forum] repooling random rows from genind
> objects
> Message-ID:
> <
> CAGjKGeZiyS-27oF2CKb6JBPgS+VSyTA1-NS2Q8g2UC0JqOs4VA at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> they both say Null, if I just type them into R.
>
> Just to be clear these genind objects contains microsat data for 11 loci
> for thousands of individuals.
>
> I'm rather new to R, so I apologize if I'm missing something obvious
> here...
>
> On Thu, Oct 23, 2014 at 10:54 AM, Jombart, Thibaut <
> t.jombart at imperial.ac.uk
> > wrote:
>
> >
> >
> > What does nrow(F1) and other nrow(...)'s say?
> >
> >
> >
> >
> > ------------------------------
> > *From:* Spencer Bruce [goatsrunfaster at gmail.com]
> > *Sent:* 23 October 2014 15:52
> > *To:* Jombart, Thibaut
> > *Cc:* adegenet-forum at lists.r-forge.r-project.org
> > *Subject:* Re: [adegenet-forum] repooling random rows from genind objects
> >
> > Error message:
> >
> > Error in sample.int(length(x), size, replace, prob) :
> > invalid first argument
> >
> > On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut <
> > t.jombart at imperial.ac.uk> wrote:
> >
> >>
> >> Hello,
> >> hard to figure out what is wrong without the error message..
> >> Cheers
> >> Thibaut
> >> ------------------------------
> >> *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [
> >> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of
> Spencer
> >> Bruce [goatsrunfaster at gmail.com]
> >> *Sent:* 23 October 2014 15:29
> >> *To:* adegenet-forum at lists.r-forge.r-project.org
> >> *Subject:* [adegenet-forum] repooling random rows from genind objects
> >>
> >> Hello All!
> >>
> >> I have three seperate populations as genind objects. What I would like
> >> to do is pull a certain number of random individuals from each, to form
> a
> >> new single genind population.
> >>
> >> I would then like individuals from this new genind population to mate
> >> randomly, producing another genind object which would contain their
> >> offspring.
> >>
> >> Below is the code I came up with (which does not work):
> >>
> >> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1),
> >> 750), ], pop2[sample(nrow(pop2), 750), ], n=2000)
> >>
> >> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ],
> >> Year1[sample(nrow(Year1), 1000), ], n=2000)
> >>
> >>
> >> any help would be greatly appreciated!
> >>
> >> Best,
> >> Spencer
> >>
> >> --
> >> Spencer A Bruce
> >> 200 Washington St.
> >> Troy, NY 12180
> >> 518 225 0787
> >>
> >
> >
> >
> > --
> > Spencer A Bruce
> > 200 Washington St.
> > Troy, NY 12180
> > 518 225 0787
> >
>
>
>
> --
> Spencer A Bruce
> 200 Washington St.
> Troy, NY 12180
> 518 225 0787
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/c51ccf09/attachment.html
> >
>
> ------------------------------
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>
> End of adegenet-forum Digest, Vol 74, Issue 9
> *********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From hilpert at ipk-gatersleben.de Fri Oct 24 09:13:22 2014
From: hilpert at ipk-gatersleben.de (Stefanie Hilpert)
Date: Fri, 24 Oct 2014 07:13:22 +0000
Subject: [adegenet-forum] DAPC & Ploidylevel
Message-ID:
Dear everybody,
I am currently using the adegenet package to perform a structure analysis of my microsatellite dataset and compare it to the results of an analysis using STRUCTURE software. The organism I am working on is an apomictic plant and I am aware that STRUCTURE is probably not adequate because it assumes HWE and asexuality violates HWE. Nevertheless we use STRUCTURE analysis for apomicts, because in most of the cases the assigned number of groups correlate to biological traits. Knowing that there is a bias using STRUCTURE we decided to perform a DAPC additionally.
But now I ran into another problem using adegenet. I am working with a mixed ploidy system with ploidies ranging from 4 to 11. To implement the data into the adgenet package we coded all individuals as 11x because otherwise it was not possible to load the data. Now I am wondering how big is the bias if the calculation assumes that for example a tetraploid is now a hendecaploid and if I could still trust the results. I am asking because the results of the DAPC are completely different to the ones of STRUCTURE which puzzles me a bit because I somehow at least expected correlations (the number of optimal k is the same, but the assigned individuals to the clusters differ completely).
I would appreciate some help
Stefanie Hilpert
------------------------------------------------------------------------------
Stefanie Hilpert
-PhD Candidate-
Dept. of Cytogenetics and Genome Analysis
Leibniz Institute of Plant Genetics and Crop Plant Research (IPK)
Corrensstra?e 3, D-06466 Gatersleben Germany
+49 (0)39482 5673
IPK Graduate School
International Max-Planck Research School
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From tingpu89 at gmail.com Mon Oct 20 19:18:56 2014
From: tingpu89 at gmail.com (Ting Pu)
Date: Mon, 20 Oct 2014 10:18:56 -0700
Subject: [adegenet-forum] $li score in sPCA
Message-ID:
Hi all,
I was just wondering in sPCA, after I have selected the first positive principal component (which represents global structures), how should I interpret the positiveness and negativeness of the $li (entity scores)? Does a high positive $li mean its spatial correlation is stronger than a negative li? Please correct me if I am wrong.
Thank you for your time,
Ting
From t.jombart at imperial.ac.uk Fri Oct 24 12:07:49 2014
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Fri, 24 Oct 2014 10:07:49 +0000
Subject: [adegenet-forum] $li score in sPCA
In-Reply-To:
References:
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570ABE6C2D3@icexch-m1.ic.ac.uk>
Hello,
as in any multivariate analysis, the sign of the PCs is arbitrary. Only the distance between individuals on this PC has a meaning, i.e. if you have (using integers to make things simpler):
A = -1
B = 1
C = 3
D = 5
Then the difference between A and C is the same as between B and D.
Cheers
Thibaut
________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Ting Pu [tingpu89 at gmail.com]
Sent: 20 October 2014 18:18
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] $li score in sPCA
Hi all,
I was just wondering in sPCA, after I have selected the first positive principal component (which represents global structures), how should I interpret the positiveness and negativeness of the $li (entity scores)? Does a high positive $li mean its spatial correlation is stronger than a negative li? Please correct me if I am wrong.
Thank you for your time,
Ting
_______________________________________________
adegenet-forum mailing list
adegenet-forum at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
From goatsrunfaster at gmail.com Fri Oct 24 16:29:58 2014
From: goatsrunfaster at gmail.com (Spencer Bruce)
Date: Fri, 24 Oct 2014 10:29:58 -0400
Subject: [adegenet-forum] adegenet-forum Digest, Vol 74, Issue 9
In-Reply-To:
References:
Message-ID:
Hello All,
Thanks for the tip Francesco! This almost works for me...
when I enter the code below in an attempt to randomly sample 20 individuals
from the genind object "tdhybrids" I get back the new genind object called
Random, I then used genind2genotype to view the contents of "Random" but
there are only 2 individuals (not 20)? The code I used is below:
Random <- tdhybrids[sample(tdhybrids$tab, 20), ]
obj <- genind2genotype(Random)
Am I missing something here? A big thank you to everyone in advance for
putting up with my questions?
-Spencer
On Thu, Oct 23, 2014 at 11:19 AM, Francesco Montinaro <
francesco.montinaro at gmail.com> wrote:
> Hi,
> I think that the problem is that since a genind object is a list, the nrow
> is NULL.
>
> Probably you want to sample from object$tab instead.
>
> Hope it helps.
>
> Best
>
>
>
> Francesco Montinaro
>
> On 23 October 2014 16:04, <
> adegenet-forum-request at lists.r-forge.r-project.org> wrote:
>
>> Send adegenet-forum mailing list submissions to
>> adegenet-forum at lists.r-forge.r-project.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>>
>> or, via email, send a message with subject or body 'help' to
>> adegenet-forum-request at lists.r-forge.r-project.org
>>
>> You can reach the person managing the list at
>> adegenet-forum-owner at lists.r-forge.r-project.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of adegenet-forum digest..."
>>
>>
>> Today's Topics:
>>
>> 1. repooling random rows from genind objects (Spencer Bruce)
>> 2. Re: repooling random rows from genind objects (Jombart, Thibaut)
>> 3. Re: repooling random rows from genind objects (Spencer Bruce)
>> 4. Re: repooling random rows from genind objects (Jombart, Thibaut)
>> 5. Re: repooling random rows from genind objects (Spencer Bruce)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Thu, 23 Oct 2014 10:29:10 -0400
>> From: Spencer Bruce
>> To: adegenet-forum at lists.r-forge.r-project.org
>> Subject: [adegenet-forum] repooling random rows from genind objects
>> Message-ID:
>> > UFFOSerHR1qeKXHSwVzF-0whQf2do83wOV38s5w at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> Hello All!
>>
>> I have three seperate populations as genind objects. What I would like to
>> do is pull a certain number of random individuals from each, to form a new
>> single genind population.
>>
>> I would then like individuals from this new genind population to mate
>> randomly, producing another genind object which would contain their
>> offspring.
>>
>> Below is the code I came up with (which does not work):
>>
>> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1), 750),
>> ], pop2[sample(nrow(pop2), 750), ], n=2000)
>>
>> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ],
>> Year1[sample(nrow(Year1), 1000), ], n=2000)
>>
>>
>> any help would be greatly appreciated!
>>
>> Best,
>> Spencer
>>
>> --
>> Spencer A Bruce
>> 200 Washington St.
>> Troy, NY 12180
>> 518 225 0787
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <
>> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/cb94a767/attachment-0001.html
>> >
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Thu, 23 Oct 2014 14:50:24 +0000
>> From: "Jombart, Thibaut"
>> To: Spencer Bruce ,
>> "adegenet-forum at lists.r-forge.r-project.org"
>>
>> Subject: Re: [adegenet-forum] repooling random rows from genind
>> objects
>> Message-ID:
>> <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE7F at icexch-m1.ic.ac.uk>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>>
>> Hello,
>> hard to figure out what is wrong without the error message..
>> Cheers
>> Thibaut
>> ________________________________
>> From: adegenet-forum-bounces at lists.r-forge.r-project.org [
>> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Spencer
>> Bruce [goatsrunfaster at gmail.com]
>> Sent: 23 October 2014 15:29
>> To: adegenet-forum at lists.r-forge.r-project.org
>> Subject: [adegenet-forum] repooling random rows from genind objects
>>
>> Hello All!
>>
>> I have three seperate populations as genind objects. What I would like to
>> do is pull a certain number of random individuals from each, to form a new
>> single genind population.
>>
>> I would then like individuals from this new genind population to mate
>> randomly, producing another genind object which would contain their
>> offspring.
>>
>> Below is the code I came up with (which does not work):
>>
>> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1),
>> 750), ], pop2[sample(nrow(pop2), 750), ], n=2000)
>>
>> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ],
>> Year1[sample(nrow(Year1), 1000), ], n=2000)
>>
>>
>> any help would be greatly appreciated!
>>
>> Best,
>> Spencer
>>
>> --
>> Spencer A Bruce
>> 200 Washington St.
>> Troy, NY 12180
>> 518 225 0787
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <
>> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/c43c29fe/attachment-0001.html
>> >
>>
>> ------------------------------
>>
>> Message: 3
>> Date: Thu, 23 Oct 2014 10:52:59 -0400
>> From: Spencer Bruce
>> To: "Jombart, Thibaut"
>> Cc: "adegenet-forum at lists.r-forge.r-project.org"
>>
>> Subject: Re: [adegenet-forum] repooling random rows from genind
>> objects
>> Message-ID:
>> <
>> CAGjKGeZhjLhurZbKiMZxSmZV_GFC3quw58FPt06LrXwakRDPeA at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> Error message:
>>
>> Error in sample.int(length(x), size, replace, prob) :
>> invalid first argument
>>
>> On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut <
>> t.jombart at imperial.ac.uk
>> > wrote:
>>
>> >
>> > Hello,
>> > hard to figure out what is wrong without the error message..
>> > Cheers
>> > Thibaut
>> > ------------------------------
>> > *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [
>> > adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of
>> Spencer
>> > Bruce [goatsrunfaster at gmail.com]
>> > *Sent:* 23 October 2014 15:29
>> > *To:* adegenet-forum at lists.r-forge.r-project.org
>> > *Subject:* [adegenet-forum] repooling random rows from genind objects
>> >
>> > Hello All!
>> >
>> > I have three seperate populations as genind objects. What I would like
>> > to do is pull a certain number of random individuals from each, to form
>> a
>> > new single genind population.
>> >
>> > I would then like individuals from this new genind population to mate
>> > randomly, producing another genind object which would contain their
>> > offspring.
>> >
>> > Below is the code I came up with (which does not work):
>> >
>> > Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1),
>> > 750), ], pop2[sample(nrow(pop2), 750), ], n=2000)
>> >
>> > Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ],
>> > Year1[sample(nrow(Year1), 1000), ], n=2000)
>> >
>> >
>> > any help would be greatly appreciated!
>> >
>> > Best,
>> > Spencer
>> >
>> > --
>> > Spencer A Bruce
>> > 200 Washington St.
>> > Troy, NY 12180
>> > 518 225 0787
>> >
>>
>>
>>
>> --
>> Spencer A Bruce
>> 200 Washington St.
>> Troy, NY 12180
>> 518 225 0787
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <
>> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/19bc54c8/attachment-0001.html
>> >
>>
>> ------------------------------
>>
>> Message: 4
>> Date: Thu, 23 Oct 2014 14:54:53 +0000
>> From: "Jombart, Thibaut"
>> To: Spencer Bruce
>> Cc: "adegenet-forum at lists.r-forge.r-project.org"
>>
>> Subject: Re: [adegenet-forum] repooling random rows from genind
>> objects
>> Message-ID:
>> <2CB2DA8E426F3541AB1907F98ABA6570ABE6AE93 at icexch-m1.ic.ac.uk>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>>
>>
>> What does nrow(F1) and other nrow(...)'s say?
>>
>>
>>
>>
>> ________________________________
>> From: Spencer Bruce [goatsrunfaster at gmail.com]
>> Sent: 23 October 2014 15:52
>> To: Jombart, Thibaut
>> Cc: adegenet-forum at lists.r-forge.r-project.org
>> Subject: Re: [adegenet-forum] repooling random rows from genind objects
>>
>> Error message:
>>
>> Error in sample.int(length(x), size, replace, prob) :
>> invalid first argument
>>
>> On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut <
>> t.jombart at imperial.ac.uk> wrote:
>>
>> Hello,
>> hard to figure out what is wrong without the error message..
>> Cheers
>> Thibaut
>> ________________________________
>> From: adegenet-forum-bounces at lists.r-forge.r-project.org> adegenet-forum-bounces at lists.r-forge.r-project.org> [
>> adegenet-forum-bounces at lists.r-forge.r-project.org> adegenet-forum-bounces at lists.r-forge.r-project.org>] on behalf of
>> Spencer Bruce [goatsrunfaster at gmail.com]
>> Sent: 23 October 2014 15:29
>> To: adegenet-forum at lists.r-forge.r-project.org> adegenet-forum at lists.r-forge.r-project.org>
>> Subject: [adegenet-forum] repooling random rows from genind objects
>>
>> Hello All!
>>
>> I have three seperate populations as genind objects. What I would like to
>> do is pull a certain number of random individuals from each, to form a new
>> single genind population.
>>
>> I would then like individuals from this new genind population to mate
>> randomly, producing another genind object which would contain their
>> offspring.
>>
>> Below is the code I came up with (which does not work):
>>
>> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1),
>> 750), ], pop2[sample(nrow(pop2), 750), ], n=2000)
>>
>> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ],
>> Year1[sample(nrow(Year1), 1000), ], n=2000)
>>
>>
>> any help would be greatly appreciated!
>>
>> Best,
>> Spencer
>>
>> --
>> Spencer A Bruce
>> 200 Washington St.
>> Troy, NY 12180
>> 518 225 0787
>>
>>
>>
>> --
>> Spencer A Bruce
>> 200 Washington St.
>> Troy, NY 12180
>> 518 225 0787
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <
>> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/2dfd9408/attachment-0001.html
>> >
>>
>> ------------------------------
>>
>> Message: 5
>> Date: Thu, 23 Oct 2014 11:03:54 -0400
>> From: Spencer Bruce
>> To: "Jombart, Thibaut"
>> Cc: "adegenet-forum at lists.r-forge.r-project.org"
>>
>> Subject: Re: [adegenet-forum] repooling random rows from genind
>> objects
>> Message-ID:
>> <
>> CAGjKGeZiyS-27oF2CKb6JBPgS+VSyTA1-NS2Q8g2UC0JqOs4VA at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> they both say Null, if I just type them into R.
>>
>> Just to be clear these genind objects contains microsat data for 11 loci
>> for thousands of individuals.
>>
>> I'm rather new to R, so I apologize if I'm missing something obvious
>> here...
>>
>> On Thu, Oct 23, 2014 at 10:54 AM, Jombart, Thibaut <
>> t.jombart at imperial.ac.uk
>> > wrote:
>>
>> >
>> >
>> > What does nrow(F1) and other nrow(...)'s say?
>> >
>> >
>> >
>> >
>> > ------------------------------
>> > *From:* Spencer Bruce [goatsrunfaster at gmail.com]
>> > *Sent:* 23 October 2014 15:52
>> > *To:* Jombart, Thibaut
>> > *Cc:* adegenet-forum at lists.r-forge.r-project.org
>> > *Subject:* Re: [adegenet-forum] repooling random rows from genind
>> objects
>> >
>> > Error message:
>> >
>> > Error in sample.int(length(x), size, replace, prob) :
>> > invalid first argument
>> >
>> > On Thu, Oct 23, 2014 at 10:50 AM, Jombart, Thibaut <
>> > t.jombart at imperial.ac.uk> wrote:
>> >
>> >>
>> >> Hello,
>> >> hard to figure out what is wrong without the error message..
>> >> Cheers
>> >> Thibaut
>> >> ------------------------------
>> >> *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [
>> >> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of
>> Spencer
>> >> Bruce [goatsrunfaster at gmail.com]
>> >> *Sent:* 23 October 2014 15:29
>> >> *To:* adegenet-forum at lists.r-forge.r-project.org
>> >> *Subject:* [adegenet-forum] repooling random rows from genind objects
>> >>
>> >> Hello All!
>> >>
>> >> I have three seperate populations as genind objects. What I would like
>> >> to do is pull a certain number of random individuals from each, to
>> form a
>> >> new single genind population.
>> >>
>> >> I would then like individuals from this new genind population to mate
>> >> randomly, producing another genind object which would contain their
>> >> offspring.
>> >>
>> >> Below is the code I came up with (which does not work):
>> >>
>> >> Year1 <- repool(F1[sample(nrow(F1), 500), ], pop1[sample(nrow(pop1),
>> >> 750), ], pop2[sample(nrow(pop2), 750), ], n=2000)
>> >>
>> >> Year2 <- hybridize(Year1[sample(nrow(Year1), 1000), ],
>> >> Year1[sample(nrow(Year1), 1000), ], n=2000)
>> >>
>> >>
>> >> any help would be greatly appreciated!
>> >>
>> >> Best,
>> >> Spencer
>> >>
>> >> --
>> >> Spencer A Bruce
>> >> 200 Washington St.
>> >> Troy, NY 12180
>> >> 518 225 0787
>> >>
>> >
>> >
>> >
>> > --
>> > Spencer A Bruce
>> > 200 Washington St.
>> > Troy, NY 12180
>> > 518 225 0787
>> >
>>
>>
>>
>> --
>> Spencer A Bruce
>> 200 Washington St.
>> Troy, NY 12180
>> 518 225 0787
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <
>> http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20141023/c51ccf09/attachment.html
>> >
>>
>> ------------------------------
>>
>> _______________________________________________
>> adegenet-forum mailing list
>> adegenet-forum at lists.r-forge.r-project.org
>>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>>
>> End of adegenet-forum Digest, Vol 74, Issue 9
>> *********************************************
>>
>
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>
--
Spencer A Bruce
200 Washington St.
Troy, NY 12180
518 225 0787
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From caitiecollins at gmail.com Fri Oct 24 19:03:11 2014
From: caitiecollins at gmail.com (Caitlin Collins)
Date: Fri, 24 Oct 2014 18:03:11 +0100
Subject: [adegenet-forum] Question about how to interpret Cross
validation in my analysis. Thanks!
In-Reply-To:
References:
Message-ID:
Hello again,
In response to your two questions:
*1) *
The output element ?mean and CI for random chance? provides the values that
are used to draw the horizontal solid (mean) and dashed (CI) lines on the
plot generated for cross-validation.
In your case, the mean and CI for random chance was 49% (43%, 60%). The
interpretation of this would be that if the highest success in outcome
prediction that you were able to achieve with any model was between 43% and
60%, then you could be 95% confident that the ability of even the best
model to assign individuals to the correct group does not differ
significantly from the success rate you could achieve by assigning
individuals to a group at random by, say, flipping a coin as a method of
determining what group they belonged to. Ergo, you would not have succeeded
in creating a useful model.
However, your results indicate that with 25 PCs retained, your model had a
success rate of 69.5%, so you *have* created a ?useful? model. Even though
it is not a particularly successful model, it still has a mean success rate
that is 20% higher than the mean success for the coin toss approach, and
10% higher than the upper limit of the CI for random chance. So you can be
95% confident that the somewhat modest ability of your best model to
discriminate between groups is not just happening by chance?the model is
truly doing something useful.
------
*2) 2)*
While your interpretation is generally true, in that group
membership is not well-predicted by any model, I think you have mis-read
the results. The way they are laid out, at least in the text you copied
into the e-mail, has skewed the values given for the means to the right of
the number of PCs that they should be corresponding to? With 25 PCs, your
optimal model is actually achieving a mean success of nearly 70%. Still not
too good, but better than 63%. The MSE for 25 PCs is 32.4%, which is indeed
quite high.
However, the interpretation of this is not that you can only be ?sure? of
correctly predicting around 20% to the right pre-defined group. Rather, you
can be ?sure? of correctly predicting almost 70%! I think your confusion
here may come from your interpretation of what the random chance values
mean. Finding that the mean success for your best model is 20% above the
mean success for random chance does not mean you can only be sure of 20%
correct predictions. Rather, you could say that while you can in fact
expect a 70% success rate (your highest mean success), your model is only
providing an improvement of ~ 20% over the success rate you could have
achieved by tossing a coin.
This changes the severity of your final conclusion. First, I should mention
that it?s not fair to say that ?[your] set of microsatellites can?t explain
well [your] pre-defined groups?. Instead, it might be more accurate to say,
?*With* the set of microsatellites available, you are unable to build a
*model* with DAPC that explains well the variation between your pre-defined
groups.? Finally, in light of the points above, while it is still true that
the model does not explain the variation between groups particularly well,
it does explain about 70% of that variation, so I wouldn?t consider it to
be ?unsuccessful?.
-----
Sorry for the long answer, but I hope it helps a bit at least!
Please let me know if it doesn?t though, or if you have any more questions.
All the best,
Caitlin.
On Thu, Oct 16, 2014 at 11:30 PM, Angela Merino <
Angela.Merino at cawthron.org.nz> wrote:
> Thanks you very much! It was really helpful! J
>
>
>
> Then I understand that my models is not significantly the best model that
> could be found using my variables (in my case, microsatellites). If I use a
> model with n.pca=20 or =40 I got pretty the same success of membership
> prediction (and with the same big root mean squared error).
>
>
>
> 1) My last questions (I hope!) to understand the output of the
> *cross.validation* function is what does it mean the Median and
> Confidence Interval for Random Chance (below in yellow)? I think it means
> that with a confidence of 95% the value of successful assignment would be a
> value between 43% and 60%, which therefore means again that the
> optimization of my model was ?not successful?. (??)
>
> 2) About the global interpretation of this results, I would say that
> membership of my predefined groups are not well predicted by any model as
> the mean successful assignment is not higher than 63% (Maximum when
> n.pcs=25) and in addition the mean squared errors is quite high (30-40%). I
> would be ?sure? of predicting only around 20% to the right predefined
> group. In short, my set of microsatellites can?t explain well my predefined
> groups.
>
>
>
>
>
> [image: cid:image002.jpg at 01CFE7A4.CCC02130]*$`Median and Confidence
> Interval for Random Chance`*
>
> * 2.5% 50% 97.5% *
>
> *0.4294840 0.4928747 0.5962807*
>
> *$`Mean Successful Assignment by Number of PCs of PCA`*
>
> * 5 10 15 20 25 30
> 35 40 *
>
> *0.5871429 0.6000000 0.5819048 0.6014286 0.6952381 0.6747619 0.6333333
> 0.6109524 *
>
> *$`Number of PCs Achieving Highest Mean Success`*
>
> *[1] "25"*
>
> *$`Root Mean Squared Error by Number of PCs of PCA`*
>
> * 5 10 15 20 25 30
> 35 40 *
>
> *0.4301795 0.4141872 0.4389381 0.4131429 0.3241735 0.3531491 0.3885084
> 0.4145894 *
>
> *$`Number of PCs Achieving Lowest MSE`*
>
> *[1] "25"*
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Thanks in advance! I am learning a lot about R and adegenet package and I
> find really interesting to assess weak genetic population structure.
>
>
>
> Kind regards,
>
>
>
> ?Angela
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *From:* Caitlin Collins [mailto:caitiecollins at gmail.com]
> *Sent:* Friday, 17 October 2014 1:28 a.m.
> *To:* Angela Merino
> *Cc:* Collins, Caitlin; Jombart, Thibaut
> *Subject:* Re: Question about how to interpret Cross validation in my
> analysis. Thanks!
>
>
>
> Hi Angela,
>
> Well, I have two pieces of good news for you, and one piece of mediocre
> news.
>
> First, there?s nothing to worry about with respect to the ?NULL? that you
> are seeing. It just gets printed when xval.plot=TRUE as an artefact of one
> of the lines of the printing function. It has no meaning, and certainly
> does not imply that your model is not valid. (Given the stress that I now
> realise this glaring ?NULL? may cause, I?ve changed the way the plots print
> now, so in the next release of adegenet this won?t happen.)
>
> Second, you are absolutely correct in your interpretation of the results
> of xvalDapc (which are stored in whatever object you assigned the results
> to, in your case, ?xval?).
>
>
>
> This brings me to the mediocre news: given that your interpretation is
> correct, it seems that the best model you can achieve with DAPC, where
> n.pca=25, is only able to predict the group membership of validation set
> individuals in 63% of the cases, with a 32% root mean squared error.
> Arguably, this is not great. Your final comment on the matter, though, is
> quite insightful. The fact that you can achieve the same modest level of
> success with 20-80 PCs indicates that the optimisation procedure has not
> been particularly successful. Ideally, one would like to see an arch, with
> a maximum success point somewhere in the middle. In your case, there is a
> bit of an arch, but it isn?t particularly striking.
>
>
>
> The only thing I might add to your interpretation of this result is that
> it?s not so much that the model is poor because a similar level of success
> can be achieved with variable numbers of PCs. If mean success was virtually
> constant, but varying around 90%, the interpretation would not be that the
> model is poor, but rather that most levels of PC retention can compose a
> model that effectively discriminates between groups.
>
> I hope this has helped answer some of your questions. If you have any
> more, please feel free to ask.
>
> Best,
> Caitlin.
>
>
>
>
>
> On Mon, Oct 13, 2014 at 11:48 PM, Angela Merino <
> Angela.Merino at cawthron.org.nz> wrote:
>
> Hi Caitlin Collins and Thibaut Jombart,
>
>
>
> My name is Angela Parody-Merino and I am a PhD student at Massey
> University (New Zealand). I am studying the population genetic structure in
> a migratory bird (the New Zealand Godwit) with 23 microsatellites. Anyway,
> maybe this is a very simple question but I really want to understand and be
> sure about the meaning and interpretation of the output when doing
> cross-validation. I have been some days looking in the internet and reading
> explanations etc?without being able to really understand what?s going on
> with my analysis. Could you help me please? J
>
>
>
> This is the script of the analysis:
>
> > x <- ELpop
>
> > mat <- as.matrix(na.replace(x, method="mean"))
>
>
>
> Replaced 371 missing values
>
> > grp <- pop(x)
>
> > xval <- xvalDapc(mat, grp, n.pca.max = 40, training.set = 0.9,
>
> + result = "groupMean", center = TRUE, scale = FALSE,
>
> + n.pca = NULL, n.rep = 500, xval.plot = TRUE)
>
> NULL *>>> What does it mean this NULL? Does it mean that the model is not
> valid?*
>
> *$`Median and Confidence Interval for Random Chance`*
>
> * 2.5% 50% 97.5% *
>
> *0.4294840 0.4928747 0.5962807 *
>
>
>
> *$`Mean Successful Assignment by Number of PCs of PCA`*
>
> * 5 10 15 20 25 30
> 35 40 *
>
> *0.5871429 0.6000000 0.5819048 0.6014286 0.6952381 0.6747619 0.6333333
> 0.6109524 *
>
>
>
> *$`Number of PCs Achieving Highest Mean Success`*
>
> *[1] "25"*
>
>
>
> *$`Root Mean Squared Error by Number of PCs of PCA`*
>
> * 5 10 15 20 25 30
> 35 40 *
>
> *0.4301795 0.4141872 0.4389381 0.4131429 0.3241735 0.3531491 0.3885084
> 0.4145894 *
>
>
>
> *$`Number of PCs Achieving Lowest MSE`*
>
> *[1] "25"*
>
>
>
> *From the screenshot and the output results of the cross validation (in
> blue), I would say that my model (retaining 25PCs) can predict with a mean
> of 63% but it is not such a good model because most of the models that can
> be obtained by retaining 20, 40, 60, 80 PCs are quite the same successful.
> Is it my interpretation correct?*
>
>
>
>
>
>
>
> Thanks in advance,
>
>
>
> Kind regards,
>
>
>
> ?Angela Parody-Merino
> ------------------------------
>
> *Attention: *
> This message is for the named person's use only. It may contain
> confidential, proprietary or legally privileged information. If you
> receive this message in error, please immediately delete it and all copies
> of it from your system, destroy any hard copies of it and notify the
> sender. You must not, directly or indirectly, use, disclose, distribute,
> print, or copy any part of this message if you are not the intended
> recipient. Cawthron reserves the right to monitor all e-mail communications
> through its networks. Any opinions expressed in this message are those of
> the individual sender, except where the message states otherwise and the
> sender is authorised to make that statement.
>
> This e-mail message has been scanned and cleared by *MailMarshal *
> ------------------------------
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 48953 bytes
Desc: not available
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 31124 bytes
Desc: not available
URL:
From goatsrunfaster at gmail.com Mon Oct 27 14:57:21 2014
From: goatsrunfaster at gmail.com (Spencer Bruce)
Date: Mon, 27 Oct 2014 09:57:21 -0400
Subject: [adegenet-forum] Hybridize Function / df2genind error message
Message-ID:
Hello All,
After hybridizing two populations, I converted the genind file to at
dataframe to randomly extract individuals. I then attempt to convert this
data frame back into a genind file, but get the error message below:
> F1_G1 <- df2genind(randomF1)
Error in df2genind(randomF1) :
2 alleles cannot be coded by a total of 19 characters
Im assuming this is because the "pop" column, instead of being coded by a
number contains the text generated by the hybridize function
"honnedaga-tdhybrids"
I tried to resolve this by using the following code, but ran into a second
error message:
> randomF1$pop[randomF1$pop == "honnedaga-tdhybrids"] <- 1
Warning message:
In `[<-.factor`(`*tmp*`, randomF1$pop == "honnedaga-tdhybrids", :
invalid factor level, NA generated
any idea how I might be able to fix this? Thanks in advance!!!
-Spencer
--
Spencer A Bruce
200 Washington St.
Troy, NY 12180
518 225 0787
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From roberto at geodev.com.br Thu Oct 30 16:40:18 2014
From: roberto at geodev.com.br (Roberto Oliveira Santos)
Date: Thu, 30 Oct 2014 15:40:18 +0000
Subject: [adegenet-forum] find.clusters without PCA
Message-ID:
Dear all
Is it possible to run find.clusters without the PCA analysis?
I have interested in the clustering procedure but would like to compare the
results with and without PCA transformation.
Best wishes
Roberto
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From f.calboli at imperial.ac.uk Thu Oct 30 16:56:34 2014
From: f.calboli at imperial.ac.uk (Federico Calboli)
Date: Thu, 30 Oct 2014 15:56:34 +0000
Subject: [adegenet-forum] find.clusters without PCA
In-Reply-To:
References:
Message-ID:
On 30 Oct 2014, at 15:40, Roberto Oliveira Santos wrote:
> Dear all
>
> Is it possible to run find.clusters without the PCA analysis?
I would not know whether find.clusters would like it, but in general you can surely find clusters without bothering with a PCA first ? you have a formula, you input some data, you get your results.
It would also be completely and utterly idiotic.
You use a PCA before because of correlation betwen the data, and you transform the data with a PCA in a set of independent variables (and you also have an idea of what linear combinations explain little or nothing in the bargain). You use a PCA to get some signal out of the noise.
So, you can well not use a PCA and cluster. You will get some results, that might, or not, look like the results you get after a PCA decomposition. You will also have biased your clustering to an unknown amount, in a way that is not clear what might actually mean.
BW
F
> I have interested in the clustering procedure but would like to compare the results with and without PCA transformation.
>
> Best wishes
>
> Roberto
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL:
From roberto at geodev.com.br Thu Oct 30 17:02:57 2014
From: roberto at geodev.com.br (Roberto Oliveira Santos)
Date: Thu, 30 Oct 2014 16:02:57 +0000
Subject: [adegenet-forum] find.clusters without PCA
In-Reply-To:
References:
Message-ID:
Dear Federico
Many thanks. Very kind of you the "It would also be completely and utterly
idiotic.".
Best wishes
Roberto
2014-10-30 15:56 GMT+00:00 Federico Calboli :
> On 30 Oct 2014, at 15:40, Roberto Oliveira Santos
> wrote:
>
> > Dear all
> >
> > Is it possible to run find.clusters without the PCA analysis?
>
> I would not know whether find.clusters would like it, but in general you
> can surely find clusters without bothering with a PCA first -- you have a
> formula, you input some data, you get your results.
>
> It would also be completely and utterly idiotic.
>
> You use a PCA before because of correlation betwen the data, and you
> transform the data with a PCA in a set of independent variables (and you
> also have an idea of what linear combinations explain little or nothing in
> the bargain). You use a PCA to get some signal out of the noise.
>
> So, you can well not use a PCA and cluster. You will get some results,
> that might, or not, look like the results you get after a PCA
> decomposition. You will also have biased your clustering to an unknown
> amount, in a way that is not clear what might actually mean.
>
> BW
>
> F
>
>
> > I have interested in the clustering procedure but would like to compare
> the results with and without PCA transformation.
> >
> > Best wishes
> >
> > Roberto
> > _______________________________________________
> > adegenet-forum mailing list
> > adegenet-forum at lists.r-forge.r-project.org
> >
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From f.calboli at imperial.ac.uk Thu Oct 30 17:16:33 2014
From: f.calboli at imperial.ac.uk (Federico Calboli)
Date: Thu, 30 Oct 2014 16:16:33 +0000
Subject: [adegenet-forum] find.clusters without PCA
In-Reply-To:
References:
Message-ID: <43B55DB4-31DF-4C47-A4E7-F10B05131A3A@imperial.ac.uk>
You?re welcome. I would not be presenting the results to referees, PhD examiners or colleagues.
http://judgestarling.tumblr.com/post/79974811093/shaming-reputations-as-a-means-of-reducing-the
Happy reading!
F
On 30 Oct 2014, at 16:02, Roberto Oliveira Santos wrote:
> Dear Federico
>
> Many thanks. Very kind of you the "It would also be completely and utterly idiotic.".
>
> Best wishes
>
> Roberto
>
>
> 2014-10-30 15:56 GMT+00:00 Federico Calboli :
> On 30 Oct 2014, at 15:40, Roberto Oliveira Santos wrote:
>
> > Dear all
> >
> > Is it possible to run find.clusters without the PCA analysis?
>
> I would not know whether find.clusters would like it, but in general you can surely find clusters without bothering with a PCA first ? you have a formula, you input some data, you get your results.
>
> It would also be completely and utterly idiotic.
>
> You use a PCA before because of correlation betwen the data, and you transform the data with a PCA in a set of independent variables (and you also have an idea of what linear combinations explain little or nothing in the bargain). You use a PCA to get some signal out of the noise.
>
> So, you can well not use a PCA and cluster. You will get some results, that might, or not, look like the results you get after a PCA decomposition. You will also have biased your clustering to an unknown amount, in a way that is not clear what might actually mean.
>
> BW
>
> F
>
>
> > I have interested in the clustering procedure but would like to compare the results with and without PCA transformation.
> >
> > Best wishes
> >
> > Roberto
> > _______________________________________________
> > adegenet-forum mailing list
> > adegenet-forum at lists.r-forge.r-project.org
> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>
>
From roberto at geodev.com.br Thu Oct 30 19:41:17 2014
From: roberto at geodev.com.br (Roberto Oliveira Santos)
Date: Thu, 30 Oct 2014 18:41:17 +0000
Subject: [adegenet-forum] find.clusters without PCA
In-Reply-To: <43B55DB4-31DF-4C47-A4E7-F10B05131A3A@imperial.ac.uk>
References:
<43B55DB4-31DF-4C47-A4E7-F10B05131A3A@imperial.ac.uk>
Message-ID:
Hi Federico
"shaming reputations"? sorry..., pretty much sure I don't have any
reputation :-) if anyone ask a naive question this should be response? I
disagree... anyway, thanks for the text. I'll keep in mind.
Cheers,
Roberto
2014-10-30 16:16 GMT+00:00 Federico Calboli :
> You're welcome. I would not be presenting the results to referees, PhD
> examiners or colleagues.
>
>
> http://judgestarling.tumblr.com/post/79974811093/shaming-reputations-as-a-means-of-reducing-the
>
> Happy reading!
>
> F
>
>
> On 30 Oct 2014, at 16:02, Roberto Oliveira Santos
> wrote:
>
> > Dear Federico
> >
> > Many thanks. Very kind of you the "It would also be completely and
> utterly idiotic.".
> >
> > Best wishes
> >
> > Roberto
> >
> >
> > 2014-10-30 15:56 GMT+00:00 Federico Calboli :
> > On 30 Oct 2014, at 15:40, Roberto Oliveira Santos
> wrote:
> >
> > > Dear all
> > >
> > > Is it possible to run find.clusters without the PCA analysis?
> >
> > I would not know whether find.clusters would like it, but in general you
> can surely find clusters without bothering with a PCA first -- you have a
> formula, you input some data, you get your results.
> >
> > It would also be completely and utterly idiotic.
> >
> > You use a PCA before because of correlation betwen the data, and you
> transform the data with a PCA in a set of independent variables (and you
> also have an idea of what linear combinations explain little or nothing in
> the bargain). You use a PCA to get some signal out of the noise.
> >
> > So, you can well not use a PCA and cluster. You will get some results,
> that might, or not, look like the results you get after a PCA
> decomposition. You will also have biased your clustering to an unknown
> amount, in a way that is not clear what might actually mean.
> >
> > BW
> >
> > F
> >
> >
> > > I have interested in the clustering procedure but would like to
> compare the results with and without PCA transformation.
> > >
> > > Best wishes
> > >
> > > Roberto
> > > _______________________________________________
> > > adegenet-forum mailing list
> > > adegenet-forum at lists.r-forge.r-project.org
> > >
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From andres.susrud at gmail.com Thu Oct 30 21:02:12 2014
From: andres.susrud at gmail.com (=?UTF-8?Q?Andres_Schj=C3=B8nhaug_Susrud?=)
Date: Thu, 30 Oct 2014 21:02:12 +0100
Subject: [adegenet-forum] problems adding predicted points to scatter plot
Message-ID:
Dear list,
I'm having problems adding points to a dapc scatter plot.
grp = find.clusters(human_DR_bind_2[1:200,])
dapc1 <- dapc(human_DR_bind_2[1:200,],grp$grp)
pred.sup <- predict.dapc(dapc1, newdata=x.sup2)
names(pred.sup)
scatter(dapc1, cell=2.5, pch=1, cstar=0, axesel=FALSE, col=c(2,3,4))
par(xpd=T)
points(pred.sup$ind.scores[,1],pred.sup$ind.scores[,2],pch = 2,col = 6)
the problem is that the predicted points are "all" visible, but completely
out of placement.
when plotting the dapc1$ind.scores[,1],dapc1$ind.scores
plot(dapc1$ind.scores[,1],dapc1$ind.scores)
points(pred.sup$ind.scores[,1],pred.sup$ind.scores[,2],pch = 2,col = 6)
the alligment seems fine.
thanks for any help on this matter
BR
Andres
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From hilpert at ipk-gatersleben.de Wed Oct 29 11:24:26 2014
From: hilpert at ipk-gatersleben.de (Stefanie Hilpert)
Date: Wed, 29 Oct 2014 10:24:26 +0000
Subject: [adegenet-forum] DAPC & Ploidylevel
Message-ID:
I've already asked the question a week ago, but I'll just try again, so here we go:
Dear everybody,
I am currently using the adegenet package to perform a structure analysis of my microsatellite dataset and compare it to the results of an analysis using STRUCTURE software. The organism I am working on is an apomictic plant and I am aware that STRUCTURE is probably not adequate because it assumes HWE and asexuality violates HWE. Nevertheless we use STRUCTURE analysis for apomicts, because in most of the cases the assigned number of groups correlate to biological traits. Knowing that there is a bias using STRUCTURE we decided to perform a DAPC additionally.
But now I ran into another problem using adegenet. I am working with a mixed ploidy system with ploidies ranging from 4 to 11. To implement the data into the adgenet package we coded all individuals as 11x because otherwise it was not possible to load the data. Now I am wondering how big is the bias if the calculation assumes that for example a tetraploid is now a hendecaploid and if I could still trust the results. I am asking because the results of the DAPC are completely different to the ones of STRUCTURE which puzzles me a bit because I somehow at least expected correlations (the number of optimal k is the same, but the assigned individuals to the clusters differ completely).
I would appreciate some help
Stefanie Hilpert
------------------------------------------------------------------------------
Stefanie Hilpert
-PhD Candidate-
Dept. of Cytogenetics and Genome Analysis
Leibniz Institute of Plant Genetics and Crop Plant Research (IPK)
Corrensstra?e 3, D-06466 Gatersleben Germany
+49 (0)39482 5673
IPK Graduate School
International Max-Planck Research School
-------------- next part --------------
An HTML attachment was scrubbed...
URL: