From noreply at r-forge.r-project.org Wed May 8 15:19:07 2013 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Wed, 8 May 2013 15:19:07 +0200 (CEST) Subject: [adegenet-commits] r1121 - pkg/vignettes www Message-ID: <20130508131908.0DF731802F0@r-forge.r-project.org> Author: jombart Date: 2013-05-08 15:19:07 +0200 (Wed, 08 May 2013) New Revision: 1121 Modified: pkg/vignettes/adegenet-dapc.tex www/literature.html Log: +1ref Modified: pkg/vignettes/adegenet-dapc.tex =================================================================== --- pkg/vignettes/adegenet-dapc.tex 2013-04-30 09:59:19 UTC (rev 1120) +++ pkg/vignettes/adegenet-dapc.tex 2013-05-08 13:19:07 UTC (rev 1121) @@ -21,7 +21,7 @@ \newcommand{\code}[1]{{{\tt #1}}} -\title{A tutorial for Discriminant Analysis of Principal Components (DAPC) using \textit{adegenet} 1.3-0} +\title{A tutorial for Discriminant Analysis of Principal Components (DAPC) using \textit{adegenet} 1.3-7} \author{Thibaut Jombart} \date{\today} @@ -204,12 +204,12 @@ We specify that we want to evaluate up to $k=40$ groups (\texttt{max.n.clust=40}): \begin{Schunk} \begin{Sinput} -> grp <- find.clusters(x, max.n.clust = 40) +> grp <- find.clusters(x, max.n.clust=40) \end{Sinput} \end{Schunk} \begin{center} - \includegraphics[width=.7\textwidth]{figs/findclust-pca.pdf} + \includegraphics[width=.7\textwidth]{findclust-pca.pdf} \end{center} \noindent @@ -221,7 +221,7 @@ Then, the function displays a graph of BIC values for increasing values of $k$: \begin{center} - \includegraphics[width=.7\textwidth]{figs/findclust-bic.pdf} + \includegraphics[width=.7\textwidth]{findclust-bic.pdf} \end{center} \noindent This graph shows a clear decrease of BIC until $k=6$ clusters, after which BIC increases. @@ -254,14 +254,14 @@ \end{Sinput} \begin{Soutput} 001 002 003 004 005 006 007 008 009 010 - 1 1 1 5 1 1 1 1 1 1 + 6 6 6 2 6 6 6 6 6 6 Levels: 1 2 3 4 5 6 \end{Soutput} \begin{Sinput} > grp$size \end{Sinput} \begin{Soutput} -[1] 98 97 102 99 105 99 +[1] 97 105 102 99 99 98 \end{Soutput} \end{Schunk} @@ -276,19 +276,18 @@ \end{Sinput} \begin{Soutput} 1 2 3 4 5 6 - 1 97 0 0 0 3 0 - 2 0 0 0 99 1 0 - 3 0 2 0 0 0 98 + 1 0 3 0 0 0 97 + 2 0 1 0 0 99 0 + 3 2 0 0 98 0 0 4 0 0 100 0 0 0 - 5 1 95 2 0 2 0 - 6 0 0 0 0 99 1 + 5 95 2 2 0 0 1 + 6 0 99 0 1 0 0 \end{Soutput} \begin{Sinput} -> table.value(table(pop(x), grp$grp), col.lab = paste("inf", 1:6), -+ row.lab = paste("ori", 1:6)) +> table.value(table(pop(x), grp$grp), col.lab=paste("inf", 1:6), row.lab=paste("ori", 1:6)) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-006} +\includegraphics{dapc-006} \noindent Rows correspond to actual groups ("ori''), while columns correspond to inferred groups ("inf''). @@ -313,7 +312,7 @@ better, more efficient summaries of the data than others. For instance, in the following case: \begin{center} - \includegraphics[width=.7\textwidth]{figs/findclust-noclearcut.pdf} + \includegraphics[width=.7\textwidth]{findclust-noclearcut.pdf} \end{center} \noindent , the concept of "true $k$" is fairly hypothetical. This does not mean that clutering @@ -384,7 +383,7 @@ probabilities returned by the method (see section below about the stability of membership probabilities). \begin{center} - \includegraphics[width=.7\textwidth]{figs/findclust-pca.pdf} + \includegraphics[width=.7\textwidth]{findclust-pca.pdf} \end{center} \noindent The bottomline is therefore retaining a few PCs without sacrificing too much information. @@ -394,7 +393,7 @@ Then, the method displays a barplot of eigenvalues for the discriminant analysis, asking for a number of discriminant functions to retain (unless argument \texttt{n.da} is provided). \begin{center} - \includegraphics[width=.7\textwidth]{figs/eigen-dapc.pdf} + \includegraphics[width=.7\textwidth]{eigen-dapc.pdf} \end{center} For small number of clusters, all eigenvalues can be retained since all discriminant functions can @@ -409,9 +408,9 @@ > dapc1 \end{Sinput} \begin{Soutput} - ######################################### + ################################################# # Discriminant Analysis of Principal Components # - ######################################### + ################################################# class: dapc $call: dapc.genind(x = x, pop = grp$grp, n.pca = 40, n.da = 100) @@ -452,7 +451,7 @@ > scatter(dapc1) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-010} +\includegraphics{dapc-010} \noindent The obtained graph represents the individuals as dots and the groups as inertia ellipses. Eigenvalues of the analysis are displayed in inset. These graphs are fairly easy to @@ -481,33 +480,31 @@ \begin{Schunk} \begin{Sinput} -> scatter(dapc1, posi.da = "bottomright", bg = "white", pch = 17:22) +> scatter(dapc1, posi.da="bottomright", bg="white", pch=17:22) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-011} +\includegraphics{dapc-011} \noindent This is still not entirely satisfying: we need to define other colors more visible over a white background, and we can remove the segments linking the points to their ellipses: \begin{Schunk} \begin{Sinput} -> myCol <- c("darkblue", "purple", "green", "orange", "red", "blue") -> scatter(dapc1, posi.da = "bottomright", bg = "white", pch = 17:22, -+ cstar = 0, col = myCol, scree.pca = TRUE, posi.pca = "bottomleft") +> myCol <- c("darkblue","purple","green","orange","red","blue") +> scatter(dapc1, posi.da="bottomright", bg="white", pch=17:22, cstar=0, col=myCol, scree.pca=TRUE, posi.pca="bottomleft") \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-012} +\includegraphics{dapc-012} \noindent Another possibility is remove the labels within the ellipses and add a legend to the plot. We also use the same symbol for all individuals, but use bigger dots and transparent colours to have a better feel for the density of individuals on the factorial plane. \begin{Schunk} \begin{Sinput} -> scatter(dapc1, scree.da = FALSE, bg = "white", pch = 20, cell = 0, -+ cstar = 0, col = myCol, solid = 0.4, cex = 3, clab = 0, leg = TRUE, -+ txt.leg = paste("Cluster", 1:6)) +> scatter(dapc1, scree.da=FALSE, bg="white", pch=20, cell=0, cstar=0, col=myCol, solid=.4, ++ cex=3,clab=0, leg=TRUE, txt.leg=paste("Cluster",1:6)) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-013} +\includegraphics{dapc-013} We can also add a minimum spanning tree based on the (squared) distances between populations within the entire space. @@ -519,27 +516,22 @@ using \texttt{scree.pca=TRUE}). \begin{Schunk} \begin{Sinput} -> scatter(dapc1, ratio.pca = 0.3, bg = "white", pch = 20, cell = 0, -+ cstar = 0, col = myCol, solid = 0.4, cex = 3, clab = 0, mstree = TRUE, -+ scree.da = FALSE, posi.pca = "bottomright", leg = TRUE, txt.leg = paste("Cluster", -+ 1:6)) -> par(xpd = TRUE) -> points(dapc1$grp.coord[, 1], dapc1$grp.coord[, 2], pch = 4, cex = 3, -+ lwd = 8, col = "black") -> points(dapc1$grp.coord[, 1], dapc1$grp.coord[, 2], pch = 4, cex = 3, -+ lwd = 2, col = myCol) -> myInset <- function() { +> scatter(dapc1, ratio.pca=0.3, bg="white", pch=20, cell=0, cstar=0, col=myCol, solid=.4, ++ cex=3, clab=0, mstree=TRUE, scree.da=FALSE, ++ posi.pca="bottomright", leg=TRUE, txt.leg=paste("Cluster",1:6)) +> par(xpd=TRUE) +> points(dapc1$grp.coord[,1], dapc1$grp.coord[,2], pch=4, cex=3, lwd=8, col="black") +> points(dapc1$grp.coord[,1], dapc1$grp.coord[,2], pch=4, cex=3, lwd=2, col=myCol) +> myInset <- function(){ + temp <- dapc1$pca.eig -+ temp <- 100 * cumsum(temp)/sum(temp) -+ plot(temp, col = rep(c("black", "lightgrey"), c(dapc1$n.pca, -+ 1000)), ylim = c(0, 100), xlab = "PCA axis", ylab = "Cumulated variance (%)", -+ cex = 1, pch = 20, type = "h", lwd = 2) ++ temp <- 100* cumsum(temp)/sum(temp) ++ plot(temp, col=rep(c("black","lightgrey"), c(dapc1$n.pca,1000)), ylim=c(0,100), ++ xlab="PCA axis", ylab="Cumulated variance (%)", cex=1, pch=20, type="h", lwd=2) + } -> add.scatter(myInset(), posi = "bottomright", inset = c(-0.03, -+ -0.01), ratio = 0.28, bg = transp("white")) +> add.scatter(myInset(), posi="bottomright", inset=c(-0.03,-0.01), ratio=.28, bg=transp("white")) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-014} +\includegraphics{dapc-014} Lastly, note that \texttt{scatter} can also represent a single discriminant function, which is @@ -548,11 +540,10 @@ different colors for different groups: \begin{Schunk} \begin{Sinput} -> scatter(dapc1, 1, 1, col = myCol, bg = "white", scree.da = FALSE, -+ legend = TRUE, solid = 0.4) +> scatter(dapc1,1,1, col=myCol, bg="white", scree.da=FALSE, legend=TRUE, solid=.4) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-015} +\includegraphics{dapc-015} @@ -604,7 +595,7 @@ \end{Soutput} \begin{Sinput} > pop(H3N2) <- H3N2$other$epid -> dapc.flu <- dapc(H3N2, n.pca = 30, n.da = 10) +> dapc.flu <- dapc(H3N2, n.pca=30,n.da=10) \end{Sinput} \end{Schunk} @@ -612,22 +603,20 @@ second one shows the originality of 2006 strains. \begin{Schunk} \begin{Sinput} -> myPal <- colorRampPalette(c("blue", "gold", "red")) -> scatter(dapc.flu, col = transp(myPal(6)), scree.da = FALSE, cell = 1.5, -+ cex = 2, bg = "white", cstar = 0) +> myPal <- colorRampPalette(c("blue","gold","red")) +> scatter(dapc.flu, col=transp(myPal(6)), scree.da=FALSE, cell=1.5, cex=2, bg="white",cstar=0) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-017} +\includegraphics{dapc-017} We can assess which alleles most highlight the originality of 2006 using \texttt{loadingplot}: \begin{Schunk} \begin{Sinput} > set.seed(4) -> contrib <- loadingplot(dapc.flu$var.contr, axis = 2, thres = 0.07, -+ lab.jitter = 1) +> contrib <- loadingplot(dapc.flu$var.contr, axis=2, thres=.07, lab.jitter=1) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-018} +\includegraphics{dapc-018} \noindent \texttt{temp} is a list invisibly returned by \texttt{loadingplot} which contains the most contributing alleles (i.e., contributions above a given threshold -- argument \texttt{threshold}). @@ -638,10 +627,8 @@ > temp <- seploc(H3N2) > snp906 <- truenames(temp[["906"]])$tab > snp399 <- truenames(temp[["399"]])$tab -> freq906 <- apply(snp906, 2, function(e) tapply(e, pop(H3N2), -+ mean, na.rm = TRUE)) -> freq399 <- apply(snp399, 2, function(e) tapply(e, pop(H3N2), -+ mean, na.rm = TRUE)) +> freq906 <- apply(snp906, 2, function(e) tapply(e, pop(H3N2), mean, na.rm=TRUE)) +> freq399 <- apply(snp399, 2, function(e) tapply(e, pop(H3N2), mean, na.rm=TRUE)) > freq906 \end{Sinput} \begin{Soutput} @@ -666,16 +653,14 @@ 2006 0.357142857 0.6428571 \end{Soutput} \begin{Sinput} -> par(mfrow = c(1, 2), mar = c(5.1, 4.1, 4.1, 0.1), las = 3) -> matplot(freq906, pch = c("a", "c"), type = "b", xlab = "year", -+ ylab = "allele frequency", xaxt = "n", cex = 1.5, main = "SNP # 906") -> axis(side = 1, at = 1:6, lab = 2001:2006) -> matplot(freq399, pch = c("c", "t"), type = "b", xlab = "year", -+ ylab = "allele frequency", xaxt = "n", cex = 1.5, main = "SNP # 399") -> axis(side = 1, at = 1:6, lab = 2001:2006) +> par(mfrow=c(1,2), mar=c(5.1,4.1,4.1,.1),las=3) +> matplot(freq906, pch=c("a","c"), type="b",xlab="year",ylab="allele frequency", xaxt="n", cex=1.5, main="SNP # 906") +> axis(side=1, at=1:6, lab=2001:2006) +> matplot(freq399, pch=c("c","t"), type="b", xlab="year",ylab="allele frequency", xaxt="n", cex=1.5, main="SNP # 399") +> axis(side=1, at=1:6, lab=2001:2006) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-019} +\includegraphics{dapc-019} In both cases, a new allele appeared in 2005 at a very low frequency, and reached high or even dominant frequencies a year later. @@ -714,16 +699,16 @@ [1] 600 6 \end{Soutput} \begin{Sinput} -> round(head(dapc1$posterior), 3) +> round(head(dapc1$posterior),3) \end{Sinput} \begin{Soutput} - 1 2 3 4 5 6 -001 1.000 0 0 0 0.000 0 -002 1.000 0 0 0 0.000 0 -003 1.000 0 0 0 0.000 0 -004 0.016 0 0 0 0.984 0 -005 1.000 0 0 0 0.000 0 -006 1.000 0 0 0 0.000 0 + 1 2 3 4 5 6 +001 0 0.000 0 0 0 1.000 +002 0 0.000 0 0 0 1.000 +003 0 0.000 0 0 0 1.000 +004 0 0.984 0 0 0 0.016 +005 0 0.000 0 0 0 1.000 +006 0 0.000 0 0 0 1.000 \end{Soutput} \end{Schunk} Each row corresponds to an individual, each column to a group. @@ -744,17 +729,17 @@ $assign.per.pop 1 2 3 4 5 6 -1.0000000 1.0000000 0.9901961 1.0000000 0.9904762 1.0000000 +1.0000000 0.9904762 0.9901961 1.0000000 1.0000000 1.0000000 $prior.grp.size 1 2 3 4 5 6 - 98 97 102 99 105 99 + 97 105 102 99 99 98 $post.grp.size 1 2 3 4 5 6 - 99 97 101 99 105 99 + 97 105 101 99 99 99 \end{Soutput} \end{Schunk} The slot \texttt{assign.per.pop} indicates the proportions of successful reassignment (based on @@ -766,10 +751,10 @@ options); here, we choose to represent only the first 50 individuals to make the figure readable: \begin{Schunk} \begin{Sinput} -> assignplot(dapc1, subset = 1:50) +> assignplot(dapc1, subset=1:50) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-022} +\includegraphics{dapc-022} \noindent This figure is the simple graphical translation of the \texttt{posterior} table above. Heat colors @@ -787,27 +772,25 @@ We can plot information of all individuals to have a global picture of the clusters composition. \begin{Schunk} \begin{Sinput} -> compoplot(dapc1, posi = "bottomright", txt.leg = paste("Cluster", -+ 1:6), lab = "", ncol = 1, xlab = "individuals") +> compoplot(dapc1, posi="bottomright", txt.leg=paste("Cluster", 1:6), lab="", ncol=1, xlab="individuals") \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-023} +\includegraphics{dapc-023} \noindent We can also have a closer look at a subset of individuals; for instance, for the first 50 individuals: \begin{Schunk} \begin{Sinput} -> compoplot(dapc1, subset = 1:50, posi = "bottomright", txt.leg = paste("Cluster", -+ 1:6), lab = "", ncol = 2, xlab = "individuals") +> compoplot(dapc1, subset=1:50, posi="bottomright", txt.leg=paste("Cluster", 1:6), lab="", ncol=2, xlab="individuals") \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-024} +\includegraphics{dapc-024} Obviously, we can use the power of R to lead our investigation further. For instance, which are the most 'admixed' individuals? Let us consider as admixed individuals having no more than 90\% of probability of membership in a single cluster: \begin{Schunk} \begin{Sinput} -> temp <- which(apply(dapc1$posterior, 1, function(e) all(e < 0.9))) +> temp <- which(apply(dapc1$posterior,1, function(e) all(e<0.9))) > temp \end{Sinput} \begin{Soutput} @@ -815,11 +798,10 @@ 21 47 243 280 \end{Soutput} \begin{Sinput} -> compoplot(dapc1, subset = temp, posi = "bottomright", txt.leg = paste("Cluster", -+ 1:6), ncol = 2) +> compoplot(dapc1, subset=temp, posi="bottomright", txt.leg=paste("Cluster", 1:6), ncol=2) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-025} +\includegraphics{dapc-025} @@ -881,18 +863,16 @@ @other: a list containing: coun breed spe \end{Soutput} \begin{Sinput} -> temp <- summary(dapc(microbov, n.da = 100, n.pca = 3))$assign.per.pop * -+ 100 +> temp <- summary(dapc(microbov, n.da=100, n.pca=3))$assign.per.pop*100 \end{Sinput} \end{Schunk} \begin{Schunk} \begin{Sinput} -> par(mar = c(4.5, 7.5, 1, 1)) -> barplot(temp, xlab = "% of reassignment to actual breed", horiz = TRUE, -+ las = 1) +> par(mar=c(4.5,7.5,1,1)) +> barplot(temp, xlab="% of reassignment to actual breed", horiz=TRUE, las=1) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-027} +\includegraphics{dapc-027} \noindent We can see that some breeds are well discriminated (e.g. Zebu, Lagunaire, > 90\%) while others are @@ -901,18 +881,16 @@ We repeat the analysis, this time keeping 300 PCs: \begin{Schunk} \begin{Sinput} -> temp <- summary(dapc(microbov, n.da = 100, n.pca = 300))$assign.per.pop * -+ 100 +> temp <- summary(dapc(microbov, n.da=100, n.pca=300))$assign.per.pop*100 \end{Sinput} \end{Schunk} \begin{Schunk} \begin{Sinput} -> par(mar = c(4.5, 7.5, 1, 1)) -> barplot(temp, xlab = "% of reassignment to actual breed", horiz = TRUE, -+ las = 1) +> par(mar=c(4.5,7.5,1,1)) +> barplot(temp, xlab="% of reassignment to actual breed", horiz=TRUE, las=1) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-029} +\includegraphics{dapc-029} \noindent We now obtain almost 100\% of discrimination for all groups. Is this result satisfying? Actually not. @@ -923,18 +901,16 @@ \begin{Sinput} > x <- microbov > pop(x) <- sample(pop(x)) -> temp <- summary(dapc(x, n.da = 100, n.pca = 300))$assign.per.pop * -+ 100 +> temp <- summary(dapc(x, n.da=100, n.pca=300))$assign.per.pop*100 \end{Sinput} \end{Schunk} \begin{Schunk} \begin{Sinput} -> par(mar = c(4.5, 7.5, 1, 1)) -> barplot(temp, xlab = "% of reassignment to actual breed", horiz = TRUE, -+ las = 1) +> par(mar=c(4.5,7.5,1,1)) +> barplot(temp, xlab="% of reassignment to actual breed", horiz=TRUE, las=1) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-031} +\includegraphics{dapc-031} \noindent Groups have been randomised, and yet we still get very good discrimination. @@ -955,7 +931,7 @@ groups, and computing $a$-scores for each group, as well as the average $a$-score: \begin{Schunk} \begin{Sinput} -> dapc2 <- dapc(microbov, n.da = 100, n.pca = 10) +> dapc2 <- dapc(microbov, n.da=100, n.pca=10) > temp <- a.score(dapc2) > names(temp) \end{Sinput} @@ -963,7 +939,7 @@ [1] "tab" "pop.score" "mean" \end{Soutput} \begin{Sinput} -> temp$tab[1:5, 1:5] +> temp$tab[1:5,1:5] \end{Sinput} \begin{Soutput} Borgou Zebu Lagunaire NDama Somba @@ -995,7 +971,7 @@ The number of retained PCs can be chosen so as to optimize the $a$-score; this is achived by \texttt{optim.a.score}: \begin{Schunk} \begin{Sinput} -> dapc2 <- dapc(microbov, n.da = 100, n.pca = 50) +> dapc2 <- dapc(microbov, n.da=100, n.pca=50) \end{Sinput} \end{Schunk} \begin{Schunk} @@ -1004,7 +980,7 @@ \end{Sinput} \end{Schunk} \begin{center} - \includegraphics[width=.7\textwidth]{figs/ascore.pdf} + \includegraphics[width=.7\textwidth]{ascore.pdf} \end{center} @@ -1019,23 +995,23 @@ We perform the analysis with 20 PCs retained, and then map the membership probabilities as before: \begin{Schunk} \begin{Sinput} -> dapc3 <- dapc(microbov, n.da = 100, n.pca = 20) +> dapc3 <- dapc(microbov, n.da=100, n.pca=20) > myCol <- rainbow(15) \end{Sinput} \end{Schunk} \begin{Schunk} \begin{Sinput} -> par(mar = c(5.1, 4.1, 1.1, 1.1), xpd = TRUE) -> compoplot(dapc3, lab = "", posi = list(x = 12, y = -0.01), cleg = 0.7) +> par(mar=c(5.1,4.1,1.1,1.1), xpd=TRUE) +> compoplot(dapc3, lab="", posi=list(x=12,y=-.01), cleg=.7) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-036} +\includegraphics{dapc-036} And as before, we can investigate further admixed individuals, which we arbitrarily define as those having no more than 0.5 probability of membership to any group: \begin{Schunk} \begin{Sinput} -> temp <- which(apply(dapc3$posterior, 1, function(e) all(e < 0.5))) +> temp <- which(apply(dapc3$posterior,1, function(e) all(e<0.5))) > temp \end{Sinput} \begin{Soutput} @@ -1052,12 +1028,11 @@ \end{Soutput} \begin{Sinput} > lab <- pop(microbov) -> par(mar = c(8, 4, 5, 1), xpd = TRUE) -> compoplot(dapc3, subset = temp, cleg = 0.6, posi = list(x = 0, -+ y = 1.2), lab = lab) +> par(mar=c(8,4,5,1), xpd=TRUE) +> compoplot(dapc3, subset=temp, cleg=.6, posi=list(x=0,y=1.2),lab=lab) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-037} +\includegraphics{dapc-037} \noindent Admixture appears to be the strongest between a few breeds (Blonde d'Aquitaine, Bretonne Pie-Noire, Limousine and Gascone). Some features are fairly surprising; for instance, the last individual is @@ -1102,8 +1077,7 @@ \begin{Sinput} > data(microbov) > set.seed(2) -> kept.id <- unlist(tapply(1:nInd(microbov), pop(microbov), function(e) sample(e, -+ 20, replace = FALSE))) +> kept.id <- unlist(tapply(1:nInd(microbov), pop(microbov), function(e) sample(e, 20,replace=FALSE))) > x <- microbov[kept.id] > x.sup <- microbov[-kept.id] > nInd(x) @@ -1126,8 +1100,8 @@ supplementary individuals: \begin{Schunk} \begin{Sinput} -> dapc4 <- dapc(x, n.pca = 20, n.da = 15) -> pred.sup <- predict.dapc(dapc4, newdata = x.sup) +> dapc4 <- dapc(x,n.pca=20,n.da=15) +> pred.sup <- predict.dapc(dapc4, newdata=x.sup) > names(pred.sup) \end{Sinput} \begin{Soutput} @@ -1141,7 +1115,7 @@ 15 Levels: Borgou Zebu Lagunaire NDama Somba Aubrac ... Salers \end{Soutput} \begin{Sinput} -> pred.sup$ind.scores[1:5, 1:3] +> pred.sup$ind.scores[1:5,1:3] \end{Sinput} \begin{Soutput} LD1 LD2 LD3 @@ -1152,7 +1126,7 @@ 005 -4.718570 -0.200391 -0.9196541 \end{Soutput} \begin{Sinput} -> round(pred.sup$posterior[1:5, 1:5], 3) +> round(pred.sup$posterior[1:5, 1:5],3) \end{Sinput} \begin{Soutput} Borgou Zebu Lagunaire NDama Somba @@ -1173,26 +1147,23 @@ \begin{Schunk} \begin{Sinput} > col <- rainbow(length(levels(pop(x)))) -> col.points <- transp(col[as.integer(pop(x))], 0.2) -> scatter(dapc4, col = col, bg = "white", scree.da = 0, pch = "", -+ cstar = 0, clab = 0, xlim = c(-10, 10), legend = TRUE) -> par(xpd = TRUE) -> points(dapc4$ind.coord[, 1], dapc4$ind.coord[, 2], pch = 20, -+ col = col.points, cex = 5) +> col.points <- transp(col[as.integer(pop(x))],.2) +> scatter(dapc4, col=col, bg="white", scree.da=0, pch="", cstar=0, clab=0, xlim=c(-10,10), legend=TRUE) +> par(xpd=TRUE) +> points(dapc4$ind.coord[,1], dapc4$ind.coord[,2], pch=20, col=col.points, cex=5) > col.sup <- col[as.integer(pop(x.sup))] -> points(pred.sup$ind.scores[, 1], pred.sup$ind.scores[, 2], pch = 15, -+ col = transp(col.sup, 0.7), cex = 2) -> add.scatter.eig(dapc4$eig, 15, 1, 2, posi = "bottomright", inset = 0.02) +> points(pred.sup$ind.scores[,1], pred.sup$ind.scores[,2], pch=15, col=transp(col.sup,.7), cex=2) +> add.scatter.eig(dapc4$eig,15,1,2, posi="bottomright", inset=.02) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-040} +\includegraphics{dapc-040} \noindent Light dots and ellipses correspond to the original analysis, while more solid squares indicate supplementary individuals. Results are fairly satisfying: \begin{Schunk} \begin{Sinput} -> mean(as.character(pred.sup$assign) == as.character(pop(x.sup))) +> mean(as.character(pred.sup$assign)==as.character(pop(x.sup))) \end{Sinput} \begin{Soutput} [1] 0.7549505 @@ -1204,10 +1175,10 @@ table of the actual cluster \textit{vs} the inferred one: \begin{Schunk} \begin{Sinput} -> table.value(table(pred.sup$assign, pop(x.sup)), col.lab = levels(pop(x.sup))) +> table.value(table(pred.sup$assign, pop(x.sup)), col.lab=levels(pop(x.sup))) \end{Sinput} \end{Schunk} -\includegraphics{figs/dapc-042} +\includegraphics{dapc-042} \noindent Columns correspond to actual clusters of the supplementary individuals, while rows correspond to inferred clusters. Modified: www/literature.html =================================================================== --- www/literature.html 2013-04-30 09:59:19 UTC (rev 1120) +++ www/literature.html 2013-05-08 13:19:07 UTC (rev 1121) @@ -63,6 +63,7 @@ + the bublisher's website]

@@ -95,6 +96,7 @@ + abstract]

- the paper presenting the spatial @@ -112,6 +114,7 @@ + principal component analysis (sPCA, function spca), global and @@ -132,6 +135,7 @@ + cryptic spatial patterns in genetic variability by a new multivariate method.  Heredity 101: 92-103. doi: @@ -153,6 +157,7 @@ + abstract]

@@ -176,6 +181,7 @@ + simulations of genealoies of haplotypes (haploGen):
Jombart T, Eggo RM, Dodd PJ, Balloux F (2010) @@ -199,6 +205,7 @@ + of Principal Components (DAPC, functions find.clusters @@ -230,6 +237,7 @@ + Behaviour76: 87-95.

@@ -250,6 +258,7 @@ + Genomics
9: 256.
@@ -285,6 +294,7 @@ + marmota.Molecular @@ -299,6 +309,7 @@ + Ecology 18: 1491-1503.

@@ -352,6 +363,7 @@ + australis in North America. Biological Invasions. doi: 10.1007/s10530-010-9699-6.
@@ -509,6 +521,7 @@ + Oct 6. [Epub ahead of print]

[24] SANTOS, H., BURBAN, C., ROUSSELET, J., @@ -527,6 +540,7 @@ + pityocampa, Lepidoptera, Notodontidae). Journal of Evolutionary Biology, no. doi: 10.1111/j.1420-9101.2010.02147.x
@@ -550,6 +564,7 @@ + Vol. Sci. Pap. ICCAT, 65(3): 988-995

[26] Vandewoestijne @@ -566,6 +581,7 @@ + S, Van Dyck H, 2010 Population Genetic @@ -585,6 +601,7 @@ + ONE5(11): e13810. doi:10.1371/journal.pone.0013810
@@ -615,6 +632,7 @@ +
DOI: 10.1007/s10329-010-0232-4

@@ -682,6 +700,7 @@ + tetradactylum: Polynemidae). Molecular Ecology, 20: no. doi: 10.1111/j.1365-294X.2011.05097.x

@@ -707,6 +726,7 @@ + neoformans Variety grubii Multilocus Sequence Types from Thailand Are Consistent with an Ancestral African Origin. PLoS @@ -964,6 +984,7 @@ + 10.1007/s10709-012-9640-2

[76] Samantha Baldwin, Meeghan Pither-Joyce, Kathryn Wright, @@ -1608,8 +1629,13 @@ divergence in Atlantic cod. Molecular Ecology, 22: 2653?2667. doi: 10.1111/mec.12284

+ [186] Costantini F, Carlesi L, Abbiati M (2013) Quantifying + Spatial Genetic Structuring in Mesophotic Populations of the + Precious Coral Corallium rubrum. PLoS ONE 8(4): e61546. + doi:10.1371/journal.pone.0061546


+

* adegenet not or wrongly cited, but actually used in the paper.
From noreply at r-forge.r-project.org Tue May 14 11:17:14 2013 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Tue, 14 May 2013 11:17:14 +0200 (CEST) Subject: [adegenet-commits] r1122 - www Message-ID: <20130514091714.E605B18508E@r-forge.r-project.org> Author: jombart Date: 2013-05-14 11:17:14 +0200 (Tue, 14 May 2013) New Revision: 1122 Modified: www/literature.html Log: +1 ref Modified: www/literature.html =================================================================== --- www/literature.html 2013-05-08 13:19:07 UTC (rev 1121) +++ www/literature.html 2013-05-14 09:17:14 UTC (rev 1122) @@ -64,6 +64,7 @@ + the bublisher's website]

@@ -97,6 +98,7 @@ + abstract]

- the paper presenting the spatial @@ -115,6 +117,7 @@ + principal component analysis (sPCA, function spca), global and @@ -136,6 +139,7 @@ + cryptic spatial patterns in genetic variability by a new multivariate method.  Heredity 101: 92-103. doi: @@ -158,6 +162,7 @@ + abstract]

@@ -182,6 +187,7 @@ + simulations of genealoies of haplotypes (haploGen):
Jombart T, Eggo RM, Dodd PJ, Balloux F (2010) @@ -206,6 +212,7 @@ + of Principal Components (DAPC, functions find.clusters @@ -238,6 +245,7 @@ + Behaviour76: 87-95.

@@ -259,6 +267,7 @@ + Genomics
9: 256.
@@ -295,6 +304,7 @@ + marmota.Molecular @@ -310,6 +320,7 @@ + Ecology 18: 1491-1503.

@@ -364,6 +375,7 @@ + australis in North America. Biological Invasions. doi: 10.1007/s10530-010-9699-6.
@@ -522,6 +534,7 @@ + Oct 6. [Epub ahead of print]

[24] SANTOS, H., BURBAN, C., ROUSSELET, J., @@ -541,6 +554,7 @@ + pityocampa, Lepidoptera, Notodontidae). Journal of Evolutionary Biology, no. doi: 10.1111/j.1420-9101.2010.02147.x
@@ -565,6 +579,7 @@ + Vol. Sci. Pap. ICCAT, 65(3): 988-995

[26] Vandewoestijne @@ -582,6 +597,7 @@ + S, Van Dyck H, 2010 Population Genetic @@ -602,6 +618,7 @@ + ONE5(11): e13810. doi:10.1371/journal.pone.0013810
@@ -633,6 +650,7 @@ +
DOI: 10.1007/s10329-010-0232-4

@@ -701,6 +719,7 @@ + tetradactylum: Polynemidae). Molecular Ecology, 20: no. doi: 10.1111/j.1365-294X.2011.05097.x

@@ -727,6 +746,7 @@ + neoformans Variety grubii Multilocus Sequence Types from Thailand Are Consistent with an Ancestral African Origin.
PLoS @@ -985,6 +1005,7 @@ + 10.1007/s10709-012-9640-2

[76] Samantha Baldwin, Meeghan Pither-Joyce, Kathryn Wright, @@ -1634,8 +1655,11 @@ Precious Coral Corallium rubrum. PLoS ONE 8(4): e61546. doi:10.1371/journal.pone.0061546

+ [187] Hendry et al. (2013) EVOLUTIONARY INFERENCES FROM THE + ANALYSIS OF EXCHANGEABILITY. Evolution. DOI: 10.1111/evo.12160


+

* adegenet not or wrongly cited, but actually used in the paper.
From noreply at r-forge.r-project.org Tue May 14 15:39:36 2013 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Tue, 14 May 2013 15:39:36 +0200 (CEST) Subject: [adegenet-commits] r1123 - in pkg: . R man Message-ID: <20130514133936.149951851D8@r-forge.r-project.org> Author: jombart Date: 2013-05-14 15:39:35 +0200 (Tue, 14 May 2013) New Revision: 1123 Modified: pkg/ChangeLog pkg/R/auxil.R pkg/man/auxil.Rd Log: Fixes to auxiliary color functions Modified: pkg/ChangeLog =================================================================== --- pkg/ChangeLog 2013-05-14 09:17:14 UTC (rev 1122) +++ pkg/ChangeLog 2013-05-14 13:39:35 UTC (rev 1123) @@ -1,3 +1,16 @@ + CHANGES IN ADEGENET VERSION 1.3-8 + +NEW FEATURES + + o new palettes: azur, wasp + + o new function any2col translates (numeric, factor, character) + vectors into colors, also providing information for a legend + + o + + + CHANGES IN ADEGENET VERSION 1.3-7 NEW FEATURES Modified: pkg/R/auxil.R =================================================================== --- pkg/R/auxil.R 2013-05-14 09:17:14 UTC (rev 1122) +++ pkg/R/auxil.R 2013-05-14 13:39:35 UTC (rev 1123) @@ -197,7 +197,7 @@ ## translate numeric values into colors of a palette num2col <- function(x, col.pal=heat.colors, reverse=FALSE, x.min=min(x), x.max=max(x), na.col="green"){ - if(any(is.na(x))) warning("NAs detected in x") + ## if(any(is.na(x))) warning("NAs detected in x") x[x < x.min] <- x.min x[x > x.max] <- x.max x <- x-x.min # min=0 @@ -225,17 +225,21 @@ ########### ## translate a factor into colors of a palette ## colors are randomized based on the provided seed -fac2col <- function(x, col.pal=funky, na.col="grey", seed=1){ +fac2col <- function(x, col.pal=funky, na.col="grey", seed=NULL){ ## get factors and levels x <- factor(x) lev <- levels(x) nlev <- length(lev) ## get colors corresponding to levels - set.seed(seed) - newseed <- round(runif(1,1,1e9)) - on.exit(set.seed(newseed)) - col <- sample(col.pal(nlev)) + if(!is.null(seed)){ + set.seed(seed) + newseed <- round(runif(1,1,1e9)) + on.exit(set.seed(newseed)) + col <- sample(col.pal(nlev)) + } else { + col <- col.pal(nlev) + } ## get output colors res <- rep(na.col, length(x)) @@ -246,20 +250,45 @@ } +########### +## any2col +########### +any2col <- function(x, col.pal=seasun, na.col="transparent"){ + ## handle numeric data + if(is.numeric(x)){ + col <- num2col(x, col.pal=col.pal) + leg.col <- num2col(pretty(x), x.min=min(x, na.rm=TRUE), + x.max=max(x, na.rm=TRUE), col.pal=col.pal, + na.col=na.col) + leg.txt <- pretty(x) + } else{ ## handle factor + x <- factor(x) + col <- fac2col(x, col.pal=col.pal) + leg.col <- col.pal(length(levels(x))) + leg.txt <- levels(x) + } + + return(list(col=col, leg.col=leg.col, leg.txt=leg.txt)) +} # end any2col + + + ## pre-defined palettes ## ## mono color bluepal <- colorRampPalette(c("lightgrey","blue")) redpal <- colorRampPalette(c("lightgrey","red")) -greenpal <- colorRampPalette(c("lightgrey","green")) +greenpal <- colorRampPalette(c("lightgrey","green3")) ## bi-color -flame <- colorRampPalette(c("gold","red")) +flame <- colorRampPalette(c("gold","red3")) +azur <- colorRampPalette(c("gold","royalblue")) ## tri-color seasun <- colorRampPalette(c("blue","gold","red")) lightseasun <- colorRampPalette(c("deepskyblue2","gold","red1")) deepseasun <- colorRampPalette(c("blue2","gold","red2")) +wasp <- colorRampPalette(c("yellow2","brown", "black")) ## psychedelic -funky <- colorRampPalette(c("blue","green3","gold","orange","red","brown4","purple")) +funky <- colorRampPalette(c("blue","green3","gold","orange","red","brown4","purple","pink2")) Modified: pkg/man/auxil.Rd =================================================================== --- pkg/man/auxil.Rd 2013-05-14 09:17:14 UTC (rev 1122) +++ pkg/man/auxil.Rd 2013-05-14 13:39:35 UTC (rev 1123) @@ -10,14 +10,17 @@ \alias{corner} \alias{num2col} \alias{fac2col} +\alias{any2col} \alias{transp} \alias{bluepal} \alias{redpal} \alias{greenpal} \alias{flame} +\alias{azur} \alias{seasun} \alias{lightseasun} \alias{deepseasun} +\alias{wasp} \alias{funky} \title{ Auxiliary functions for adegenet} @@ -27,41 +30,51 @@ variables (numeric or factors) onto a color scale, adding transparency to existing colors, pre-defined color palettes, extra functions to access documentation, and low-level treatment of character vectors. - + These functions are mostly auxiliary procedures used internally in adegenet, with the exception of, which opens the adegenet website in the default navigator.\cr - These items include:\cr - - \code{adegenetWeb}: opens the adegenet website in a web navigator - - \code{num2col}: translates a numeric vector into colors. \cr - - \code{fac2col}: translates a numeric vector into colors. \cr - - \code{transp}: adds transparency to a vector of colors. Note that - transparent colors are not supported on some graphical devices.\cr - - \code{corner}: adds text to a corner of a figure. \cr - - \code{checkType}: checks the type of markers being used in a - function and issues an error if appropriate.\cr - - \code{.rmspaces}: remove peripheric spaces in a character string. \cr - - \code{.genlab}: generate labels in a correct alphanumeric ordering. \cr - - \code{.readExt}: read the extension of a given file. \cr + These items include: + \itemize{ + \item \code{adegenetWeb}: opens the adegenet website in a web navigator + \item \code{num2col}: translates a numeric vector into colors. + \item \code{fac2col}: translates a factor into colors. + \item \code{any2col}: translates a vector of type numeric, character + or factor into colors. + \item \code{transp}: adds transparency to a vector of colors. Note that + transparent colors are not supported on some graphical devices. + \item \code{corner}: adds text to a corner of a figure. + \item \code{checkType}: checks the type of markers being used in a + function and issues an error if appropriate. + \item \code{.rmspaces}: remove peripheric spaces in a character string. + \item \code{.genlab}: generate labels in a correct alphanumeric ordering. + \item \code{.readExt}: read the extension of a given file. + } - Color palettes include:\cr - - \code{bluepal}: white->blue\cr - - \code{redpal}: white->red\cr - - \code{greenpal}: white->green\cr - - \code{flame}: gold->red\cr - - \code{seasun}: blue->gold->red\cr - - \code{lightseasun}: blue->gold->red (light variant)\cr - - \code{deepseasun}: blue->gold->red (deep variant)\cr - - \code{funky}: many colors\cr + Color palettes include: + \itemize{ + \item \code{bluepal}: white->blue + \item \code{redpal}: white->red + \item \code{greenpal}: white->green + \item \code{flame}: gold->red + \item \code{azur}: gold->blue + \item \code{seasun}: blue->gold->red + \item \code{lightseasun}: blue->gold->red (light variant) + \item \code{deepseasun}: blue->gold->red (deep variant) + \item \code{wasp}: gold->brown->black + \item \code{funky}: many colors + } } \usage{ adegenetWeb() .genlab(base, n) corner(text, posi="topleft", inset=0.1, \dots) num2col(x, col.pal=heat.colors, reverse=FALSE, - x.min=min(x), x.max=max(x), na.col="green") -fac2col(x, col.pal=funky, na.col="grey", seed=1) + x.min=min(x,na.rm=TRUE), x.max=max(x,na.rm=TRUE), + na.col="green") +fac2col(x, col.pal=funky, na.col="grey", seed=NULL) +any2col(x, col.pal=seasun, na.col="transparent") transp(col, alpha=.5) } \arguments{ @@ -82,13 +95,19 @@ \item{x.max}{the maximal value from which to start the color scale} \item{na.col}{the color to be used for missing values (NAs)} \item{seed}{a seed for R's random number generated, used to fix the - random permutation of colors in the palette used.} + random permutation of colors in the palette used; if NULL, no + randomization is used and the colors are taken from the palette + according to the ordering of the levels.} \item{col}{a vector of colors} \item{alpha}{a numeric value between 0 and 1 representing the alpha - coefficient; 0: total transparency; 1: no transparency.} + coefficient; 0: total transparency; 1: no transparency.} } \value{ For \code{.genlab}, a character vector of size "n". + \code{num2col} and \code{fac2col} return a vector of + colors. \code{any2col} returns a list with the following components: + \code{$col} (a vector of colors), \code{$leg.col} (colors for the + legend), and \code{$leg.txt} (text for the legend). } \author{Thibaut Jombart \email{t.jombart at imperial.ac.uk} } \examples{ @@ -108,8 +127,8 @@ plot(1:100, col=num2col(1:100), pch=20, cex=4) plot(1:100, col=num2col(1:100, col.pal=bluepal), pch=20, cex=4) plot(1:100, col=num2col(1:100, col.pal=flame), pch=20, cex=4) -plot(1:100, col=num2col(1:100, col.pal=seasun), pch=20, cex=4) -plot(1:100, col=num2col(1:100, col.pal=seasun,rev=TRUE), pch=20, cex=4) +plot(1:100, col=num2col(1:100, col.pal=wasp), pch=20, cex=4) +plot(1:100, col=num2col(1:100, col.pal=azur,rev=TRUE), pch=20, cex=4) ## factor as colors using fac2col dat <- cbind(c(rnorm(50,8), rnorm(100), rnorm(150,3), @@ -119,5 +138,16 @@ plot(dat, col=transp(fac2col(fac)), pch=19, cex=4) plot(dat, col=transp(fac2col(fac,seed=2)), pch=19, cex=4) +## use of any2col +x <- factor(1:10) +col.info <- any2col(x, col.pal=funky) +plot(x, col=col.info$col, main="Use of any2col on a factor") +legend("bottomleft", fill=col.info$leg.col, legend=col.info$leg.txt, bg="white") + +x <- 100:1 +col.info <- any2col(x, col.pal=wasp) +barplot(x, col=col.info$col, main="Use of any2col on a numeric") +legend("bottomleft", fill=col.info$leg.col, legend=col.info$leg.txt, bg="white") + } \keyword{manip} \ No newline at end of file From noreply at r-forge.r-project.org Tue May 14 15:39:54 2013 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Tue, 14 May 2013 15:39:54 +0200 (CEST) Subject: [adegenet-commits] r1124 - pkg/R Message-ID: <20130514133954.84C9B184988@r-forge.r-project.org> Author: jombart Date: 2013-05-14 15:39:54 +0200 (Tue, 14 May 2013) New Revision: 1124 Modified: pkg/R/auxil.R Log: Fixes to auxiliary color functions Modified: pkg/R/auxil.R =================================================================== --- pkg/R/auxil.R 2013-05-14 13:39:35 UTC (rev 1123) +++ pkg/R/auxil.R 2013-05-14 13:39:54 UTC (rev 1124) @@ -196,7 +196,8 @@ ########### ## translate numeric values into colors of a palette num2col <- function(x, col.pal=heat.colors, reverse=FALSE, - x.min=min(x), x.max=max(x), na.col="green"){ + x.min=min(x,na.rm=TRUE), x.max=max(x,na.rm=TRUE), + na.col="green"){ ## if(any(is.na(x))) warning("NAs detected in x") x[x < x.min] <- x.min x[x > x.max] <- x.max From noreply at r-forge.r-project.org Tue May 14 16:30:32 2013 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Tue, 14 May 2013 16:30:32 +0200 (CEST) Subject: [adegenet-commits] r1125 - pkg Message-ID: <20130514143032.5DB131855F7@r-forge.r-project.org> Author: greatsage Date: 2013-05-14 16:30:28 +0200 (Tue, 14 May 2013) New Revision: 1125 Modified: pkg/ChangeLog Log: updated Changelog -- Fede Modified: pkg/ChangeLog =================================================================== --- pkg/ChangeLog 2013-05-14 13:39:54 UTC (rev 1124) +++ pkg/ChangeLog 2013-05-14 14:30:28 UTC (rev 1125) @@ -7,7 +7,8 @@ o new function any2col translates (numeric, factor, character) vectors into colors, also providing information for a legend - o + o new function xval.dapc (and its wrapper xval), that performs + cross-validation for a dapc analysis. From noreply at r-forge.r-project.org Tue May 14 17:20:13 2013 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Tue, 14 May 2013 17:20:13 +0200 (CEST) Subject: [adegenet-commits] r1126 - pkg/R Message-ID: <20130514152013.AF17018444A@r-forge.r-project.org> Author: jombart Date: 2013-05-14 17:20:13 +0200 (Tue, 14 May 2013) New Revision: 1126 Modified: pkg/R/auxil.R Log: fixed handling of missing data Modified: pkg/R/auxil.R =================================================================== --- pkg/R/auxil.R 2013-05-14 14:30:28 UTC (rev 1125) +++ pkg/R/auxil.R 2013-05-14 15:20:13 UTC (rev 1126) @@ -257,14 +257,14 @@ any2col <- function(x, col.pal=seasun, na.col="transparent"){ ## handle numeric data if(is.numeric(x)){ - col <- num2col(x, col.pal=col.pal) + col <- num2col(x, col.pal=col.pal, na.col=na.col) leg.col <- num2col(pretty(x), x.min=min(x, na.rm=TRUE), x.max=max(x, na.rm=TRUE), col.pal=col.pal, na.col=na.col) leg.txt <- pretty(x) } else{ ## handle factor x <- factor(x) - col <- fac2col(x, col.pal=col.pal) + col <- fac2col(x, col.pal=col.pal, na.col=na.col) leg.col <- col.pal(length(levels(x))) leg.txt <- levels(x) } From noreply at r-forge.r-project.org Tue May 14 17:21:11 2013 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Tue, 14 May 2013 17:21:11 +0200 (CEST) Subject: [adegenet-commits] r1127 - in pkg: R man Message-ID: <20130514152111.30C1918444A@r-forge.r-project.org> Author: jombart Date: 2013-05-14 17:21:10 +0200 (Tue, 14 May 2013) New Revision: 1127 Modified: pkg/R/auxil.R pkg/man/auxil.Rd Log: fixed handling of missing data Modified: pkg/R/auxil.R =================================================================== --- pkg/R/auxil.R 2013-05-14 15:20:13 UTC (rev 1126) +++ pkg/R/auxil.R 2013-05-14 15:21:10 UTC (rev 1127) @@ -197,7 +197,7 @@ ## translate numeric values into colors of a palette num2col <- function(x, col.pal=heat.colors, reverse=FALSE, x.min=min(x,na.rm=TRUE), x.max=max(x,na.rm=TRUE), - na.col="green"){ + na.col="transparent"){ ## if(any(is.na(x))) warning("NAs detected in x") x[x < x.min] <- x.min x[x > x.max] <- x.max @@ -226,7 +226,7 @@ ########### ## translate a factor into colors of a palette ## colors are randomized based on the provided seed -fac2col <- function(x, col.pal=funky, na.col="grey", seed=NULL){ +fac2col <- function(x, col.pal=funky, na.col="transparent", seed=NULL){ ## get factors and levels x <- factor(x) lev <- levels(x) Modified: pkg/man/auxil.Rd =================================================================== --- pkg/man/auxil.Rd 2013-05-14 15:20:13 UTC (rev 1126) +++ pkg/man/auxil.Rd 2013-05-14 15:21:10 UTC (rev 1127) @@ -72,8 +72,8 @@ corner(text, posi="topleft", inset=0.1, \dots) num2col(x, col.pal=heat.colors, reverse=FALSE, x.min=min(x,na.rm=TRUE), x.max=max(x,na.rm=TRUE), - na.col="green") -fac2col(x, col.pal=funky, na.col="grey", seed=NULL) + na.col="transparent") +fac2col(x, col.pal=funky, na.col="transparent", seed=NULL) any2col(x, col.pal=seasun, na.col="transparent") transp(col, alpha=.5) } From noreply at r-forge.r-project.org Tue May 14 23:54:19 2013 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Tue, 14 May 2013 23:54:19 +0200 (CEST) Subject: [adegenet-commits] r1128 - in pkg: . R man vignettes Message-ID: <20130514215419.9D7BB184ECF@r-forge.r-project.org> Author: jombart Date: 2013-05-14 23:54:19 +0200 (Tue, 14 May 2013) New Revision: 1128 Modified: pkg/ChangeLog pkg/R/dapc.R pkg/R/find.clust.R pkg/R/scale.R pkg/R/spca.R pkg/man/as.genlight.Rd pkg/man/ascore.Rd pkg/man/dapc.Rd pkg/man/dapcGraphics.Rd pkg/man/eHGDP.Rd pkg/man/find.clusters.Rd pkg/man/inbreeding.Rd pkg/man/loadingplot.Rd pkg/man/read.structure.Rd pkg/man/scale.Rd pkg/man/seqTrack.Rd pkg/man/sequences.Rd pkg/man/spca.Rd pkg/man/spcaIllus.Rd pkg/vignettes/adegenet-spca.Rnw Log: Fixed release 1.3-8; new version of xvalDapc;removed method argument in scaleGen Modified: pkg/ChangeLog =================================================================== --- pkg/ChangeLog 2013-05-14 15:21:10 UTC (rev 1127) +++ pkg/ChangeLog 2013-05-14 21:54:19 UTC (rev 1128) @@ -7,8 +7,8 @@ o new function any2col translates (numeric, factor, character) vectors into colors, also providing information for a legend - o new function xval.dapc (and its wrapper xval), that performs - cross-validation for a dapc analysis. + o new function xvalDapc which performs cross-validation for a dapc + analysis. Modified: pkg/R/dapc.R =================================================================== --- pkg/R/dapc.R 2013-05-14 15:21:10 UTC (rev 1127) +++ pkg/R/dapc.R 2013-05-14 21:54:19 UTC (rev 1128) @@ -64,7 +64,7 @@ ## keep relevant PCs - stored in XU X.rank <- sum(pcaX$eig > 1e-14) n.pca <- min(X.rank, n.pca) - if(n.pca >= N) stop("number of retained PCs of PCA is greater than N") + if(n.pca >= N) n.pca <- N-1 if(n.pca > N/3) warning("number of retained PCs of PCA may be too large (> N /3)\n results may be unstable ") n.pca <- round(n.pca) @@ -152,7 +152,7 @@ ## dapc.genind ############# dapc.genind <- function(x, pop=NULL, n.pca=NULL, n.da=NULL, - scale=FALSE, scale.method=c("sigma", "binom"), truenames=TRUE, var.contrib=TRUE, pca.info=TRUE, + scale=FALSE, truenames=TRUE, var.contrib=TRUE, pca.info=TRUE, pca.select=c("nbEig","percVar"), perc.pca=NULL, ...){ ## FIRST CHECKS @@ -176,7 +176,7 @@ ## PERFORM PCA ## maxRank <- min(dim(x at tab)) - X <- scaleGen(x, center = TRUE, scale = scale, method = scale.method, + X <- scaleGen(x, center = TRUE, scale = scale, missing = "mean", truenames = truenames) ## CALL DATA.FRAME METHOD ## @@ -299,7 +299,7 @@ N <- nInd(x) X.rank <- sum(pcaX$eig > 1e-14) n.pca <- min(X.rank, n.pca) - if(n.pca >= N) stop("number of retained PCs of PCA is greater than N") + if(n.pca >= N) n.pca <- N-1 if(n.pca > N/3) warning("number of retained PCs of PCA may be too large (> N /3)\n results may be unstable ") U <- pcaX$loadings[, 1:n.pca, drop=FALSE] # principal axes @@ -983,37 +983,89 @@ -## ############ -## ## crossval -## ############ +############ +## crossval +############ -xval.dapc <- function(object, n.pca, n.da, training.set = 90, ...){ - training.set = training.set/100 - kept.id <- unlist(tapply(1:nInd(object), pop(object), function(e) {pop.size = length(e); pop.size.train = round(pop.size * training.set); sample(e, pop.size.train, replace=FALSE)})) - training <- object[kept.id] - validating <- object[-kept.id] - post = vector(mode = 'list', length = n.pca) - asgn = vector(mode = 'list', length = n.pca) - ind = vector(mode = 'list', length = n.pca) - mtch = vector(mode = 'list', length = n.pca) - for(i in 1:n.pca){ - dapc.base = dapc(training, n.pca = i, n.da = 15) - dapc.p = predict.dapc(dapc.base, newdata = validating) - match.prp = mean(as.character(dapc.p$assign)==as.character(pop(validating))) - post[[i]] = dapc.p$posterior - asgn[[i]] = dapc.p$assign - ind[[i]] = dapc.p$ind.score - mtch[[i]] = match.prp - } - res = list(assign = asgn, posterior = post, ind.score = ind, match.prp = mtch) - return(res) -} # end of xval.dapc +xvalDapc <- function (x, ...) UseMethod("xvalDapc") -xval <- function (object, n.pca, n.da, training.set, ...) UseMethod("xval") -xval.genind <- function(object, n.pca, n.da, training.set = 90, ...){ - res = xval.dapc(object = object, n.pca = n.pca, n.da = n.da, training.set = training.set) - return(res) +xvalDapc.data.frame <- function(x, grp, n.pca.max, n.da=NULL, training.set = 1/2, + center=TRUE, scale=FALSE, + n.pca=NULL, n.rep=10, ...){ + + ## CHECKS ## + grp <- factor(grp) + n.pca <- n.pca[n.pca>0] + if(is.null(n.da)) { + n.da <- length(levels(grp))-1 + } + + ## GET TRAINING SET SIZE ## + N <- nrow(x) + N.training <- round(N*training.set) + + ## GET FULL PCA ## + pcaX <- dudi.pca(x, nf=n.pca.max, scannf=FALSE, center=center, scale=scale) + n.pca.max <- min(n.pca.max,pcaX$rank,N.training-1) + + ## DETERMINE N.PCA IF NEEDED ## + if(is.null(n.pca)){ + n.pca <- round(pretty(1:n.pca.max,10)) + } + n.pca <- n.pca[n.pca>0 & n.pca<(N.training-1)] + + ## FUNCTION GETTING THE % OF ACCURATE PREDICTION FOR ONE NUMBER OF PCA PCs ## + ## n.pca is a number of retained PCA PCs + get.prop.pred <- function(n.pca){ + f1 <- function(){ + toKeep <- sample(1:N, N.training) + temp.pca <- pcaX + temp.pca$li <- temp.pca$li[toKeep,,drop=FALSE] + temp.dapc <- suppressWarnings(dapc(x[toKeep,,drop=FALSE], grp[toKeep], n.pca=n.pca, n.da=n.da, dudi=temp.pca)) + temp.pred <- predict.dapc(temp.dapc, newdata=x[-toKeep,,drop=FALSE]) + return(mean(temp.pred$assign==grp[-toKeep])) + } + return(replicate(n.rep, f1())) + } + + + ## GET %SUCCESSFUL OF ACCURATE PREDICTION FOR ALL VALUES ## + res.all <- unlist(lapply(n.pca, get.prop.pred)) + res <- list(success=res.all, n.pca=factor(rep(n.pca, each=n.rep))) + return(res) } + + +xvalDapc.matrix <- xvalDapc.data.frame + + +## There's a bunch of problems down there, commenting it for now? +## xval.dapc <- function(object, n.pca, n.da, training.set = 90, ...){ +## training.set = training.set/100 +## kept.id <- unlist(tapply(1:nInd(object), pop(object), function(e) {pop.size = length(e); pop.size.train = round(pop.size * training.set); sample(e, pop.size.train, replace=FALSE)})) # this can't work: nInd/pop not defined for DAPC objects +## training <- object[kept.id] +## validating <- object[-kept.id] +## post = vector(mode = 'list', length = n.pca) +## asgn = vector(mode = 'list', length = n.pca) +## ind = vector(mode = 'list', length = n.pca) +## mtch = vector(mode = 'list', length = n.pca) +## for(i in 1:n.pca){ +## dapc.base = dapc(training, n.pca = i, n.da = 15) # Why 15?? +## dapc.p = predict.dapc(dapc.base, newdata = validating) +## match.prp = mean(as.character(dapc.p$assign)==as.character(pop(validating))) +## post[[i]] = dapc.p$posterior +## asgn[[i]] = dapc.p$assign +## ind[[i]] = dapc.p$ind.score +## mtch[[i]] = match.prp +## } +## res = list(assign = asgn, posterior = post, ind.score = ind, match.prp = mtch) +## return(res) +## } # end of xval.dapc + +## xval.genind <- function(object, n.pca, n.da, training.set = 90, ...){ +## res = xval.dapc(object = object, n.pca = n.pca, n.da = n.da, training.set = training.set) +## return(res) +## } ## ############### ## ## randtest.dapc ## ############### Modified: pkg/R/find.clust.R =================================================================== --- pkg/R/find.clust.R 2013-05-14 15:21:10 UTC (rev 1127) +++ pkg/R/find.clust.R 2013-05-14 21:54:19 UTC (rev 1128) @@ -193,7 +193,7 @@ find.clusters.genind <- function(x, clust=NULL, n.pca=NULL, n.clust=NULL, stat=c("BIC", "AIC", "WSS"), choose.n.clust=TRUE, criterion=c("diffNgroup", "min","goesup", "smoothNgoesup", "goodfit"), max.n.clust=round(nrow(x at tab)/10), n.iter=1e5, n.start=10, - scale=FALSE, scale.method=c("sigma", "binom"), truenames=TRUE, ...){ + scale=FALSE, truenames=TRUE, ...){ ## CHECKS ## if(!require(ade4, quietly=TRUE)) stop("ade4 library is required.") @@ -210,7 +210,7 @@ ## PERFORM PCA ## maxRank <- min(dim(x at tab)) - X <- scaleGen(x, center = TRUE, scale = scale, method = scale.method, + X <- scaleGen(x, center = TRUE, scale = scale, missing = "mean", truenames = truenames) ## CALL DATA.FRAME METHOD Modified: pkg/R/scale.R =================================================================== --- pkg/R/scale.R 2013-05-14 15:21:10 UTC (rev 1127) +++ pkg/R/scale.R 2013-05-14 21:54:19 UTC (rev 1128) @@ -4,13 +4,11 @@ setGeneric("scaleGen", function(x,...){standardGeneric("scaleGen")}) setMethod("scaleGen", "genind", function(x, center=TRUE, scale=TRUE, - method=c("sigma", "binom"), missing=c("NA","0","mean"), truenames=TRUE){ + missing=c("NA","0","mean"), truenames=TRUE){ THRES <- 1e-10 - method <- match.arg(method) missing <- match.arg(missing) ## checkType(x) - if(method=="binom" & x at type=="PA") stop("This scaling is not available for presence/absence markers.") ## handle "missing" arg if(missing %in% c("0","mean")){ @@ -18,7 +16,7 @@ } ## handle specific cases - if(scale[1] & tolower(method)=="binom"){ + if(scale[1]){ ## get allele freq temp <- apply(x$tab,2,mean,na.rm=TRUE) if(x at type=="codom"){ @@ -59,13 +57,11 @@ setMethod("scaleGen", "genpop", function(x, center=TRUE, scale=TRUE, - method=c("sigma", "binom"), missing=c("NA","0","mean"), truenames=TRUE){ + missing=c("NA","0","mean"), truenames=TRUE){ THRES <- 1e-10 - method <- match.arg(method) missing <- match.arg(missing) ## checkType(x) - if(method=="binom" & x at type=="PA") stop("This scaling is not available for presence/absence markers.") ## make allele frequencies here if(x at type=="codom"){ @@ -75,7 +71,7 @@ } ## handle specific cases - if(scale[1] & tolower(method)=="binom"){ + if(scale[1]){ ## get allele freq temp <- apply(X,2,mean,na.rm=TRUE) if(x at type=="codom"){ Modified: pkg/R/spca.R =================================================================== --- pkg/R/spca.R 2013-05-14 15:21:10 UTC (rev 1127) +++ pkg/R/spca.R 2013-05-14 21:54:19 UTC (rev 1128) @@ -16,7 +16,7 @@ # spca genind ################ spca <- function(obj, xy=NULL, cn=NULL, matWeight=NULL, - scale=FALSE, scale.method=c("sigma","binom"), + scale=FALSE, scannf=TRUE, nfposi=1, nfnega=1, type=NULL, ask=TRUE, plot.nb=TRUE, edit.nb=FALSE, truenames=TRUE, d1=NULL, d2=NULL, k=NULL, a=NULL, dmin=NULL){ @@ -89,7 +89,7 @@ } ## handle NAs, centring and scaling - X <- scaleGen(obj, center=TRUE, scale=scale, method=scale.method, missing="mean", truenames=truenames) + X <- scaleGen(obj, center=TRUE, scale=scale, missing="mean", truenames=truenames) ## perform analyses pcaX <- dudi.pca(X, center=FALSE, scale=FALSE, scannf=FALSE) Modified: pkg/man/as.genlight.Rd =================================================================== --- pkg/man/as.genlight.Rd 2013-05-14 15:21:10 UTC (rev 1127) +++ pkg/man/as.genlight.Rd 2013-05-14 21:54:19 UTC (rev 1128) @@ -45,6 +45,7 @@ - \code{\linkS4class{genind}} } \examples{ +\dontrun{ ## data to be converted dat <- list(toto=c(1,1,0,0,2,2,1,2,NA), titi=c(NA,1,1,0,1,1,1,0,0), tata=c(NA,0,3, NA,1,1,1,0,0)) @@ -58,7 +59,7 @@ identical(x1,x2) identical(x1,x3) +} - } \keyword{classes} Modified: pkg/man/ascore.Rd =================================================================== --- pkg/man/ascore.Rd 2013-05-14 15:21:10 UTC (rev 1127) +++ pkg/man/ascore.Rd 2013-05-14 21:54:19 UTC (rev 1128) @@ -10,7 +10,8 @@ \usage{ a.score(x, n.sim=10, \ldots) -optim.a.score(x, n.pca=1:ncol(x$tab), smart=TRUE, n=10, plot=TRUE, n.sim=10, n.da=length(levels(x$grp)), \ldots) +optim.a.score(x, n.pca=1:ncol(x$tab), smart=TRUE, n=10, plot=TRUE, + n.sim=10, n.da=length(levels(x$grp)), \ldots) } \arguments{ \item{x}{a \code{dapc} object.} Modified: pkg/man/dapc.Rd =================================================================== --- pkg/man/dapc.Rd 2013-05-14 15:21:10 UTC (rev 1127) +++ pkg/man/dapc.Rd 2013-05-14 21:54:19 UTC (rev 1128) @@ -9,9 +9,9 @@ \alias{print.dapc} \alias{summary.dapc} \alias{predict.dapc} -\alias{xval.dapc} -\alias{xval} -\alias{xval.genind} +\alias{xvalDapc} +\alias{xvalDapc.data.frame} +\alias{xvalDapc.matrix} \alias{as.lda} \alias{as.lda.dapc} \title{Discriminant Analysis of Principal Components (DAPC)} @@ -31,18 +31,19 @@ - \code{matrix} (only numeric data)\cr - \code{\linkS4class{genind}} objects (genetic markers)\cr - \code{\linkS4class{genlight}} objects (genome-wide SNPs) - + These methods all return an object with class \code{dapc}. Functions that can be applied to these objects are (the ".dapc" can be ommitted): - \code{print.dapc}: prints the content of a \code{dapc} object.\cr - - \code{summary.dapc}: extracts useful information from a \code{dapc} object.\cr + - \code{summary.dapc}: extracts useful information from a \code{dapc} + object.\cr - \code{predict.dapc}: predicts group memberships based on DAPC results.\cr - - \code{xval.dapc}: performs cross-validation of DAPC function varying the number of PCs and keeping the number of DAs fixed. - - \code{xval}: performs cross-validation of DAPC function varying the number of PCs and keeping the number of DAs fixed. - - \code{xval.genind}: performs cross-validation of DAPC function varying the number of PCs and keeping the number of DAs fixed. + - \code{xvalDapc}: performs cross-validation of DAPC using varying + numbers of PCs (and keeping the number of discriminant functions + fixed); it currently has methods for \code{data.frame} and \code{matrix}.\cr @@ -55,7 +56,6 @@ \code{as.lda} is a generic with a method for \code{dapc} object which converts these objects into outputs similar to that of \code{lda.default}. - } \usage{ \method{dapc}{data.frame}(x, grp, n.pca=NULL, n.da=NULL, center=TRUE, @@ -65,8 +65,8 @@ \method{dapc}{matrix}(x, \ldots) \method{dapc}{genind}(x, pop=NULL, n.pca=NULL, n.da=NULL, scale=FALSE, - scale.method=c("sigma", "binom"), truenames=TRUE, var.contrib=TRUE, - pca.info=TRUE, pca.select=c("nbEig","percVar"), perc.pca=NULL, \ldots) + truenames=TRUE, var.contrib=TRUE, pca.info=TRUE, + pca.select=c("nbEig","percVar"), perc.pca=NULL, \ldots) \method{dapc}{genlight}(x, pop = NULL, n.pca = NULL, n.da = NULL, scale = FALSE, var.contrib = TRUE, pca.info=TRUE, pca.select = c("nbEig", "percVar"), @@ -81,9 +81,13 @@ \method{predict}{dapc}(object, newdata, prior = object$prior, dimen, method = c("plug-in", "predictive", "debiased"), ...) -\method{xval}{dapc}(object, n.pca, n.da, training.set = 90, \ldots) +\method{xvalDapc}{data.frame}(x, grp, n.pca.max, n.da=NULL, + training.set = 1/2, center=TRUE, scale=FALSE, + n.pca=NULL, n.rep=10, \ldots) -\method{xval}{genind}(object, n.pca, n.da, training.set = 90, \ldots) +\method{xvalDapc}{matrix}(x, grp, n.pca.max, n.da=NULL, + training.set = 1/2, center=TRUE, scale=FALSE, + n.pca=NULL, n.rep=10, \ldots) } \arguments{ \item{x}{\code{a data.frame}, \code{matrix}, or \code{\linkS4class{genind}} @@ -100,8 +104,7 @@ \item{scale}{a \code{logical} indicating whether variables should be scaled (TRUE) or not (FALSE, default). Scaling consists in dividing variables by their (estimated) standard deviation to account for trivial differences in - variances. Further scaling options are available for \linkS4class{genind} - objects (see argument \code{scale.method}).} + variances.} \item{var.contrib}{a \code{logical} indicating whether the contribution of original variables (alleles, for \linkS4class{genind} objects) should be provided (TRUE, default) or not (FALSE). Such output can be useful, @@ -127,10 +130,6 @@ dimension reduction is not performed (saving computational time) but taken directly from this object.} \item{object}{a \code{dapc} object.} - \item{scale.method}{a \code{character} specifying the scaling method to be used - for allele frequencies, which must match "sigma" (usual estimate of standard - deviation) or "binom" (based on binomial distribution). See \code{\link{scaleGen}} for - further details.} \item{truenames}{a \code{logical} indicating whether true (i.e., user-specified) labels should be used in object outputs (TRUE, default) or not (FALSE).} \item{dudi}{optionally, a multivariate analysis with the class @@ -143,10 +142,11 @@ original ('training') data. In particular, variables must be exactly the same as in the original data. For \linkS4class{genind} objects, see \code{\link{repool}} to ensure matching of alleles.} - \item{training.set}{the percentage of individuals randomly chosen in each population - as the training set used for cross-validation. This value is applied to all groups/pops - defined in the object. The default is set to 90\%. - For meaningful cross-validation it is recommended not to go below 80\%} + \item{n.pca.max}{maximum number of PCA components to retain.} + \item{training.set}{the proportion of data (individuals) to be used + for the training set; defaults to one half.} + \item{n.rep}{the number of replicate to be used for each number of PCA + components retained.} \item{prior,dimen,method}{see \code{?predict.lda}.} } \details{ @@ -179,8 +179,9 @@ \item{tab}{matrix of retained principal components of PCA} \item{loadings}{principal axes of DAPC, giving coefficients of the linear combination of retained PCA axes.} - \item{ind.coord}{principal components of DAPC, giving the coordinates of individuals onto - principal axes of DAPC; also called the discriminant functions.} + \item{ind.coord}{principal components of DAPC, giving the coordinates + of individuals onto principal axes of DAPC; also called the + discriminant functions.} \item{grp.coord}{coordinates of the groups onto the principal axes of DAPC.} \item{posterior}{a data.frame giving posterior membership probabilities for all individuals and all clusters.} @@ -198,10 +199,12 @@ \code{assign.prop} (proportion of overall correct assignment), \code{assign.per.pop} (proportion of correct assignment per group), \code{prior.grp.size} (prior group sizes), and \code{post.grp.size} (posterior - group sizes), \code{xval.dapc}, \code{xval.genind} and \code{xval} (all return a list of four lists, each one with as - many items as cross-validation runs. The first item is a list of \code{assign} components, - the secon is a list of \code{posterior} components, the thirs is a list of \code{ind.score} - components and the fourth is a list of \code{match.prp} items, i.e. the prortion of the validation + group sizes), \code{xval.dapc}, \code{xval.genind} and \code{xval} + (all return a list of four lists, each one with as many items as + cross-validation runs. The first item is a list of \code{assign} + components, the secon is a list of \code{posterior} components, the + thirs is a list of \code{ind.score} components and the fourth is a + list of \code{match.prp} items, i.e. the prortion of the validation set correctly matched to its original population) } \references{ @@ -314,30 +317,12 @@ title("30 indiv popA, 30 indiv pop B, 30 hybrids") ## CROSS-VALIDATION ## -# select dataset -data(microbov) -summary(microbov) # the dataset contains 15 populations of different sizes +data(sim2pop) +xval <- xvalDapc(sim2pop at tab, pop(sim2pop), n.pca.max=100, n.rep=3) +xval +boxplot(xval$success~xval$n.pca, xlab="Number of PCA components", +ylab="Classification succes", main="DAPC - cross-validation") -# we take a fixed number of disriminant functions (15 in this case) -# and we test how the cross-validation does varying the number of PCs -# we specify the *maximum* number of PCs, and we will test how -# the cross-validation performs by going from 1 PC to the maximum -# we specified in the fucntion call - -crossval.test <- xval.dapc(microbov, n.pca = 40, n.da = 15, training.set = 90) - -attributes(crossval.test) # we get four lists of lists -# namely "assign" "posterior" "ind.score" "match.prp" -# a quick visual inspection of the cross-validation - -plot(unlist(crossval.test$match.prp)) - -# the use can also just call xval: -crossval.test2 <- xval(microbov, n.pca = 40, n.da = 15, training.set = 90) -plot(unlist(crossval.test2$match.prp)) - - - } Modified: pkg/man/dapcGraphics.Rd =================================================================== --- pkg/man/dapcGraphics.Rd 2013-05-14 15:21:10 UTC (rev 1127) +++ pkg/man/dapcGraphics.Rd 2013-05-14 21:54:19 UTC (rev 1128) @@ -22,23 +22,23 @@ } \usage{ \method{scatter}{dapc}(x, xax=1, yax=2, grp=x$grp, col=rainbow(length(levels(grp))), - pch=20, bg="lightgrey", solid=.7, scree.da=TRUE, - scree.pca=FALSE, posi.da="bottomright", - posi.pca="bottomleft", bg.inset="white", ratio.da=.25, - ratio.pca=.25, inset.da=0.02, inset.pca=0.02, - inset.solid=.5, onedim.filled=TRUE, mstree=FALSE, lwd=1, - lty=1, segcol="black", legend=FALSE, posi.leg="topright", - cleg=1, txt.leg=levels(grp), cstar = 1, cellipse = 1.5, - axesell = FALSE, label = levels(grp), clabel = 1, xlim = - NULL, ylim = NULL, grid = FALSE, addaxes = TRUE, origin = - c(0,0), include.origin = TRUE, sub = "", csub = 1, possub = - "bottomleft", cgrid = 1, pixmap = NULL, contour = NULL, area - = NULL, \ldots) + pch=20, bg="lightgrey", solid=.7, scree.da=TRUE, + scree.pca=FALSE, posi.da="bottomright", + posi.pca="bottomleft", bg.inset="white", ratio.da=.25, + ratio.pca=.25, inset.da=0.02, inset.pca=0.02, + inset.solid=.5, onedim.filled=TRUE, mstree=FALSE, lwd=1, + lty=1, segcol="black", legend=FALSE, posi.leg="topright", + cleg=1, txt.leg=levels(grp), cstar = 1, cellipse = 1.5, + axesell = FALSE, label = levels(grp), clabel = 1, xlim = + NULL, ylim = NULL, grid = FALSE, addaxes = TRUE, origin = + c(0,0), include.origin = TRUE, sub = "", csub = 1, possub = + "bottomleft", cgrid = 1, pixmap = NULL, contour = NULL, area + = NULL, \ldots) assignplot(x, only.grp=NULL, subset=NULL, new.pred=NULL, cex.lab=.75,pch=3) compoplot(x, only.grp=NULL, subset=NULL, new.pred=NULL, col=NULL, lab=NULL, - legend=TRUE, txt.leg=NULL, ncol=4, posi=NULL, cleg=.8, bg=transp("white"), ...) + legend=TRUE, txt.leg=NULL, ncol=4, posi=NULL, cleg=.8, bg=transp("white"), ...) } \arguments{ \item{x}{a \code{dapc} object.} @@ -156,7 +156,8 @@ inset.pca=c(.01,.3), lab=paste("year\n",2001:2006), axesel=FALSE, col=terrain.colors(10)) ## without ellipses, use legend for groups -scatter(dapc1, cell=0, cstar=0, scree.da=FALSE, clab=0, cex=3, solid=.4, bg="white", leg=TRUE, posi.leg="topleft") +scatter(dapc1, cell=0, cstar=0, scree.da=FALSE, clab=0, cex=3, +solid=.4, bg="white", leg=TRUE, posi.leg="topleft") ## only one axis scatter(dapc1,1,1,scree.da=FALSE, legend=TRUE, solid=.4,bg="white") @@ -174,7 +175,8 @@ dapc2 ## plot results -scatter(dapc2, scree.da=FALSE, leg=TRUE, txt.leg=paste("group", c('A','B')), col=c("red","blue")) +scatter(dapc2, scree.da=FALSE, leg=TRUE, txt.leg=paste("group", +c('A','B')), col=c("red","blue")) ## SNP contributions loadingplot(dapc2$var.contr) Modified: pkg/man/eHGDP.Rd =================================================================== --- pkg/man/eHGDP.Rd 2013-05-14 15:21:10 UTC (rev 1127) +++ pkg/man/eHGDP.Rd 2013-05-14 21:54:19 UTC (rev 1128) @@ -30,7 +30,7 @@ d'Etude du Polymorphisme Humain (CEPH). See reference [4] for Native American populations. - This copy of the dataset was prepared by Francois Balloux (f.balloux at imperial.ac.uk). + This copy of the dataset was prepared by Francois Balloux. } \references{ [1] Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, et @@ -62,7 +62,8 @@ ## PERFORM DAPC - USE POPULATIONS AS CLUSTERS ## to reproduce exactly analyses from the paper, use "n.pca=1000" -dapc1 <- dapc(eHGDP, all.contrib=TRUE, scale=FALSE, n.pca=200, n.da=80) # takes 2 minutes +dapc1 <- dapc(eHGDP, all.contrib=TRUE, scale=FALSE, +n.pca=200, n.da=80) # takes 2 minutes dapc1 ## (see ?dapc for details about the output) @@ -70,7 +71,8 @@ ## SCREEPLOT OF EIGENVALUES -barplot(dapc1$eig, main="eHGDP - DAPC eigenvalues", col=c("red","green","blue", rep("grey", 1000))) +barplot(dapc1$eig, main="eHGDP - DAPC eigenvalues", +col=c("red","green","blue", rep("grey", 1000))) @@ -111,7 +113,8 @@ ## and then ## plot(grp$Kstat, type="b", col="blue") -grp <- find.clusters(eHGDP, max.n=30, n.pca=200, scale=FALSE, n.clust=4) # takes about 2 minutes +grp <- find.clusters(eHGDP, max.n=30, n.pca=200, +scale=FALSE, n.clust=4) # takes about 2 minutes names(grp) ## (see ?find.clusters for details about the output) @@ -120,7 +123,8 @@ ## PERFORM DAPC - USE POPULATIONS AS CLUSTERS ## to reproduce exactly analyses from the paper, use "n.pca=1000" -dapc2 <- dapc(eHGDP, pop=grp$grp, all.contrib=TRUE, scale=FALSE, n.pca=200, n.da=80) # takes around a 1 minute +dapc2 <- dapc(eHGDP, pop=grp$grp, all.contrib=TRUE, +scale=FALSE, n.pca=200, n.da=80) # takes around a 1 minute dapc2 @@ -131,7 +135,8 @@ ## MAP DAPC2 RESULTS if(require(maps)){ -xy <- cbind(eHGDP$other$popInfo$Longitude, eHGDP$other$popInfo$Latitude) +xy <- cbind(eHGDP$other$popInfo$Longitude, +eHGDP$other$popInfo$Latitude) myCoords <- apply(dapc2$ind.coord, 2, tapply, pop(eHGDP), mean) Modified: pkg/man/find.clusters.Rd =================================================================== --- pkg/man/find.clusters.Rd 2013-05-14 15:21:10 UTC (rev 1127) +++ pkg/man/find.clusters.Rd 2013-05-14 21:54:19 UTC (rev 1128) @@ -50,8 +50,7 @@ stat=c("BIC","AIC", "WSS"), choose.n.clust=TRUE, criterion=c("diffNgroup", "min","goesup", "smoothNgoesup", "goodfit"), max.n.clust=round(nrow(x at tab)/10), n.iter=1e5, - n.start=10, scale=FALSE, scale.method=c("sigma", "binom"), - truenames=TRUE, \ldots) + n.start=10, scale=FALSE, truenames=TRUE, \ldots) \method{find.clusters}{genlight}(x, clust=NULL, n.pca=NULL, n.clust=NULL, stat=c("BIC", "AIC", @@ -118,11 +117,7 @@ percentage (interactively, or via \code{perc.pca}). } \item{perc.pca}{a \code{numeric} value between 0 and 100 indicating the minimal percentage of the total variance of the data to be expressed by the - retained axes of PCA.} - \item{scale.method}{a \code{character} specifying the scaling method to be used - for allele frequencies, which must match "sigma" (usual estimate of standard - deviation) or "binom" (based on binomial distribution). See - \code{\link{scaleGen}} for further details.} + retained axes of PCA.} \item{truenames}{a \code{logical} indicating whether true (i.e., user-specified) labels should be used in object outputs (TRUE, default) or not (FALSE), in which case generic labels are used.} @@ -231,7 +226,8 @@ data(eHGDP) ## here, n.clust is specified, so that only on K value is used -grp <- find.clusters(eHGDP, max.n=30, n.pca=200, scale=FALSE, n.clust=4) # takes about 2 minutes +grp <- find.clusters(eHGDP, max.n=30, n.pca=200, scale=FALSE, +n.clust=4) # takes about 2 minutes names(grp) grp$Kstat grp$stat @@ -258,7 +254,8 @@ ## DETECTION WITH AIC (less clear-cut) foo.AIC <- find.clusters(sim2pop, n.pca=100, choose=FALSE, stat="AIC") -plot(foo.AIC$Kstat, type="o", xlab="number of clusters (K)", ylab="AIC", col="purple", main="Detection based on AIC") +plot(foo.AIC$Kstat, type="o", xlab="number of clusters (K)", +ylab="AIC", col="purple", main="Detection based on AIC") points(2, foo.AIC$Kstat[2], pch="x", cex=3) mtext(3, tex="'X' indicates the actual number of clusters") @@ -276,7 +273,8 @@ x plot(x) grp <- find.clusters(x, n.pca=100, choose=FALSE, stat="BIC") -plot(grp$Kstat, type="o", xlab="number of clusters (K)",ylab="BIC",main="find.clusters on a genlight object\n(two groups)") +plot(grp$Kstat, type="o", xlab="number of clusters (K)", +ylab="BIC",main="find.clusters on a genlight object\n(two groups)") } } \keyword{multivariate} Modified: pkg/man/inbreeding.Rd =================================================================== --- pkg/man/inbreeding.Rd 2013-05-14 15:21:10 UTC (rev 1127) +++ pkg/man/inbreeding.Rd 2013-05-14 21:54:19 UTC (rev 1128) @@ -11,7 +11,8 @@ 1 is acceptable. } \usage{ -inbreeding(x, pop = NULL, truenames = TRUE, res.type = c("sample", "function"), N = 200, M = N * 10) +inbreeding(x, pop = NULL, truenames = TRUE, res.type = c("sample","function"), + N = 200, M = N * 10) } \arguments{ \item{x}{an object of class \linkS4class{genind}.} @@ -22,9 +23,9 @@ used (TRUE, default) instead of generic labels (FALSE); used if res.type is "matrix".} \item{res.type}{a character string matching "sample" or "function", - specifying whether the output should be a function giving the density of probability - of F values ("function") or a sample of F values taken from this - distribution ("sample", default).} + specifying whether the output should be a function giving the density + of probability of F values ("function") or a sample of F values taken + from this distribution ("sample", default).} \item{N}{an integer indicating the size of the sample to be taken from the distribution of F values.} \item{M}{an integer indicating the number of different F values to be @@ -43,10 +44,10 @@ probability for an individual to inherit two identical alleles from a single ancestor. - Let \eqn{p_i} refer to the frequency of allele \eqn{i} in the population. Let - \eqn{h} be an variable which equates 1 if the individual is - homozygote, and 0 otherwise. For one locus, the probability of being - homozygote is computed as: + Let \eqn{p_i} refer to the frequency of allele \eqn{i} in the + population. Let \eqn{h} be an variable which equates 1 if the + individual is homozygote, and 0 otherwise. For one locus, the + probability of being homozygote is computed as: \eqn{ F + (1-F) \sum_i p_i^2} @@ -72,11 +73,13 @@ Fsamp <- inbreeding(lagun, N=30) ## plot the first 10 results -invisible(sapply(Fsamp[1:10], function(e) plot(density(e), xlab="F", xlim=c(0,1), main="Density of the sampled F values"))) +invisible(sapply(Fsamp[1:10], function(e) plot(density(e), xlab="F", +xlim=c(0,1), main="Density of the sampled F values"))) ## compute means for all individuals Fmean=sapply(Fsamp, mean) -hist(Fmean, col="orange", xlab="mean value of F", main="Distribution of mean F across individuals") +hist(Fmean, col="orange", xlab="mean value of F", +main="Distribution of mean F across individuals") ## estimate inbreeding - return proba density functions Fdens <- inbreeding(lagun, res.type="function") @@ -85,6 +88,7 @@ Fdens[[1]] ## plot the first 10 functions -invisible(sapply(Fdens[1:10], plot, ylab="Density", main="Density of probability of F values")) +invisible(sapply(Fdens[1:10], plot, ylab="Density", +main="Density of probability of F values")) } } \ No newline at end of file Modified: pkg/man/loadingplot.Rd =================================================================== --- pkg/man/loadingplot.Rd 2013-05-14 15:21:10 UTC (rev 1127) +++ pkg/man/loadingplot.Rd 2013-05-14 21:54:19 UTC (rev 1128) @@ -13,10 +13,11 @@ \usage{ loadingplot(x, \dots) -\method{loadingplot}{default}(x, at=NULL, threshold=quantile(x,0.75), axis=1, fac=NULL, byfac=FALSE, +\method{loadingplot}{default}(x, at=NULL, threshold=quantile(x,0.75), + axis=1, fac=NULL, byfac=FALSE, lab=NULL, cex.lab=0.7, cex.fac=1, lab.jitter=0, - main="Loading plot", xlab="Variables", ylab="Loadings", srt = 0, adj = NULL, \dots) - + main="Loading plot", xlab="Variables", ylab="Loadings", + srt = 0, adj = NULL, \dots) } \arguments{ \item{x}{either a vector with numeric values to be plotted, or a Modified: pkg/man/read.structure.Rd =================================================================== --- pkg/man/read.structure.Rd 2013-05-14 15:21:10 UTC (rev 1127) +++ pkg/man/read.structure.Rd 2013-05-14 21:54:19 UTC (rev 1128) @@ -17,7 +17,9 @@ \code{\link{df2genind}}. } \usage{ -read.structure(file, n.ind=NULL, n.loc=NULL, onerowperind=NULL, col.lab=NULL, col.pop=NULL, col.others=NULL, row.marknames=NULL, NA.char="-9", pop=NULL, missing=NA, ask=TRUE, quiet=FALSE) +read.structure(file, n.ind=NULL, n.loc=NULL, onerowperind=NULL, + col.lab=NULL, col.pop=NULL, col.others=NULL, row.marknames=NULL, + NA.char="-9", pop=NULL, missing=NA, ask=TRUE, quiet=FALSE) } \arguments{ \item{file}{ a character string giving the path to the file to @@ -31,9 +33,9 @@ labels of genotypes. '0' if absent.} \item{col.pop}{an integer giving the index of the column containing population to which genotypes belong. '0' if absent.} [TRUNCATED] To get the complete diff run: svnlook diff /svnroot/adegenet -r 1128 From noreply at r-forge.r-project.org Wed May 15 12:57:43 2013 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Wed, 15 May 2013 12:57:43 +0200 (CEST) Subject: [adegenet-commits] r1129 - in pkg: . R Message-ID: <20130515105743.F41C918453F@r-forge.r-project.org> Author: jombart Date: 2013-05-15 12:57:43 +0200 (Wed, 15 May 2013) New Revision: 1129 Removed: pkg/TODO Modified: pkg/R/glPlot.R Log: fixed a NOTE due to glPlot; removed useless TODO file Modified: pkg/R/glPlot.R =================================================================== --- pkg/R/glPlot.R 2013-05-14 21:54:19 UTC (rev 1128) +++ pkg/R/glPlot.R 2013-05-15 10:57:43 UTC (rev 1129) @@ -31,6 +31,9 @@ +## hack to remove the NOTE in R CMD check about: +## "plot,genlight: no visible binding for global variable ?y?" +if(getRversion() >= "2.15.1") utils::globalVariables("y") ## plot method setMethod("plot", signature(x="genlight", y="ANY"), function(x, y=NULL, col=NULL, legend=TRUE, Deleted: pkg/TODO =================================================================== --- pkg/TODO 2013-05-14 21:54:19 UTC (rev 1128) +++ pkg/TODO 2013-05-15 10:57:43 UTC (rev 1129) @@ -1,82 +0,0 @@ -####################### -# -# adegenet TODO list -# -####################### -# -# please list here all intended modifications -# and all detected bugs -# -# please add a "-- done" or "-- fixed" tag when -# you achieved something -# '*' indicates stuff to do -# 'o' indicates done stuff -# '*o*' indicates partly done stuff -# -# Inside a given section, priority goes decreasing. -# -# Delete fixed things each new release. -# -# T.J. 2008 -# -###################### - - - -# FOR NEXT STABLE VERSION -========================= -========================= - -# CODE ISSUES: -============== -* fix request 1.2-2.04 (implement adjusted heretozygosity in summary) - - -# DOCUMENTATION ISSUES: -======================= -* explain new changes inside the tutorial (handling of AFLP/RAPD...) - - -# NEW IMPLEMENTATIONS: -===================== - -o allow genind2df to export alleles on separate columns -- done - - -# TESTING: -========== - - -# LOW PRIORITY / MINOR ISSUES -=========================== -=========================== -* in spca, when nfposi=0, the returned object actually contains what corresponds to nfposi=1. Comes from multispati in ade4. To correct in ade4. -* use spcaIllus to illustrate global.rtest and local.rtest -* Implement a method to merge different markers for the same individuals -*o* Build accessors for: -- marker names -- done (TJ) -- indiv names -- pop names/factor -- done (TJ) -- spatial coords -- nb of loci -- done (TJ) -* Return a spatial object from monmonier (class sp?) -* implement classical Fst sensu Weir 1996 ? Or wait for EP to do it... -* Implement different levels of ploidy in: -- hybridize - done (TJ) -- read.structure - - - -# LONG TERM -========================== -========================== -* import from pegas -* import from geneticsBase -- wait for geneSet to be stable -* export to geneticsBase -- same thing -* see where code needs tuning, and use C/C++ -* implement global.rtest and local.rtest for genind/genpop objects -* Check the formulae provided for Reynolds (consistent with Felsenstein's -formulae, not straightforward reading the original article) -* Implement wrappers for spatial function (Moran's I, Mantel's correlogram, etc.). - - From noreply at r-forge.r-project.org Wed May 15 19:02:18 2013 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Wed, 15 May 2013 19:02:18 +0200 (CEST) Subject: [adegenet-commits] r1130 - pkg/R www www/images www/images/gimp Message-ID: <20130515170218.59EC6184C9A@r-forge.r-project.org> Author: jombart Date: 2013-05-15 19:02:17 +0200 (Wed, 15 May 2013) New Revision: 1130 Modified: pkg/R/dapc.R www/download.html www/images/acceuil.png www/images/gimp/acceuil.xcf www/news.html Log: update website for release; added a new function to assess the quality of discrimination with the number of PCs... doesn't seem to kick asses. Modified: pkg/R/dapc.R =================================================================== --- pkg/R/dapc.R 2013-05-15 10:57:43 UTC (rev 1129) +++ pkg/R/dapc.R 2013-05-15 17:02:17 UTC (rev 1130) @@ -984,14 +984,13 @@ ############ -## crossval +## xvalDapc ############ xvalDapc <- function (x, ...) UseMethod("xvalDapc") xvalDapc.data.frame <- function(x, grp, n.pca.max, n.da=NULL, training.set = 1/2, - center=TRUE, scale=FALSE, - n.pca=NULL, n.rep=10, ...){ + center=TRUE, scale=FALSE, n.pca=NULL, n.rep=10, ...){ ## CHECKS ## grp <- factor(grp) @@ -1005,6 +1004,7 @@ N.training <- round(N*training.set) ## GET FULL PCA ## + if(missing(n.pca.max)) n.pca.max <- min(dim(x)) pcaX <- dudi.pca(x, nf=n.pca.max, scannf=FALSE, center=center, scale=scale) n.pca.max <- min(n.pca.max,pcaX$rank,N.training-1) @@ -1031,14 +1031,64 @@ ## GET %SUCCESSFUL OF ACCURATE PREDICTION FOR ALL VALUES ## res.all <- unlist(lapply(n.pca, get.prop.pred)) - res <- list(success=res.all, n.pca=factor(rep(n.pca, each=n.rep))) + res <- data.frame(n.pca=rep(n.pca, each=n.rep), success=res.all) return(res) -} +} # end xvalDapc.data.frame xvalDapc.matrix <- xvalDapc.data.frame + + + + +############# +## discriVal +############# + +discriVal <- function (x, ...) UseMethod("discriVal") + +discriVal.data.frame <- function(x, grp, n.pca.max, n.da=NULL, center=TRUE, scale=FALSE, n.pca=NULL, ...){ + + ## CHECKS ## + grp <- factor(grp) + n.pca <- n.pca[n.pca>0] + if(is.null(n.da)) { + n.da <- length(levels(grp))-1 + } + + ## GET FULL PCA ## + if(missing(n.pca.max)) n.pca.max <- min(dim(x)) + pcaX <- dudi.pca(x, nf=n.pca.max, scannf=FALSE, center=center, scale=scale) + n.pca.max <- min(n.pca.max,pcaX$rank) + + ## DETERMINE N.PCA IF NEEDED ## + if(is.null(n.pca)){ + n.pca <- round(pretty(1:n.pca.max,10)) + } + n.pca <- n.pca[n.pca>0 & n.pca The current stable - version (adegenet_1.3-6) is available as:
- - linux sources
- - MacOS X binary
- - Windows binary
+ version (adegenet_1.3-8) is available as:
+ - linux sources
+ - MacOS X binary
+ - Windows binary

The devel version - (adegenet_1.3-7) is also available from R-Forge's daily snapshots.
It can be installed directly from R console using:
@@ -47,6 +47,8 @@ height: 10px;"> Patches  + +
Patches correct minor bugs or implement new functionnalities, and will be included into the next CRAN release. It is recommended to @@ -58,17 +60,31 @@ use + + a patch.
addedFeatures.R: (for adegenet >=1.3-1) added features not included in the CRAN version of the package; includes conversion from seqTrack outputs to graphNEL objects.
+ style="font-family: monospace;">graphNEL objects. If you + have a choice though, consider using current igraph + implementation, which is probably better and still maintained.

Older versions:
+ adegenet _1.3-7
+ - linux sources
+ - MacOS X binary
+ - Windows binary
+
+ adegenet _1.3-6
+ - linux sources
+ - MacOS X binary
+ - Windows binary
+
adegenet _1.3-5
- linux sources
- MacOS X binary
Modified: www/images/acceuil.png =================================================================== (Binary files differ) Modified: www/images/gimp/acceuil.xcf =================================================================== (Binary files differ) Modified: www/news.html =================================================================== --- www/news.html 2013-05-15 10:57:43 UTC (rev 1129) +++ www/news.html 2013-05-15 17:02:17 UTC (rev 1130) @@ -28,15 +28,15 @@ 0, 0);">

What's next?
- Next release (1.3-7) is under development.
+ Next release (1.3-9) is under development.


Today
Current stable version of - adegenet is 1.3-6 for - R.2.15.2.
+ adegenet is 1.3-8 for + R.3.0.0.
You can install the devel version from R-Forge.
See the current

- 30/01/2013
+ New adegenet version (1.3-8) + has been released for R-3.0.0! New features + include a new function for cross-validation of DAPC results, + especially useful for assessing the optimal number of PCA + components to retain during the dimension reduction step. + Check out xvalDapc.  Other sources of minor excitement + include new color palettes, and the function any2col + which converts factors and numeric into colors, prodiving info + for a legend as well. See the
ChangeLog file for more details.
+
+
+ ??/??/????
+
At some point in time, version 1.3-7 was released, but + never acknowledged on this website. See the ChangeLog file for clues about what + happened.
+
+
+ 30/01/2013

New adegenet version (1.3-6) - has - been - released - for R-2.15.2! New features include a function - for identifying mutations (position and nature) between pairs - of DNA sequences (findMutations), with a - possibility of visualization using graphs ( New features + include a function for identifying mutations (position and + nature) between pairs of DNA sequences (findMutations), + with a possibility of visualization using graphs (graphMutations). - See the ). See the ChangeLog - file for more details.
+ style="color: rgb(0, 0, 0);"> file for more details.


08/11/2012
New adegenet version (1.3-5) - has - been - released - for R-2.15.2! New features include a - graph-based - clutering approach (gengraph), + has been released for R-2.15.2! New features + include a graph-based clutering approach (gengraph), and a new function for reading fasta alignments into DNAbin objects with - minimum - memory requirements (fasta2DNAbin). - See the ). See the ChangeLog - file for more details.
+ style="color: rgb(0, 0, 0);"> file for more details.


10/08/2012 23/01/2012
New adegenet version (1.3-4) - has - been - released - for R-2.14.1! The package has undergone some - internal changes in the documentation which should make no - difference - to users, except for a drastic reduction in download - and - installed - size - (respectively 4-5MB and 6-7MB). Many examples have - been 'inactivated' to comply with CRAN check runtime policy, - meaning - these won't be run by 'example(myFunction)' - anymore, - but - will - still be available in the documentation. See the The package has undergone + some internal changes in the documentation which should make no + difference to users, except for a drastic reduction in download and installed size (respectively 4-5MB and + 6-7MB). Many examples have been 'inactivated' to comply with + CRAN check runtime policy, meaning these won't be run by 'example(myFunction)' + anymore, but will still be available in the documentation. See + the ChangeLog - file for more details.
+ style="color: rgb(0, 0, 0);"> file for more details.


23/12/2011
New adegenet version (1.3-3) - has - been - released - for R-2.14.1! Just in time for Christmas, ain't - that nice? This release includes a brand new version of propShared - which fixes an important bug, and now allows for any ploidy level - while - running core computations in C. The package is also slightly - smaller - than before. See the Just in time for + Christmas, ain't that nice? This release includes a brand new + version of propShared which fixes an important bug, and + now allows for any ploidy level while running core computations in + C. The package is also slightly smaller than before. See the ChangeLog - file for more details.
+ style="color: rgb(0, 0, 0);"> file for more details.


11/11/2011
New adegenet version (1.3-2) - has - been - released - for R-2.14.0! This minor release fixes some - issues in the detection of the number of cores occuring on some - platforms, as well as a few fixes in manpages. Function This minor release fixes + some issues in the detection of the number of cores occuring on + some platforms, as well as a few fixes in manpages. Function fstat - is - now gone for good as hierfstat - package - seems - to - be - gone for good from CRAN, but the documentation shows how to - use Fst - from the pegas package - instead. See the hierfstat + + package seems to be gone for good from CRAN, but the + documentation shows how to use Fst from the pegas package instead. See the ChangeLog - file for more details.
+ style="color: rgb(0, 0, 0);"> file for more details.


02/09/2011
New adegenet version (1.3-1) - has - been - released - for R-2.13.1! Less is more: this new release is - much smaller in size than 1.3-0, while containing the same - material. It - also fixes a few minor issues in documentation and in the legend - of - dapc scatterplots. See the Less is more: this new + release is much smaller in size than 1.3-0, while containing the + same material. It also fixes a few minor issues in documentation + and in the legend of dapc scatterplots. See the ChangeLog - file for more details.
+ style="color: rgb(0, 0, 0);"> file for more details.



27/06/2011
New adegenet version (1.3-0) - has - been - released - for R-2.13.0! - This new major release implements new classes and methods for - the + has been released for R-2.13.0! This new + major release implements new classes and methods for the handling and analysis of genome-wide SNP data, along with - several - improvements to previous tools (especially new graphics for - DAPC) and - a few bug fixes. It also include 4 new tutorial distributed as - vignettes. See the and a few bug fixes. + It also include 4 new tutorial distributed as vignettes. See + the ChangeLog - file for details of the release. Given this is a - major - release, version number switched directly from 1.2-8 to 1.3-0.
+ style="color: rgb(0, 0, 0);"> file for details of the release.
+ Given this is a major release, version number switched directly + from 1.2-8 to 1.3-0.


02/06/2011
A new tutorial dedicated to DAPC has been uploaded on the website - (see - section "documents").
+ (see section "documents").


14/12/2010
@@ -260,68 +245,50 @@ 05/11/2010
New adegenet version (1.2-8) - has - been - released - for R-2.12.0! - This release fixes a bug in exports - from adegenet to - hierfstat (function genind2hierfstat), - which - lead - to - wrong fstat - computations in some cases. See the This + release fixes a bug in exports from + adegenet to hierfstat (function genind2hierfstat), + which lead to wrong fstat computations in some + cases. See the ChangeLog - file for details.
+ style="color: rgb(0, 0, 0);"> file for details.


29/10/2010
New adegenet version (1.2-7) - has - been - released - for R-2.12.0! - This release includes the stable implementation of the This + release includes the stable implementation of the Discriminant Analysis of - Principal - Components (DAPC, - ?dapc, - ?find.clusters) (DAPC, ?dapc, ?find.clusters), - and fixes a bug from the ape package which impacted adegenet - and - prevented windows binaries to be generated with R-2.12.0. See - the , and fixes a bug from the ape package which impacted + adegenet and prevented windows binaries to be generated with + R-2.12.0. See the ChangeLog - file for more information.
+ style="color: rgb(0, 0, 0);"> file for more information.


22/09/2010
New adegenet version (1.2-6) - has - been - released - for R-2.11.1! - This release adds the computation of pairwise Fst ( This + release adds the computation of pairwise Fst () and - a new function for extracting the polymorphism of aligned - proteic - sequences (class alignment - in - the seqinr package) into genind objects (and a new function for extracting the polymorphism of + aligned proteic sequences (class alignment in the + seqinr package) into genind objects (?alignment2genind). - See the ). See the ChangeLog - file for more information, and check out new updates of the - tutorial.
+ style="color: rgb(0, 0, 0);"> file for more information, and + check out new updates of the tutorial.


30/07/2010

New adegenet version (1.2-5) - has - been - released - for R-2.11.1! - This release includes a stable implementation of the This + release includes a stable implementation of the SeqTrack algorithm  - (see literature section for a reference) for reconstructing - genealogies - of haplotypes; see (see literature section for a reference) for + reconstructing genealogies of haplotypes; see ?seqTrack Author: jombart Date: 2013-05-16 10:14:12 +0200 (Thu, 16 May 2013) New Revision: 1131 Added: www/files/adegenet_1.3-7.tar.gz www/files/adegenet_1.3-7.tgz www/files/adegenet_1.3-7.zip www/files/adegenet_1.3-8.tar.gz Log: adding versions of the package to files/ Added: www/files/adegenet_1.3-7.tar.gz =================================================================== (Binary files differ) Property changes on: www/files/adegenet_1.3-7.tar.gz ___________________________________________________________________ Added: svn:mime-type + application/octet-stream Added: www/files/adegenet_1.3-7.tgz =================================================================== (Binary files differ) Property changes on: www/files/adegenet_1.3-7.tgz ___________________________________________________________________ Added: svn:mime-type + application/octet-stream Added: www/files/adegenet_1.3-7.zip =================================================================== (Binary files differ) Property changes on: www/files/adegenet_1.3-7.zip ___________________________________________________________________ Added: svn:mime-type + application/octet-stream Added: www/files/adegenet_1.3-8.tar.gz =================================================================== (Binary files differ) Property changes on: www/files/adegenet_1.3-8.tar.gz ___________________________________________________________________ Added: svn:mime-type + application/octet-stream From noreply at r-forge.r-project.org Thu May 16 17:22:54 2013 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Thu, 16 May 2013 17:22:54 +0200 (CEST) Subject: [adegenet-commits] r1132 - pkg/R Message-ID: <20130516152254.B016018445B@r-forge.r-project.org> Author: jombart Date: 2013-05-16 17:22:54 +0200 (Thu, 16 May 2013) New Revision: 1132 Modified: pkg/R/dapc.R Log: changing the default colors of scatter dapc Modified: pkg/R/dapc.R =================================================================== --- pkg/R/dapc.R 2013-05-16 08:14:12 UTC (rev 1131) +++ pkg/R/dapc.R 2013-05-16 15:22:54 UTC (rev 1132) @@ -486,7 +486,7 @@ ############## ## scatter.dapc ############## -scatter.dapc <- function(x, xax=1, yax=2, grp=x$grp, col=rainbow(length(levels(grp))), pch=20, bg="lightgrey", solid=.7, +scatter.dapc <- function(x, xax=1, yax=2, grp=x$grp, col=seasun(length(levels(grp))), pch=20, bg="white", solid=.7, scree.da=TRUE, scree.pca=FALSE, posi.da="bottomright", posi.pca="bottomleft", bg.inset="white", ratio.da=.25, ratio.pca=.25, inset.da=0.02, inset.pca=0.02, inset.solid=.5, onedim.filled=TRUE, mstree=FALSE, lwd=1, lty=1, segcol="black", @@ -514,8 +514,8 @@ par(mar = c(0.1, 0.1, 0.1, 0.1), bg=bg) on.exit(par(opar)) axes <- c(xax,yax) + ## basic empty plot - ## s.label(x$ind.coord[,axes], clab=0, cpoint=0, grid=FALSE, addaxes = FALSE, cgrid = 1, include.origin = FALSE, ...) s.class(x$ind.coord[,axes], fac=grp, col=col, cpoint=0, cstar = cstar, cellipse = cellipse, axesell = axesell, label = label, clabel = clabel, xlim = xlim, ylim = ylim, grid = grid, addaxes = addaxes, origin = origin, include.origin = include.origin, sub = sub, csub = csub, possub = possub, cgrid = cgrid, pixmap = pixmap, contour = contour, area = area) @@ -547,6 +547,8 @@ } } else { + ## set screeplot of DA to FALSE (just 1 bar) + scree.da <- FALSE ## get plotted axis if(ncol(x$ind.coord)==1) { @@ -989,12 +991,14 @@ xvalDapc <- function (x, ...) UseMethod("xvalDapc") -xvalDapc.data.frame <- function(x, grp, n.pca.max, n.da=NULL, training.set = 1/2, +xvalDapc.data.frame <- function(x, grp, n.pca.max, n.da=NULL, training.set = 0.9, + result=c("groupMean","overall"), center=TRUE, scale=FALSE, n.pca=NULL, n.rep=10, ...){ ## CHECKS ## grp <- factor(grp) n.pca <- n.pca[n.pca>0] + result <- match.arg(result) if(is.null(n.da)) { n.da <- length(levels(grp))-1 } @@ -1023,7 +1027,13 @@ temp.pca$li <- temp.pca$li[toKeep,,drop=FALSE] temp.dapc <- suppressWarnings(dapc(x[toKeep,,drop=FALSE], grp[toKeep], n.pca=n.pca, n.da=n.da, dudi=temp.pca)) temp.pred <- predict.dapc(temp.dapc, newdata=x[-toKeep,,drop=FALSE]) - return(mean(temp.pred$assign==grp[-toKeep])) + if(result=="overall"){ + out <- mean(temp.pred$assign==grp[-toKeep]) + } + if(result=="groupMean"){ + out <- mean(tapply(temp.pred$assign==grp[-toKeep], grp[-toKeep], mean)) + } + return(out) } return(replicate(n.rep, f1())) } @@ -1043,48 +1053,48 @@ -############# -## discriVal -############# +## ############# +## ## discriVal +## ############# -discriVal <- function (x, ...) UseMethod("discriVal") +## discriVal <- function (x, ...) UseMethod("discriVal") -discriVal.data.frame <- function(x, grp, n.pca.max, n.da=NULL, center=TRUE, scale=FALSE, n.pca=NULL, ...){ +## discriVal.data.frame <- function(x, grp, n.pca.max, n.da=NULL, center=TRUE, scale=FALSE, n.pca=NULL, ...){ - ## CHECKS ## - grp <- factor(grp) - n.pca <- n.pca[n.pca>0] - if(is.null(n.da)) { - n.da <- length(levels(grp))-1 - } +## ## CHECKS ## +## grp <- factor(grp) +## n.pca <- n.pca[n.pca>0] +## if(is.null(n.da)) { +## n.da <- length(levels(grp))-1 +## } - ## GET FULL PCA ## - if(missing(n.pca.max)) n.pca.max <- min(dim(x)) - pcaX <- dudi.pca(x, nf=n.pca.max, scannf=FALSE, center=center, scale=scale) - n.pca.max <- min(n.pca.max,pcaX$rank) +## ## GET FULL PCA ## +## if(missing(n.pca.max)) n.pca.max <- min(dim(x)) +## pcaX <- dudi.pca(x, nf=n.pca.max, scannf=FALSE, center=center, scale=scale) +## n.pca.max <- min(n.pca.max,pcaX$rank) - ## DETERMINE N.PCA IF NEEDED ## - if(is.null(n.pca)){ - n.pca <- round(pretty(1:n.pca.max,10)) - } - n.pca <- n.pca[n.pca>0 & n.pca0 & n.pca Author: jombart Date: 2013-05-20 16:40:11 +0200 (Mon, 20 May 2013) New Revision: 1133 Modified: www/download.html www/files/adegenet_1.3-8.tar.gz Log: new versions of the package Modified: www/download.html =================================================================== --- www/download.html 2013-05-16 15:22:54 UTC (rev 1132) +++ www/download.html 2013-05-20 14:40:11 UTC (rev 1133) @@ -32,8 +32,8 @@ 10px;"> The current stable version (adegenet_1.3-8) is available as:
- linux sources
- - MacOS X binary
- - Windows binary
+ - MacOS X binary
+ - Windows binary

The devel version @@ -49,6 +49,7 @@ +
Patches correct minor bugs or implement new functionnalities, and will be included into the next CRAN release. It is recommended to @@ -62,6 +63,7 @@ + a patch.
addedFeatures.R: (for adegenet Modified: www/files/adegenet_1.3-8.tar.gz =================================================================== (Binary files differ) From noreply at r-forge.r-project.org Mon May 20 16:40:44 2013 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Mon, 20 May 2013 16:40:44 +0200 (CEST) Subject: [adegenet-commits] r1134 - www/files Message-ID: <20130520144044.B9A78184FA3@r-forge.r-project.org> Author: jombart Date: 2013-05-20 16:40:44 +0200 (Mon, 20 May 2013) New Revision: 1134 Added: www/files/adegenet_1.3-8.tgz www/files/adegenet_1.3-8.zip Log: adding the binaries Added: www/files/adegenet_1.3-8.tgz =================================================================== (Binary files differ) Property changes on: www/files/adegenet_1.3-8.tgz ___________________________________________________________________ Added: svn:mime-type + application/octet-stream Added: www/files/adegenet_1.3-8.zip =================================================================== (Binary files differ) Property changes on: www/files/adegenet_1.3-8.zip ___________________________________________________________________ Added: svn:mime-type + application/octet-stream From noreply at r-forge.r-project.org Tue May 21 12:26:08 2013 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Tue, 21 May 2013 12:26:08 +0200 (CEST) Subject: [adegenet-commits] r1135 - www Message-ID: <20130521102608.F031218429E@r-forge.r-project.org> Author: jombart Date: 2013-05-21 12:26:08 +0200 (Tue, 21 May 2013) New Revision: 1135 Modified: www/literature.html Log: +2ref Modified: www/literature.html =================================================================== --- www/literature.html 2013-05-20 14:40:44 UTC (rev 1134) +++ www/literature.html 2013-05-21 10:26:08 UTC (rev 1135) @@ -65,6 +65,7 @@ + the bublisher's website]

@@ -99,6 +100,7 @@ + abstract]

- the paper presenting the spatial @@ -118,6 +120,7 @@ + principal component analysis (sPCA, function spca), global and @@ -140,6 +143,7 @@ + cryptic spatial patterns in genetic variability by a new multivariate method.  Heredity 101: 92-103. doi: @@ -163,6 +167,7 @@ + abstract]

@@ -188,6 +193,7 @@ + simulations of genealoies of haplotypes (haploGen):
Jombart T, Eggo RM, Dodd PJ, Balloux F (2010) @@ -213,6 +219,7 @@ + of Principal Components
(DAPC, functions find.clusters @@ -246,6 +253,7 @@ + Behaviour76: 87-95.

@@ -268,6 +276,7 @@ + Genomics
9: 256.
@@ -305,6 +314,7 @@ + marmota
.
Molecular @@ -321,6 +331,7 @@ + Ecology 18: 1491-1503.

@@ -376,6 +387,7 @@ + australis in North America. Biological Invasions. doi: 10.1007/s10530-010-9699-6.
@@ -535,6 +547,7 @@ + Oct 6. [Epub ahead of print]

[24] SANTOS, H., BURBAN, C., ROUSSELET, J., @@ -555,6 +568,7 @@ + pityocampa, Lepidoptera, Notodontidae). Journal of Evolutionary Biology, no. doi: 10.1111/j.1420-9101.2010.02147.x
@@ -580,6 +594,7 @@ + Vol. Sci. Pap. ICCAT, 65(3): 988-995

[26] Vandewoestijne @@ -598,6 +613,7 @@ + S, Van Dyck H, 2010 Population Genetic @@ -619,6 +635,7 @@ + ONE5(11): e13810. doi:10.1371/journal.pone.0013810
@@ -651,6 +668,7 @@ +
DOI: 10.1007/s10329-010-0232-4

@@ -720,6 +738,7 @@ + tetradactylum: Polynemidae). Molecular Ecology, 20: no. doi: 10.1111/j.1365-294X.2011.05097.x

@@ -747,6 +766,7 @@ + neoformans Variety grubii Multilocus Sequence Types from Thailand Are Consistent with an Ancestral African Origin.
PLoS @@ -1006,6 +1026,7 @@ + 10.1007/s10709-012-9640-2

[76] Samantha Baldwin, Meeghan Pither-Joyce, Kathryn Wright, @@ -1658,8 +1679,19 @@ [187] Hendry et al. (2013) EVOLUTIONARY INFERENCES FROM THE ANALYSIS OF EXCHANGEABILITY. Evolution. DOI: 10.1111/evo.12160

+ [188] Berbegal et al. (2013) Evidence for multiple introductions + and clonality in Spanish populations of Fusarium circinatum. + Phytopathology. http://dx.doi.org/10.1094/PHYTO-11-12-0281-R

+ [189]* Viera et al. (2013) Development of interspecific hybrids + of cassava and paternity analysis with molecular markers. + Journal of Agricultural Science.DOI: + http://dx.doi.org/10.1017/S0021859612000871

+
+
+
+

* adegenet not or wrongly cited, but actually used in the paper.
From noreply at r-forge.r-project.org Tue May 28 21:58:19 2013 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Tue, 28 May 2013 21:58:19 +0200 (CEST) Subject: [adegenet-commits] r1136 - www Message-ID: <20130528195820.0F02E1855DE@r-forge.r-project.org> Author: jombart Date: 2013-05-28 21:58:19 +0200 (Tue, 28 May 2013) New Revision: 1136 Modified: www/literature.html Log: +3 ref Modified: www/literature.html =================================================================== --- www/literature.html 2013-05-21 10:26:08 UTC (rev 1135) +++ www/literature.html 2013-05-28 19:58:19 UTC (rev 1136) @@ -66,6 +66,7 @@ + the bublisher's website]

@@ -101,6 +102,7 @@ + abstract]

- the paper presenting the spatial @@ -121,6 +123,7 @@ + principal component analysis (sPCA, function spca), global and @@ -144,6 +147,7 @@ + cryptic spatial patterns in genetic variability by a new multivariate method.  Heredity 101: 92-103. doi: @@ -168,6 +172,7 @@ + abstract]

@@ -194,6 +199,7 @@ + simulations of genealoies of haplotypes (haploGen):
Jombart T, Eggo RM, Dodd PJ, Balloux F (2010) @@ -220,6 +226,7 @@ + of Principal Components (DAPC, functions find.clusters @@ -254,6 +261,7 @@ + Behaviour76: 87-95.

@@ -277,6 +285,7 @@ + Genomics
9: 256.
@@ -315,6 +324,7 @@ + marmota.Molecular @@ -332,6 +342,7 @@ + Ecology 18: 1491-1503.

@@ -388,6 +399,7 @@ + australis in North America. Biological Invasions. doi: 10.1007/s10530-010-9699-6.
@@ -548,6 +560,7 @@ + Oct 6. [Epub ahead of print]

[24] SANTOS, H., BURBAN, C., ROUSSELET, J., @@ -569,6 +582,7 @@ + pityocampa, Lepidoptera, Notodontidae). Journal of Evolutionary Biology, no. doi: 10.1111/j.1420-9101.2010.02147.x
@@ -595,6 +609,7 @@ + Vol. Sci. Pap. ICCAT, 65(3): 988-995

[26] Vandewoestijne @@ -614,6 +629,7 @@ + S, Van Dyck H, 2010 Population Genetic @@ -636,6 +652,7 @@ + ONE5(11): e13810. doi:10.1371/journal.pone.0013810
@@ -669,6 +686,7 @@ +
DOI: 10.1007/s10329-010-0232-4

@@ -739,6 +757,7 @@ + tetradactylum: Polynemidae). Molecular Ecology, 20: no. doi: 10.1111/j.1365-294X.2011.05097.x

@@ -767,6 +786,7 @@ + neoformans Variety grubii Multilocus Sequence Types from Thailand Are Consistent with an Ancestral African Origin.
PLoS @@ -1027,6 +1047,7 @@ + 10.1007/s10709-012-9640-2

[76] Samantha Baldwin, Meeghan Pither-Joyce, Kathryn Wright, @@ -1688,10 +1709,27 @@ Journal of Agricultural Science.DOI: http://dx.doi.org/10.1017/S0021859612000871

+ [190] Tysklind, N., Taylor, M. I., Lyons, B. P., Goodsir, F., + McCarthy, I. D. and Carvalho, G. R. (2013), Population genetics + provides new insights into biomarker prevalence in dab (Limanda + limanda L.): a key marine biomonitoring species. Evolutionary + Applications. doi: 10.1111/eva.12074

+ [191] H?glund, J., Wang, B., Axelsson, T. and Quintela, M. + (2013), Phylogeography of willow grouse (Lagopus lagopus) in the + Arctic: taxonomic discordance as inferred from molecular data. + Biological Journal of the Linnean Society. doi: + 10.1111/bij.12109

+ [192] Lalis et al. (2013) Development and characterization of 8 + polymorphic microsatellite loci in a North African toad, + Amietophrynus mauritanicus (Schlegel, 1841). Conservation + Genetics Resources. DOI: 10.1007/s12686-013-9963-z


+
+
+

* adegenet not or wrongly cited, but actually used in the paper.