[adegenet-forum] extracting subset of SNPs with the highest weight

Jean-Luc LEGRAS legrasjl at supagro.inra.fr
Thu Jun 25 16:32:03 CEST 2015


Hello
Thank you for your 	answer and solution:

Indeed  i could obtain a plot and the list of SNPs with the highest contribution using  
Axis1<- loadingplot(abs(GWEVariant.PCA$loadings[,1]), threshold=quantile(abs(DTloadings[, i+1]),probs = .95),  lab=rownames(GWEVariant.PCA$loadings), cex.lab=0.7, cex.fac=1, lab.jitter=0, main="Loading plot", xlab="SNP positions", ylab="Contributions", srt = 90, adj = c(0, 0.5))

and then  subset<-as.matrix(GWEVariant[,Axe1$var.idx])


Best regards.
Jean-Luc

Le 24 juin 2015 à 17:00, Jombart, Thibaut <t.jombart at imperial.ac.uk> a écrit :

> Hi there, 
> 
> can you try with 'loadingplot'? It invisibly returns the list of most contributing alleles.
> 
> Best
> Thibaut 
> 
> ________________________________________
> From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Jean-Luc LEGRAS [legrasjl at supagro.inra.fr]
> Sent: 24 June 2015 15:04
> To: adegenet-forum at lists.r-forge.r-project.org
> Subject: [adegenet-forum] extracting subset of SNPs with the highest weight
> 
> Hello
> I am using adegenet 1.4-2 on a set of genomic data. I have convert my data to  the plink raw format, in 326000 snp for 82 diploid individuals. All variant position have an ID chromosomenumber+coordinates.
> I performed a PCA on genotypes which separates nicely the main groups and I wanted to extract snps which have the highest  contribution  (5%) of the PCA to make a subset of the initial genotypes matrix. I can obtain the list of snps with the highest loadings but I cannot The problem is that when using subset I obtain an empty list:. Is this wrong? Do you have any suggestions?
> 
> Thank you in advance.
> Best regards.
> Jean-Luc
> here is the code I used:
> 
> GWEVariant <- read.PLINK(file="GWE.raw",map.file = "GWE.map",multicore= FALSE)
> 
> GWEVariant.PCA <-glPca(GWEVariant, center = TRUE, scale = FALSE, nf = 7, loadings = TRUE, alleleAsUnit = FALSE, useC = TRUE,n.cores = 4, returnDotProd=FALSE, matDotProd=NULL)
> DTloadings<- data.frame(GWEVariant at loc.names,GWEVariant.PCA$loadings)
> 
> top <-matrix(nrow=7,ncol=2)
> Mqdiscriminants<-matrix(,ncol=8)
> colnames(Mqdiscriminants)<-colnames(DTloadings)
> liste <-list()
> i=1
> for (i in 1:7) {
> top[i,1]<-quantile(DTloadings[, i+1], probs = .025)
> top[i,2]<-quantile(DTloadings[, i+1], probs = .975)
> liste <-  which(DTloadings[,i+1]<top[i,1] | DTloadings[,i+1]>top[i,2])
> Mqdiscriminants<-rbind(Mqdiscriminants,DTloadings[liste,])
> }
> 
> Mqdiscriminants <-unique(Mqdiscriminants)
> Mqdiscriminants<-na.omit(Mqdiscriminants)
> 
> subset<-as.matrix(GWEvVaraint[,Mqdiscriminants[,1]])
> 
> 
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum



More information about the adegenet-forum mailing list