From carlo.pecoraro2 at unibo.it  Thu Sep  3 15:32:43 2015
From: carlo.pecoraro2 at unibo.it (Carlo Pecoraro)
Date: Thu, 3 Sep 2015 13:32:43 +0000
Subject: [adegenet-forum] F statistics
Message-ID: <386B59F78C25A74C9D53373962B04CF80121C6CE27@E10-MBX1-CS.personale.dir.unibo.it>

Dear adegenet users,

I would like to test using the G-statistic if my Fst values are significant.

Gtest <- gstat.randtest(x,nsim=99) ### x is my genind object (4.7 MB) and I am using 99 simulations for the randtest
Gtest

The result obtained is very weird.44

sim: num [1:99] 00000000

obs: num 0

pvalue: num 1


Do you know which is the mistake? According to the fact that I have 1400 loci; Do I have increase the number nsim?


Thanks a lot in advance

Cheers

Carlo
--


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150903/d52ffaf6/attachment.html>

From swahuaxi at yahoo.com  Mon Sep  7 15:32:08 2015
From: swahuaxi at yahoo.com (Stephen Attwood)
Date: Mon, 7 Sep 2015 13:32:08 +0000 (UTC)
Subject: [adegenet-forum] df2genind ignoring ind.names
Message-ID: <1112291306.2619199.1441632728454.JavaMail.yahoo@mail.yahoo.com>

Apologies for bothering you all, but I suddenly started getting this error (the script below used to work ok on same dataset and I reinstalled with latest package adegenet after problem appeared but error persisted):
R
library(PopGenReport)
dataset.df <- read.csv("/home/.../dataset.csv",head = FALSE, sep = ",")
dataset.gen <- df2genind(dataset.df[, -c(1, 2)], ind.names = dataset.df[[1]], pop = dataset.df[[2]], loc.names = c("locusA","locusB","locusC","locusD","locusE","locusF","locusG","locusH","locusJ"), type = "codom", ploidy = 2, sep="/")
dataset.gen at ind.names # returns "Error: no slot of name "ind.names" for this object of class "genind""
dataset.gen at pop # returns the populations correctly
dataset.gen at tab # returns the genotypes correctly
dataset.df[[1]] # returns the ind.names correctly

Can anyone advise as to how to fix the above code so that the ind.names are correctly slotted into the genind object?
I run R on linux Mint.

I also noticed that PopGenReport began to return the error "could not find function "pairwise.fst"" in computation of Nei?s pairwise Fst, and that this error began to appear from the same time as above. Are both these problems likely to be due to changes in adegenet?
These are great tools by the way - I use them often.

Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150907/59462931/attachment.html>

From t.jombart at imperial.ac.uk  Mon Sep  7 15:39:47 2015
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Mon, 7 Sep 2015 13:39:47 +0000
Subject: [adegenet-forum] df2genind ignoring ind.names
In-Reply-To: <1112291306.2619199.1441632728454.JavaMail.yahoo@mail.yahoo.com>
References: <1112291306.2619199.1441632728454.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570B1293557@icexch-m1.ic.ac.uk>


Hi there,

I am not sure which version of adegenet you are using, but it would be worth updating everything just to make sure.

@ind.names disappeared in adegenet 2.0.0, alongside other slots that were no longer useful. See ChangeLog for more info:
https://github.com/thibautjombart/adegenet/blob/master/ChangeLog

You basically want to use accessors instead, e.g. indNames(dataset.gen) etc.

They are all documented in the new 'basics' tutorial.

Best
Thibaut

________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Stephen Attwood [swahuaxi at yahoo.com]
Sent: 07 September 2015 14:32
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] df2genind ignoring ind.names

Apologies for bothering you all, but I suddenly started getting this error (the script below used to work ok on same dataset and I reinstalled with latest package adegenet after problem appeared but error persisted):

R
library(PopGenReport)
dataset.df <- read.csv("/home/.../dataset.csv",head = FALSE, sep = ",")

dataset.gen <- df2genind(dataset.df[, -c(1, 2)], ind.names = dataset.df[[1]], pop = dataset.df[[2]], loc.names = c("locusA","locusB","locusC","locusD","locusE","locusF","locusG","locusH","locusJ"), type = "codom", ploidy = 2, sep="/")

dataset.gen at ind.names # returns "Error: no slot of name "ind.names" for this object of class "genind""

dataset.gen at pop # returns the populations correctly
dataset.gen at tab # returns the genotypes correctly

dataset.df[[1]] # returns the ind.names correctly

Can anyone advise as to how to fix the above code so that the ind.names are correctly slotted into the genind object?

I run R on linux Mint.

I also noticed that PopGenReport began to return the error "could not find function "pairwise.fst"" in computation of Nei?s pairwise Fst, and that this error began to appear from the same time as above. Are both these problems likely to be due to changes in adegenet?

These are great tools by the way - I use them often.

Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150907/dbbdb871/attachment.html>

From postmaster at r-forge.wu-wien.ac.at  Fri Sep 11 11:20:42 2015
From: postmaster at r-forge.wu-wien.ac.at (MAILER-DAEMON)
Date: Fri, 11 Sep 2015 09:20:42 -0000
Subject: [adegenet-forum] Delivery failed
Message-ID: <mailman.6.1441963250.1100.adegenet-forum@lists.r-forge.r-project.org>

This message was undeliverable due to the following reason(s):

Your message could not be delivered because the destination server was
not reachable within the allowed queue period. The amount of time
a message is queued before it is returned depends on local configura-
tion parameters.

Most likely there is a network problem that prevented delivery, but
it is also possible that the computer is turned off, or does not
have a mail system running right now.

Your message could not be delivered within 7 days:
Server 35.159.67.132 is not responding.

The following recipients did not receive this message:
<adegenet-forum at r-forge.wu-wien.ac.at>

Please reply to postmaster at r-forge.wu-wien.ac.at
if you feel this message to be in error.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: file.zip
Type: application/octet-stream
Size: 29228 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150911/0f6e96ba/attachment.obj>

From cesaradx at gmail.com  Fri Sep 11 15:20:31 2015
From: cesaradx at gmail.com (cesar augusto diniz xavier)
Date: Fri, 11 Sep 2015 10:20:31 -0300
Subject: [adegenet-forum] Input file DAPC
Message-ID: <CAD=WoiFw16Mwf=juu4vKztMJT_s0NQC70yMcBo_6aZqYPOBuiw@mail.gmail.com>

I am PhD student at the  Universidade Federal de Vi?osa, Brazil, currently
working with virus population genetics (genus *Begomovirus, *family
*Geminiviridae*) directed by Murilo Zerbini. I need to use a method for
analyzing the structure populations of begomovirus, of the available
methods which best fits for this purpose is the DAPC. However I could not
run the analysis because I do not know what the input file format. My data
sequences of genomic DNA. I count on your help.

 Since already thank!

C?sar
C?sar Augusto Diniz Xavier
Engenheiro Agr?nomo
Doutorando em Fitopatologia - UFV
Laborat?rio de Virologia Vegetal Molecular - Bioagro
(31) 9339-7658
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150911/216f8c16/attachment.html>

From t.jombart at imperial.ac.uk  Fri Sep 11 16:49:25 2015
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Fri, 11 Sep 2015 14:49:25 +0000
Subject: [adegenet-forum] Input file DAPC
In-Reply-To: <CAD=WoiFw16Mwf=juu4vKztMJT_s0NQC70yMcBo_6aZqYPOBuiw@mail.gmail.com>
References: <CAD=WoiFw16Mwf=juu4vKztMJT_s0NQC70yMcBo_6aZqYPOBuiw@mail.gmail.com>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570B1293D42@icexch-m1.ic.ac.uk>


Hi there,

it is all documented - go through the 'basics' tutorial for inputs and data handling, and then the dapc tutorial for the method itself.

https://github.com/thibautjombart/adegenet/wiki/Tutorials

Cheers
Thibaut


==============================
Dr Thibaut Jombart
MRC Centre for Outbreak Analysis and Modelling
Department of Infectious Disease Epidemiology
Imperial College - School of Public Health
Norfolk Place, London W2 1PG, UK
Tel. : 0044 (0)20 7594 3658
http://sites.google.com/site/thibautjombart/
http://sites.google.com/site/therepiproject/
http://adegenet.r-forge.r-project.org/
Twitter: @thibautjombart


________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of cesar augusto diniz xavier [cesaradx at gmail.com]
Sent: 11 September 2015 14:20
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] Input file DAPC

I am PhD student at the  Universidade Federal de Vi?osa, Brazil, currently working with virus population genetics (genus Begomovirus, family Geminiviridae) directed by Murilo Zerbini. I need to use a method for analyzing the structure populations of begomovirus, of the available methods which best fits for this purpose is the DAPC. However I could not run the analysis because I do not know what the input file format. My data sequences of genomic DNA. I count on your help.

 Since already thank!

C?sar
C?sar Augusto Diniz Xavier
Engenheiro Agr?nomo
Doutorando em Fitopatologia - UFV
Laborat?rio de Virologia Vegetal Molecular - Bioagro
(31) 9339-7658
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150911/ee841063/attachment.html>

From ebowles at ucalgary.ca  Tue Sep 22 21:16:31 2015
From: ebowles at ucalgary.ca (Ella Bowles)
Date: Tue, 22 Sep 2015 13:16:31 -0600
Subject: [adegenet-forum] how do I know if missing data is affecting PCA or
	DAPC results
Message-ID: <CAHpKFdAudJ9gZRDObSD94L+D86RfTDx326vQRCQNCvBWK9-shA@mail.gmail.com>

Hello,

I'm attempting to do a PCA and a DAPC on genomic data, 186 individuals
spread over 11 putative populations, with just over 4000 loci. I have
converted the data to a genlight object. I'm wondering, I know that I have
some missing data (markers are present in at least 65% of individuals). In
the Adegent manual it specifies that missing data could bias results. How
do I know if I have too much missing data, or should I just get rid of all
the loci that have missing values before doing the analysis?

With thanks,
Ella

-- 
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca, bowlese at gmail.com
website: http://ellabowlesphd.wordpress.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150922/e25129d3/attachment.html>

From ebowles at ucalgary.ca  Tue Sep 22 21:28:48 2015
From: ebowles at ucalgary.ca (Ella Bowles)
Date: Tue, 22 Sep 2015 13:28:48 -0600
Subject: [adegenet-forum] question about warning message
Message-ID: <CAHpKFdBu9mY4w61BhVuKCO9BkFSrh_ciN88=h9-UBo0FjfYuZA@mail.gmail.com>

Hello,

After I converted my data into a genlight object, I got the warning message
Warning message:
In matrix(txt, ncol = 4, byrow = TRUE) :
  data length [16397] is not a sub-multiple or multiple of the number of
rows [4100]

How do I know if this is a problem?

I have just over 4000 SNPs spread over 186 individuals. My data import line
is:

data=read.PLINK(file='plink.raw', map.file='batch_1.plink.map', quiet =
FALSE, chunkSize = 186, parallel = FALSE)


With thanks,
Ella

-- 
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca, bowlese at gmail.com
website: http://ellabowlesphd.wordpress.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150922/76d5d30e/attachment.html>

From f.calboli at imperial.ac.uk  Tue Sep 22 21:30:11 2015
From: f.calboli at imperial.ac.uk (Federico Calboli)
Date: Tue, 22 Sep 2015 22:30:11 +0300
Subject: [adegenet-forum] how do I know if missing data is affecting PCA
	or DAPC results
In-Reply-To: <CAHpKFdAudJ9gZRDObSD94L+D86RfTDx326vQRCQNCvBWK9-shA@mail.gmail.com>
References: <CAHpKFdAudJ9gZRDObSD94L+D86RfTDx326vQRCQNCvBWK9-shA@mail.gmail.com>
Message-ID: <ECC90361-83A7-44D6-A82D-F2BA8DE1FAA2@imperial.ac.uk>

> On 22 Sep 2015, at 22:16, Ella Bowles <ebowles at ucalgary.ca> wrote:
> 
> Hello,
> 
> I'm attempting to do a PCA and a DAPC on genomic data, 186 individuals spread over 11 putative populations, with just over 4000 loci. I have converted the data to a genlight object. I'm wondering, I know that I have some missing data (markers are present in at least 65% of individuals). In the Adegent manual it specifies that missing data could bias results. How do I know if I have too much missing data, or should I just get rid of all the loci that have missing values before doing the analysis?

As a general rule you should QC your data in some way, say remove all SNPs with more than X% missing ? a 35% missing looks very generous to me, I would personally use a 5% threshold.  One way of testing the effects of your missing data is to run the PCA and DAPC multiple times, starting with ?no missing data? and each subsequent time with a less and less stringent threshold, until your results are unacceptably different from those obtained with the no missing dataset.

Cheers

F


> 
> With thanks,
> Ella 
> 
> -- 
> Ella Bowles
> PhD Candidate 
> Biological Sciences
> University of Calgary
> 
> e-mail: ebowles at ucalgary.ca, bowlese at gmail.com
> website: http://ellabowlesphd.wordpress.com/
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum


From kirsty.m.medcalf at gmail.com  Wed Sep 23 06:45:15 2015
From: kirsty.m.medcalf at gmail.com (Kirsty Medcalf)
Date: Tue, 22 Sep 2015 21:45:15 -0700
Subject: [adegenet-forum] DAPC
Message-ID: <CAJ5RjHhCOuPfyUwGC3gWjKabjrspCrypWQwvVTuj-jKsqs+6tQ@mail.gmail.com>

Dear Forum

This is my first post, so I would like to thank you for your patience.  The
multivariate data that I am using contains two categorical grouping factors
(V4 or G8) under the column family (response variable) and 12 accompanying
predictor variables. The data is called LDA.scores and is found at the
bottom of my Stack Overflow page by following the link below, which shows
my attempted step-by-step logic and figures.

http://stackoverflow.com/questions/32704902/discriminant-analysis-of-principal-components-and-how-to-graphically-show-the-di

I have been attempting to graphically show the distance of data points to
its multivariate centroid using DAPC analysis and the function `scatter' in
the `adegenet' package in R. After splitting the two categorical factors
into two separate data frames (coding below), I attempt to produce these
scatterplot. I understand this package is used for the analysis of genetic
markers, however, I am also under the impression that all types of
multivariate data can be analysed using this package. I tried to manipulate
the data but to no avail.

Code used to produce figure
*Split the dataframe into just V4 and G8

Just.V4<-LDA.scores[LDA.scores$Family=="V4",]
Just.G8 <-LDA.scores[LDA.scores$Family=="G8",]

#Attempt to produce a scatterplot for the categorical factor V4
library(adegenet)
x<-Just.V4[2:13]

*Find the clusters

grp<-find.clusters(x, max.n.clust=12, na.action="omit")

The next step is the perform the discriminant analysis of principal
components

 dapc1<-dapc(x, grp$grp)
 scatter(dapc1)

I have tried many different combinations of code and here are some of the
error messages

Error in dapc.data.frame(x, grp1$grp1) : Inconsistent length for grp
Warning in find.clusters.data.frame(as.data.frame(x), ...) :
NAs introduced by coercion
Error in if (n.pca >= N) warning("number of retained PCs of PCA is
greater than N") :
missing value where TRUE/FALSE needed


If anyone has a solution in terms of how to produce two figures for each
categorical factor which illustrates the clusters (12 parameters measured)
to its multivariate centroid, then thank so much. I have followed lots of
tutorials, searched online and read papers, and still do not understand
these error and warning messages.

Thank you if anyone can help.

Best wishes,
Kaikash
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150922/54fd1d71/attachment-0001.html>

From t.jombart at imperial.ac.uk  Wed Sep 23 16:01:30 2015
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Wed, 23 Sep 2015 14:01:30 +0000
Subject: [adegenet-forum] how do I know if missing data is affecting PCA
 or	DAPC results
In-Reply-To: <CAHpKFdAudJ9gZRDObSD94L+D86RfTDx326vQRCQNCvBWK9-shA@mail.gmail.com>
References: <CAHpKFdAudJ9gZRDObSD94L+D86RfTDx326vQRCQNCvBWK9-shA@mail.gmail.com>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570B12A14C9@icexch-m1.ic.ac.uk>

Dear Ella,

there is no one-size-fits-all answer to this question, but some general ideas may be useful.

Missing data should ideally be i) not too numerous and ii) randomly distributed in the dataset. In a situation like yours, individuals are more precious than markers, so I would discard loci with a majority of NAs, and briefly check the structure of the remaining missing entries.

NAs are basically replaced to the mean allele frequency. This means individuals with NAs will tend to be placed closer to the origin. Also, individuals with similar patterns of NAs will be seen as more similar than they probably are in reality.

If you really have a big missing value problem, and lot of NAs you cannot discard, one possibility would be to get a matrix of 1 and 0 where '1' indicate NAs, and do the PCA of this. If you obtain a structure, then this is a sign of problem - your NAs are not randomly distributed.

Hope this helps.

Cheers
Thibaut


==============================
Dr Thibaut Jombart
MRC Centre for Outbreak Analysis and Modelling
Department of Infectious Disease Epidemiology
Imperial College - School of Public Health
Norfolk Place, London W2 1PG, UK
Tel. : 0044 (0)20 7594 3658
http://sites.google.com/site/thibautjombart/
http://sites.google.com/site/therepiproject/
http://adegenet.r-forge.r-project.org/
Twitter: @thibautjombart


________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Ella Bowles [ebowles at ucalgary.ca]
Sent: 22 September 2015 20:16
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] how do I know if missing data is affecting PCA or DAPC results

Hello,

I'm attempting to do a PCA and a DAPC on genomic data, 186 individuals spread over 11 putative populations, with just over 4000 loci. I have converted the data to a genlight object. I'm wondering, I know that I have some missing data (markers are present in at least 65% of individuals). In the Adegent manual it specifies that missing data could bias results. How do I know if I have too much missing data, or should I just get rid of all the loci that have missing values before doing the analysis?

With thanks,
Ella

--
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>, bowlese at gmail.com<mailto:bowlese at gmail.com>
website: http://ellabowlesphd.wordpress.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150923/e34872db/attachment.html>

From t.jombart at imperial.ac.uk  Wed Sep 23 16:03:07 2015
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Wed, 23 Sep 2015 14:03:07 +0000
Subject: [adegenet-forum] question about warning message
In-Reply-To: <CAHpKFdBu9mY4w61BhVuKCO9BkFSrh_ciN88=h9-UBo0FjfYuZA@mail.gmail.com>
References: <CAHpKFdBu9mY4w61BhVuKCO9BkFSrh_ciN88=h9-UBo0FjfYuZA@mail.gmail.com>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570B12A14D9@icexch-m1.ic.ac.uk>

Dear Ella,

this is hard to tell without the input file. However, you may want to use genind objects for that kind of data, as the dataset is fairly small. genlight only is recommended when basically no other options are available, i.e. 100,000 or millions of SNPs

Cheers
Thibaut


________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Ella Bowles [ebowles at ucalgary.ca]
Sent: 22 September 2015 20:28
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] question about warning message

Hello,

After I converted my data into a genlight object, I got the warning message
Warning message:
In matrix(txt, ncol = 4, byrow = TRUE) :
  data length [16397] is not a sub-multiple or multiple of the number of rows [4100]

How do I know if this is a problem?

I have just over 4000 SNPs spread over 186 individuals. My data import line is:

data=read.PLINK(file='plink.raw', map.file='batch_1.plink.map', quiet = FALSE, chunkSize = 186, parallel = FALSE)


With thanks,
Ella

--
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>, bowlese at gmail.com<mailto:bowlese at gmail.com>
website: http://ellabowlesphd.wordpress.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150923/e9579af7/attachment.html>

From t.jombart at imperial.ac.uk  Wed Sep 23 16:11:10 2015
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Wed, 23 Sep 2015 14:11:10 +0000
Subject: [adegenet-forum] DAPC
In-Reply-To: <CAJ5RjHhCOuPfyUwGC3gWjKabjrspCrypWQwvVTuj-jKsqs+6tQ@mail.gmail.com>
References: <CAJ5RjHhCOuPfyUwGC3gWjKabjrspCrypWQwvVTuj-jKsqs+6tQ@mail.gmail.com>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570B12A14FF@icexch-m1.ic.ac.uk>

Hi there,

sorry I did not go into the details, but checking for potential mistakes:

#1
 'na.action' is not an argument in find.clusters

#2

x<-Just.V4[2:13]

this looks like a vector to me, not a matrix/data.frame; I am not quite sure what you expect 'grp' to be then.

As for producing scatterplots with varying factors, see argument 'grp' in ?scatter.dapc

Cheers
Thibaut
==============================
Dr Thibaut Jombart
MRC Centre for Outbreak Analysis and Modelling
Department of Infectious Disease Epidemiology
Imperial College - School of Public Health
Norfolk Place, London W2 1PG, UK
Tel. : 0044 (0)20 7594 3658
http://sites.google.com/site/thibautjombart/
http://sites.google.com/site/therepiproject/
http://adegenet.r-forge.r-project.org/
Twitter: @thibautjombart


________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Kirsty Medcalf [kirsty.m.medcalf at gmail.com]
Sent: 23 September 2015 05:45
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] DAPC


Dear Forum

This is my first post, so I would like to thank you for your patience.  The multivariate data that I am using contains two categorical grouping factors (V4 or G8) under the column family (response variable) and 12 accompanying predictor variables. The data is called LDA.scores and is found at the bottom of my Stack Overflow page by following the link below, which shows my attempted step-by-step logic and figures.

http://stackoverflow.com/questions/32704902/discriminant-analysis-of-principal-components-and-how-to-graphically-show-the-di

I have been attempting to graphically show the distance of data points to its multivariate centroid using DAPC analysis and the function `scatter' in the `adegenet' package in R. After splitting the two categorical factors into two separate data frames (coding below), I attempt to produce these scatterplot. I understand this package is used for the analysis of genetic markers, however, I am also under the impression that all types of multivariate data can be analysed using this package. I tried to manipulate the data but to no avail.

Code used to produce figure
*Split the dataframe into just V4 and G8

Just.V4<-LDA.scores[LDA.scores$Family=="V4",]
Just.G8 <-LDA.scores[LDA.scores$Family=="G8",]


#Attempt to produce a scatterplot for the categorical factor V4
library(adegenet)
x<-Just.V4[2:13]

*Find the clusters

grp<-find.clusters(x, max.n.clust=12, na.action="omit")

The next step is the perform the discriminant analysis of principal components

 dapc1<-dapc(x, grp$grp)
 scatter(dapc1)

I have tried many different combinations of code and here are some of the error messages

Error in dapc.data.frame(x, grp1$grp1) : Inconsistent length for grp
Warning in find.clusters.data.frame(as.data.frame(x), ...) :
NAs introduced by coercion
Error in if (n.pca >= N) warning("number of retained PCs of PCA is  greater than N") :
missing value where TRUE/FALSE needed

If anyone has a solution in terms of how to produce two figures for each categorical factor which illustrates the clusters (12 parameters measured) to its multivariate centroid, then thank so much. I have followed lots of tutorials, searched online and read papers, and still do not understand these error and warning messages.

Thank you if anyone can help.

Best wishes,
Kaikash

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150923/f6ec0c21/attachment-0001.html>

From postmaster at r-forge.wu-wien.ac.at  Thu Sep 24 11:26:39 2015
From: postmaster at r-forge.wu-wien.ac.at (Returned mail)
Date: Thu, 24 Sep 2015 16:26:39 +0700
Subject: [adegenet-forum] Delivery failed
Message-ID: <mailman.8.1443086829.1100.adegenet-forum@lists.r-forge.r-project.org>


-------------- next part --------------
A non-text attachment was scrubbed...
Name: mail.zip
Type: application/octet-stream
Size: 28978 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150924/74302098/attachment.obj>

From claudiafigueira at ua.pt  Thu Sep 24 12:53:56 2015
From: claudiafigueira at ua.pt (=?iso-8859-1?Q?Cl=E1udia_Figueira?=)
Date: Thu, 24 Sep 2015 10:53:56 +0000
Subject: [adegenet-forum] adegenet
Message-ID: <DB5PR02MB1046C256055F4F557DB31512B1430@DB5PR02MB1046.eurprd02.prod.outlook.com>

Hello,


my name is Cl?udia, I'm doing my thesis and I started working recently with adegenet package. Some doubts arose but hopefully you are able to help me.


My major doubt is to find which type of data (genotype, genetic distances, etc...) is necessary for DAPC and Monmorier analysis. And then, when I create my 'genind object' how can I store my xy information in my genind object.


Best regards,

Cl?udia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150924/3706ccf8/attachment.html>

From t.jombart at imperial.ac.uk  Thu Sep 24 13:03:44 2015
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Thu, 24 Sep 2015 11:03:44 +0000
Subject: [adegenet-forum] adegenet
In-Reply-To: <DB5PR02MB1046C256055F4F557DB31512B1430@DB5PR02MB1046.eurprd02.prod.outlook.com>
References: <DB5PR02MB1046C256055F4F557DB31512B1430@DB5PR02MB1046.eurprd02.prod.outlook.com>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570B12A16DC@icexch-m1.ic.ac.uk>

Hi Claudia,

please have a look at the documentation. The xy question has been asked on this forum before, and is documented in the basics and spatial tutorial I believe. As for the DAPC, there is a whole tutorial on it, besides the already quite exhaustive documentation of the function itself.

Cheers
Thibaut


==============================
Dr Thibaut Jombart
MRC Centre for Outbreak Analysis and Modelling
Department of Infectious Disease Epidemiology
Imperial College - School of Public Health
Norfolk Place, London W2 1PG, UK
Tel. : 0044 (0)20 7594 3658
http://sites.google.com/site/thibautjombart/
http://sites.google.com/site/therepiproject/
http://adegenet.r-forge.r-project.org/
Twitter: @thibautjombart


________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Cl?udia Figueira [claudiafigueira at ua.pt]
Sent: 24 September 2015 11:53
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] adegenet


Hello,


my name is Cl?udia, I'm doing my thesis and I started working recently with adegenet package. Some doubts arose but hopefully you are able to help me.


My major doubt is to find which type of data (genotype, genetic distances, etc...) is necessary for DAPC and Monmorier analysis. And then, when I create my 'genind object' how can I store my xy information in my genind object.


Best regards,

Cl?udia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150924/86beb2f7/attachment.html>

From ebowles at ucalgary.ca  Thu Sep 24 21:07:24 2015
From: ebowles at ucalgary.ca (Ella Bowles)
Date: Thu, 24 Sep 2015 13:07:24 -0600
Subject: [adegenet-forum] adegenet not recognizing file extensions
Message-ID: <CAHpKFdAZnxi+P6SPcUfvWTL6SjS0Ta-X_7eEHnhmcqbgkoW3Nw@mail.gmail.com>

Hello,

I'm trying to read files into adegenet and convert to genind objects, but
am getting an error. After I have correctly specified/set the directory
that the files are in I have used the line

data <- read.genetix(system.file("8c9.gtx", package = "adegenet"))

for many different permutations of filetype. These include read.genepop for
genepop files, read.structure for structure filies. I keep getting the error

 Converting data from GENETIX to a genind object...
Error in if (toupper(.readExt(file)) != "GTX") stop("File extension .gtx
expected") :
  argument is of length zero

The thing that I can see as the possible issue is that my files originally
had different extensions (from the populations program of stacks). They
were previously ".genepop" and ".strucuture", and I changed them manually
to the ".gen" and ".stru" suffixes. Can you see a way around this?

With thanks,
Ella

-- 
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca, bowlese at gmail.com
website: http://ellabowlesphd.wordpress.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150924/01b0559b/attachment.html>

From t.jombart at imperial.ac.uk  Fri Sep 25 17:10:39 2015
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Fri, 25 Sep 2015 15:10:39 +0000
Subject: [adegenet-forum] adegenet not recognizing file extensions
In-Reply-To: <CAHpKFdAZnxi+P6SPcUfvWTL6SjS0Ta-X_7eEHnhmcqbgkoW3Nw@mail.gmail.com>
References: <CAHpKFdAZnxi+P6SPcUfvWTL6SjS0Ta-X_7eEHnhmcqbgkoW3Nw@mail.gmail.com>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570B12A191D@icexch-m1.ic.ac.uk>

Hi there,

please check the forum archives, this has been answered several times. I think I even mention it in the tutorial. You need to remove 'system.file' and just indicate the path to your file.

Cheers
Thibaut


==============================
Dr Thibaut Jombart
MRC Centre for Outbreak Analysis and Modelling
Department of Infectious Disease Epidemiology
Imperial College - School of Public Health
Norfolk Place, London W2 1PG, UK
Tel. : 0044 (0)20 7594 3658
http://sites.google.com/site/thibautjombart/
http://sites.google.com/site/therepiproject/
http://adegenet.r-forge.r-project.org/
Twitter: @thibautjombart


________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Ella Bowles [ebowles at ucalgary.ca]
Sent: 24 September 2015 20:07
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] adegenet not recognizing file extensions

Hello,

I'm trying to read files into adegenet and convert to genind objects, but am getting an error. After I have correctly specified/set the directory that the files are in I have used the line

data <- read.genetix(system.file("8c9.gtx", package = "adegenet"))

for many different permutations of filetype. These include read.genepop for genepop files, read.structure for structure filies. I keep getting the error

 Converting data from GENETIX to a genind object...
Error in if (toupper(.readExt(file)) != "GTX") stop("File extension .gtx expected") :
  argument is of length zero

The thing that I can see as the possible issue is that my files originally had different extensions (from the populations program of stacks). They were previously ".genepop" and ".strucuture", and I changed them manually to the ".gen" and ".stru" suffixes. Can you see a way around this?

With thanks,
Ella

--
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>, bowlese at gmail.com<mailto:bowlese at gmail.com>
website: http://ellabowlesphd.wordpress.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150925/33a1c4f5/attachment.html>

From karine.bounan at florimond-desprez.fr  Sun Sep 20 08:53:43 2015
From: karine.bounan at florimond-desprez.fr (karine henry)
Date: Sun, 20 Sep 2015 06:53:43 +0000
Subject: [adegenet-forum] adding labels to compoplot
Message-ID: <4DE9ABC5E00F7544A4E8A6758C0217E4F21BE86A@srv-exchange.florimond-desprez.fr>

Apologies for bothering you all, but I'd like to add individual label on compoplot. I run following script which is working well

compoplot(dapc1, subset=1:200,posi="bottomright",
txt.leg=paste("Cluster", 1:4), lab=rownames(d),
ncol=1, xlab="individuals", col=funky(6),cex.lab=.3)

But I get a problem with font size of the label. It seems that cex.label has no effect (whatever the value, font is big and unreadable). I tried other parameters like cex, clab... but not working.

Does somebody know how to reduce font size here?

Sorry to disturb with "stupid" question

cheers

Karine


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150920/a73cf60c/attachment.html>

From roman.lustrik at biolitika.si  Sat Sep 26 09:18:43 2015
From: roman.lustrik at biolitika.si (Roman Lustrik)
Date: Sat, 26 Sep 2015 09:18:43 +0200 (CEST)
Subject: [adegenet-forum] adding labels to compoplot
In-Reply-To: <4DE9ABC5E00F7544A4E8A6758C0217E4F21BE86A@srv-exchange.florimond-desprez.fr>
References: <4DE9ABC5E00F7544A4E8A6758C0217E4F21BE86A@srv-exchange.florimond-desprez.fr>
Message-ID: <907527293.209085.1443251923535.JavaMail.zimbra@biolitika.si>

Hi, 

if you look at `?barplot` you'll notice argument `axis.names`. `barplot` has different arguments for numeric/character labels for some reason. 

This works for me: 


library(adegenet) 

data(microbov) 
dapc3 <- dapc(microbov, n.pca=20, n.da=15) 
compoplot(dapc3, subset = 1:10, cex.names = 0.5, legend = FALSE) 


Cheers, 
Roman 

---- 
In god we trust, all others bring data. 

----- Original Message -----

From: "karine henry" <karine.bounan at florimond-desprez.fr> 
To: adegenet-forum at lists.r-forge.r-project.org 
Sent: Sunday, September 20, 2015 8:53:43 AM 
Subject: [adegenet-forum] adding labels to compoplot 


Apologies for bothering you all, but I'd like to add individual label on compoplot. I run following script which is working well 


compoplot(dapc1, subset=1:200,posi="bottomright", 

txt.leg=paste("Cluster", 1:4), lab=rownames(d), 

ncol=1, xlab="individuals", col=funky(6),cex.lab=.3) 


But I get a problem with font size of the label. It seems that cex.label has no effect (whatever the value, font is big and unreadable). I tried other parameters like cex, clab... but not working. 


Does somebody know how to reduce font size here? 


Sorry to disturb with "stupid" question 


cheers 


Karine 


_______________________________________________ 
adegenet-forum mailing list 
adegenet-forum at lists.r-forge.r-project.org 
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150926/73c24b8b/attachment.html>

From kirsty.m.medcalf at gmail.com  Sun Sep 27 01:50:09 2015
From: kirsty.m.medcalf at gmail.com (Kirsty Medcalf)
Date: Sat, 26 Sep 2015 16:50:09 -0700
Subject: [adegenet-forum] DAPC
In-Reply-To: <2CB2DA8E426F3541AB1907F98ABA6570B12A14FF@icexch-m1.ic.ac.uk>
References: <CAJ5RjHhCOuPfyUwGC3gWjKabjrspCrypWQwvVTuj-jKsqs+6tQ@mail.gmail.com>
 <2CB2DA8E426F3541AB1907F98ABA6570B12A14FF@icexch-m1.ic.ac.uk>
Message-ID: <CAJ5RjHgkg=y=9v43Ya5pr9iKHAYOD=aEWXyH0hdE4pWptOeU1A@mail.gmail.com>

Hi Mariana, and  Jombart

Thank you for your suggestions.  I was finally successful with creating and
customising  the scatter plot.  I am now attempting to cross validate the
DAPC analysis using the code:

x1 <- LDA.scores
mat <- as.matrix(x1, method="mean")
grp <- x1
xval <- xvalDapc(mat, grp2, n.pca.max = 300, training.set = 0.7,
                 result = "groupMean", center = TRUE, scale = FALSE,
                 n.pca = NULL, n.rep = 30, xval.plot = TRUE)

The output is this error:

Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

If this is possible, I have been playing around with this code for a few
days and checked online.  Would you happen to have any idea why this is
occurring.  Thank you if you can help.

Best wishes

Kirsty

Kirsty Medcalf

kirsty.m.medcalf at gmail.com

+447963374030

skype contact: kirsty.medcalf

On Wed, Sep 23, 2015 at 7:11 AM, Jombart, Thibaut <t.jombart at imperial.ac.uk>
wrote:

> Hi there,
>
> sorry I did not go into the details, but checking for potential mistakes:
>
> #1
>  'na.action' is not an argument in find.clusters
>
> #2
>
> x<-Just.V4[2:13]
>
> this looks like a vector to me, not a matrix/data.frame; I am not quite
> sure what you expect 'grp' to be then.
>
> As for producing scatterplots with varying factors, see argument 'grp' in
> ?scatter.dapc
>
> Cheers
> Thibaut
> ==============================
> Dr Thibaut Jombart
> MRC Centre for Outbreak Analysis and Modelling
> Department of Infectious Disease Epidemiology
> Imperial College - School of Public Health
> Norfolk Place, London W2 1PG, UK
> Tel. : 0044 (0)20 7594 3658
> http://sites.google.com/site/thibautjombart/
> http://sites.google.com/site/therepiproject/
> http://adegenet.r-forge.r-project.org/
> Twitter: @thibautjombart
>
>
> ------------------------------
> *From:* adegenet-forum-bounces at lists.r-forge.r-project.org [
> adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Kirsty
> Medcalf [kirsty.m.medcalf at gmail.com]
> *Sent:* 23 September 2015 05:45
> *To:* adegenet-forum at lists.r-forge.r-project.org
> *Subject:* [adegenet-forum] DAPC
>
>
> Dear Forum
>
> This is my first post, so I would like to thank you for your patience.  The
> multivariate data that I am using contains two categorical grouping factors
> (V4 or G8) under the column family (response variable) and 12 accompanying
> predictor variables. The data is called LDA.scores and is found at the
> bottom of my Stack Overflow page by following the link below, which shows
> my attempted step-by-step logic and figures.
>
>
> http://stackoverflow.com/questions/32704902/discriminant-analysis-of-principal-components-and-how-to-graphically-show-the-di
>
> I have been attempting to graphically show the distance of data points to
> its multivariate centroid using DAPC analysis and the function `scatter' in
> the `adegenet' package in R. After splitting the two categorical factors
> into two separate data frames (coding below), I attempt to produce these
> scatterplot. I understand this package is used for the analysis of genetic
> markers, however, I am also under the impression that all types of
> multivariate data can be analysed using this package. I tried to manipulate
> the data but to no avail.
>
> Code used to produce figure
> *Split the dataframe into just V4 and G8
>
> Just.V4<-LDA.scores[LDA.scores$Family=="V4",]
> Just.G8 <-LDA.scores[LDA.scores$Family=="G8",]
>
> #Attempt to produce a scatterplot for the categorical factor V4
> library(adegenet)
> x<-Just.V4[2:13]
>
> *Find the clusters
>
> grp<-find.clusters(x, max.n.clust=12, na.action="omit")
>
> The next step is the perform the discriminant analysis of principal
> components
>
>  dapc1<-dapc(x, grp$grp)
>  scatter(dapc1)
>
> I have tried many different combinations of code and here are some of the
> error messages
>
> Error in dapc.data.frame(x, grp1$grp1) : Inconsistent length for grp
> Warning in find.clusters.data.frame(as.data.frame(x), ...) :
> NAs introduced by coercion
> Error in if (n.pca >= N) warning("number of retained PCs of PCA is  greater than N") :
> missing value where TRUE/FALSE needed
>
>
> If anyone has a solution in terms of how to produce two figures for each
> categorical factor which illustrates the clusters (12 parameters measured)
> to its multivariate centroid, then thank so much. I have followed lots of
> tutorials, searched online and read papers, and still do not understand
> these error and warning messages.
>
> Thank you if anyone can help.
>
> Best wishes,
> Kaikash
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150926/344bcdfa/attachment-0001.html>

From kirsty.m.medcalf at gmail.com  Mon Sep 28 04:44:06 2015
From: kirsty.m.medcalf at gmail.com (Kirsty Medcalf)
Date: Sun, 27 Sep 2015 19:44:06 -0700
Subject: [adegenet-forum] Cross validation using xvalDapc
Message-ID: <CAJ5RjHjCUzLjqovDA1v5kzknpWts+MFpd3YUYKB2VVbzhr-f-w@mail.gmail.com>

Hi

I am attempting to cross validate my results from DAPC analysis with a 70 %
training set using the function xvalDapc (code below).  My data frame is
called LDA.scores. If this is possible, I was wondering if anyone had a
solution to this error message (below).  I have looked online and through
available tutorials and still cannot
solve this issue.

Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

 Also, I have confusion regarding the argument n.pca.max.  My data frame
has two grouping dependent factors, 12 predictor values and 80
observations.  Would n.pca.max=80 be correct?

If it is possible to help me, then thank you

Best wishes,
Kirsty

CODE

#Permute the data
set.seed(999)

#DAPC analysis

windows(width=10, height=7)
x<-LDA.scores[,2:13]
grp1<-find.clusters(x, max.n.clust=12)
dapc1<-dapc(x, grp1$grp)
dapc1

windows(width=10, height=7)
x1 <- LDA.scores
mat <- as.matrix(x1, method="mean")
grp2 <- x1
xval <- xvalDapc(mat, grp2, n.pca.max = 80, training.set = 0.7,
                 result = "groupMean", center = TRUE, scale = FALSE,
                 n.pca = NULL, n.rep = 30, xval.plot = TRUE)


Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?


Kirsty Medcalf

kirsty.m.medcalf at gmail.com

+447963374030

skype contact: kirsty.medcalf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150927/97eccd28/attachment.html>

From roman.lustrik at biolitika.si  Mon Sep 28 09:38:08 2015
From: roman.lustrik at biolitika.si (Roman Lustrik)
Date: Mon, 28 Sep 2015 09:38:08 +0200 (CEST)
Subject: [adegenet-forum] Cross validation using xvalDapc
In-Reply-To: <CAJ5RjHjCUzLjqovDA1v5kzknpWts+MFpd3YUYKB2VVbzhr-f-w@mail.gmail.com>
References: <CAJ5RjHjCUzLjqovDA1v5kzknpWts+MFpd3YUYKB2VVbzhr-f-w@mail.gmail.com>
Message-ID: <503574782.220304.1443425888118.JavaMail.zimbra@biolitika.si>

Hi, 

can you make your example reproducible (simulate some data)? 

As for the n.pca.max argument. PCA is a dimension reduction technique. Which means it tries to present data of N variables (columns) as as few principal components as possible. If you think of a 3D cloud shaped like a sphere (3 variables describe this, call them x, y and z), PCA will try to show you the data in 2D (two principal components, call them PC1 and PC2). What you expect to see is a circle which explains most of the variation, since circle is quite good approximation of a sphere. If you add data from PC3, you get a sphere again (and all variation explained). 

Retaining all components is not practical so the function will retain only `n.pca.max` components. 

Cheers, 
Roman 

---- 
In god we trust, all others bring data. 

----- Original Message -----

From: "Kirsty Medcalf" <kirsty.m.medcalf at gmail.com> 
To: adegenet-forum at lists.r-forge.r-project.org 
Sent: Monday, September 28, 2015 4:44:06 AM 
Subject: [adegenet-forum] Cross validation using xvalDapc 

Hi 

I am attempting to cross validate my results from DAPC analysis with a 70 % training set using the function xvalDapc (code below). My data frame is called LDA.scores. If this is possible, I was wondering if anyone had a solution to this error message (below). I have looked online and through available tutorials and still cannot 
solve this issue. 

Error in sort.list(y) : 'x' must be atomic for 'sort.list' 
Have you called 'sort' on a list? 

Also, I have confusion regarding the argument n.pca.max. My data frame has two grouping dependent factors, 12 predictor values and 80 observations. Would n.pca.max=80 be correct? 

If it is possible to help me, then thank you 

Best wishes, 
Kirsty 

CODE 

#Permute the data 
set.seed(999) 

#DAPC analysis 

windows(width=10, height=7) 
x<-LDA.scores[,2:13] 
grp1<-find.clusters(x, max.n.clust=12) 
dapc1<-dapc(x, grp1$grp) 
dapc1 

windows(width=10, height=7) 
x1 <- LDA.scores 
mat <- as.matrix(x1, method="mean") 
grp2 <- x1 
xval <- xvalDapc(mat, grp2, n.pca.max = 80, training.set = 0.7, 
result = "groupMean", center = TRUE, scale = FALSE, 
n.pca = NULL, n.rep = 30, xval.plot = TRUE) 


Error in sort.list(y) : 'x' must be atomic for 'sort.list' 
Have you called 'sort' on a list? 


Kirsty Medcalf 
kirsty.m.medcalf at gmail.com 
+447963374030 
skype contact: kirsty.medcalf 

_______________________________________________ 
adegenet-forum mailing list 
adegenet-forum at lists.r-forge.r-project.org 
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150928/7c3be2aa/attachment.html>

From t.jombart at imperial.ac.uk  Mon Sep 28 13:18:21 2015
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Mon, 28 Sep 2015 11:18:21 +0000
Subject: [adegenet-forum] Cross validation using xvalDapc
In-Reply-To: <503574782.220304.1443425888118.JavaMail.zimbra@biolitika.si>
References: <CAJ5RjHjCUzLjqovDA1v5kzknpWts+MFpd3YUYKB2VVbzhr-f-w@mail.gmail.com>,
 <503574782.220304.1443425888118.JavaMail.zimbra@biolitika.si>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570B12A1B1D@icexch-m1.ic.ac.uk>

Hi Kirsty,

if you have N individuals and P variables, the maximum number of PCs of a PCA is at most min(N,P): in your case, you cannot have more than 12 PCs associated to non-null eigenvalues.

Otherwise, regarding the quoted code, I don't know what this is:
mat <- as.matrix(x1, method="mean")

Also grp2 does not look like a factor at all to me.

The input for xvalDapc should mimic the input of a dapc. Please check the examples.

Cheers
Thibaut


________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Roman Lustrik [roman.lustrik at biolitika.si]
Sent: 28 September 2015 08:38
To: Kirsty Medcalf
Cc: adegenet-forum at lists.r-forge.r-project.org
Subject: Re: [adegenet-forum] Cross validation using xvalDapc

Hi,

can you make your example reproducible (simulate some data)?

As for the n.pca.max argument. PCA is a dimension reduction technique. Which means it tries to present data of N variables (columns) as as few principal components as possible. If you think of a 3D cloud shaped like a sphere (3 variables describe this, call them x, y and z), PCA will try to show you the data in 2D (two principal components, call them PC1 and PC2). What you expect to see is a circle which explains most of the variation, since circle is quite good approximation of a sphere. If you add data from PC3, you get a sphere again (and all variation explained).

Retaining all components is not practical so the function will retain only `n.pca.max` components.

Cheers,
Roman

----
In god we trust, all others bring data.

________________________________
From: "Kirsty Medcalf" <kirsty.m.medcalf at gmail.com>
To: adegenet-forum at lists.r-forge.r-project.org
Sent: Monday, September 28, 2015 4:44:06 AM
Subject: [adegenet-forum] Cross validation using xvalDapc

Hi

I am attempting to cross validate my results from DAPC analysis with a 70 % training set using the function xvalDapc (code below).  My data frame is called LDA.scores. If this is possible, I was wondering if anyone had a solution to this error message (below).  I have looked online and through available tutorials and still cannot
solve this issue.

Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

 Also, I have confusion regarding the argument n.pca.max.  My data frame has two grouping dependent factors, 12 predictor values and 80 observations.  Would n.pca.max=80 be correct?

If it is possible to help me, then thank you

Best wishes,
Kirsty

CODE

#Permute the data
set.seed(999)

#DAPC analysis

windows(width=10, height=7)
x<-LDA.scores[,2:13]
grp1<-find.clusters(x, max.n.clust=12)
dapc1<-dapc(x, grp1$grp)
dapc1

windows(width=10, height=7)
x1 <- LDA.scores
mat <- as.matrix(x1, method="mean")
grp2 <- x1
xval <- xvalDapc(mat, grp2, n.pca.max = 80, training.set = 0.7,
                 result = "groupMean", center = TRUE, scale = FALSE,
                 n.pca = NULL, n.rep = 30, xval.plot = TRUE)


Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?


Kirsty Medcalf

kirsty.m.medcalf at gmail.com<mailto:kirsty.m.medcalf at gmail.com>

+447963374030<tel:%2B447963374030>

skype contact: kirsty.medcalf

_______________________________________________
adegenet-forum mailing list
adegenet-forum at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150928/34a41f0b/attachment.html>

From t.jombart at imperial.ac.uk  Tue Sep 29 12:01:48 2015
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Tue, 29 Sep 2015 10:01:48 +0000
Subject: [adegenet-forum] adegenet not recognizing file extensions
In-Reply-To: <CAHpKFdD-dNTY=HDdmc-d9HfyDXbN=KNhABXsztoBCQ-AZFYwpA@mail.gmail.com>
References: <CAHpKFdAZnxi+P6SPcUfvWTL6SjS0Ta-X_7eEHnhmcqbgkoW3Nw@mail.gmail.com>
 <2CB2DA8E426F3541AB1907F98ABA6570B12A191D@icexch-m1.ic.ac.uk>
 <CAHpKFdAkm+o2E57t+BpApEvBT=UBjtFLGYxK9zLGYszK9TmgMQ@mail.gmail.com>
 <CAHpKFdBKJ1C=MS+7CcaA=XQ0EFVTL1HmK9Bjt6hqd4vu+rwF6A@mail.gmail.com>,
 <CAHpKFdD-dNTY=HDdmc-d9HfyDXbN=KNhABXsztoBCQ-AZFYwpA@mail.gmail.com>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570B12A1CFC@icexch-m1.ic.ac.uk>

Hi Ella,

It looks like the path to your file is incorrect  - can you double-check?

Cheers
Thibaut

________________________________
From: bowlese at gmail.com [bowlese at gmail.com] on behalf of Ella Bowles [ebowles at ucalgary.ca]
Sent: 28 September 2015 18:56
To: Jombart, Thibaut
Subject: Re: [adegenet-forum] adegenet not recognizing file extensions

Hi Thibaut,

I'm now using the line (all in one line)
data <- read.structure("C:/Users/Ella Bowles/Desktop/batch_1_8c9.str", n.ind = 186, n.loc = 4099, onerowperind = FALSE, col.lab = 1, col.pop = 2, col.others = NULL, row.marknames = 2, NA.char = "0", ask = TRUE)

for the attached dataset

and I'm still getting the error
Which other optional columns should be read (press 'return' when done)? 1:

 Converting data from a STRUCTURE .stru file to a genind object...

Error in file(file, "r") : cannot open the connection
In addition: Warning message:
In file(file, "r") :
  cannot open file 'C:/Users/Ella Bowles/Desktop/batch_1_8c9.str': No such file or directory

Can you see what may be wrong?

?Thank you for your time,
Ella?


On Fri, Sep 25, 2015 at 7:16 PM, Ella Bowles <ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>> wrote:
Hi Thibaut,

I'm now using the line (all in one line)
data <- read.structure("C:/Users/Ella Bowles/Desktop/batch_1_8c9.str", n.ind = 186, n.loc = 4099, onerowperind = FALSE, col.lab = 1, col.pop = 2, col.others = NULL, row.marknames = 2, NA.char = "0", ask = TRUE)

for the attached dataset

and I'm still getting the error
Which other optional columns should be read (press 'return' when done)? 1:

 Converting data from a STRUCTURE .stru file to a genind object...

Error in file(file, "r") : cannot open the connection
In addition: Warning message:
In file(file, "r") :
  cannot open file 'C:/Users/Ella Bowles/Desktop/batch_1_8c9.str': No such file or directory

Can you see what may be wrong?

With thanks,
Ella


On Fri, Sep 25, 2015 at 11:54 AM, Ella Bowles <ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>> wrote:
thank you!

On Fri, Sep 25, 2015 at 9:10 AM, Jombart, Thibaut <t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk>> wrote:
Hi there,

please check the forum archives, this has been answered several times. I think I even mention it in the tutorial. You need to remove 'system.file' and just indicate the path to your file.

Cheers
Thibaut


==============================
Dr Thibaut Jombart
MRC Centre for Outbreak Analysis and Modelling
Department of Infectious Disease Epidemiology
Imperial College - School of Public Health
Norfolk Place, London W2 1PG, UK
Tel. : 0044 (0)20 7594 3658
http://sites.google.com/site/thibautjombart/
http://sites.google.com/site/therepiproject/
http://adegenet.r-forge.r-project.org/
Twitter: @thibautjombart


________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org> [adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org>] on behalf of Ella Bowles [ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>]
Sent: 24 September 2015 20:07
To: adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-forum at lists.r-forge.r-project.org>
Subject: [adegenet-forum] adegenet not recognizing file extensions

Hello,

I'm trying to read files into adegenet and convert to genind objects, but am getting an error. After I have correctly specified/set the directory that the files are in I have used the line

data <- read.genetix(system.file("8c9.gtx", package = "adegenet"))

for many different permutations of filetype. These include read.genepop for genepop files, read.structure for structure filies. I keep getting the error

 Converting data from GENETIX to a genind object...
Error in if (toupper(.readExt(file)) != "GTX") stop("File extension .gtx expected") :
  argument is of length zero

The thing that I can see as the possible issue is that my files originally had different extensions (from the populations program of stacks). They were previously ".genepop" and ".strucuture", and I changed them manually to the ".gen" and ".stru" suffixes. Can you see a way around this?

With thanks,
Ella

--
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>, bowlese at gmail.com<mailto:bowlese at gmail.com>
website: http://ellabowlesphd.wordpress.com/


--
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>, bowlese at gmail.com<mailto:bowlese at gmail.com>
website: http://ellabowlesphd.wordpress.com/


--
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>, bowlese at gmail.com<mailto:bowlese at gmail.com>
website: http://ellabowlesphd.wordpress.com/


--
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>, bowlese at gmail.com<mailto:bowlese at gmail.com>
website: http://ellabowlesphd.wordpress.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150929/7ade4eeb/attachment.html>

From kirsty.m.medcalf at gmail.com  Tue Sep 29 18:44:08 2015
From: kirsty.m.medcalf at gmail.com (Kirsty Medcalf)
Date: Tue, 29 Sep 2015 09:44:08 -0700
Subject: [adegenet-forum] Cross validation using xvalDapc
Message-ID: <CAJ5RjHjjLZ3v=h9Si8csTd3WQyKnEpcGe_dDFDJdzD7GDyCr3g@mail.gmail.com>

Hi

I am attempting to cross validate my results from DAPC analysis with a 70 %
training set using the function xvalDapc (code below).  My data frame is
called LDA.scores. this is an updated version of a previous post after
taking into account the recommendationsbut I am still outputting the same
error message.  Do I have to change my data frame into a list? If so, what
would be the correct format to transform the data frame into this format.
If this is possible, I was wondering if anyone had a solution with how to
solve this error message (below).  I have looked online and through
available tutorials and still cannot solve this issue.  Words cannot
describe my gratitude if this is possible.

 #Permute the data

set.seed(999)

x<-LDA.scores[,2:13]

   grp1<-find.clusters(x, max.n.clust=12)
   dapc1<-dapc(x, grp1$grp)

#DAPC analysis

windows(width=10, height=7)
x<-LDA.scores[,2:13]
grp1<-find.clusters(x, max.n.clust=12)
dapc1<-dapc(x, grp1$grp)
dapc1

#Loadings plot

contrib <- loadingplot(dapc1$var.contr, axis=2,
                       thres=.07, lab.jitter=1)


#Cross Validation
windows(width=10, height=7)
set.seed(1234)
x1 <- LDA.scores
str(x1)
x1$Matriline<-as.factor(x1$Matriline)
xval <- xvalDapc(x1, grp1, n.pca.max = 2, training.set = 0.7,
                 result = "groupMean", center = TRUE, scale = FALSE,
                 n.pca = NULL, n.rep = 30, xval.plot = TRUE)

Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

During the DAPC analysis,  I chose to retain 2 PCs and 2 LD's, and there
appears to be 3 clusters. Would n.pca.max=2 be correct?

My reproducible data, the logical steps that I took to chose the number of
PC's and LD's to retain,  and the number of chosen clusters is available on
stack overflow

http://stackoverflow.com/questions/32704902/discriminant-analysis-of-principal-components-and-how-to-graphically-show-the-di

If it is possible to help me, then thank you

Best wishes,
Kirsty
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150929/267ebe23/attachment-0001.html>

From t.jombart at imperial.ac.uk  Tue Sep 29 21:10:42 2015
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Tue, 29 Sep 2015 19:10:42 +0000
Subject: [adegenet-forum] adegenet not recognizing file extensions
In-Reply-To: <CAHpKFdBUZtRWg-n-96R3dpE6YW-FG+OC-DiYX_9CP0z5swp0ZQ@mail.gmail.com>
References: <CAHpKFdAZnxi+P6SPcUfvWTL6SjS0Ta-X_7eEHnhmcqbgkoW3Nw@mail.gmail.com>
 <2CB2DA8E426F3541AB1907F98ABA6570B12A191D@icexch-m1.ic.ac.uk>
 <CAHpKFdAkm+o2E57t+BpApEvBT=UBjtFLGYxK9zLGYszK9TmgMQ@mail.gmail.com>
 <CAHpKFdBKJ1C=MS+7CcaA=XQ0EFVTL1HmK9Bjt6hqd4vu+rwF6A@mail.gmail.com>
 <CAHpKFdD-dNTY=HDdmc-d9HfyDXbN=KNhABXsztoBCQ-AZFYwpA@mail.gmail.com>
 <2CB2DA8E426F3541AB1907F98ABA6570B12A1CFC@icexch-m1.ic.ac.uk>,
 <CAHpKFdBUZtRWg-n-96R3dpE6YW-FG+OC-DiYX_9CP0z5swp0ZQ@mail.gmail.com>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570B12A1F73@icexch-m1.ic.ac.uk>

Hi again,

I think a quadruple check is in order ;)

The extension of the file you sent is wrong. It is a .str.tsv file (?) while the correct extension should read .str. Seeing your command line, I suspect windows is hiding extensions by default on your computer, and so you are assuming there is a file called

C:/Users/Ella Bowles/Desktop/batch_1_8c9.str

where there is none - your file must be, in fact

C:/Users/Ella Bowles/Desktop/batch_1_8c9.str.tsv


After correcting the extension it all works for me:


> library(adegenet)
Loading required package: ade4
   /// adegenet 2.0.1 is loaded ////////////

   > overview: '?adegenet'
   > tutorials/doc/questions: 'adegenetWeb()'
   > bug reports/feature resquests: adegenetIssues()


> x=read.structure("batch_1_8c9.str",  n.ind = 186, n.loc = 4099, onerowperind = FALSE, col.lab = 1, col.pop = 2, col.others = NULL, row.marknames = 2, NA.char = "0", ask = TRUE)

 Which other optional columns should be read (press 'return' when done)? 1:

 Converting data from a STRUCTURE .stru file to a genind object...

> x
/// GENIND OBJECT /////////

 // 186 individuals; 4,099 loci; 8,198 alleles; size: 7.7 Mb

 // Basic content
   @tab:  186 x 8198 matrix of allele counts
   @loc.n.all: number of alleles per locus (range: 2-2)
   @loc.fac: locus factor for the 8198 columns of @tab
   @all.names: list of allele names for each locus
   @ploidy: ploidy of each individual  (range: 2-2)
   @type:  codom
   @call: read.structure(file = "batch_1_8c9.str", n.ind = 186, n.loc = 4099,
    onerowperind = FALSE, col.lab = 1, col.pop = 2, col.others = NULL,
    row.marknames = 2, NA.char = "0", ask = TRUE)

 // Optional content
   @pop: population of each individual (group size range: 12-20)


And yes, in case you wonder, window$ is an evil system and probably hates all of us.

Cheers
Thibaut


________________________________
From: bowlese at gmail.com [bowlese at gmail.com] on behalf of Ella Bowles [ebowles at ucalgary.ca]
Sent: 29 September 2015 18:01
To: Jombart, Thibaut
Subject: Re: [adegenet-forum] adegenet not recognizing file extensions

Hi Thibaut,

I checked, double and triple checked, and it is correct. Can you read it in?

I don't think this should be an issue, but the only thing that seems potentially amiss is that after I use the read.structure line R always asks for which other optional columns should be read. I just press enter here, so no other columns are read. Just seems weird that I would be getting a warning that looks like it should be for a filepath. Also, I'm running R v 3.2.2.

Will be good to know if you can read in the file.

Cheers

Thanks

On Tue, Sep 29, 2015 at 4:01 AM, Jombart, Thibaut <t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk>> wrote:
Hi Ella,

It looks like the path to your file is incorrect  - can you double-check?

Cheers
Thibaut

________________________________
From: bowlese at gmail.com<mailto:bowlese at gmail.com> [bowlese at gmail.com<mailto:bowlese at gmail.com>] on behalf of Ella Bowles [ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>]
Sent: 28 September 2015 18:56
To: Jombart, Thibaut
Subject: Re: [adegenet-forum] adegenet not recognizing file extensions

Hi Thibaut,

I'm now using the line (all in one line)
data <- read.structure("C:/Users/Ella Bowles/Desktop/batch_1_8c9.str", n.ind = 186, n.loc = 4099, onerowperind = FALSE, col.lab = 1, col.pop = 2, col.others = NULL, row.marknames = 2, NA.char = "0", ask = TRUE)

for the attached dataset

and I'm still getting the error
Which other optional columns should be read (press 'return' when done)? 1:

 Converting data from a STRUCTURE .stru file to a genind object...

Error in file(file, "r") : cannot open the connection
In addition: Warning message:
In file(file, "r") :
  cannot open file 'C:/Users/Ella Bowles/Desktop/batch_1_8c9.str': No such file or directory

Can you see what may be wrong?

?Thank you for your time,
Ella?


On Fri, Sep 25, 2015 at 7:16 PM, Ella Bowles <ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>> wrote:
Hi Thibaut,

I'm now using the line (all in one line)
data <- read.structure("C:/Users/Ella Bowles/Desktop/batch_1_8c9.str", n.ind = 186, n.loc = 4099, onerowperind = FALSE, col.lab = 1, col.pop = 2, col.others = NULL, row.marknames = 2, NA.char = "0", ask = TRUE)

for the attached dataset

and I'm still getting the error
Which other optional columns should be read (press 'return' when done)? 1:

 Converting data from a STRUCTURE .stru file to a genind object...

Error in file(file, "r") : cannot open the connection
In addition: Warning message:
In file(file, "r") :
  cannot open file 'C:/Users/Ella Bowles/Desktop/batch_1_8c9.str': No such file or directory

Can you see what may be wrong?

With thanks,
Ella


On Fri, Sep 25, 2015 at 11:54 AM, Ella Bowles <ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>> wrote:
thank you!

On Fri, Sep 25, 2015 at 9:10 AM, Jombart, Thibaut <t.jombart at imperial.ac.uk<mailto:t.jombart at imperial.ac.uk>> wrote:
Hi there,

please check the forum archives, this has been answered several times. I think I even mention it in the tutorial. You need to remove 'system.file' and just indicate the path to your file.

Cheers
Thibaut


==============================
Dr Thibaut Jombart
MRC Centre for Outbreak Analysis and Modelling
Department of Infectious Disease Epidemiology
Imperial College - School of Public Health
Norfolk Place, London W2 1PG, UK
Tel. : 0044 (0)20 7594 3658
http://sites.google.com/site/thibautjombart/
http://sites.google.com/site/therepiproject/
http://adegenet.r-forge.r-project.org/
Twitter: @thibautjombart


________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org> [adegenet-forum-bounces at lists.r-forge.r-project.org<mailto:adegenet-forum-bounces at lists.r-forge.r-project.org>] on behalf of Ella Bowles [ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>]
Sent: 24 September 2015 20:07
To: adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-forum at lists.r-forge.r-project.org>
Subject: [adegenet-forum] adegenet not recognizing file extensions

Hello,

I'm trying to read files into adegenet and convert to genind objects, but am getting an error. After I have correctly specified/set the directory that the files are in I have used the line

data <- read.genetix(system.file("8c9.gtx", package = "adegenet"))

for many different permutations of filetype. These include read.genepop for genepop files, read.structure for structure filies. I keep getting the error

 Converting data from GENETIX to a genind object...
Error in if (toupper(.readExt(file)) != "GTX") stop("File extension .gtx expected") :
  argument is of length zero

The thing that I can see as the possible issue is that my files originally had different extensions (from the populations program of stacks). They were previously ".genepop" and ".strucuture", and I changed them manually to the ".gen" and ".stru" suffixes. Can you see a way around this?

With thanks,
Ella

--
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>, bowlese at gmail.com<mailto:bowlese at gmail.com>
website: http://ellabowlesphd.wordpress.com/


--
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>, bowlese at gmail.com<mailto:bowlese at gmail.com>
website: http://ellabowlesphd.wordpress.com/


--
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>, bowlese at gmail.com<mailto:bowlese at gmail.com>
website: http://ellabowlesphd.wordpress.com/


--
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>, bowlese at gmail.com<mailto:bowlese at gmail.com>
website: http://ellabowlesphd.wordpress.com/

_______________________________________________
adegenet-forum mailing list
adegenet-forum at lists.r-forge.r-project.org<mailto:adegenet-forum at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum


--
Ella Bowles
PhD Candidate
Biological Sciences
University of Calgary

e-mail: ebowles at ucalgary.ca<mailto:ebowles at ucalgary.ca>, bowlese at gmail.com<mailto:bowlese at gmail.com>
website: http://ellabowlesphd.wordpress.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150929/4a780f61/attachment-0001.html>

From cabreracano.1 at buckeyemail.osu.edu  Sun Sep 27 05:26:27 2015
From: cabreracano.1 at buckeyemail.osu.edu (Cabrera Cano, Antonio)
Date: Sun, 27 Sep 2015 03:26:27 +0000
Subject: [adegenet-forum] help with genind2hierfstat
Message-ID: <29238C81-DC8E-44F7-8648-804DBBDA9293@buckeyemail.osu.edu>

Hi adegenet forum,
I have some trouble trying to ran this command.
data<-genind2hierfstat(nancycats)

I am using agednet 2.0.0.
Does anyone know about this problem?

thank you very much

From t.jombart at imperial.ac.uk  Wed Sep 30 10:46:28 2015
From: t.jombart at imperial.ac.uk (Jombart, Thibaut)
Date: Wed, 30 Sep 2015 08:46:28 +0000
Subject: [adegenet-forum] help with genind2hierfstat
In-Reply-To: <29238C81-DC8E-44F7-8648-804DBBDA9293@buckeyemail.osu.edu>
References: <29238C81-DC8E-44F7-8648-804DBBDA9293@buckeyemail.osu.edu>
Message-ID: <2CB2DA8E426F3541AB1907F98ABA6570B12A2082@icexch-m1.ic.ac.uk>

Hi there, 

this function has been removed from adegenet, as hierfstat is now meant to work natively with genind objects.

Other major changes are outlined there:
https://raw.githubusercontent.com/thibautjombart/adegenet/master/ChangeLog

Cheers
Thibaut


________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Cabrera Cano, Antonio [cabreracano.1 at buckeyemail.osu.edu]
Sent: 27 September 2015 04:26
To: adegenet-forum at lists.r-forge.r-project.org
Subject: [adegenet-forum] help with genind2hierfstat

Hi adegenet forum,
I have some trouble trying to ran this command.
data<-genind2hierfstat(nancycats)

I am using agednet 2.0.0.
Does anyone know about this problem?

thank you very much
_______________________________________________
adegenet-forum mailing list
adegenet-forum at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum

From kirsty.m.medcalf at gmail.com  Wed Sep 30 20:19:29 2015
From: kirsty.m.medcalf at gmail.com (Kirsty Medcalf)
Date: Wed, 30 Sep 2015 11:19:29 -0700
Subject: [adegenet-forum] Cross validation using xvalDapc
In-Reply-To: <CAJ5RjHjjLZ3v=h9Si8csTd3WQyKnEpcGe_dDFDJdzD7GDyCr3g@mail.gmail.com>
References: <CAJ5RjHjjLZ3v=h9Si8csTd3WQyKnEpcGe_dDFDJdzD7GDyCr3g@mail.gmail.com>
Message-ID: <CAJ5RjHg_HvVOPJCXKheaa1fawemL4Xc_o-G8Q8sjrf2bT8+1kA@mail.gmail.com>

Hi,

Firstly, I would like to thank you for your previous recommendations, it
was greatly appreciated. The solution was not as obvious at first but I
persevered. Thank you again because I am moderately new to R.

Kind regards to this forum

Kirsty

xval <- xvalDapc(x, grp1$grp, n.pca.max = 2, training.set = 0.7,
                 result = "groupMean", center = TRUE, scale = FALSE,
                 n.pca = NULL, n.rep = 30, xval.plot = TRUE)

$`Cross-Validation Results`
   n.pca   success
1      1 0.6111111
2      1 0.6666667
3      1 0.6666667
4      1 0.6111111
5      1 0.6190476
6      1 0.6190476
7      1 0.6111111
8      1 0.5634921
9      1 0.6111111
10     1 0.6111111
11     1 0.6190476
12     1 0.6666667
13     1 0.5079365
14     1 0.6190476
15     1 0.6190476
16     1 0.6666667
17     1 0.6111111
18     1 0.6111111
19     1 0.4603175
20     1 0.6111111
21     1 0.6111111
22     1 0.6666667
23     1 0.5634921
24     1 0.6666667
25     1 0.6666667
26     1 0.5079365
27     1 0.6111111
28     1 0.6190476
29     1 0.6111111
30     1 0.6666667

$`Median and Confidence Interval for Random Chance`
     2.5%       50%     97.5%
0.2411765 0.3303922 0.4377002

$`Mean Successful Assignment by Number of PCs of PCA`
        1
0.6124339

$`Number of PCs Achieving Highest Mean Success`
[1] "1"

$`Root Mean Squared Error by Number of PCs of PCA`
        1
0.3907175

$`Number of PCs Achieving Lowest MSE`
[1] "1"

$DAPC
#################################################
# Discriminant Analysis of Principal Components #
#################################################
class: dapc
$call: dapc.data.frame(x = x, grp = grp, n.pca = n.pca, n.da = n.da)

$n.pca: 1 first PCs of PCA used
$n.da: 1 discriminant functions saved
$var (proportion of conserved variance): 0.605

$eig (eigenvalues): 58.23  vector    length content
1 $eig      1      eigenvalues
2 $grp      80     prior group assignment
3 $prior    3      prior group probabilities
4 $assign   80     posterior group assignment
5 $pca.cent 12     centring vector of PCA
6 $pca.norm 12     scaling vector of PCA
7 $pca.eig  12     eigenvalues of PCA

  data.frame    nrow ncol
1 $tab          80   1
2 $means        3    1
3 $loadings     1    1
4 $ind.coord    80   1
5 $grp.coord    3    1
6 $posterior    80   3
7 $pca.loadings 12   1
8 $var.contr    12   1
  content
1 retained PCs of PCA
2 group means
3 loadings of variables
4 coordinates of individuals (principal components)
5 coordinates of groups
6 posterior membership probabilities
7 PCA loadings of original variables
8 contribution of original variables


Kirsty Medcalf

kirsty.m.medcalf at gmail.com

+447963374030

skype contact: kirsty.medcalf

On Tue, Sep 29, 2015 at 9:44 AM, Kirsty Medcalf <kirsty.m.medcalf at gmail.com>
wrote:

> Hi
>
> I am attempting to cross validate my results from DAPC analysis with a 70
> % training set using the function xvalDapc (code below).  My data frame is
> called LDA.scores. this is an updated version of a previous post after
> taking into account the recommendationsbut I am still outputting the same
> error message.  Do I have to change my data frame into a list? If so, what
> would be the correct format to transform the data frame into this format.
> If this is possible, I was wondering if anyone had a solution with how to
> solve this error message (below).  I have looked online and through
> available tutorials and still cannot solve this issue.  Words cannot
> describe my gratitude if this is possible.
>
>  #Permute the data
>
> set.seed(999)
>
> x<-LDA.scores[,2:13]
>
>    grp1<-find.clusters(x, max.n.clust=12)
>    dapc1<-dapc(x, grp1$grp)
>
> #DAPC analysis
>
> windows(width=10, height=7)
> x<-LDA.scores[,2:13]
> grp1<-find.clusters(x, max.n.clust=12)
> dapc1<-dapc(x, grp1$grp)
> dapc1
>
> #Loadings plot
>
> contrib <- loadingplot(dapc1$var.contr, axis=2,
>                        thres=.07, lab.jitter=1)
>
>
> #Cross Validation
> windows(width=10, height=7)
> set.seed(1234)
> x1 <- LDA.scores
> str(x1)
> x1$Matriline<-as.factor(x1$Matriline)
> xval <- xvalDapc(x1, grp1, n.pca.max = 2, training.set = 0.7,
>                  result = "groupMean", center = TRUE, scale = FALSE,
>                  n.pca = NULL, n.rep = 30, xval.plot = TRUE)
>
> Error in sort.list(y) : 'x' must be atomic for 'sort.list'
> Have you called 'sort' on a list?
>
> During the DAPC analysis,  I chose to retain 2 PCs and 2 LD's, and there
> appears to be 3 clusters. Would n.pca.max=2 be correct?
>
> My reproducible data, the logical steps that I took to chose the number of
> PC's and LD's to retain,  and the number of chosen clusters is available on
> stack overflow
>
>
> http://stackoverflow.com/questions/32704902/discriminant-analysis-of-principal-components-and-how-to-graphically-show-the-di
>
> If it is possible to help me, then thank you
>
> Best wishes,
> Kirsty
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150930/2dabb7af/attachment.html>

From caitiecollins at gmail.com  Wed Sep 30 23:14:46 2015
From: caitiecollins at gmail.com (Caitlin Collins)
Date: Wed, 30 Sep 2015 22:14:46 +0100
Subject: [adegenet-forum] Cross validation using xvalDapc
In-Reply-To: <CAJ5RjHg_HvVOPJCXKheaa1fawemL4Xc_o-G8Q8sjrf2bT8+1kA@mail.gmail.com>
References: <CAJ5RjHjjLZ3v=h9Si8csTd3WQyKnEpcGe_dDFDJdzD7GDyCr3g@mail.gmail.com>
 <CAJ5RjHg_HvVOPJCXKheaa1fawemL4Xc_o-G8Q8sjrf2bT8+1kA@mail.gmail.com>
Message-ID: <CAMon0MCREp-U88txhJskmp=O-x5Q-fF9mc=3w0L_SaOOCL8-5Q@mail.gmail.com>

Hi Kirsty,

Now that you seem to have cross-validation working, I was wondering which,
if any, of your questions still remain to be answered. Are you still
looking for help on any of the questions you posted?

If you are still looking for help, I was wondering if you could offer me a
clarification:
I took a look at the post you made to StackOverflow, copied your data, and
tried to run through the code in your e-mail. But I got stuck because I am
not sure where this came from: x1$Matriline. It didn't seem to be one of
the variables in the "mydat" dataset at the bottom of your post that you
said contained the LDA.scores data you had been working with...

Please let us know what questions or problems you are still running into.

Thanks,
Caitlin.

On Wed, Sep 30, 2015 at 7:19 PM, Kirsty Medcalf <kirsty.m.medcalf at gmail.com>
wrote:

> Hi,
>
> Firstly, I would like to thank you for your previous recommendations, it
> was greatly appreciated. The solution was not as obvious at first but I
> persevered. Thank you again because I am moderately new to R.
>
> Kind regards to this forum
>
> Kirsty
>
> xval <- xvalDapc(x, grp1$grp, n.pca.max = 2, training.set = 0.7,
>                  result = "groupMean", center = TRUE, scale = FALSE,
>                  n.pca = NULL, n.rep = 30, xval.plot = TRUE)
>
> $`Cross-Validation Results`
>    n.pca   success
> 1      1 0.6111111
> 2      1 0.6666667
> 3      1 0.6666667
> 4      1 0.6111111
> 5      1 0.6190476
> 6      1 0.6190476
> 7      1 0.6111111
> 8      1 0.5634921
> 9      1 0.6111111
> 10     1 0.6111111
> 11     1 0.6190476
> 12     1 0.6666667
> 13     1 0.5079365
> 14     1 0.6190476
> 15     1 0.6190476
> 16     1 0.6666667
> 17     1 0.6111111
> 18     1 0.6111111
> 19     1 0.4603175
> 20     1 0.6111111
> 21     1 0.6111111
> 22     1 0.6666667
> 23     1 0.5634921
> 24     1 0.6666667
> 25     1 0.6666667
> 26     1 0.5079365
> 27     1 0.6111111
> 28     1 0.6190476
> 29     1 0.6111111
> 30     1 0.6666667
>
> $`Median and Confidence Interval for Random Chance`
>      2.5%       50%     97.5%
> 0.2411765 0.3303922 0.4377002
>
> $`Mean Successful Assignment by Number of PCs of PCA`
>         1
> 0.6124339
>
> $`Number of PCs Achieving Highest Mean Success`
> [1] "1"
>
> $`Root Mean Squared Error by Number of PCs of PCA`
>         1
> 0.3907175
>
> $`Number of PCs Achieving Lowest MSE`
> [1] "1"
>
> $DAPC
> #################################################
> # Discriminant Analysis of Principal Components #
> #################################################
> class: dapc
> $call: dapc.data.frame(x = x, grp = grp, n.pca = n.pca, n.da = n.da)
>
> $n.pca: 1 first PCs of PCA used
> $n.da: 1 discriminant functions saved
> $var (proportion of conserved variance): 0.605
>
> $eig (eigenvalues): 58.23  vector    length content
> 1 $eig      1      eigenvalues
> 2 $grp      80     prior group assignment
> 3 $prior    3      prior group probabilities
> 4 $assign   80     posterior group assignment
> 5 $pca.cent 12     centring vector of PCA
> 6 $pca.norm 12     scaling vector of PCA
> 7 $pca.eig  12     eigenvalues of PCA
>
>   data.frame    nrow ncol
> 1 $tab          80   1
> 2 $means        3    1
> 3 $loadings     1    1
> 4 $ind.coord    80   1
> 5 $grp.coord    3    1
> 6 $posterior    80   3
> 7 $pca.loadings 12   1
> 8 $var.contr    12   1
>   content
> 1 retained PCs of PCA
> 2 group means
> 3 loadings of variables
> 4 coordinates of individuals (principal components)
> 5 coordinates of groups
> 6 posterior membership probabilities
> 7 PCA loadings of original variables
> 8 contribution of original variables
>
>
>
>
> Kirsty Medcalf
>
> kirsty.m.medcalf at gmail.com
>
> +447963374030
>
> skype contact: kirsty.medcalf
>
> On Tue, Sep 29, 2015 at 9:44 AM, Kirsty Medcalf <
> kirsty.m.medcalf at gmail.com> wrote:
>
>> Hi
>>
>> I am attempting to cross validate my results from DAPC analysis with a 70
>> % training set using the function xvalDapc (code below).  My data frame is
>> called LDA.scores. this is an updated version of a previous post after
>> taking into account the recommendationsbut I am still outputting the same
>> error message.  Do I have to change my data frame into a list? If so, what
>> would be the correct format to transform the data frame into this format.
>> If this is possible, I was wondering if anyone had a solution with how to
>> solve this error message (below).  I have looked online and through
>> available tutorials and still cannot solve this issue.  Words cannot
>> describe my gratitude if this is possible.
>>
>>  #Permute the data
>>
>> set.seed(999)
>>
>> x<-LDA.scores[,2:13]
>>
>>    grp1<-find.clusters(x, max.n.clust=12)
>>    dapc1<-dapc(x, grp1$grp)
>>
>> #DAPC analysis
>>
>> windows(width=10, height=7)
>> x<-LDA.scores[,2:13]
>> grp1<-find.clusters(x, max.n.clust=12)
>> dapc1<-dapc(x, grp1$grp)
>> dapc1
>>
>> #Loadings plot
>>
>> contrib <- loadingplot(dapc1$var.contr, axis=2,
>>                        thres=.07, lab.jitter=1)
>>
>>
>> #Cross Validation
>> windows(width=10, height=7)
>> set.seed(1234)
>> x1 <- LDA.scores
>> str(x1)
>> x1$Matriline<-as.factor(x1$Matriline)
>> xval <- xvalDapc(x1, grp1, n.pca.max = 2, training.set = 0.7,
>>                  result = "groupMean", center = TRUE, scale = FALSE,
>>                  n.pca = NULL, n.rep = 30, xval.plot = TRUE)
>>
>> Error in sort.list(y) : 'x' must be atomic for 'sort.list'
>> Have you called 'sort' on a list?
>>
>> During the DAPC analysis,  I chose to retain 2 PCs and 2 LD's, and there
>> appears to be 3 clusters. Would n.pca.max=2 be correct?
>>
>> My reproducible data, the logical steps that I took to chose the number
>> of PC's and LD's to retain,  and the number of chosen clusters is available
>> on stack overflow
>>
>>
>> http://stackoverflow.com/questions/32704902/discriminant-analysis-of-principal-components-and-how-to-graphically-show-the-di
>>
>> If it is possible to help me, then thank you
>>
>> Best wishes,
>> Kirsty
>>
>>
>>
>>
>>
>>
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150930/6d30c52e/attachment-0001.html>