From mw14533.2014 at my.bristol.ac.uk  Wed Nov 23 21:05:17 2016
From: mw14533.2014 at my.bristol.ac.uk (Max Williams)
Date: Wed, 23 Nov 2016 20:05:17 +0000
Subject: [adegenet-forum] PCoA
Message-ID: <CAPhBhmHHp=oNLjYouMQ-ksJ4YrisLOt6=OWU4bjYqekoJNVt1w@mail.gmail.com>

Dear All

I have recently used PCoA in adegenet to produce the following graph, using
the s.class function (shown in attachment). Once i defined my PCoA data as
the variable "pca.kelp" i inputted this into adegenet:

s.class(pca.cows$li, fac=pop(microbov), col=funky(15))

s.class(pca.kelp$li, fac=pop(genind1),
   +col=transp(funky(15),.6),
   +axesel=FALSE, cstar=0, cpoint=3)

I am fairly new to R statistics and would appreciate your time.

Many thanks
Max
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161123/5879ad2e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pcoc,3axes.jpg
Type: image/jpeg
Size: 16081 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161123/5879ad2e/attachment.jpg>

From roman.lustrik at biolitika.si  Thu Nov 24 08:48:03 2016
From: roman.lustrik at biolitika.si (Roman =?utf-8?Q?Lu=C5=A1trik?=)
Date: Thu, 24 Nov 2016 08:48:03 +0100 (CET)
Subject: [adegenet-forum] PCoA
In-Reply-To: <CAPhBhmHHp=oNLjYouMQ-ksJ4YrisLOt6=OWU4bjYqekoJNVt1w@mail.gmail.com>
References: <CAPhBhmHHp=oNLjYouMQ-ksJ4YrisLOt6=OWU4bjYqekoJNVt1w@mail.gmail.com>
Message-ID: <1646837837.194970.1479973683914.JavaMail.zimbra@biolitika.si>

Hello Max, 

is there a question here? If you have problems with your code, it's best to provide a reproducible example. Feel free to use any of the datasets that comes shipped with adegenet or perhaps you could simulate your own data. 

Cheers, 
Roman 

---- 
In god we trust, all others bring data. 


From: "Max Williams" <mw14533.2014 at my.bristol.ac.uk> 
To: adegenet-forum at lists.r-forge.r-project.org 
Sent: Wednesday, November 23, 2016 9:05:17 PM 
Subject: [adegenet-forum] PCoA 

Dear All 

I have recently used PCoA in adegenet to produce the following graph, using the s.class function (shown in attachment). Once i defined my PCoA data as the variable "pca.kelp" i inputted this into adegenet: 

s.class(pca.cows$li, fac=pop(microbov), col=funky(15)) 

s.class(pca.kelp$li, fac=pop(genind1), 
+col=transp(funky(15),.6), 
+axesel=FALSE, cstar=0, cpoint=3) 

I am fairly new to R statistics and would appreciate your time. 

Many thanks 
Max 

_______________________________________________ 
adegenet-forum mailing list 
adegenet-forum at lists.r-forge.r-project.org 
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161124/3abaa4a8/attachment-0001.html>

From j.kro365 at gmail.com  Wed Nov  2 18:41:10 2016
From: j.kro365 at gmail.com (John Kronenberger)
Date: Wed, 02 Nov 2016 17:41:10 -0000
Subject: [adegenet-forum] Squared distance between groups
Message-ID: <CAMEPmDHuER8ZwOcXAB9iYQK_U8BkGC1wZT39f4a=uPLQiWGtsw@mail.gmail.com>

Hey there,

I'm aware of the mstree argument, but is there any way to output the
squared distance between groups? For example, if I'm interested in the
relative similarity between each group.

Best,

John

-- 
John A. Kronenberger
Master's Student and Teaching Assistant
Graduate Degree Program in Ecology
Department of Biology
Colorado State University
johnkronenberger.weebly.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161102/c82c0517/attachment-0001.html>

From roman.lustrik at biolitika.si  Thu Nov 24 10:27:47 2016
From: roman.lustrik at biolitika.si (Roman =?utf-8?Q?Lu=C5=A1trik?=)
Date: Thu, 24 Nov 2016 10:27:47 +0100 (CET)
Subject: [adegenet-forum] Fwd:  PCoA
In-Reply-To: <CAPhBhmF8tZMK1YF41nt9q1KWhhBcQg3MKssNyL_EYaROB2vPXw@mail.gmail.com>
References: <CAPhBhmHHp=oNLjYouMQ-ksJ4YrisLOt6=OWU4bjYqekoJNVt1w@mail.gmail.com>
 <1646837837.194970.1479973683914.JavaMail.zimbra@biolitika.si>
 <CAPhBhmF8tZMK1YF41nt9q1KWhhBcQg3MKssNyL_EYaROB2vPXw@mail.gmail.com>
Message-ID: <1753299968.286478.1479979667007.JavaMail.zimbra@biolitika.si>

Max is asking on how to remove the labels. 

This example was taken from the adegenet vignette ( see page 57 ) and it gets you pretty close to what you want. 

library(adegenet) 

data(microbov) 

mb <- scaleGen(microbov, NA.method = "mean") 
pca.cows <- dudi.pca(mb, center = FALSE, scale = FALSE, scannf = FALSE, nf = 3) 

par(mfrow = c(2, 1)) 
s.class(pca.cows$li, pop(microbov), 
xax = 1, yax = 3, col = transp(funky(15), .6), axesell=FALSE, 
cstar=0, cpoint=3, grid=FALSE) 

colorplot(pca.cows$li[c(1,3)], pca.cows$li, transp=TRUE, cex=3, xlab="PC 1", ylab="PC 3") 
title("PCA of microbov dataset\naxes 1-3") 
abline(v=0,h=0,col="grey", lty=2) 


Cheers, 
Roman 

---- 
In god we trust, all others bring data. 


From: "mw14533 2014" <mw14533.2014 at my.bristol.ac.uk> 
To: "Roman Lu?trik" <roman.lustrik at biolitika.si> 
Sent: Thursday, November 24, 2016 9:39:45 AM 
Subject: Re: [adegenet-forum] PCoA 

Hello Roman 

The question i meant to ask was how do i remove the labels from the graph? sorry i seem to have been quite unhelpful and forgot to put that in my initial email. 

Many thanks 
Max 

On Thu, Nov 24, 2016 at 7:48 AM, Roman Lu?trik < roman.lustrik at biolitika.si > wrote: 


Hello Max, 

is there a question here? If you have problems with your code, it's best to provide a reproducible example. Feel free to use any of the datasets that comes shipped with adegenet or perhaps you could simulate your own data. 

Cheers, 
Roman 

---- 
In god we trust, all others bring data. 


From: "Max Williams" < mw14533.2014 at my.bristol.ac.uk > 
To: adegenet-forum at lists.r-forge.r-project.org 
Sent: Wednesday, November 23, 2016 9:05:17 PM 
Subject: [adegenet-forum] PCoA 

Dear All 

I have recently used PCoA in adegenet to produce the following graph, using the s.class function (shown in attachment). Once i defined my PCoA data as the variable "pca.kelp" i inputted this into adegenet: 

s.class(pca.cows$li, fac=pop(microbov), col=funky(15)) 

s.class(pca.kelp$li, fac=pop(genind1), 
+col=transp(funky(15),.6), 
+axesel=FALSE, cstar=0, cpoint=3) 

I am fairly new to R statistics and would appreciate your time. 

Many thanks 
Max 

_______________________________________________ 
adegenet-forum mailing list 
adegenet-forum at lists.r-forge.r-project.org 
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum 

_______________________________________________ 
adegenet-forum mailing list 
adegenet-forum at lists.r-forge.r-project.org 
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161124/b5a2f5f3/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Rplot.jpeg
Type: image/jpeg
Size: 140648 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161124/b5a2f5f3/attachment-0001.jpeg>

From thibautjombart at gmail.com  Fri Nov 25 19:04:03 2016
From: thibautjombart at gmail.com (Thibaut Jombart)
Date: Fri, 25 Nov 2016 18:04:03 +0000
Subject: [adegenet-forum] Squared distance between groups
In-Reply-To: <CAMEPmDHuER8ZwOcXAB9iYQK_U8BkGC1wZT39f4a=uPLQiWGtsw@mail.gmail.com>
References: <CAMEPmDHuER8ZwOcXAB9iYQK_U8BkGC1wZT39f4a=uPLQiWGtsw@mail.gmail.com>
Message-ID: <CANPRA+qW14faWAEMCCga=A5k1JjMDwsQkHU6keNWE00+fWKPow@mail.gmail.com>

I would use go for:
- dist.genpop
- various measures of pairwise Fst in hierfstat

Best
Thibaut

--
Dr Thibaut Jombart
Lecturer, Department of Infectious Disease Epidemiology, Imperial College London
Head of RECON: repidemicsconsortium.org
sites.google.com/site/thibautjombart/
github.com/thibautjombart
Twitter: @TeebzR


On 2 November 2016 at 17:41, John Kronenberger <j.kro365 at gmail.com> wrote:
> Hey there,
>
> I'm aware of the mstree argument, but is there any way to output the squared
> distance between groups? For example, if I'm interested in the relative
> similarity between each group.
>
> Best,
>
> John
>
> --
> John A. Kronenberger
> Master's Student and Teaching Assistant
> Graduate Degree Program in Ecology
> Department of Biology
> Colorado State University
> johnkronenberger.weebly.com
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum

From biz.sheedy at gmail.com  Fri Nov 25 10:44:16 2016
From: biz.sheedy at gmail.com (Biz Sheedy)
Date: Fri, 25 Nov 2016 18:44:16 +0900
Subject: [adegenet-forum] Discrepancy in NA counts
Message-ID: <CAHbbAD1yCtXajYwDd8nNJQLvs9aictO4pRymQQynk4b7mzcjeA@mail.gmail.com>

Dear All,

I am trying to read SNP data from Stacks into adegenet. I have tried
read.structure and read.genepop but they both give (the same) NA counts
that are higher than expected. Using read.table on the structure-formatted
file (with "ind" and "pop" inserted into the first two columns of row one)
gave the expected number of missing data.

I looked at a single population subset (both the original and the converted
data) in excel and found a locus where in the original data, all nine
individuals were "3", but in the converted data one individual was "NA".
The loci before and after this one both matched/were correct.

I am not sure what I have missed for this to happen, my R skills are
beginner at best. Any help with reading the data in correctly would be
greatly appreciated!

Thank you,
Elizabeth


R version 3.3.2
adegenet version 2.0.1

Data: 44 individuals, diploid, 4279 loci.

all<-read.structure("all_batch_1.stru", NA.char="0")

Total cells in excel: 376552
After read.structure/genepop: 44*8558=376552

0s in excel: 3952
0s after read.table; length(which(X==0)): 3952
NA after read.structure/genepop; sum(is.na(all$tab)): 4008
Difference: 56

Subset Chichi
Total cells: 77022
After read.structure/genepop: 9*8558=77022

0s in excel: 742
NA after read.structure/genepop; sum(is.na(chi$tab)): 756
Difference: 14


-- 
4-1-1 Amakubo
Department of Botany
National Museum of Nature and Science
Tsukuba, Ibaraki 305-0005
Japan

biz.sheedy at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161125/2e3b487b/attachment-0001.html>

From panda143526 at gmail.com  Fri Nov 25 16:23:47 2016
From: panda143526 at gmail.com (Da Pan)
Date: Fri, 25 Nov 2016 16:23:47 +0100
Subject: [adegenet-forum] question on scaleGen()
Message-ID: <CADdQ8WqpAmMTZuJH9ZdvpVFBqjbGk=awKRGA8bd32iJS9mtboA@mail.gmail.com>

Dear Thimbaut and adegenet users,
Thank you for your time and help.
I probably have some more naif questions about adegenet.
I am attemping a PCA analysis on my SNP dataset with the following
arguements:

test <- read.structure("batch_1.str",n.ind = 17, n.loc =  12451, col.lab =
1, col.pop = 2, row.marknames = 1, NA.char = "0")

test2 <- scaleGen(test, NA.method = "mean")

After this, the R shows:

Warning message:
In .local(x, ...) : Some scaling values are null.
 Corresponding alleles are removed.

I checked pop(test), it returned as :

 1  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33
02 02 04 04 06 06 07 07 19 19 38 38 39 39 46 46 46
Levels: 02 04 06 07 19 38 39 46

How to solve this problem?

thanks in advance

Best wishes,
Da
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161125/8c3c9c6d/attachment-0001.html>

From arsalan at pobox.com  Sun Nov 27 22:03:04 2016
From: arsalan at pobox.com (Arsalan Emami-Khoyi)
Date: Mon, 28 Nov 2016 00:33:04 +0330
Subject: [adegenet-forum] scatter plot
Message-ID: <1480280584.2062924.800481737.58E61574@webmail.messagingengine.com>

Dear Dr. jombart  and other users,

I am not a very experienced user of adegenet ! I just follow tutorials
and at final stage  when I used  scatter(dapc1) what I get is only few
squares by  numbers( screenshot attached) showing my clusters and
individuals are not showed ! I am wondering if any body can help me to
solve the issue ? apologize for basic question.
Many thanks in advance


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161128/9979173e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scatter.jpg
Type: image/jpeg
Size: 76470 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161128/9979173e/attachment-0001.jpg>

From roman.lustrik at biolitika.si  Mon Nov 28 08:27:19 2016
From: roman.lustrik at biolitika.si (Roman =?utf-8?Q?Lu=C5=A1trik?=)
Date: Mon, 28 Nov 2016 08:27:19 +0100 (CET)
Subject: [adegenet-forum] scatter plot
In-Reply-To: <1480280584.2062924.800481737.58E61574@webmail.messagingengine.com>
References: <1480280584.2062924.800481737.58E61574@webmail.messagingengine.com>
Message-ID: <922431229.570707.1480318039300.JavaMail.zimbra@biolitika.si>

Hi, 

can you share the code you are using? On which dataset are you running it? 
If you are following the code from the vignette (along with the dataset that comes with adegenet) have you tried clearing your R session and running the script on a clean environment? 

Cheers, 
Roman 

---- 
In god we trust, all others bring data. 


From: "Arsalan Emami-Khoyi" <arsalan at pobox.com> 
To: adegenet-forum at lists.r-forge.r-project.org, "Thibaut Jombart" <thibautjombart at gmail.com> 
Sent: Sunday, November 27, 2016 10:03:04 PM 
Subject: [adegenet-forum] scatter plot 

Dear Dr. jombart and other users, 
I am not a very experienced user of adegenet ! I just follow tutorials and at final stage when I used scatter(dapc1) what I get is only few squares by numbers( screenshot attached) showing my clusters and individuals are not showed ! I am wondering if any body can help me to solve the issue ? apologize for basic question. 
Many thanks in advance 


_______________________________________________ 
adegenet-forum mailing list 
adegenet-forum at lists.r-forge.r-project.org 
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161128/1378f719/attachment.html>

From roman.lustrik at biolitika.si  Mon Nov 28 08:31:07 2016
From: roman.lustrik at biolitika.si (Roman =?utf-8?Q?Lu=C5=A1trik?=)
Date: Mon, 28 Nov 2016 08:31:07 +0100 (CET)
Subject: [adegenet-forum] Discrepancy in NA counts
In-Reply-To: <CAHbbAD1yCtXajYwDd8nNJQLvs9aictO4pRymQQynk4b7mzcjeA@mail.gmail.com>
References: <CAHbbAD1yCtXajYwDd8nNJQLvs9aictO4pRymQQynk4b7mzcjeA@mail.gmail.com>
Message-ID: <1051082989.570757.1480318267345.JavaMail.zimbra@biolitika.si>

Hi, 

can you share a (subset) of the dataset? It's hard to pinpoint where things might be going wrong without some data in hand. 

Cheers, 
Roman 

---- 
In god we trust, all others bring data. 


From: "Biz Sheedy" <biz.sheedy at gmail.com> 
To: adegenet-forum at lists.r-forge.r-project.org 
Sent: Friday, November 25, 2016 10:44:16 AM 
Subject: [adegenet-forum] Discrepancy in NA counts 

Dear All, 

I am trying to read SNP data from Stacks into adegenet. I have tried read.structure and read.genepop but they both give (the same) NA counts that are higher than expected. Using read.table on the structure-formatted file (with "ind" and "pop" inserted into the first two columns of row one) gave the expected number of missing data. 

I looked at a single population subset (both the original and the converted data) in excel and found a locus where in the original data, all nine individuals were "3", but in the converted data one individual was "NA". The loci before and after this one both matched/were correct. 

I am not sure what I have missed for this to happen, my R skills are beginner at best. Any help with reading the data in correctly would be greatly appreciated! 

Thank you, 
Elizabeth 


R version 3.3.2 
adegenet version 2.0.1 

Data: 44 individuals, diploid, 4279 loci. 

all<-read.structure("all_batch_1.stru", NA.char="0") 

Total cells in excel: 376552 
After read.structure/genepop: 44*8558=376552 

0s in excel: 3952 
0s after read.table; length(which(X==0)): 3952 
NA after read.structure/genepop; sum( is.na (all$tab)): 4008 
Difference: 56 

Subset Chichi 
Total cells: 77022 
After read.structure/genepop: 9*8558=77022 

0s in excel: 742 
NA after read.structure/genepop; sum( is.na (chi$tab)): 756 
Difference: 14 


-- 
4-1-1 Amakubo 
Department of Botany 
National Museum of Nature and Science 
Tsukuba, Ibaraki 305-0005 
Japan 

biz.sheedy at gmail.com 

_______________________________________________ 
adegenet-forum mailing list 
adegenet-forum at lists.r-forge.r-project.org 
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161128/c1272c36/attachment.html>

From Oliver.Berry at csiro.au  Mon Nov 28 08:31:16 2016
From: Oliver.Berry at csiro.au (Oliver.Berry at csiro.au)
Date: Mon, 28 Nov 2016 07:31:16 +0000
Subject: [adegenet-forum] scatter plot
In-Reply-To: <1480280584.2062924.800481737.58E61574@webmail.messagingengine.com>
References: <1480280584.2062924.800481737.58E61574@webmail.messagingengine.com>
Message-ID: <bbec899f876e45688a041a61cfa52254@exch3-cdc.nexus.csiro.au>

Hi Arsalam,

It looks like you may have your labels (clab) set to 1 (or some other positive value) so that it is covering up your points. Perhaps your symbols (cex) are also set to either zero or a very small value. You might also have cstar set to zero so you wont see any of the lines joining points to the centroid for that group.

I suggest tying higher values for your cex and cstar. You could also set clab=0 to remove the labels to see if theyre obscuring the points.  I prefer to use a legend than direct labels.

Cheers,

Olly
Dr Oliver Berry
Senior Research Scientist | CSIRO Ocean and Atmosphere
Team Leader | Coastal Ecosystems and Modelling
Adjunct Senior Lecturer, School of Animal Biology, The University of Western Australia
Phone: +61 8 9333 6584 | 0400 747 197 |Fax: +61 8 9333 6499
oliver.berry at csiro.au<mailto:oliver.berry at csiro.au>| www.csiro.au/en/Research/OandA
Address: Centre for Environment and Life Sciences, Cnr Underwood Ave. & Brockway Rd, Floreat, WA, 6014
Postal: Private Mail Bag 5, Wembley, WA, 6913
PLEASE NOTE
The information contained in this email may be confidential or privileged. Any unauthorised use or disclosure is prohibited. If you have received this email in error, please delete it immediately and notify the sender by return email. Thank you. To the extent permitted by law, CSIRO does not represent, warrant and/or guarantee that the integrity of this communication has been maintained or that the communication is free of errors, virus, interception or interference.


From: adegenet-forum-bounces at lists.r-forge.r-project.org [mailto:adegenet-forum-bounces at lists.r-forge.r-project.org] On Behalf Of Arsalan Emami-Khoyi
Sent: Monday, 28 November 2016 5:03 AM
To: adegenet-forum at lists.r-forge.r-project.org; thibautjombart at gmail.com
Subject: [adegenet-forum] scatter plot

Dear Dr. jombart  and other users,
I am not a very experienced user of adegenet ! I just follow tutorials and at final stage  when I used  scatter(dapc1) what I get is only few squares by  numbers( screenshot attached) showing my clusters and individuals are not showed ! I am wondering if any body can help me to solve the issue ? apologize for basic question.
Many thanks in advance


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161128/35fd8f85/attachment-0001.html>

From roman.lustrik at biolitika.si  Mon Nov 28 10:03:39 2016
From: roman.lustrik at biolitika.si (Roman =?utf-8?Q?Lu=C5=A1trik?=)
Date: Mon, 28 Nov 2016 10:03:39 +0100 (CET)
Subject: [adegenet-forum] Discrepancy in NA counts
In-Reply-To: <CAHbbAD2rmrv95vvcZTWHVhmV2qiXELvW099Gs2-z7kYNKf32sw@mail.gmail.com>
References: <CAHbbAD1yCtXajYwDd8nNJQLvs9aictO4pRymQQynk4b7mzcjeA@mail.gmail.com>
 <1051082989.570757.1480318267345.JavaMail.zimbra@biolitika.si>
 <CAHbbAD2rmrv95vvcZTWHVhmV2qiXELvW099Gs2-z7kYNKf32sw@mail.gmail.com>
Message-ID: <656891479.571300.1480323819198.JavaMail.zimbra@biolitika.si>

Hi, 

I think the problem is that adegenet, for consistency, adds NAs to accommodate the extra alleles present for a particular locus. Take for example C_KH1238 (bottom row in the example pasted belo). 
In raw file, it has missing values for locus 1378_53, but this locus has three alleles, ergo 3 NAs and not 2. Can't go through all the NAs right now, but I think there's a pretty good chance this is what is causing the discrepancy between what you see in "excel" and in adegenet. 

1369_41.11 1372_14.22 1372_14.24 1373_9.44 1373_9.24 1377_42.44 1377_42.24 1378_53.22 1378_53.24 1378_53.44 1379_10.33 1379_10.13 1382_37.33 
... 
C_KH1238 0 1 0 1 0 1 0 NA NA NA 1 0 1 # notice 3 NAs for all available alleles for 1378_53, not just two (as expected for diploid) 


Here is the code I used to explore this: 

library(adegenet) 

xy <- read.table("Sub_batch_1.stru", header = TRUE, sep = "\t") 
xy <- xy[, c(-1, -2)] 
table(as.matrix(xy)) 

# 0 1 2 3 4 
# 16 467 618 760 867 


xy <- read.structure("Sub_batch_1.stru", NA.char="0", 
n.ind = 44, n.loc = 31, onerowperind = FALSE, 
col.lab = 1, col.pop = 2, row.marknames = 1, 
sep = "\t", col.others = 0) 

xy <- tab(xy) 
xy[grepl("C_KH1238", rownames(xy)), grepl("1378_53", colnames(xy))] 

Cheers, 
Roman 

---- 
In god we trust, all others bring data. 


From: "Biz Sheedy" <biz.sheedy at gmail.com> 
To: "Roman Lu?trik" <roman.lustrik at biolitika.si> 
Sent: Monday, November 28, 2016 9:11:39 AM 
Subject: Re: [adegenet-forum] Discrepancy in NA counts 

My apologies. First time posting to a forum so I am a little unsure of things. I have attached a subset of the data, which includes the locus that I saw had problems. 

In this case there are 31 loci with 16 zeroes counted (excel), and 20 NAs counted (adegenet). The additional NAs occur in locus 1401_25. 

Thanks so much, 
Elizabeth 

On 28 November 2016 at 16:31, Roman Lu?trik < roman.lustrik at biolitika.si > wrote: 


Hi, 

can you share a (subset) of the dataset? It's hard to pinpoint where things might be going wrong without some data in hand. 

Cheers, 
Roman 

---- 
In god we trust, all others bring data. 


From: "Biz Sheedy" < biz.sheedy at gmail.com > 
To: adegenet-forum at lists.r-forge.r-project.org 
Sent: Friday, November 25, 2016 10:44:16 AM 
Subject: [adegenet-forum] Discrepancy in NA counts 

Dear All, 

I am trying to read SNP data from Stacks into adegenet. I have tried read.structure and read.genepop but they both give (the same) NA counts that are higher than expected. Using read.table on the structure-formatted file (with "ind" and "pop" inserted into the first two columns of row one) gave the expected number of missing data. 

I looked at a single population subset (both the original and the converted data) in excel and found a locus where in the original data, all nine individuals were "3", but in the converted data one individual was "NA". The loci before and after this one both matched/were correct. 

I am not sure what I have missed for this to happen, my R skills are beginner at best. Any help with reading the data in correctly would be greatly appreciated! 

Thank you, 
Elizabeth 


R version 3.3.2 
adegenet version 2.0.1 

Data: 44 individuals, diploid, 4279 loci. 

all<-read.structure("all_batch_1.stru", NA.char="0") 

Total cells in excel: 376552 
After read.structure/genepop: 44*8558=376552 

0s in excel: 3952 
0s after read.table; length(which(X==0)): 3952 
NA after read.structure/genepop; sum( is.na (all$tab)): 4008 
Difference: 56 

Subset Chichi 
Total cells: 77022 
After read.structure/genepop: 9*8558=77022 

0s in excel: 742 
NA after read.structure/genepop; sum( is.na (chi$tab)): 756 
Difference: 14 


-- 
4-1-1 Amakubo 
Department of Botany 
National Museum of Nature and Science 
Tsukuba, Ibaraki 305-0005 
Japan 

biz.sheedy at gmail.com 

_______________________________________________ 
adegenet-forum mailing list 
adegenet-forum at lists.r-forge.r-project.org 
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum 


-- 
4-1-1 Amakubo 
Department of Botany 
National Museum of Nature and Science 
Tsukuba, Ibaraki 305-0005 
Japan 

biz.sheedy at gmail.com 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161128/0724ee04/attachment.html>

From biz.sheedy at gmail.com  Mon Nov 28 11:00:53 2016
From: biz.sheedy at gmail.com (Biz Sheedy)
Date: Mon, 28 Nov 2016 19:00:53 +0900
Subject: [adegenet-forum] Discrepancy in NA counts
In-Reply-To: <656891479.571300.1480323819198.JavaMail.zimbra@biolitika.si>
References: <CAHbbAD1yCtXajYwDd8nNJQLvs9aictO4pRymQQynk4b7mzcjeA@mail.gmail.com>
 <1051082989.570757.1480318267345.JavaMail.zimbra@biolitika.si>
 <CAHbbAD2rmrv95vvcZTWHVhmV2qiXELvW099Gs2-z7kYNKf32sw@mail.gmail.com>
 <656891479.571300.1480323819198.JavaMail.zimbra@biolitika.si>
Message-ID: <CAHbbAD1SSpXideucLg-wM-Di28YOXny832=MVqFZueE9TGJ5yw@mail.gmail.com>

Thanks for looking into this.

Something that I did differently to the code you provided, was that I only
answered the prompts for the read.structure function. This meant I did not
use sep="\t" and the number of alleles was 62 instead of 72, which I think
should be comparable to the excel count. Following the code you provide, '
is.na' finds 23 NAs (instead of 20 NAs at 62 alleles and 16 zeroes in
excel).

Your explanation makes sense to me for the additional three NAs in
adegenet, but I still don't understand how in locus 1401_25 the data for
two individuals (C_KH1059 and M_KH1834) changed from being homozygous for
"3" to being "NA"?

I would really appreciate any further help on this.

Thanks again,
Elizabeth


On 28 November 2016 at 18:03, Roman Lu?trik <roman.lustrik at biolitika.si>
wrote:

> Hi,
>
> I think the problem is that adegenet, for consistency, adds NAs to
> accommodate the extra alleles present for a particular locus. Take for
> example C_KH1238 (bottom row in the example pasted belo).
> In raw file, it has missing values for locus 1378_53, but this locus has
> three alleles, ergo 3 NAs and not 2. Can't go through all the NAs right
> now, but I think there's a pretty good chance this is what is causing the
> discrepancy between what you see in "excel" and in adegenet.
>
> 1369_41.11 1372_14.22 1372_14.24 1373_9.44 1373_9.24 1377_42.44 1377_42.24
> 1378_53.22 1378_53.24 1378_53.44 1379_10.33 1379_10.13 1382_37.33
> ...
> C_KH1238 0 1 0 1 0 1 0 *NA NA NA* 1 0 1 # notice 3 NAs for all available
> alleles for 1378_53, not just two (as expected for diploid)
>
>
> Here is the code I used to explore this:
>
> library(adegenet)
>
> xy <- read.table("Sub_batch_1.stru", header = TRUE, sep = "\t")
> xy <- xy[, c(-1, -2)]
> table(as.matrix(xy))
>
> # 0 1 2 3 4
> # 16 467 618 760 867
>
>
> xy <- read.structure("Sub_batch_1.stru", NA.char="0",
> n.ind = 44, n.loc = 31, onerowperind = FALSE,
> col.lab = 1, col.pop = 2, row.marknames = 1,
> sep = "\t", col.others = 0)
>
> xy <- tab(xy)
> xy[grepl("C_KH1238", rownames(xy)), grepl("1378_53", colnames(xy))]
>
> Cheers,
> Roman
>
> ----
> In god we trust, all others bring data.
>
> ------------------------------
> *From: *"Biz Sheedy" <biz.sheedy at gmail.com>
> *To: *"Roman Lu?trik" <roman.lustrik at biolitika.si>
> *Sent: *Monday, November 28, 2016 9:11:39 AM
> *Subject: *Re: [adegenet-forum] Discrepancy in NA counts
>
> My apologies. First time posting to a forum so I am a little unsure of
> things. I have attached a subset of the data, which includes the locus that
> I saw had problems.
>
> In this case there are 31 loci with 16 zeroes counted (excel), and 20 NAs
> counted (adegenet). The additional NAs occur in locus 1401_25.
>
> Thanks so much,
> Elizabeth
>
> On 28 November 2016 at 16:31, Roman Lu?trik <roman.lustrik at biolitika.si>
> wrote:
>
>> Hi,
>>
>> can you share a (subset) of the dataset? It's hard to pinpoint where
>> things might be going wrong without some data in hand.
>>
>> Cheers,
>> Roman
>>
>> ----
>> In god we trust, all others bring data.
>>
>> ------------------------------
>> *From: *"Biz Sheedy" <biz.sheedy at gmail.com>
>> *To: *adegenet-forum at lists.r-forge.r-project.org
>> *Sent: *Friday, November 25, 2016 10:44:16 AM
>> *Subject: *[adegenet-forum] Discrepancy in NA counts
>>
>> Dear All,
>>
>> I am trying to read SNP data from Stacks into adegenet. I have tried
>> read.structure and read.genepop but they both give (the same) NA counts
>> that are higher than expected. Using read.table on the structure-formatted
>> file (with "ind" and "pop" inserted into the first two columns of row one)
>> gave the expected number of missing data.
>>
>> I looked at a single population subset (both the original and the
>> converted data) in excel and found a locus where in the original data, all
>> nine individuals were "3", but in the converted data one individual was
>> "NA". The loci before and after this one both matched/were correct.
>>
>> I am not sure what I have missed for this to happen, my R skills are
>> beginner at best. Any help with reading the data in correctly would be
>> greatly appreciated!
>>
>> Thank you,
>> Elizabeth
>>
>>
>> R version 3.3.2
>> adegenet version 2.0.1
>>
>> Data: 44 individuals, diploid, 4279 loci.
>>
>> all<-read.structure("all_batch_1.stru", NA.char="0")
>>
>> Total cells in excel: 376552
>> After read.structure/genepop: 44*8558=376552
>>
>> 0s in excel: 3952
>> 0s after read.table; length(which(X==0)): 3952
>> NA after read.structure/genepop; sum(is.na(all$tab)): 4008
>> Difference: 56
>>
>> Subset Chichi
>> Total cells: 77022
>> After read.structure/genepop: 9*8558=77022
>>
>> 0s in excel: 742
>> NA after read.structure/genepop; sum(is.na(chi$tab)): 756
>> Difference: 14
>>
>>
>>
>> --
>> 4-1-1 Amakubo
>> Department of Botany
>> National Museum of Nature and Science
>> Tsukuba, Ibaraki 305-0005
>> Japan
>>
>> biz.sheedy at gmail.com
>>
>> _______________________________________________
>> adegenet-forum mailing list
>> adegenet-forum at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/
>> listinfo/adegenet-forum
>>
>
>
>
> --
> 4-1-1 Amakubo
> Department of Botany
> National Museum of Nature and Science
> Tsukuba, Ibaraki 305-0005
> Japan
>
> biz.sheedy at gmail.com
>
>


-- 
4-1-1 Amakubo
Department of Botany
National Museum of Nature and Science
Tsukuba, Ibaraki 305-0005
Japan

biz.sheedy at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161128/6a9ab72e/attachment-0001.html>

From roman.lustrik at biolitika.si  Mon Nov 28 13:40:24 2016
From: roman.lustrik at biolitika.si (Roman =?utf-8?Q?Lu=C5=A1trik?=)
Date: Mon, 28 Nov 2016 13:40:24 +0100 (CET)
Subject: [adegenet-forum] Discrepancy in NA counts
In-Reply-To: <CAHbbAD1SSpXideucLg-wM-Di28YOXny832=MVqFZueE9TGJ5yw@mail.gmail.com>
References: <CAHbbAD1yCtXajYwDd8nNJQLvs9aictO4pRymQQynk4b7mzcjeA@mail.gmail.com>
 <1051082989.570757.1480318267345.JavaMail.zimbra@biolitika.si>
 <CAHbbAD2rmrv95vvcZTWHVhmV2qiXELvW099Gs2-z7kYNKf32sw@mail.gmail.com>
 <656891479.571300.1480323819198.JavaMail.zimbra@biolitika.si>
 <CAHbbAD1SSpXideucLg-wM-Di28YOXny832=MVqFZueE9TGJ5yw@mail.gmail.com>
Message-ID: <237564767.572607.1480336824232.JavaMail.zimbra@biolitika.si>

Hi Elizabeth, 

it would appear there is something funky happening with the code due to locus names being numeric. This has happened before in some other function. Until we fix this, you can change your locus names so that they start with a letter. 

Here is the excerpt from the genind object indicating that these two samples have alleles 33: 

X1401_25.13 X1401_25.33 X1403_13.11 X1403_13.13 X1403_13.33 X1404_17.13 X1404_17.33 X1404_17.11 
C_KH1059 0 1 1 0 0 0 1 0 
M_KH1834 0 1 1 0 0 1 0 0 


Cheers, 
Roman 


---- 
In god we trust, all others bring data. 


From: "Biz Sheedy" <biz.sheedy at gmail.com> 
To: "Roman Lu?trik" <roman.lustrik at biolitika.si> 
Cc: adegenet-forum at lists.r-forge.r-project.org 
Sent: Monday, November 28, 2016 11:00:53 AM 
Subject: Re: [adegenet-forum] Discrepancy in NA counts 

Thanks for looking into this. 

Something that I did differently to the code you provided, was that I only answered the prompts for the read.structure function. This meant I did not use sep="\t" and the number of alleles was 62 instead of 72, which I think should be comparable to the excel count. Following the code you provide, ' is.na ' finds 23 NAs (instead of 20 NAs at 62 alleles and 16 zeroes in excel). 

Your explanation makes sense to me for the additional three NAs in adegenet, but I still don't understand how in locus 1401_25 the data for two individuals (C_KH1059 and M_KH1834) changed from being homozygous for "3" to being "NA"? 

I would really appreciate any further help on this. 

Thanks again, 
Elizabeth 


On 28 November 2016 at 18:03, Roman Lu?trik < roman.lustrik at biolitika.si > wrote: 


Hi, 

I think the problem is that adegenet, for consistency, adds NAs to accommodate the extra alleles present for a particular locus. Take for example C_KH1238 (bottom row in the example pasted belo). 
In raw file, it has missing values for locus 1378_53, but this locus has three alleles, ergo 3 NAs and not 2. Can't go through all the NAs right now, but I think there's a pretty good chance this is what is causing the discrepancy between what you see in "excel" and in adegenet. 

1369_41.11 1372_14.22 1372_14.24 1373_9.44 1373_9.24 1377_42.44 1377_42.24 1378_53.22 1378_53.24 1378_53.44 1379_10.33 1379_10.13 1382_37.33 
... 
C_KH1238 0 1 0 1 0 1 0 NA NA NA 1 0 1 # notice 3 NAs for all available alleles for 1378_53, not just two (as expected for diploid) 


Here is the code I used to explore this: 

library(adegenet) 

xy <- read.table("Sub_batch_1.stru", header = TRUE, sep = "\t") 
xy <- xy[, c(-1, -2)] 
table(as.matrix(xy)) 

# 0 1 2 3 4 
# 16 467 618 760 867 


xy <- read.structure("Sub_batch_1.stru", NA.char="0", 
n.ind = 44, n.loc = 31, onerowperind = FALSE, 
col.lab = 1, col.pop = 2, row.marknames = 1, 
sep = "\t", col.others = 0) 

xy <- tab(xy) 
xy[grepl("C_KH1238", rownames(xy)), grepl("1378_53", colnames(xy))] 

Cheers, 
Roman 

---- 
In god we trust, all others bring data. 


From: "Biz Sheedy" < biz.sheedy at gmail.com > 
To: "Roman Lu?trik" < roman.lustrik at biolitika.si > 
Sent: Monday, November 28, 2016 9:11:39 AM 
Subject: Re: [adegenet-forum] Discrepancy in NA counts 

My apologies. First time posting to a forum so I am a little unsure of things. I have attached a subset of the data, which includes the locus that I saw had problems. 

In this case there are 31 loci with 16 zeroes counted (excel), and 20 NAs counted (adegenet). The additional NAs occur in locus 1401_25. 

Thanks so much, 
Elizabeth 

On 28 November 2016 at 16:31, Roman Lu?trik < roman.lustrik at biolitika.si > wrote: 

BQ_BEGIN

Hi, 

can you share a (subset) of the dataset? It's hard to pinpoint where things might be going wrong without some data in hand. 

Cheers, 
Roman 

---- 
In god we trust, all others bring data. 


From: "Biz Sheedy" < biz.sheedy at gmail.com > 
To: adegenet-forum at lists.r-forge.r-project.org 
Sent: Friday, November 25, 2016 10:44:16 AM 
Subject: [adegenet-forum] Discrepancy in NA counts 

Dear All, 

I am trying to read SNP data from Stacks into adegenet. I have tried read.structure and read.genepop but they both give (the same) NA counts that are higher than expected. Using read.table on the structure-formatted file (with "ind" and "pop" inserted into the first two columns of row one) gave the expected number of missing data. 

I looked at a single population subset (both the original and the converted data) in excel and found a locus where in the original data, all nine individuals were "3", but in the converted data one individual was "NA". The loci before and after this one both matched/were correct. 

I am not sure what I have missed for this to happen, my R skills are beginner at best. Any help with reading the data in correctly would be greatly appreciated! 

Thank you, 
Elizabeth 


R version 3.3.2 
adegenet version 2.0.1 

Data: 44 individuals, diploid, 4279 loci. 

all<-read.structure("all_batch_1.stru", NA.char="0") 

Total cells in excel: 376552 
After read.structure/genepop: 44*8558=376552 

0s in excel: 3952 
0s after read.table; length(which(X==0)): 3952 
NA after read.structure/genepop; sum( is.na (all$tab)): 4008 
Difference: 56 

Subset Chichi 
Total cells: 77022 
After read.structure/genepop: 9*8558=77022 

0s in excel: 742 
NA after read.structure/genepop; sum( is.na (chi$tab)): 756 
Difference: 14 


-- 
4-1-1 Amakubo 
Department of Botany 
National Museum of Nature and Science 
Tsukuba, Ibaraki 305-0005 
Japan 

biz.sheedy at gmail.com 

_______________________________________________ 
adegenet-forum mailing list 
adegenet-forum at lists.r-forge.r-project.org 
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum 


-- 
4-1-1 Amakubo 
Department of Botany 
National Museum of Nature and Science 
Tsukuba, Ibaraki 305-0005 
Japan 

biz.sheedy at gmail.com 


BQ_END


-- 
4-1-1 Amakubo 
Department of Botany 
National Museum of Nature and Science 
Tsukuba, Ibaraki 305-0005 
Japan 

biz.sheedy at gmail.com 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161128/38302c86/attachment.html>

From perezalquicira at gmail.com  Wed Nov 30 14:28:52 2016
From: perezalquicira at gmail.com (Jessica Perez Alquicira)
Date: Wed, 30 Nov 2016 13:28:52 -0000
Subject: [adegenet-forum] tetraploid DAPC
Message-ID: <CA+OwOyVM_scL6E-K7S7U3ao6h367Q69gHV_TxVj-tWFRE0MCCg@mail.gmail.com>

Hi, I would like to do a dapc on tetraploid data. My file format is in
structure.
I have not find this information in the manual. Could you please let me
know how could I do that.

Best
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161130/dc7ce5f9/attachment-0001.html>