[adegenet-forum] problems with na.replace and values for n.pca & n.da

D. Magdalena Sorger dm.sorger at gmail.com
Fri Oct 9 21:00:56 CEST 2015


Dear all,

I am trying to constuct a DAPC scatterplot with adegenet and have three
questions that after consulting online resources, tutorials, etc. still
haven't been answered definitively.


My data set consists of (diploid) microsatellite data (8 markers) for 20
ant colonies of 9-28 workers each (mean=15), 301 workers total. I have
missing data in 7 spots (i.e. individuals with missing data at one or more
loci). My first question is about reading in the genepop file:

*1) What is the proper command for reading in my file in regards to missing
data? *

I had replaced all missing data ("0000") with "NA" in the genepop file and
used the below code assuming that it would recognize my missing data as NA
(first line of code) and replace missing values with means (second line):


*msts_m2<-read.genepop("BOR-Od_m2_301w.gen",missing="NA")na.replace(msts_m2,"mean",
quiet=FALSE)*

However, when I run this code, it informs me that it replaced 119 missing
values. This obviously seems too much as it should have only replaced 7.
I'm not sure why this isn't working, see output below


*OUTPUT:*
*> na.replace(msts_m2,"mean", quiet=FALSE) *

* Replaced 119 missing values *

*   #####################*
*   ### Genind object ### *
*   #####################*
*- genotypes of individuals - *

*S4 class:  genind*
*@call: read.genepop(file = "BOR-Od_m2_301w.gen", missing = "NA")*

*@tab:  301 x 93 matrix of genotypes*

*@ind.names: vector of  301 individual names*
*@loc.names: vector of  8 locus names*
*@loc.nall: number of alleles per locus*
*@loc.fac: locus factor for the  93 columns of @tab*
*@all.names: list of  8 components yielding allele names for each locus*
*@ploidy:  2*
*@type:  codom*

*Optionnal contents: *
*@pop:  factor giving the population of each individual*
*@pop.names:  factor giving the population of each individual*

*@other: - empty -*





My second and third questions relate to the selection of the # of PCs and
the # of discriminant functions to retain. It seems that each time I make
slight changes to these numbers, the output changes vastly and so I want to
make sure I input the proper numbers.

First I use the find.clusters function, I designate 20 for n.pca here since
I have 20 colonies:

*clusters<-find.clusters(msts_m2,max.n.pca=20)*

When asked for the number of PCs to retain I select 30 which is the level
at which the points seem to level off. The BIC graph looks nothing like the
graph in the vignette (it does not level off but below a certain level
shows a downward zigzag pattern) so I choose 20 given that I have 20
colonies.

*dapc_m2<-dapc(msts_m2,clusters$grp)*
[image: Inline image 1][image: Inline image 2]
Next, when asked for PCs to retain I again choose the level at which the
points start to level off (30) but when asked for discriminant functions to
retain, I'm at a loss. I have about 18 relatively constantly decreasing
bars. I usually choose the point between the first few ones and the rest
where there is some kind of bigger (sometimes arbitrary) break and use that
number (in this case: 4):

*2) How to choose the appropriate number for PCs to retain and discr.
functions to retain?*


*[image: Inline image 3][image: Inline image 4]*



*best.n.pca<-a.score(dapc_m2)*
*temp<-optim.a.score(dapc_m2)*

After running the optim.a.score function I receive a graph that tells me at
the top "optimal number of PCs". This is the number I put into my last line
of code (below) for n.pca, and for n.da I choose the number I used when
prompted earlier:

*3) Are these the correct numbers to use for this last line of code? *

*dapc_m2<-dapc(msts_m2,n.pca=7,n.da=4)*



-- 
Magdalena Sorger
____________
Department of Applied Ecology
North Carolina State University
127 David Clark Labs, Box 7617
100 Eugene Brooks Ave.
Raleigh, NC 27695, USA
# 919-513-7464
dmsorger at ncsu.edu
*www.theantlife.com* <http://www.theantlife.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20151009/15162ba8/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 19507 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20151009/15162ba8/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 15234 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20151009/15162ba8/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 18215 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20151009/15162ba8/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 19074 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20151009/15162ba8/attachment-0007.png>


More information about the adegenet-forum mailing list