[adegenet-forum] read.genepop (adegenet 2.0.0 with R v. 3.2.1)

Thu Jul 16 13:25:45 CEST 2015

Fixed now:
https://github.com/thibautjombart/adegenet/issues/71#issuecomment-121790358

And readily available in the devel version:

install.packages("devtools")
library(devtools)
install_github("thibautjombart/adegenet")
library("adegenet")

Cheers
Thibaut

________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Jombart, Thibaut [t.jombart at imperial.ac.uk]
Sent: 16 July 2015 11:50
To: Zhian Kamvar; adegenet-forum at lists.r-forge.r-project.org
Subject: Re: [adegenet-forum] read.genepop (adegenet 2.0.0 with R v. 3.2.1)

Looks like a bug indeed. Thanks for spotting it. Will fix today.

Cheers
Thibaut

________________________________________
From: adegenet-forum-bounces at lists.r-forge.r-project.org [adegenet-forum-bounces at lists.r-forge.r-project.org] on behalf of Zhian Kamvar [zkamvar at gmail.com]
Sent: 16 July 2015 01:35
To: adegenet-forum at lists.r-forge.r-project.org
Subject: Re: [adegenet-forum] read.genepop (adegenet 2.0.0 with R v. 3.2.1)

This smells like a bug. After poking around some, it is indeed one in read.fstat and read.genepop. (Both read.genetix and read.structure still work):

> obj <- read.genepop(system.file("files/nancycats.gen",package="adegenet"))

 Converting data from a Genepop .gen file to a genind object...

File description:  Genotypes of cats from 17 colonies of Nancy (France)

...done.

> obj
/// GENIND OBJECT /////////

 // 237 individuals; 9 loci; 111 alleles; size: 138.5 Kb

 // Basic content
   @tab:  237 x 111 matrix of allele counts
   @loc.n.all: number of alleles per locus (range: 8-18)
   @loc.fac: locus factor for the 111 columns of @tab
   @all.names: list of allele names for each locus
   @ploidy: ploidy of each individual  (range: 2-2)
   @type:  codom
   @call: read.genepop(file = system.file("files/nancycats.gen", package = "adegenet"))

 // Optional content
   @pop: population of each individual (group size range: 9-23)
> summary(obj)

 # Total number of genotypes:  237

 # Population sample sizes:
 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
10 22 12 23 15 11 14 10  9 11 20 14 13 17 11 12 13

 # Number of alleles per locus:
 fca8 fca23 fca43 fca45 fca77 fca78 fca90 fca96 fca37
   17    11    10    10    12     8    12    13    18

 # Number of alleles per population:
 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17
37 53 50 67 48 56 43 54 43 46 73 53 44 62 42 40 37

 # Percentage of missing data:
[1] 0

 # Observed heterozygosity:
     fca8     fca23     fca43     fca45     fca77     fca78     fca90     fca96     fca37
0.6118143 0.6666667 0.6793249 0.6455696 0.6329114 0.5654008 0.6497890 0.5949367 0.4514768

 # Expected heterozygosity:
     fca8     fca23     fca43     fca45     fca77     fca78     fca90     fca96     fca37
0.8803076 0.7928751 0.7953319 0.7930531 0.8702576 0.6884669 0.8157881 0.7767630 0.6062686

This will be reported and fixed.

Cheers,
Zhian

> On Jul 15, 2015, at 11:16 , adegenet-forum-request at lists.r-forge.r-project.org wrote:
>
> On closer inspection, it appears the new version stores missing data as
> alleles (i.e. *.00 in @tab). So using tab to replace the allele counts
> doesn't work. For example, x at tab <- tab(x, NA.method="mean") does nothing
> because missing data is stored as normal data. Here's a workaround I
> created, although probably not the most clever method, it fixed my problem.
> Hopefully this helps someone!
> Paul
>
> # Fix missing values to reflect depracated option, missing = "mean"
> x at tab <- x at tab[,-grep("\\.00",colnames(x at tab))] #remove "00" alleles
> rep <- gsub("([^\\.]+)\\.\\d+","\\1",colnames(x at tab)) #locus names
> loci <- unique(x at loc.fac) #unique locus names
> x at loc.fac <- as.factor(rep)
> for (i in 1:length(x at all.names))
>  if ("00" %in% x at all.names[[i]]) #remove "00" from allele names
>    x at all.names[[i]] <- x at all.names[[i]][-which(x at all.names[[i]]=="00")]
> for (i in 1:length(x at loc.n.all)) #remove "00" from allele counts
>  x at loc.n.all[[i]] <- length(x at all.names[[i]])
> for (i in 1:length(loci)) { #replace missing data with mean allele counts
>  df <- data.frame(x at tab[,which(loci[i] == rep)]) #df, alleles for one locus
>  for (j in 1:nrow(df)) {
>    if (sum(df[j,]) == 0) {
>      for (k in 1:length(df[j,])) { #mean allele counts from rows with data
>        df[j,k] <- round(mean( df[which(apply(df,1,sum) != 0),k] ))
>      }
>      x at tab[j,which(loci[i] == rep)] <- as.numeric(df[j,])
>    }
>  }
> }

_______________________________________________
adegenet-forum mailing list
adegenet-forum at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
_______________________________________________
adegenet-forum mailing list
adegenet-forum at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum