[adegenet-forum] Difference in assignment between versions of adegenet (DAPC)

Wed Dec 14 15:07:41 CET 2016

Hi Thibaut and Co

We're a team who have used adegenet's (version 1-4.1 and 1-4.2 ) DAPC assignment method for some earlier studies. We are now encountering problems using the assignment method. The problem is that the new version adegenet 2.0.1 assigns "old individuals", which we have used in earlier studies, differently compared to assignments with earlier versions of the package.

We use SNP data, and our gen-files look as shown below. Alleles are coded by three digits. Se example below
______________________________________________
GenePop file, with 5 samples & 96 loci
cgpGmo-S1017
cgpGmo-S1018a
cgpGmo-S1026
cgpGmo-S1070
cgpGmo-S1095
cgpGmo-S1103
POP
DAB08_01 , 001001 002002 001002 002001 001001 001001
DAB08_02 , 001001 002002 001001 002002 001001 002002
DAB08_03 , 001001 002002 001001 002002 001001 002001
POP
INC02_01 , 001001 002002 001002 002002 001001 002001
INC02_02 , 001001 002002 002002 002002 001001 002002
INC02_03 , 001001 002002 001002 002002 001001 002001
__________________________________________

We have two issues

1) Last year we assigned individuals using version adegenet 1-4.1.We suspected that is must be something with how the file are read, and we wanted to check and compare with older versions (1-4.1 and 1-4.2). We've tried to use older versions with install_version() to make the comparison between versions (1-4.1, 1-4.2 and 2.0.1), but we keep getting following error message when using older versions.
___________________________________________
 Converting data from a Genepop .gen file to a genind object...

File description:  GenePop file, with 5 samples & 96 loci
Error in while (keepCheck) { : missing value where TRUE/FALSE needed
____________________________________________________________

We do not understand why we get this error message, when we use the exact same files as we have always used. Any idea?

2) When we use the newest version, we get a different assignment result compared to assignments with earlier versions of the package.
I have my previous assignment results for assigned individuals (1-4.1 and 1-4.2). I reassigned these individuals with the new package (2.0.1). Thereafter, I've compared the assignment between package versions and they are different, even though we retain the same number of PC's, use same reference file and use the same script with some minor corrections for reading files to accommodate the new version. Any idea why this is the case? Any changes to how each locus and allele are read from version to version?

I have noticed that there is a difference between assignment when using adegenet (2.0.1)  depending on the individuals I include in a gen-file for assignment. When I assign all my individuals from all years in one file, it will give a different assignment result than when I assign single files where they are divided up into years.
Can it be the positioning of alleles at each locus which have changed? We are not sure what is going wrong, but we suspect that it is something with the reading of our files.

Below is some R-history, which hopefully. might be helpful
R-script:
______________________________________________
#Reading files
Ref <- read.genepop("Ref.gen", ncode = 3)
Assign <- read.genepop("TBA_All.gen", ncode = 3)
#DAPC
DAPC_Ref<-dapc(Ref, pop(Ref), n.pca=100, n.da=3)
#Assignment
Predict=predict.dapc(DAPC_Ref, newdata=Assign)
Predict$assign

Genind objects after read.genepop():
___________________________________
>Reference
/// GENIND OBJECT /////////

 // 487 individuals; 96 loci; 192 alleles; size: 451.5 Kb

 // Basic content
   @tab:  487 x 192 matrix of allele counts
   @loc.n.all: number of alleles per locus (range: 2-2)
   @loc.fac: locus factor for the 192 columns of @tab
   @all.names: list of allele names for each locus
   @ploidy: ploidy of each individual  (range: 2-2)
   @type:  codom
   @call: read.genepop(file = "Ref.gen", ncode = 3)

 // Optional content
   @pop: population of each individual (group size range: 62-215)

>AssignAll #All individuals for all years
/// GENIND OBJECT /////////

 // 1,357 individuals; 96 loci; 192 alleles; size: 1.1 Mb

 // Basic content
   @tab:  1357 x 192 matrix of allele counts
   @loc.n.all: number of alleles per locus (range: 2-2)
   @loc.fac: locus factor for the 192 columns of @tab
   @all.names: list of allele names for each locus
   @ploidy: ploidy of each individual  (range: 2-2)
   @type:  codom
   @call: read.genepop(file = "TBA_All.gen", ncode = 3)

 // Optional content
   @pop: population of each individual (group size range: 1357-1357)

> Assign2015 #individuals for year 2015 only
/// GENIND OBJECT /////////

 // 469 individuals; 96 loci; 192 alleles; size: 434.2 Kb

 // Basic content
   @tab:  469 x 192 matrix of allele counts
   @loc.n.all: number of alleles per locus (range: 2-2)
   @loc.fac: locus factor for the 192 columns of @tab
   @all.names: list of allele names for each locus
   @ploidy: ploidy of each individual  (range: 2-2)
   @type:  codom
   @call: read.genepop(file = "TBA_Fisk2015.gen", ncode = 3)

 // Optional content
   @pop: population of each individual (group size range: 469-469)

Assignment result showing different assignment depending on which individuals one include in a input-file (gen-file) for assignment is after predict.dapc():
_______________________________________________________
> Predict$assign #All individuals for all years
   [1] TAS10_30 TAS10_30 TAS10_30 TAS10_30 UMM45_39 UMM45_39
   [7] UMM45_39 UMM45_39 TAS10_30 TAS10_30 UMM45_39 UMM45_39
  [13] ISC02_39 UMM45_39 ISC02_39 ISC02_39 ISC02_39 ISC02_39
  [19] UMM45_39 QOR08_30 ISC02_39 UMM45_39 TAS10_30 UMM45_39
  [25] QOR08_30 QOR08_30 UMM45_39 QOR08_30 QOR08_30 UMM45_39
  [31] UMM45_39 UMM45_39 QOR08_30 UMM45_39 UMM45_39 ISC02_39
  [37] ISC02_39 UMM45_39 UMM45_39 QOR08_30 UMM45_39 QOR08_30
  [43] UMM45_39 UMM45_39 UMM45_39 UMM45_39 QOR08_30 UMM45_39
                             etc.

> Predict$assign #individuals for year 2015 only
  [1] TAS10_30 TAS10_30 TAS10_30 TAS10_30 TAS10_30 TAS10_30
  [7] TAS10_30 TAS10_30 TAS10_30 TAS10_30 UMM45_39 UMM45_39
 [13] UMM45_39 UMM45_39 UMM45_39 UMM45_39 UMM45_39 UMM45_39
 [19] UMM45_39 UMM45_39 ISC02_39 ISC02_39 ISC02_39 ISC02_39
 [25] ISC02_39 ISC02_39 TAS10_30 ISC02_39 ISC02_39 ISC02_39
 [31] ISC02_39 ISC02_39 ISC02_39 ISC02_39 ISC02_39 TAS10_30
 [37] ISC02_39 ISC02_39 ISC02_39 ISC02_39 ISC02_39 TAS10_30
 [43] ISC02_39 ISC02_39 ISC02_39 ISC02_39 ISC02_39 ISC02_39
                             etc.

Thank you
Sincerely
Ole and team

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20161214/cd819585/attachment-0001.html>