[adegenet-forum] problem to import data with read.structure

Thibaut Jombart jombart at biomserv.univ-lyon1.fr
Wed Jul 23 11:16:57 CEST 2008


Hello Stéphanie,
> Dear all,
>
> My question is about the data importation and the use of 
> read.structure command.
>
> In the tutorial I read:
> "In all cases, it should be possible to store data in an individuals x 
> markers table where each element is a character string coding 2  alleles
> Such data are interpretable when all strings contain 2,4 or 6 characters."
This part was about df2genind, but it is obsolete since version 1.2-0; 
now there can be any ploidy level, so the function aims at finding the 
number of characters coding alleles according the the maximum number of 
characters and the ploidy. Anyhow, it is safer to provide the number of 
characters coding the genotypes (argument ncode) or to use a separator 
between alleles. I updated the tutorial, it should be online within a 
few hours.
>
> In  my case the allele are  not  stored together and should *not be 
> coded  with two character(*s)?
This applied to df2genind, not to read.structure. In structure, alleles 
are always separated, so there should be no problem.
>
> In more detail:
> In fact I  have a problem for importing data with adegent using 
> read.structure (which is the most convenient when the two alleles of 
> each loci are not stored together   but in two different colon). When 
> in my data to import, I have allele coded with 2 and 1 characters , I 
> have no probleme to import the data. However when I have all the 
> alleles coded with only one characters, I cannot import the file and I 
> have the following error message:
> _I used the following command:_
> dataadegenet<-read.structure(nameoffilebis,n.ind=200,n.loc=20, 
> onerowperind=T,col.lab=1,col.pop=0,col.other=c(2,3),row.marknames=0,NA.char="-9")
> _Error message:_
> Error in df2genind(X = X, pop = pop, missing = missing) :
>         Invalid number of coding characters (should be 2, 4, or 6)
>
This may be a problem in read.structure, not in your data. But I have to 
be able to reproduce the problem to make it clear, and correct bugs if 
any. Could you send me a toy dataset reproducing the problem?
> _My questions are:_
> - is the error message due to the fact that the alleles are coded with 
> only one character?
It should not. If it is, I'll fix this.
> -when in a dataset alleles are coded with two and one character, R 
> read all with two characters?
No, it does not.
> -what is the best way to import such data (file text with allele not 
> together?)
I'd say, the best way is the simplest for you. If your data are in 
STRUCTURE format, then  read.structure should do the job, and I'll fix 
problems if there are some. If you do not have a file with one of the 
recognized format (GENETIX, Hierfstat, Genepop, STRUCTURE), then use 
df2genind. The advantage of df2genind is that any separator between 
alleles can be used.
> -one allele is coded by 0 : is it a problem ?
Yes, because it will be understood as a NA. In many formats, "0" (or 
"00", or "000", etc.) stands for NA. In STRUCTURE, NAs are coded by "-9" 
by default, but read.structure uses internally df2genind, which 
considers NAs and zeros both as missing data. I added a comment about 
this in ?read.structure.

Best regards,

Thibaut.
>
> Thank you for your attention
> Stéphanie
>
>
>
>
> ___________________________________
> Stéphanie Manel
> Université Joseph Fourier,
> Laboratoire d'Ecologie Alpine, Equipe GPB
> UMR-CNRS 5553, BPX53 Grenoble 38041
> tél: 04 76 51 41 15
> http://www-leca.ujf-grenoble.fr/membres/manel.htm
> ------------------------------------------------------------------------
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>   


-- 
######################################
Thibaut JOMBART
CNRS UMR 5558 - Laboratoire de Biométrie et Biologie Evolutive
Universite Lyon 1
43 bd du 11 novembre 1918
69622 Villeurbanne Cedex
Tél. : 04.72.43.29.35
Fax : 04.72.43.13.88
jombart at biomserv.univ-lyon1.fr
http://biomserv.univ-lyon1.fr/%7Ejombart/
http://adegenet.r-forge.r-project.org/


More information about the adegenet-forum mailing list