[Seqinr-forum] read.alignment truncated FASTA header

Jean Lobry jean.lobry at univ-lyon1.fr
Sat Feb 24 15:57:41 CET 2018


Dear All,

I have commited a fix:

http://seqinr.r-forge.r-project.org/src/appendix/releasenotes.pdf

available under the dev version of seqinr:

install.packages("seqinr", repos="http://R-Forge.R-project.org")

Best,

JLO

Le 23/02/2018 à 12:40, Jean Lobry a écrit :
> Dear Simon,
> 
> I was able to reproduce the bahaviour bescribed by
> Haro Suzuki thereafter.
> 
> I've found the culprit in read.fasta() which is
> called by read.alignment(). The name is indeed
> truncated after the first space.
> 
> I'll commit a fix that doesn't break previous
> code asap.
> 
> Best,
> 
> JLO
> 
> Le 22/02/2018 à 09:39, Haruo Suzuki a écrit :
>> Dear Simon,
>>
>> I hope all is well with you.
>>
>> LTP datasets based on SILVA release 128 was downloaded from 
>> [Archive](https://www.arb-silva.de/no_cache/download/archive/living_tree/LTP_release_128/) 
>> using:
>> -----------------------
>>     wget 
>> https://www.arb-silva.de/fileadmin/silva_databases/living_tree/LTP_release_128/LTPs128_SSU/LTPs128_SSU_aligned.fasta.tar.gz 
>>
>>     tar xvzf LTPs128_SSU_aligned.fasta.tar.gz
>> -----------------------
>>
>> Here are FASTA header lines:
>> -----------------------
>> $grep "^>" LTPs128_SSU_aligned.fasta | head -n 2
>>> D50541    1    1411    1411bp    rna    Abiotrophia defectiva    
>>> Aerococcaceae
>>> KP233895    1    1520    1520bp    rna    Abyssivirga alkaniphila    
>>> Lachnospiraceae
>> -----------------------
>>
>> The `read.alignment` function of SeqinR (Version: 3.4-5) did not get 
>> whole FASTA header lines (truncated descriptions probably because 
>> there are space " " between genus and species in organism names; e.g. 
>> "Abiotrophia defectiva" and "Abyssivirga alkaniphila") as follows:
>> -----------------------
>>> aln <- read.alignment("LTPs128_SSU_aligned.fasta", format = "fasta")
>>> head(aln$nam, 2)
>> [1] "D50541\t1\t1411\t1411bp\trna\tAbiotrophia"
>> [2] "KP233895\t1\t1520\t1520bp\trna\tAbyssivirga"
>> -----------------------
>>
>> # References
>> -----------------------
>> https://www.arb-silva.de/fileadmin/silva_databases/living_tree/LTP_release_128/readme_LTP_SSUs128_LSUs123.pdf 
>>
>> LTPs128_SSU_aligned.fasta​: multifasta alignments of type strains. The 
>> headers of the sequences accordingly stand for the following 
>> information: accession number, start and stop position, length, type 
>> of sequence, fullname_ltp, hi_tax_ltp. Also compressed for download as 
>> ​LTPs128_SSU_aligned.fasta.tar.gz
>> -----------------------
>>
>> Yours sincerely,
>>
>> Haruo Suzuki
>>
>>
>> On Nov 27, 2017, at 18:14, Simon Penel <simon.penel at univ-lyon1.fr> 
>> wrote:.
>>
> 
> _______________________________________________
> Seqinr-forum mailing list
> Seqinr-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/seqinr-forum



More information about the Seqinr-forum mailing list