[Seqinr-forum] read.alignment truncated FASTA header
Jean Lobry
jean.lobry at univ-lyon1.fr
Sat Feb 24 15:57:41 CET 2018
Dear All,
I have commited a fix:
http://seqinr.r-forge.r-project.org/src/appendix/releasenotes.pdf
available under the dev version of seqinr:
install.packages("seqinr", repos="http://R-Forge.R-project.org")
Best,
JLO
Le 23/02/2018 à 12:40, Jean Lobry a écrit :
> Dear Simon,
>
> I was able to reproduce the bahaviour bescribed by
> Haro Suzuki thereafter.
>
> I've found the culprit in read.fasta() which is
> called by read.alignment(). The name is indeed
> truncated after the first space.
>
> I'll commit a fix that doesn't break previous
> code asap.
>
> Best,
>
> JLO
>
> Le 22/02/2018 à 09:39, Haruo Suzuki a écrit :
>> Dear Simon,
>>
>> I hope all is well with you.
>>
>> LTP datasets based on SILVA release 128 was downloaded from
>> [Archive](https://www.arb-silva.de/no_cache/download/archive/living_tree/LTP_release_128/)
>> using:
>> -----------------------
>> wget
>> https://www.arb-silva.de/fileadmin/silva_databases/living_tree/LTP_release_128/LTPs128_SSU/LTPs128_SSU_aligned.fasta.tar.gz
>>
>> tar xvzf LTPs128_SSU_aligned.fasta.tar.gz
>> -----------------------
>>
>> Here are FASTA header lines:
>> -----------------------
>> $grep "^>" LTPs128_SSU_aligned.fasta | head -n 2
>>> D50541 1 1411 1411bp rna Abiotrophia defectiva
>>> Aerococcaceae
>>> KP233895 1 1520 1520bp rna Abyssivirga alkaniphila
>>> Lachnospiraceae
>> -----------------------
>>
>> The `read.alignment` function of SeqinR (Version: 3.4-5) did not get
>> whole FASTA header lines (truncated descriptions probably because
>> there are space " " between genus and species in organism names; e.g.
>> "Abiotrophia defectiva" and "Abyssivirga alkaniphila") as follows:
>> -----------------------
>>> aln <- read.alignment("LTPs128_SSU_aligned.fasta", format = "fasta")
>>> head(aln$nam, 2)
>> [1] "D50541\t1\t1411\t1411bp\trna\tAbiotrophia"
>> [2] "KP233895\t1\t1520\t1520bp\trna\tAbyssivirga"
>> -----------------------
>>
>> # References
>> -----------------------
>> https://www.arb-silva.de/fileadmin/silva_databases/living_tree/LTP_release_128/readme_LTP_SSUs128_LSUs123.pdf
>>
>> LTPs128_SSU_aligned.fasta: multifasta alignments of type strains. The
>> headers of the sequences accordingly stand for the following
>> information: accession number, start and stop position, length, type
>> of sequence, fullname_ltp, hi_tax_ltp. Also compressed for download as
>> LTPs128_SSU_aligned.fasta.tar.gz
>> -----------------------
>>
>> Yours sincerely,
>>
>> Haruo Suzuki
>>
>>
>> On Nov 27, 2017, at 18:14, Simon Penel <simon.penel at univ-lyon1.fr>
>> wrote:.
>>
>
> _______________________________________________
> Seqinr-forum mailing list
> Seqinr-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/seqinr-forum
More information about the Seqinr-forum
mailing list