[Seqinr-forum] read.alignment truncated FASTA header
Jean Lobry
jean.lobry at univ-lyon1.fr
Fri Feb 23 12:40:27 CET 2018
Dear Simon,
I was able to reproduce the bahaviour bescribed by
Haro Suzuki thereafter.
I've found the culprit in read.fasta() which is
called by read.alignment(). The name is indeed
truncated after the first space.
I'll commit a fix that doesn't break previous
code asap.
Best,
JLO
Le 22/02/2018 à 09:39, Haruo Suzuki a écrit :
> Dear Simon,
>
> I hope all is well with you.
>
> LTP datasets based on SILVA release 128 was downloaded from [Archive](https://www.arb-silva.de/no_cache/download/archive/living_tree/LTP_release_128/) using:
> -----------------------
> wget https://www.arb-silva.de/fileadmin/silva_databases/living_tree/LTP_release_128/LTPs128_SSU/LTPs128_SSU_aligned.fasta.tar.gz
> tar xvzf LTPs128_SSU_aligned.fasta.tar.gz
> -----------------------
>
> Here are FASTA header lines:
> -----------------------
> $grep "^>" LTPs128_SSU_aligned.fasta | head -n 2
>> D50541 1 1411 1411bp rna Abiotrophia defectiva Aerococcaceae
>> KP233895 1 1520 1520bp rna Abyssivirga alkaniphila Lachnospiraceae
> -----------------------
>
> The `read.alignment` function of SeqinR (Version: 3.4-5) did not get whole FASTA header lines (truncated descriptions probably because there are space " " between genus and species in organism names; e.g. "Abiotrophia defectiva" and "Abyssivirga alkaniphila") as follows:
> -----------------------
>> aln <- read.alignment("LTPs128_SSU_aligned.fasta", format = "fasta")
>> head(aln$nam, 2)
> [1] "D50541\t1\t1411\t1411bp\trna\tAbiotrophia"
> [2] "KP233895\t1\t1520\t1520bp\trna\tAbyssivirga"
> -----------------------
>
> # References
> -----------------------
> https://www.arb-silva.de/fileadmin/silva_databases/living_tree/LTP_release_128/readme_LTP_SSUs128_LSUs123.pdf
> LTPs128_SSU_aligned.fasta: multifasta alignments of type strains. The headers of the sequences accordingly stand for the following information: accession number, start and stop position, length, type of sequence, fullname_ltp, hi_tax_ltp. Also compressed for download as LTPs128_SSU_aligned.fasta.tar.gz
> -----------------------
>
> Yours sincerely,
>
> Haruo Suzuki
>
>
> On Nov 27, 2017, at 18:14, Simon Penel <simon.penel at univ-lyon1.fr> wrote:.
>
More information about the Seqinr-forum
mailing list