[Seqinr-forum] extract A/C/G/T positions in a FASTA file
Jean Lobry
jean.lobry at univ-lyon1.fr
Mon Aug 23 10:35:21 CEST 2021
Dear Jie,
I'm unsure of what you are trying to do. Here is some
code you may use as a starter:
library(seqinr)
# read a DNA alignement from a fasta file
myfile <- system.file("sequences/Anouk.fasta", package = "seqinr")
myali <- read.alignment(myfile, format = "fasta")
# Get the indices of "a" in the alignement
which(as.matrix(myali) == "a", arr.ind = TRUE)
# Get the indices of "a" in the consensus sequence
mycon <- consensus(myali)
which(mycon == "a")
HTH,
JLO
Le 09/08/2021 à 21:33, jiehuang001 at gmail.com a écrit :
> Hi, guys:
>
> Previously I have been using library(Biostrings).
>
> For example, I have used the following 2 lines to read in a SARS-COV-2
> FASTA file and find the positions for all “A” allele.
>
> fa <- readDNAStringSet(“MY-FASTA.fa”, format="fasta")
>
> I could then use vmatchPattern("A", fa, max.mismatch=0)
>
> However, the output from the above vmatchPattern() command is a bit messy.
>
> I wish that SeqinR package could do this more straight-forward.
>
> If so, can someone please let me know how to write my above Biostrings
> command for SeqinR?
>
> Thank you very much & best regards,
>
> Jie
>
>
> _______________________________________________
> Seqinr-forum mailing list
> Seqinr-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/seqinr-forum
>
More information about the Seqinr-forum
mailing list