[Seqinr-forum] inconsistent behaviour of the getAnnot() function

Coghlan, Avril A.Coghlan at ucc.ie
Wed Mar 17 18:25:00 CET 2010


Dear all,

I've been using the getAnnot() function to retrieve annotations for
UniProt sequences.

For each query, I've been asking for a set of UniProt sequences. I've
then been retrieving the annotations for each set of sequences using
getAnnot().

I noticed that usually getAnnot() returns the annotations for a set of
sequences in the form of a list variable, where each element of the list
variable is a vector containing the annotations for a particular
sequence.

However, sometimes getAnnot() returns the annotations for a set of
sequences in the form of a matrix variable, where each column of the
matrix variable contains the annotations for one sequence.  

I've included the R code for two examples below, one where getAnnot()
returns the annotations in a matrix variable (Example 1 below), and one
where getAnnot() returns the annotations in a list variable (Example 2
below). 

I am a bit confused why getAnnot() does not always return the
annotations in the same sort of data structure (eg. always as a list
variable).
I'm wondering could this possibly be due to a bug in getAnnot()?

Kind regards,
Avril

Avril Coghlan
University College Cork
Ireland


EXAMPLE 1 - RETURNS ANNOTATIONS AS A MATRIX

> query = paste("K=@MotA@ AND TID=479436") # Veillonella parvula (strain
ATCC 10790 / DSM 2008 / JCM 12972 / Te3)
> query("myquery",`query`)
> myquery$nelem 
[1] 4 
> myannot <- getAnnot(myquery)
> dim(myannot)
[1] 40 4
> length(myannot[[1]])
[1] 1
> length(myannot) 
[1] 160 # A matrix with all 4 sequences' annotations glued together, one
sequence's annotations per matrix column
> myannot[,1] # The first sequence's annotations
> myannot[,2] # The second sequence's annotations
> myannot[,3] # The thirs sequence's annotations
> myannot[,4] # The fourth sequence's annotations
> is.list(myannot)
[1] FALSE
> is.matrix(myannot)
[1] TRUE


EXAMPLE 2 - RETURNS ANNOTATIONS AS A LIST

> query = paste("K=@MotA@ AND TID=405535") # Bacillus cereus strain
AH820
> query("myquery",`query`)
> myquery$nelem 
[1] 20
> myannot <- getAnnot(myquery)
> dim(myannot)
NULL
> length(myannot[[1]])
[1] 47
> length(myannot) # A list with all 20 sequences, one sequence's
annotations per list element 
20
> myannot[[1]] # The first sequence's annotations
> myannot[[2]] # The second sequence's annotations 
> is.list(myannot) 
[1] TRUE








More information about the Seqinr-forum mailing list