# [Seqinr-commits] r1594 - pkg/man

Thu Apr 23 20:24:12 CEST 2009

Author: lobry
Date: 2009-04-23 20:24:12 +0200 (Thu, 23 Apr 2009)
New Revision: 1594

Modified:
pkg/man/kaks.Rd
Log:
doc polish after Darren Obbard post

Modified: pkg/man/kaks.Rd
===================================================================
--- pkg/man/kaks.Rd	2009-04-23 16:45:50 UTC (rev 1593)
+++ pkg/man/kaks.Rd	2009-04-23 18:24:12 UTC (rev 1594)
@@ -1,18 +1,17 @@
\name{kaks}
\alias{kaks}
-\title{ to Get an Estimation of Ka and Ks }
-\description{
-  Ks and Ka  are respectively the number of substitutions per synonymous site and per nonsynonymous site between two protein-coding genes. The ratio of nonsynonymous (Ka) to synonymous (Ks) nucleotide substitution rates is an indicator of selective pressures on genes. A ratio significantly greater than 1 indicates positive selective pressure. A ratio around 1 indicates either neutral evolution at the protein level or an averaging of sites under positive and negative selective pressures. A ratio less than 1 indicates pressures to conserve protein sequence (i.e. purifying selection). This function estimates the Ka and Ks values for a set of aligned sequences using the method published by Li (1993) and gives the associated variance matrix.
+\title{Ka and Ks, also known as dn and ds, computation}
+\description{ Ks and Ka  are, respectively, the number of substitutions per synonymous site and per non-synonymous site between two protein-coding genes. They are also denoted as ds and dn in the literature. The ratio of nonsynonymous (Ka) to synonymous (Ks) nucleotide substitution rates is an indicator of selective pressures on genes. A ratio significantly greater than 1 indicates positive selective pressure. A ratio around 1 indicates either neutral evolution at the protein level or an averaging of sites under positive and negative selective pressures. A ratio less than 1 indicates pressures to conserve protein sequence (i.e. purifying selection). This function estimates the Ka and Ks values for a set of aligned sequences using the method published by Li (1993) and gives the associated variance matrix.
}
\usage{
kaks(x, debug = FALSE, forceUpperCase = TRUE)
}
\arguments{
-  \item{x}{ An object of class \code{alignment} }
-  \item{debug}{ If TRUE turns debug mode on}
+  \item{x}{ An object of class \code{alignment}, obtained for instance by importing into R the data from an alignment file with the \code{\link{read.alignment}} function. This is typically a set of coding sequences aligned at the protein level, see \code{\link{reverse.align}}.}
\item{forceUpperCase}{ If TRUE, the default value, all character in sequences are forced to the upper case
if at least one 'a', 'c', 'g', or 't' is found in the sequences.
Turning it to FALSE if the sequences are already in upper case will save time.}
+  \item{debug}{ If TRUE turns debug mode on.}
}
\value{
\item{ ks }{ matrix of Ks values }
@@ -23,27 +22,42 @@
\references{
Li, W.-H. (1993) Unbiased estimation of the rates of synonymous and nonsynonymous substitution.
\emph{J. Mol. Evol.}, \bold{36}:96-99.\cr
+
Hurst, L.D. (2002) The Ka/Ks ratio: diagnosing the form of sequence evolution.
\emph{Trends Genet.}, \bold{18}:486-486.\cr
+
The C programm implementing this method was provided by Manolo Gouy. More info is
needed here to trace back the original C source so as to credit correct source.
The original FORTRAN-77 code by Chung-I Wu modified by Ken Wolfe is available
here \url{http://wolfe.gen.tcd.ie/lab/pub/li93/}.\cr
-For a recent discussion about the estimation of Ka and Ks see:\cr
+
+For a more recent discussion about the estimation of Ka and Ks see:\cr
+
Tzeng, Y.H., Pan, R., Li, W.-H. (2004) Comparison of three methods for estimating
rates of synonymous and nonsynonymous nucleotide substitutions.
\emph{Mol. Biol. Evol}, \bold{21}:2290-2298.\cr
+
The method implemented here is noted LWL85 in the above paper.\cr
-\code{citation("seqinr")}
+
+The cite this package in a publication, as any R package, try something as \code{citation("seqinr")}
+at your R prompt.
}
\note{
- When the alignment does not contain enough information (i.e we approach saturation), the Ka and Ks values take the value 10.
- Negative values indicate that Ka and Ks can not be computed.\cr
- Codons with ambiguous bases are treated as gaps.\cr
- Codons with gaps are not used for computations.
+Computing Ka and Ks makes sense for coding sequences that have been aligned at the amino-acid level before retro-translating the alignement at the nucleic acid level to ensure that sequences are compared on a codon-by-codon basis. Function \code{\link{reverse.align}} may help for this.
+
+There is an internal check at the C level to ensure that sequences have a multiple of 3 nucleotides after gap removal. A trouble at this level means most likely that the alignment was not done at the amino-acid level.
+
+When there is at least one ambiguous base in a codon, this codon is considered as a gap-codon (\code{---}).
+
+Gap-codons (\code{---}) are not used for computations.
+
+When the alignment does not contain enough information (\emph{i.e.} close to saturation), the Ka and Ks values are forced to 10.
+
+Negative values indicate that Ka and Ks can not be computed.
+
}
\author{ D. Charif, J.R. Lobry }
+\seealso{\code{\link{read.alignment}} to import alignments from files, \code{\link{reverse.align}} to align CDS at the aa level.}
\examples{
#
# Simple Toy example:
@@ -56,9 +70,9 @@
data(AnoukResult)
Anouk <- read.alignment(file = system.file("sequences/Anouk.fasta", package = "seqinr"), format = "fasta")
if( ! all.equal(kaks(Anouk), AnoukResult) ) {
-   warning("Poor numeric results with Anouk test file")
+   warning("Poor numeric results with respect to AnoukResult standard")
} else {
-   print("Results are OK with Anouk test file")
+   print("Results are consistent with AnoukResult standard")
}
}
\keyword{ manip }