[CHNOSZ-commits] r8 - in pkg: . inst inst/doc inst/tests man vignettes
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Wed Sep 5 01:20:57 CEST 2012
Author: jedick
Date: 2012-09-05 01:20:56 +0200 (Wed, 05 Sep 2012)
New Revision: 8
Removed:
pkg/inst/doc/protactiv.pdf
pkg/vignettes/protactiv.Rnw
pkg/vignettes/protactiv.lyx
Modified:
pkg/DESCRIPTION
pkg/inst/NEWS
pkg/inst/doc/anintro.pdf
pkg/inst/tests/test-findit.R
pkg/man/iprotein.Rd
pkg/man/transfer.Rd
pkg/vignettes/anintro.Rnw
pkg/vignettes/anintro.lyx
Log:
remove protactiv.Rnw; add donttest protection to transfer.Rd, iprotein.Rd
add --as-cran (R_CHECK_TIMINGS) check to test-findit.Rd
change scale of protein predominance diagram in anintro.Rnw
Modified: pkg/DESCRIPTION
===================================================================
--- pkg/DESCRIPTION 2012-09-02 15:11:09 UTC (rev 7)
+++ pkg/DESCRIPTION 2012-09-04 23:20:56 UTC (rev 8)
@@ -1,6 +1,6 @@
+Date: 2012-09-05
Package: CHNOSZ
Version: 0.9-7.98
-Date: 2012-09-02
Title: Chemical Thermodynamics and Activity Diagrams
Author: Jeffrey M. Dick
Maintainer: Jeffrey M. Dick <jmdick at asu.edu>
Modified: pkg/inst/NEWS
===================================================================
--- pkg/inst/NEWS 2012-09-02 15:11:09 UTC (rev 7)
+++ pkg/inst/NEWS 2012-09-04 23:20:56 UTC (rev 8)
@@ -1,4 +1,4 @@
-CHANGES IN CHNOSZ 0.9-7.98 (2012-09-02)
+CHANGES IN CHNOSZ 0.9-7.98 (2012-09-05)
---------------------------------------
SIGNIFICANT USER-VISIBLE CHANGES:
@@ -234,8 +234,8 @@
- Move vignette sources to 'vignettes' directory.
-- Remove vignettes formation.Rnw and xadditivity.Rnw (and, for the
- latter, the package Suggests dependency on xtable).
+- Remove vignettes formation.Rnw, protactiv.Rnw and xadditivity.Rnw
+ (and, for the latter, the package Suggests dependency on xtable).
- Add vignette wjd.Rnw to accompany the new function wjd().
Modified: pkg/inst/doc/anintro.pdf
===================================================================
(Binary files differ)
Deleted: pkg/inst/doc/protactiv.pdf
===================================================================
(Binary files differ)
Modified: pkg/inst/tests/test-findit.R
===================================================================
--- pkg/inst/tests/test-findit.R 2012-09-02 15:11:09 UTC (rev 7)
+++ pkg/inst/tests/test-findit.R 2012-09-04 23:20:56 UTC (rev 8)
@@ -1,5 +1,7 @@
context("findit")
+# skip test if we're on R CMD check --as-cran
+if(!any(grepl("R_CHECK_TIMINGS", names(Sys.getenv())))) {
test_that("findit() returns known values encoded in a species distribution", {
# test the findit function in three dimensions
@@ -34,3 +36,5 @@
expect_equal(tail(f$value[[3]],1), -sqrt(2), tol=1e-1)
# we could decrease the tolerance by increasing the resolution and/or iterations in findit()
})
+
+}
Modified: pkg/man/iprotein.Rd
===================================================================
--- pkg/man/iprotein.Rd 2012-09-02 15:11:09 UTC (rev 7)
+++ pkg/man/iprotein.Rd 2012-09-04 23:20:56 UTC (rev 8)
@@ -95,6 +95,7 @@
subcrt("ALAT1_HUMAN", c("cr", "aq"), c(-1, 1))$out[[1]]$H
}
+\donttest{
# read a fasta file, calculate H/C and O/C for all proteins
# and averages by polypeptide chain, residue and mass
ffile <- system.file("extdata/fasta/HTCC1062.faa.xz", package="CHNOSZ")
@@ -122,6 +123,7 @@
title(main=paste("O/C vs H/C for HTCC1062 proteins\n",
"n =", nrow(aa)))
}
+}
\seealso{
\code{\link{protein}} for examples of calculations using proteins, \code{\link{protein.info}} for higher-level functions (chemical formulas, summaries of reaction coefficients and energies), \code{\link{more.aa}} for getting amino acid compositions for model organisms from additional data files in the \code{\link{extdata}/protein} directory, and \code{\link{read.expr}} for working with protein abundance and localization data.
Modified: pkg/man/transfer.Rd
===================================================================
--- pkg/man/transfer.Rd 2012-09-02 15:11:09 UTC (rev 7)
+++ pkg/man/transfer.Rd 2012-09-04 23:20:56 UTC (rev 8)
@@ -64,8 +64,9 @@
}
-\examples{
-\dontshow{data(thermo)}\donttest{## react potassium feldspar in a closed system
+\examples{\dontshow{data(thermo)}
+\donttest{
+## react potassium feldspar in a closed system
## after Steinmann et al., 1994 and Helgeson et al., 1969
basis(c("Al+3", "H4SiO4", "K+", "H2O", "H+", "O2"), c(0, -6, -6, 0, 0, 0))
species(c("k-feldspar", "muscovite", "pyrophyllite", "kaolinite", "gibbsite"))
Modified: pkg/vignettes/anintro.Rnw
===================================================================
--- pkg/vignettes/anintro.Rnw 2012-09-02 15:11:09 UTC (rev 7)
+++ pkg/vignettes/anintro.Rnw 2012-09-04 23:20:56 UTC (rev 8)
@@ -1,4 +1,4 @@
-%% LyX 2.0.3 created this file. For more info, see http://www.lyx.org/.
+%% LyX 2.0.4 created this file. For more info, see http://www.lyx.org/.
%% Do not edit unless you really know what you are doing.
\documentclass[english,noae,round]{article}
\usepackage{mathpazo}
@@ -699,7 +699,7 @@
<<PredominanceDiagram,fig=T,width=4,height=4>>=
species(c("SLAP_ACEKI", "SLAP_GEOSE", "SLAP_BACLI", "SLAP_AERSA"))
basis(c("NH3", "H2S"), c(-1, -10))
-a <- affinity(O2=c(-90, -70), H2O=c(-10, 0))
+a <- affinity(O2=c(-85, -70), H2O=c(-5, 0))
diagram(a)
@
\setkeys{Gin}{width=1.0\textwidth}
Modified: pkg/vignettes/anintro.lyx
===================================================================
--- pkg/vignettes/anintro.lyx 2012-09-02 15:11:09 UTC (rev 7)
+++ pkg/vignettes/anintro.lyx 2012-09-04 23:20:56 UTC (rev 8)
@@ -128,7 +128,7 @@
\begin_layout Standard
\begin_inset Branch load
-status open
+status collapsed
\begin_layout Standard
This document will orient you to the basic functionality of CHNOSZ, a package
@@ -195,7 +195,7 @@
\begin_layout Standard
\begin_inset Branch stuff
-status open
+status collapsed
\begin_layout Standard
CHNOSZ is made up of a set of functions and supporting datasets.
@@ -304,7 +304,7 @@
\begin_layout Standard
\begin_inset Branch load
-status open
+status collapsed
\begin_layout Standard
If you have just installed R, and you are online, installing the CHNOSZ
@@ -372,7 +372,7 @@
\begin_layout Standard
\begin_inset Branch stuff
-status open
+status collapsed
\begin_layout Subsection
@@ -500,7 +500,7 @@
\begin_layout Standard
\begin_inset Branch stuff
-status open
+status collapsed
\begin_layout Subsection
@@ -596,7 +596,7 @@
\begin_layout Standard
\begin_inset Branch stuff
-status open
+status collapsed
\begin_layout Subsection
@@ -724,7 +724,7 @@
\begin_layout Standard
\begin_inset Branch stuff
-status open
+status collapsed
\begin_layout Subsection
@@ -815,7 +815,7 @@
\begin_layout Standard
\begin_inset Branch stuff
-status open
+status collapsed
\begin_layout Subsection
@@ -897,7 +897,7 @@
\begin_layout Standard
\begin_inset Branch stuff
-status open
+status collapsed
\begin_layout Subsection
A single species
@@ -1200,7 +1200,7 @@
\begin_layout Standard
\begin_inset Branch stuff
-status open
+status collapsed
\begin_layout Subsection
What are basis species?
@@ -2008,7 +2008,7 @@
\begin_layout Chunk
-a <- affinity(O2=c(-90, -70), H2O=c(-10, 0))
+a <- affinity(O2=c(-85, -70), H2O=c(-5, 0))
\end_layout
\begin_layout Chunk
@@ -2272,7 +2272,7 @@
\begin_layout Standard
\begin_inset Branch stuff
-status open
+status collapsed
\begin_layout Standard
You can explore the package documentation through R's help system; just
@@ -2397,7 +2397,7 @@
\begin_layout Standard
\begin_inset Branch more
-status open
+status collapsed
\begin_layout Standard
The following pages contain activity diagrams created with more complex
@@ -2438,7 +2438,7 @@
\begin_layout Standard
\begin_inset Branch more
-status open
+status collapsed
\begin_layout Subsection
Protein Eh-pH
@@ -2697,7 +2697,7 @@
\begin_layout Standard
\begin_inset Branch more
-status open
+status collapsed
\begin_layout Subsection
Subcellular proteins
@@ -2879,7 +2879,7 @@
\begin_layout Standard
\begin_inset Branch more
-status open
+status collapsed
\begin_layout Subsection
Buffers
@@ -3132,7 +3132,7 @@
\begin_layout Standard
\begin_inset Branch more
-status open
+status collapsed
\begin_layout Subsection
Revisit
@@ -3358,7 +3358,7 @@
\begin_layout Standard
\begin_inset Branch more
-status open
+status collapsed
\begin_layout Subsection
Findit
@@ -3576,7 +3576,7 @@
\begin_layout Standard
\begin_inset Branch stuff
-status open
+status collapsed
\begin_layout Standard
Revision history:
Deleted: pkg/vignettes/protactiv.Rnw
===================================================================
--- pkg/vignettes/protactiv.Rnw 2012-09-02 15:11:09 UTC (rev 7)
+++ pkg/vignettes/protactiv.Rnw 2012-09-04 23:20:56 UTC (rev 8)
@@ -1,699 +0,0 @@
-%% LyX 2.0.3 created this file. For more info, see http://www.lyx.org/.
-%% Do not edit unless you really know what you are doing.
-\documentclass[english,noae]{article}
-\usepackage{mathpazo}
-\usepackage[T1]{fontenc}
-\usepackage[latin9]{inputenc}
-\usepackage[letterpaper]{geometry}
-\geometry{verbose,tmargin=2.5cm,bmargin=2.5cm,lmargin=2.5cm,rmargin=2.5cm}
-\usepackage{babel}
-\usepackage{amsbsy}
-\usepackage{amssymb}
-\usepackage[numbers]{natbib}
-\usepackage[unicode=true,pdfusetitle,
- bookmarks=true,bookmarksnumbered=false,bookmarksopen=false,
- breaklinks=false,pdfborder={0 0 1},backref=false,colorlinks=false]
- {hyperref}
-\usepackage{breakurl}
-
-\makeatletter
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Textclass specific LaTeX commands.
-<<echo=F>>=
- if(exists(".orig.enc")) options(encoding = .orig.enc)
-@
-
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% User specified LaTeX commands.
-%\VignetteIndexEntry{Protein abundances vs. equilibrium activities}
-
-% so DOIs in bibliography show up as hyperlinks
-\newcommand*{\doi}[1]{\href{http://dx.doi.org/#1}{doi: #1}}
-
-\makeatother
-
-\begin{document}
-
-\title{Protein abundances vs. equilibrium activities}
-
-
-\author{Jeffrey M. Dick}
-
-\maketitle
-
-\section{Introduction}
-
-The \texttt{diagram()} function serves multiple purposes that might
-be confusing to the new user. From its name, we know that it produces
-diagrams of some sort. These are equilibrium chemical activity diagrams
--- that is the primary purpose of the function. However, inspecting
-the arguments to the function reveals that the input values are the
-affinities of formation reactions of species in the system. How do
-we go from chemical affinities to chemical activities? This problem
-defines the purpose of two auxiliary functions (\texttt{equil.react()}
-and \texttt{equil.boltz()}) whose algorithms are described below.
-
-Some explanation of terminology is in order. By chemical activity
-we mean the quantity $a_{i}$ that appears in the expression
-\begin{equation}
-\mu_{i}=\mu_{i}^{\circ}+RT\ln a_{i}\,,\label{eq:mu}
-\end{equation}
-where $\mu_{i}$ and $\mu_{i}^{\circ}$ stand for the chemical potential
-and the standard chemical potential of the $i$th species, and $R$
-and $T$ represent the gas constant and the temperature in Kelvin.
-Chemical activity is related to molality ($m_{i}$) by
-\begin{equation}
-a_{i}=\gamma_{i}m_{i}\,,\label{eq:activity}
-\end{equation}
-where $\gamma_{i}$ stands for the activity coefficient of the $i$th
-species. For this discussion, we take $\gamma_{i}=1$ for all species,
-so chemical activity is assumed to be numerically equivalent to molality.
-Since molality is a measure of concentration, calculating the equilibrium
-chemical activities can be a theoretical tool to help understand the
-relative abundances of species, including proteins.
-
-After going over the methods used in CHNOSZ for equilibrium activity
-calculations, some comparisons with experimental protein abundance
-data are made.
-
-
-\section{\label{sec:point}Calculations at a single point}
-
-Here we discuss two procedures for calculating equilibrium activities
-of species. The first is a reaction-matrix approach and the second
-takes advantage of the Boltzmann distribution. We show (by example)
-that the two approaches are equivalent when the formation reactions
-of residue equivalents of proteins are used. The example system here
-has also been described in a paper \citep{Dic08}.
-
-
-\subsection{Reaction-matrix approach}
-
-The next two sections give examples of calculating the equilibrium
-activities of two proteins using a matrix of equations representing
-reactions to form the proteins. Although the examples below include
-only two proteins, each additional protein introduces one more equation
-and unknown, so this procedure can be carried out for any number of
-proteins given the necessary computational requirements.
-
-
-\subsubsection{Whole proteins}
-
-Let us calculate the equilibrium activities of two proteins in metastable
-equilibrium. To do this we start by writing the formation reactions
-of each protein as
-\begin{equation}
-stuff_{3}\rightleftharpoons\mathrm{CSG\_METVO}\,\label{react:CSG_METVO}
-\end{equation}
-and
-\begin{equation}
-stuff_{4}\rightleftharpoons\mathrm{CSG\_METJA}\,.\label{react:CSG_METJA}
-\end{equation}
-The basis species in the reactions are collectively symbolized by
-$stuff$; the subscripts simply refer to the reaction number in this
-document. In these examples, $stuff$ consists of $\mathrm{CO_{2}}$,
-$\mathrm{H_{2}O}$, $\mathrm{NH_{3}}$, $\mathrm{O_{2}}$, $\mathrm{H_{2}S}$
-and $\mathrm{H^{+}}$ in different molar proportions. To see what
-$stuff$ is, try out these commands in CHNOSZ:
-
-<<ProteinFormation>>=
-library(CHNOSZ)
-basis("CHNOS+")
-species("CSG",c("METVO","METJA"))
-@
-
-
-Although the basis species are defined, the temperature is not yet
-specified, so it is not immediately possible to calculate the ionization
-states of the proteins. That is why the coefficient on $\mathrm{H^{+}}$
-is zero in the output above. To see what the computed protein charges
-are at 25 $^{\circ}$C and 1 bar and at pH 7 (which is the opposite
-of the logarithm of activity of $\mathrm{H^{+}}$ in the basis species),
-try this:
-
-<<ProteinInfo>>=
-protein.info(species()$name)
-@
-
-
-Note that \texttt{affinity()} is called twice by \texttt{protein.info()};
-this so that both charges and standard Gibbs energies of ionization
-of the proteins can be calculated. The \texttt{Z} values in the table
-are the charges of the proteins computed using the ionization constants
-of sidechain and terminal groups, and the \texttt{G.Z} values are
-the calculated Gibbs energies of formation of the ionized proteins
-\citep{DLH06}. The \texttt{ZC} values are the average oxidation states
-of carbon of the proteins. Let us now calculate the chemical affinities
-of formation of the ionized proteins:
-
-<<ProteinAffinity>>=
-a <- affinity()
-a$values
-@
-
-
-Since \texttt{affinity()} returns a list with a lot of information
-(such as the basis species and species definitions) the last command
-was written to only print the \texttt{values} part of that list. The
-values are actually dimensionless, i.e. $\boldsymbol{A}/2.303RT$.
-
-The affinities of the formation reactions above were calculated for
-a \emph{reference value of activity of the proteins, which is not
-the equilibrium value}. Those non-equilibrium activities were $10^{-3}$.
-How do we calculate the equilibrium values? Let us write specific
-statements of the expression for chemical affinity (2.303 is used
-here to stand for $\ln10$),
-\begin{equation}
-\boldsymbol{A}=2.303RT\log(K/Q)\,,\label{eq:affinity}
-\end{equation}
-for Reactions \ref{react:CSG_METVO} and \ref{react:CSG_METJA} as
-\begin{equation}
-\begin{array}{rl}
-\boldsymbol{A}_{3}/2.303RT & =\log K_{3}-\log Q_{3}\\
- & =\log K_{3}+\log a_{stuff,3}-\log a_{\mathrm{CSG\_METVO}}\\
- & =\boldsymbol{A}_{3}^{*}/2.303RT-\log a_{\mathrm{CSG\_METVO}}
-\end{array}\label{eq:A3_METVO}
-\end{equation}
-and
-
-\begin{equation}
-\begin{array}{rl}
-\boldsymbol{A}_{4}/2.303RT & =\log K_{4}-\log Q_{4}\\
- & =\log K_{4}+\log a_{stuff,4}-\log a_{\mathrm{CSG\_METJA}}\\
- & =\boldsymbol{A}_{4}^{*}/2.303RT-\log a_{\mathrm{CSG\_METJA}}\,.
-\end{array}\label{eq:A4_METJA}
-\end{equation}
-The $\boldsymbol{A}^{*}$ denote the affinities of the formation reactions
-when the activities of the proteins are zero. From the output above
-it follows that $\boldsymbol{A}_{3}^{*}/2.303RT=104.6774$ and $\boldsymbol{A}_{4}^{*}/2.303RT=314.1877$.
-
-Next we must specify how reactions are balanced in this system: what
-is conserved during transformations between species (let us call it
-the immobile component)? For proteins, one possibility is to use the
-repeating protein backbone group. Let us use $n_{i}$ to designate
-the number of residues in the $i$th protein, which is equal to the
-number of backbone groups, which is equal to the length of the sequence.
-If $\gamma_{i}=1$ in Eq. (\ref{eq:activity}), the relationship between
-the activity of the $i$th protein ($a_{i}$) and the activity of
-the residue equivalent of the $i$th protein ($a_{residue,i}$) is
-\begin{equation}
-a_{residue,i}=n_{i}\times a_{i}\,.\label{eq:a_residue_i}
-\end{equation}
- We can use this to write a statement of mass balance:
-\begin{equation}
-553\times a_{\mathrm{CSG\_METVO}}+530\times a_{\mathrm{CSG\_METJA}}=1.083\,.\label{eq:a_residue}
-\end{equation}
-
-
-At equilibrium, the affinities of the formation reactions, per conserved
-quantity (in this case protein backbone groups) are equal. Therefore
-$\boldsymbol{A}=\boldsymbol{A}_{3}/553=\boldsymbol{A}_{4}/530$ is
-a condition for equilibrium. Combining this with Eqs. (\ref{eq:A3_METVO})
-and (\ref{eq:A4_METJA}) gives
-\begin{equation}
-\boldsymbol{A}/2.303RT=\left(104.6774-\log a_{\mathrm{CSG\_METVO}}\right)/553\label{eq:A_METVO}
-\end{equation}
-and
-\begin{equation}
-\boldsymbol{A}/2.303RT=\left(314.1877-\log a_{\mathrm{CSG\_METJA}}\right)/530\,.\label{eq:A_METJA}
-\end{equation}
-Now we have three equations (\ref{eq:a_residue}--\ref{eq:A_METJA})
-with three unknowns. The solution can be displayed in CHNOSZ as follows.
-The argument \texttt{residue=FALSE} overrides the default setting
-for \texttt{diagram} when proteins are the species of interest and
-instructs it to use the function named \texttt{equil.react()}, which
-implements the equation-solving strategy described in the next section.
-Here we retrieve the equilibrium activities using \texttt{diagram()}
-without letting it actually do any plotting.
-
-<<ProteinActivities>>=
-d <- diagram(a,residue=FALSE,plot.it=FALSE)
-d$logact
-@
-
-
-Those are the logarithms of the equilibrium activities of the proteins.
-Combining these values with either Eqs. (\ref{eq:A_METVO}) or (\ref{eq:A_METJA})
-gives us the same value for affinity of the formation reactions per
-residue (or per protein backbone group), $\boldsymbol{A}/2.303RT=0.5978817$.
-Equilibrium activities that differ by such great magnitudes make it
-appear that the proteins are very unlikely to coexist in metastable
-equilibrium. Later we explain the concept of using residue equivalents
-of the proteins to achieve a different result.
-
-
-\subsubsection{Implementing the reaction-matrix approach}
-
-The implementation used in CHNOSZ for finding a solution to the system
-of equations relies on a difference function for the activity of the
-immobile component. The steps to obtain this difference function are:
-\begin{enumerate}
-\item Set the total activity of the immobile (conserved) component as $a_{\mathrm{ic}}$
-(e.g., the 1.083 in Eqn. \ref{eq:a_residue}).
-\item Write a function for the logarithm of activity of each of the species
-of interest: $\boldsymbol{A}=\left(\boldsymbol{A}_{i}^{*}-2.303RT\log a_{i}\right)/n_{\mathrm{ic},i}$,
-where $n_{\mathrm{ic},i}$ stands for the number of moles of the immobile
-component that react in the formation of one mole of the $i$th species.
-(e.g., for systems of proteins where the backbone group is conserved,
-$n_{\mathrm{ic},i}$ is the same as $n_{i}$ in Eq. \ref{eq:a_residue_i}).
-Calculate values for each of the $\boldsymbol{A}_{i}^{*}$. Metastable
-equilibrium is implied by the identity of $\boldsymbol{A}$ in all
-of the equations.
-\item Write a function for the total activity of the immobile component:
-$a_{\mathrm{ic}}^{'}=\sum n_{\mathrm{ic},i}a_{i}$.
-\item The difference function is now $\delta a_{\mathrm{ic}}=a_{\mathrm{ic}}^{'}-a_{\mathrm{ic}}$.
-\end{enumerate}
-Now all we have to do is solve for the value of $\boldsymbol{A}$
-where $\delta a_{\mathrm{ic}}=0$. This is achieved in the code by
-first looking for a range of values of $\boldsymbol{A}$ where at
-one end $\delta a_{\mathrm{ic}}<0$ and at the other end $\delta a_{\mathrm{ic}}>0$,
-then using the \texttt{uniroot()} function that is part of R to find
-the solution.
-
-This approach is subject to failure if for all trial ranges of $\boldsymbol{A}$
-the $\delta a_{\mathrm{ic}}$ are of the same sign, which gives an
-error message like ``i tried it 1000 times but can't make it work''.
-Even if values of $\delta a_{\mathrm{ic}}$ on either side of zero
-can be located, the algorithm does not guarantee an accurate solution
-and may give a warning about poor convergence if a certain (currently
-hard-coded) tolerance is not reached.
-
-
-\subsubsection{Residue equivalents}
-
-Let us consider the formation reactions of the \emph{residue equivalents}
-of proteins, for example
-\begin{equation}
-stuff_{12}\rightleftharpoons\mathrm{CSG\_METVO(residue)}\,\label{react:CSG_METVO_residue}
-\end{equation}
-and
-\begin{equation}
-stuff_{13}\rightleftharpoons\mathrm{CSG\_METJA(residue)}\,.\label{react:CSG_METJA_residue}
-\end{equation}
-The formulas of the residue equivalents are those of the proteins
-divided by the number of residues in each protein. With the \texttt{protein.basis()}
-function it is possible to see the coefficients on the basis species
-in these reactions:
-
-<<>>=
-protein.basis(species()$name, residue=TRUE)
-@
-
-
-Let us denote by $\boldsymbol{A}_{12}$ and $\boldsymbol{A}_{13}$
-the chemical affinities of Reactions \ref{react:CSG_METVO_residue}
-and \ref{react:CSG_METJA_residue}. We can write
-\begin{equation}
-\boldsymbol{A}_{12}/2.303RT=\log K_{12}+\log a_{stuff,12}-\log a_{\mathrm{CSG\_METVO(residue)}}\label{eq:A3_METVO_residue}
-\end{equation}
-and
-
-\begin{equation}
-\boldsymbol{A}_{13}/2.303RT=\log K_{13}+\log a_{stuff,13}-\log a_{\mathrm{CSG\_METJA(residue)}}\,,\label{eq:A4_METJA_residue}
-\end{equation}
-For metastable equilibrium we have $\boldsymbol{A}_{12}/1=\boldsymbol{A}_{13}/1$.
-The $1$'s in the denominators are there as a reminder that we are
-still conserving residues, and that each reaction now is written for
-the formation of a single residue equivalent. So, let us write $\boldsymbol{A}$
-for $\boldsymbol{A}_{12}=\boldsymbol{A}_{13}$ and also define $\boldsymbol{A}_{12}^{*}=\boldsymbol{A}_{12}+2.303RT\log a_{\mathrm{CSG\_METVO(residue)}}$
-and $\boldsymbol{A}_{13}^{*}=\boldsymbol{A}_{13}+2.303RT\log a_{\mathrm{CSG\_METJA(residue)}}$.
-At the same temperature, pressure and activities of basis species
-and proteins as shown in the previous section, we can write $\boldsymbol{A}_{12}^{*}=\boldsymbol{A}_{3}^{*}/553=2.303RT\times0.1892901$
-and $\boldsymbol{A}_{13}^{*}=\boldsymbol{A}_{4}^{*}/530=2.303RT\times0.5928069$
-to give
-\begin{equation}
-\boldsymbol{A}/2.303RT=0.1892901-\log a_{\mathrm{CSG\_METVO(residue)}}\label{eq:A_METVO_residue}
-\end{equation}
-and
-\begin{equation}
-\boldsymbol{A}/2.303RT=0.5928069-\log a_{\mathrm{CSG\_METJA(residue)}}\,,\label{eq:A_METJA_residue}
-\end{equation}
-which are equivalent to Equations 12 and 13 in the paper \citep{Dic08}
-but with more decimal places shown. A third equation arises from the
-conservation of amino acid residues:
-\begin{equation}
-a_{\mathrm{CSG\_METVO(residue)}}+a_{\mathrm{CSG\_METJA(residue)}}=1.083\,.\label{eq:a_residue_2}
-\end{equation}
-The solution to these equations is $a_{\mathrm{CSG\_METVO(residue)}}=0.3065982$,
-$a_{\mathrm{CSG\_METJA(residue)}}=0.7764018$ and $\boldsymbol{A}/2.303RT=0.7027204$.
-
-The corresponding logarithms of activities of the proteins are $\log\left(0.307/553\right)=-3.256$
-and $\log\left(0.776/530\right)=-2.834$. These activities of the
-proteins are much closer to each other than those calculated using
-formation reactions for whole protein formulas, so this result seems
-more compatible with the actual coexistence of proteins in nature.
-
-The approach just described is not used by \texttt{diagram()} when
-\texttt{residue=TRUE} (which is the default setting). Instead, the
-Boltzmann distribution, described next, is implemented for that situation.
-
-
-\subsection{Boltzmann distribution}
-
-An expression for Boltzmann distribution, relating equilibrium activities
-of species to the affinities of their formation reactions, can be
-written as (using the same definitions of the symbols above)
-\begin{equation}
-\frac{a_{i}}{\sum a_{i}}=\frac{e^{\boldsymbol{A}_{i}^{*}/RT}}{\sum e^{\boldsymbol{A}_{i}^{*}/RT}}\,.\label{eq:Boltzmann}
-\end{equation}
-Using this equation, we can very quickly (without setting up a system
-of equations) calculate the equilibrium activities of proteins using
-their residue equivalents. Above, we saw $\boldsymbol{A}_{12}^{*}/2.303RT=0.1892901$
-and $\boldsymbol{A}_{13}^{*}/2.303RT=0.5928069$. Multiplying by $\ln10=2.302585$
-gives $\boldsymbol{A}_{12}^{*}/RT=0.4358565$ and $\boldsymbol{A}_{13}^{*}/RT=1.364988$.
-We then have $e^{\boldsymbol{A}_{12}^{*}/RT}=1.546287$ and $e^{\boldsymbol{A}_{13}/RT}=3.915678$.
-This gives us $\sum e^{A_{i}^{*}/RT}=5.461965$, $a_{12}/\sum a_{i}=0.2831009$
-and $a_{13}/\sum a_{i}=0.7168991$. Since $\sum a_{i}=1.083$, we
-arrive at $a_{12}=0.3065982$ and $a_{13}=0.7764018$, the same result
-as above. This example was also described in a recent paper \citep{DS11}.
-
-This computation can be carried out in CHNOSZ using the following
-commands, which implies \texttt{residue=TRUE} as the default setting
-for systems of proteins. This setting signifies to consider the formation
-reactions of the residue equivalents instead of the whole proteins,
-AND consequently to make a call to \texttt{equil.boltz()} rather than
-\texttt{equil.react()}.
-
-<<ResidueActivities1>>=
-d <- diagram(a,plot.it=FALSE)
-as.numeric(d$logact)
-@
-
-
-We can also specify \texttt{as.residue=TRUE} (which means to return
-the logarithms of activities of the residue equivalents rather than
-converting them to logarithms of activities of the proteins):
-
-<<ResidueActivities2>>=
-d <- diagram(a,as.residue=TRUE,plot.it=FALSE)
-10^as.numeric(d$logact)
-@
-
-
-Although this example includes only two proteins, this procedure is
-suitable for calculating the metastable equilibrium activities of
-any number of proteins.
-
-
-\section{Calculations as a function of a single variable}
-
-A comparison of the outcomes of equilibrium calculations that do and
-do not use the residue equivalents for proteins was given in a publication
-\citep{Dic08}. An expanded version of a diagram in that paper is
-below (though, without labels on the figures).
-
-<<ProteinSpeciation,fig=T>>=
-organisms <- c("METSC","METJA","METFE","HALJP","METVO","METBU","ACEKI","GEOSE","BACLI","AERSA")
-proteins <- c(rep("CSG",6),rep("SLAP",4))
-basis("CHNOS+")
-species(proteins,organisms)
-a <- affinity(O2=c(-100,-65))
-par(mfrow=c(2,1))
-diagram(a,ylim=c(-5,-1),legend.x=NULL,residue=FALSE)
-title(main="Equilibrium activities of proteins, whole formulas")
-diagram(a,ylim=c(-5,-1),legend.x=NULL)
-title(main="Equilibrium activities of proteins, residue equivalents")
-@
-
-
-The reaction-matrix approach described above can also be applied to
-systems having conservation coefficients that differ from unity, such
-as many mineral and inorganic systems, where the immobile component
-has different molar coefficients in the formulas. For example, consider
-a system like that described by Seewald, 1997 \citep{See96}:
-
-<<SulfurSpeciation,fig=T>>=
-basis('CHNOS+')
-basis('pH',5)
-species(c('H2S','S2-2','S3-2','S2O3-2','S2O4-2','S3O6-2','S5O6-2','S2O6-2','HSO3-','SO2','HSO4-'))
-a <- affinity(O2=c(-50,-15),T=325,P=350)
-par(mfrow=c(2,1))
-diagram(a,loga.balance=-2,ylim=c(-30,0),legend.x="topleft",cex.names=0.8)
-title(main="Aqueous sulfur speciation, whole formulas")
-diagram(a,loga.balance=-2,ylim=c(-30,0),legend.x="topleft",cex.names=0.8,residue=TRUE)
-title(main="Aqueous sulfur speciation, 'residue' equivalents")
-@
-
-
-The first diagram is quantitatively similar to the one shown by Seewald,
-1997, but in the second (where we have set \texttt{residue=TRUE})
-the range of activities is lower at any given $\log f_{\mathrm{O_{2}}_{\left(g\right)}}$.
-There, the function was told to rewrite the formation reactions of
-the aqueous sulfur species for their residue equivalents in the same
-way the formation reactions for the proteins were rewritten above.
-The number of ``residues'' in each species is the coefficient of
-the immobile component, in this case $\mathrm{H_{2}S}$, in the formation
-reaction.
-
-Maybe \texttt{residue=TRUE} doesn't make sense for systems where the
-formulas of species are similar in size to those of the basis species.
-For molecules as large as proteins it might be a useful concept. It
-is now (since CHNOSZ version 0.9) the defaut mode for \texttt{diagram()}
-when working with proteins.
-
-With the potential for calculating equilibrium activities of proteins
-comes the desire to compare these calculations to actual measurements!
-
-
-\section{Blood plasma proteins}
-
-Let's look at some protein abundance levels in human blood plasma.
-First get going with the experimental data. In CHNOSZ is a table listing
-the upper limits of the intervals, or ranges, of protein abundances
-taken from figures available in Anderson and Anderson, 2002, 2003
-\citep{AA02,AA03}. The protein abundances in the tables are in log10(pg/ml);
-let's convert that to molality. First locate the file with the abundance
-data. Then read it, with ``as.is'' so that strings are read as characters
-not factors (to avoid an error in the \texttt{species()} call further
-down). Then identify the protein named ``INS.C'' and drop it from
-the list. The reason for doing so is that preliminary calculations
-show it is much more stable than any other protein in the list. It
-is therefore an interesting outlier in terms of relative stabilities
-of the proteins.
-
-Then get the species indices of the proteins for thermodynamic calculations
-(with parameters based on amino acid compositions of the proteins
-listed in \texttt{thermo\$protein} ... and suppress messages that
-would fill up a whole page here). Then calculate the masses of the
-proteins. Then convert log10(pg/ml) to log10(mol/L) (logarithm of
-molarity). The conversion from pg/ml to g/L involves a factor of $10^{-9}$
-($\frac{10^{0}\mathrm{g}}{10^{12}\mathrm{pg}}\times\frac{10^{3}\mathrm{ml}}{10^{0}\mathrm{L}}$);
-then to get molarity we divide by mass ($\frac{\mathrm{g}}{\mathrm{mol}}$).
-
-<<PlasmaExpt>>=
-f <- system.file("extdata/abundance/AA03.csv",package="CHNOSZ")
-pdata <- read.csv(f, as.is=TRUE)
-pdrop <- which(pdata$name == "INS.C")
-pname <- pdata$name[-pdrop]
-iip <- suppressMessages(info(paste(pname,"HUMAN",sep="_")))
-pmass <- mass(thermo$obigt$formula[iip])
-loga.expt <- logm <- log10( 10^pdata$log10.pg.ml.[-pdrop] / 10^9 / pmass )
-@
-
-
-As implied by the ``loga'', we are assuming for the comparisons
-offered below that molarity (derived from the published abundance
-data) can be taken to be equal to molality and that molality can equated
-with chemical activity. The latter equality (the assumption of ideal
-behavior) especially should be subject to more scrutiny. We'll go
-ahead anyway and calculate, for ideality, the equilibrium activities
-of the proteins. First we need to calculate the total activity of
-residues from the experimental data, but to do that we need even more
-firstly the lengths of the proteins.
-
-<<PlasmaAct>>=
-pl <- protein.length(paste(pname,"HUMAN",sep="_"))
-logares.tot <- sum(10^loga.expt * pl)
-@
-
-
-Our total activity (\emph{not} the logarithm of it) of residues turns
-out to be about 200, which for our average protein length of 637 works
-out to about 0.3 for the average protein, if the total activity of
-residues could be attributed to that single average protein.
-
-Now let's get down to the stuff CHNOSZ is made for. First define the
-basis species. Then define the species, being the proteins. Then calculate
-the affinities of the formation reactions of the proteins. Then calculate
-the equilibrium activities, but don't plot them by themselves. Instead,
-use \texttt{revisit} to compare the equilibrium activities to the
-experimental abundances.
-
-\setkeys{Gin}{width=0.6\textwidth}
-<<PlasmaComparisonRed,fig=T>>=
-basis("CHNOS+")
-s <- species(pname,"HUMAN")
-a <- affinity()
-d <- diagram(a,loga.balance=logares.tot,plot.it=FALSE)
-pch <- as.numeric(as.factor(pdata$type))
-revisit(d,"rmsd",loga.ref=loga.expt,pch=pch)
-legend("bottomright",pch=unique(pch),legend=unique(pdata$type))
-@
-\setkeys{Gin}{width=1.0\textwidth}
-
-
-There seems to be almost no relation between the reference values
-and the calculated ones. But what if we increase the oxygen fugacity?
-$\log f_{\mathrm{O_{2}}_{\left(g\right)}}=-80$ might be appropriate
-for some subcellular conditions, or reduced hydrothermal systems.
-Blood is exposed to oxygen after all... let's try $\log f_{\mathrm{O_{2}}_{\left(g\right)}}=-60$.
-
-\setkeys{Gin}{width=0.6\textwidth}
-<<PlasmaComparisonOxid,fig=T>>=
-basis("O2",-60)
-a <- affinity()
-d <- diagram(a,loga.balance=logares.tot,plot.it=FALSE)
-revisit(d,"rmsd",loga.ref=loga.expt,pch=pch)
-legend("topleft",pch=unique(pch),legend=unique(pdata$type))
-@
-\setkeys{Gin}{width=1.0\textwidth}
-
-
-Well it's still quite scattered. However, the RMSD has decreased considerably,
-the loess fit has a positive slope, and the dynamic ranges of the
-calculations and observations are more similar.
-
-
-\section{Comparison with expression profile in \emph{E. coli}}
-
-Amino acid compositions of proteins in \emph{Escherichia coli} are
-provided with CHNOSZ at \texttt{extdata/protein/ECO.csv.xz}. Protein
[TRUNCATED]
To get the complete diff run:
svnlook diff /svnroot/chnosz -r 8
More information about the CHNOSZ-commits
mailing list