[Qpcr-commits] r122 - pkg/ReadqPCR/inst/doc

Thu Sep 9 09:56:42 CEST 2010

Author: jperkins
Date: 2010-09-09 09:56:42 +0200 (Thu, 09 Sep 2010)
New Revision: 122

Added:
   pkg/ReadqPCR/inst/doc/ReadqPCR.Rnw
Log:
readded ReadqPCR.Rnw, strange svn behaviour when adding an old version then trying to re-commit

Added: pkg/ReadqPCR/inst/doc/ReadqPCR.Rnw
===================================================================

--- pkg/ReadqPCR/inst/doc/ReadqPCR.Rnw	                        (rev 0)
+++ pkg/ReadqPCR/inst/doc/ReadqPCR.Rnw	2010-09-09 07:56:42 UTC (rev 122)
@@ -0,0 +1,206 @@
+%\VignetteIndexEntry{Functions to load RT-qPCR data into R}
+%\VignetteDepends{stats,Biobase,methods}                            
+%\VignetteKeywords{real-time, quantitative, PCR, housekeeper, reference gene, geNorm, NormFinder}
+%\VignettePackage{ReadqPCR}
+%
+\documentclass[11pt]{article}
+\usepackage{geometry}\usepackage{color}
+\definecolor{darkblue}{rgb}{0.0,0.0,0.75}
+\usepackage[%
+baseurl={http://qpcr.r-forge.r-project.org/},%
+pdftitle={ReadqPCR: Functions to load RT-qPCR data into R},%
+pdfauthor={James Perkins},%
+pdfsubject={Functions to load RT-qPCR data into R},%
+pdfkeywords={real-time, quantitative, PCR, qPCR, RT-qPCR, load data},%
+pagebackref,bookmarks,colorlinks,linkcolor=darkblue,citecolor=darkblue,%
+pagecolor=darkblue,raiselinks,plainpages,pdftex]{hyperref}
+%
+\markboth{\sl Package ``{\tt ReadqPCR}''}{\sl Package ``{\tt ReadPCR}''}
+%
+% -------------------------------------------------------------------------------
+\newcommand{\code}[1]{{\tt #1}}
+\newcommand{\pkg}[1]{{\tt "#1"}}
+\newcommand{\myinfig}[2]{%
+%  \begin{figure}[htbp]
+    \begin{center}
+      \includegraphics[width = #1\textwidth]{#2}
+%      \caption{\label{#1}#3}
+    \end{center}
+%  \end{figure}
+}
+% -------------------------------------------------------------------------------
+%
+% -------------------------------------------------------------------------------
+\begin{document}
+\SweaveOpts{keep.source = TRUE, eval = TRUE, include = FALSE}
+%-------------------------------------------------------------------------------
+\title{ReadqPCR: Functions to load RT-qPCR data into R}
+%-------------------------------------------------------------------------------
+\author{James Perkins\\
+University College London\medskip\\
+}
+\maketitle
+\tableofcontents
+%-------------------------------------------------------------------------------
+\section{Introduction}
+The package \pkg{ReadqPCR} contains different functions for reading qPCR data into R. Different proprietary software used for the different high throghput real-time quantitative polymerase chain reaction (RT-qPCR) systems available produce different formats of output data. ReadqPCR contains functions to read some of these different formats into R so that the data can be manipulated. It also allows the user to read their own RT-qPCR data into R.
+
+As well as the functions to read in the data, \pkg{ReadqPCR} contains the \code{qPCRBatch} class definition.
+The data output by these RT-qPCR systems is in the form of cycle threshold, or Ct values, which represents the number of cycles of amplification needed in order to detect the expression of a given gene from a sample.
+
+ReadqPCR is designed to be complementary to 2 other R modules: QCqPCR and NormqPCR, which are intended for (respectively) the quality control and normalisation of qPCR data. It must be installed before the other two modules can be.
+%-------------------------------------------------------------------------------
+\section{read.qPCR}
+
+\code{read.qPCR} allows the user to read in qPCR data and populate a \code{qPCRBatch} R object (see section \code{qPCRBatch}) using their own data matrix. 
+The format of the data file should be tab delimited and have the following columns, the first two of which are optional (although they should either be provided both together, or not at all):
+
+\begin{description}
+    \item[Well] Optional, this represents the position of the detector on a given plate. This information, if given, will be used to check the plates are of the same size and will also be used in order to plot a representation of the card to look for spatial effects and other potential problems. Both Well number and Plate ID must be present to enable a plate to be plotted.
+    \item[Plate ID] Optional, this is an identifier for the plate on which an experiment was performed. It is not possible to have duplicate plate IDs with the same Well number. Neither is it possible to have Plate Ids without Well numbers. Both Well number and Plate ID must be present to enable a plate to be plotted.
+    \item[Sample] The sample being analysed. Each sample must contain the same detectors in order to combine and compare samples effectively and to form a valid expression set matrix.
+    \item[Detector] This is the identifier for the gene being investigated. The Detectors must be identical for each sample.
+    \item[Ct] This is the cycle threshold for a given detector in the corresponding sample.
+\end{description}
+
+The generic function \code{read.qPCR} is called to read in the qPCR file. It is similar to the \code{read.affybatch} function of the \pkg{affy} package, in that it reads a file and automatically populates an R object, \code{qPCRBatch} described below. However it is different in that the file is user formatted. In addition, unlike \code{read.affybatch}, and also unlike the \code{read.taqman} function detailed below, only one file may be read in at a time.
+
+If \code{Well} and \code{Plate ID} information are given, then these are used to populate the \code{exprs.well.order}, a new \code{assayData} slot introduced in the \code{qPCRBatch} object, as detailed below in section \code{qPCRBatch}.
+
+So for the \code{qPCR.example.txt} file, in directory \code{exData} of this library, which contains \code{Well} and \code{Plate ID} information, as well as the mandatory \code{Sample}, \code{Detector} and \code{Ct} information, we can read in the data as follows.
+<<read.qPCR>>=
+library(ReadqPCR) # load the ReadqPCR library
+path <- system.file("exData", package = "ReadqPCR")
+qPCR.example <- paste(path, "/qPCR.example.txt", sep="")
+qPCRBatch.qPCR <- read.qPCR(qPCR.example)
+@
+
+\code{qPCRBatch.qPCR} will be a \code{qPCRBatch} object with an exprs and exprs.well.order, as well as a phenoData slot which gets automatically populated in the same way as when using \code{read.affybatch}. More detail is given in the \code{qPCRBatch} section below.
+
+read.qPCR can deal with technical replicates. If the same detector and sample identifier occurs more than once, the suffix \code{\_TechRep.n} is concatenated to the detector name, where $n$ in \{$1, 2...N$
+\} is the number of the replication in the total number of replicates, $N$, based on
+order of appearence in the qPCR data file.
+So for a qPCR file with 2 technical replicates of 8 detectors on each sample, with one sample per plate, the detector names would be amended as follows:
+
+<<read.qPCR.tech.reps>>=
+qPCR.example.techReps <- paste(path,"/qPCR.techReps.txt", sep = "")
+qPCRBatch.qPCR.techReps <- read.qPCR(qPCR.example.techReps)
+rownames(exprs(qPCRBatch.qPCR.techReps))[1:8]
+@
+
+
+The reason for appending the suffix when technical replicates are encountered is in order to populate the \code{exprs} and \code{exprs.well.order} slots correctly and keep them to the \code{assayData} format.
+It also allows the decisions on how to deal with the analysis and combination of technical replicates to be controlled by the user, either using the \pkg{NormqPCR} package, or potentially some other function that takes \code{assayData} format R objects as input.
+
+\section{read.taqman}
+
+\code{read.taqman} allows the user to read in the data output by the Sequence Detection Systems (SDS) software which is the software used to analyse the Taqman Low Density Arrays. 
+This data consists of the header section, which gives some general information about the experiment, run date etc., followed by the raw Cts values detected by the software, followed by summary data about the experiment.
+\code{read.taqman} is a generic function, and is called in a way similar to the \code{read.affybatch} function of the \pkg{affy} package.
+
+<<read.taqman>>=
+taqman.example <- paste(path, "/example.txt", sep="")
+qPCRBatch.taq <- read.taqman(taqman.example)
+@
+
+Currently the SDS software only allows up to 10 plates to be output onto one file. read.taqman allows any number of SDS output files to be combined to make a single \code{qPCRBatch}, as long as they have matching detector identifiers.
+
+<<read.taqman.two>>=
+path <- system.file("exData", package = "ReadqPCR")
+taqman.example <- paste(path, "/example.txt", sep="")
+taqman.example.second.file <- paste(path, "/example2.txt", sep="")
+qPCRBatch.taq.two.files <- read.taqman(taqman.example, 
+                             taqman.example.second.file)
+@
+
+SDS output will not necessarily contain plate identifiers, in which case a numeric identifier will be generated, which will increment for each plate, depending on the order of the plates within the SDS files.
+This is important for filling the \code{exprs.well.order} slot of the \code{qPCRBatch}, which is used for assessing the quality of different arrays, using the \pkg{QCqPCR} package, as explained in section \code{qPCRBatch} and in the vignette for \pkg{QCqPCR}.
+
+read.taqman can also deal with technical replicates. If the same detector and
+sample identifier occurs more than once, the suffix \code{\_TechRep.n} will be concatenated to the detector name, where $n$ in \{$1, 2...N$\} is the number of the replication in the total number of replicates $N$, based on the order of occurence in the taqman data file.
+So for a taqman file with 4 technical replicates of 96 detectors per sample, with one sample per plate, the detector names would be amended as follows:
+
+<<read.taqman.tech.reps>>=
+taqman.example.tech.reps <- paste(path,"/exampleTechReps.txt", sep = "")
+qPCRBatch.taq.tech.reps <- read.taqman(taqman.example.tech.reps)
+rownames(exprs(qPCRBatch.taq.tech.reps))[1:8]
+@
+
+As with read.qPCR, the motivation for appending the suffix when technical replicates are encountered is in order to populate the \code{exprs} and \code{exprs.well.order} slots correctly and keep them to the \code{assayData} format.
+Again it allows the decisions on how to deal with the analysis of technical replicates to be controlled by the user, either using the \pkg{NormqPCR} package, or otherwise.
+
+\section{qPCRBatch}
+\code{qPCRBatch} is an S4 class, designed to store information on the Ct raw values
+which  represents the relative gene expression for a given sample, phenotypic information on the different samples which enable the user to compare expression accross different conditions or cell lines, and information on the spatial location of the different detectors used to measure Ct. This is achieved by
+making qPCRBatch an an extension of eSet, which means we can recycle slots such as exprs and pData, and by introducing a new \code{assyData} slot.
+Here is an example of what a qPCRBatch looks like. note the similarity to eSet:
+
+<<taqman.qPCRBatch>>=
+qPCRBatch.taq
+@
+
+pData will be filled automatically if no data is given, in a way analagous to read.affybatch:
+
+<<taqman.pData>>=
+pData(qPCRBatch.taq)
+@
+
+However it is advantageous to use pData as this information can be read by methods and functions in the \pkg{NormqPCR} and \pkg{QCqPCR} packages.
+In addition there is a new slot, \code{exprs.well.order} which extends the \code{assayData} slot used for \code{exprs()}.
+It has the same dimensions as \code{exprs} (as every instance of \code{assayData} must), which are m rows of genes and n rows of samples.
+The cells contain further details on the position on the arrays where the different meaurements were taken.
+
+The data provided by this slot can be used in order to identify certain problems with arrays, perhaps due to spatial effects and other  problems with the microfluidics technology that is used by many of these systems (see \pkg{QCqPCR} for more details).
+
+This is conceptually similar to the cdf file information being stored in the \code{AffyBatch} class, which contains information on the spatial layout of features on an affy chip. However it differs since it allows for different arrays within the same \code{affyBatch} object to have different layouts to each other.
+This information can be viewed using the \code{exprs.well.order()} function and is later used in the \pkg{QCqPCR} package in order to produce pseudoplots of the qPCR cards, in a method analagous to the pseudo-images produced by \pkg{affyPLM}.
+
+When using \code{read.taqman}, if the input file includes identifiers for the different arrays in the experiment, the identifiers will be of the format \code{<plate.id>-<plate.position>}.
+However if no names are given for the different plates, \pkg{ReadqPCR} will assign them a numeric identifier, which increments depending on the order of plates in the original file.
+When several input files are given, as in the case of SDS files, the order in which they are supplied as arguments to the \code{read.taqman} function will be mirrored in the order of the numeric identifiers for the different plates. However, to minimise confusion, we recommend the useR giving the plates their own unique identifiers where possible.
+
+Without plate names:
+<<taqman.exprs.well.order>>=
+head(exprs.well.order(qPCRBatch.taq))
+@
+
+With plate names:
+<<taqman.exprs.well.order.plate.names>>=
+taqman.example.plateNames <-paste(path,"/exampleWithPlateNames.txt",sep="")
+qPCRBatch.taq.plateNames <- read.taqman(taqman.example.plateNames)
+head(exprs.well.order(qPCRBatch.taq.plateNames))
+@
+
+In addition, a mixture of files with and without plate identifiers is possible. 
+
+<<taqman.exprs.well.order.plate.and.not>>=
+taqman.example <- paste(path, "/example.txt", sep="")
+taqman.example.plateNames <-paste(path,"/exampleWithPlateNames.txt",
+                                                               sep="")
+qPCRBatch.taq.mixedPlateNames <- read.taqman(taqman.example, 
+                                    taqman.example.plateNames)
+head(exprs.well.order(qPCRBatch.taq.mixedPlateNames))
+@
+
+If the files to be combined do not have matching detector names, or if duplicate sample or plate names are given, read.taqman will stop and give an error message.
+\\
+When reading in \code{qPCR} files with \code{read.qPCR}, \code{exprs.well.order} will be populated as long as \code{Well} and \code{Plate ID} columns are given in the input file, otherwise the \code{exprs.well.order} slot will be \code{NULL}.
+
+So when plate ID and Well data are given:
+
+<<qPCR.exprs.well.order.withPlateId>>=
+head(exprs.well.order(qPCRBatch.qPCR))
+@
+
+And when they are not:
+
+<<qPCR.exprs.well.order.noPlateId>>=
+qPCR.example.noPlateOrWell <- paste(path, "/qPCR.noPlateOrWell.txt", sep="")
+qPCRBatch.qPCR.noPlateOrWell <- read.qPCR(qPCR.example.noPlateOrWell)
+exprs.well.order(qPCRBatch.qPCR.noPlateOrWell)
+@
+
+Once a qPCRBatch has been populated it is theoretically possible to use any tool which takes as it's input an \code{exprs} set matrix. However it is important to bear in mind the values are not raw expression values but Ct values, which means amongst other things that the values will be on the log scale, and a lower the Ct will indicate a higher expression level for a given transcript in the sample.
+Therefore we recommend the use of an algorithm such as those available in \pkg{NormqPCR} in order to calculate a more meaningful expression value. Also bear in mind that the amount is relative and is intended to be compared to another condition or tissue type in order to look for differential expression between condition; the technology is not designed to give absolute quantification.
+
+\end{document}