[Genabel-commits] r1130 - in pkg/DatABEL: . inst/doc vignettes
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Tue Mar 12 00:35:07 CET 2013
Author: lckarssen
Date: 2013-03-12 00:35:07 +0100 (Tue, 12 Mar 2013)
New Revision: 1130
Added:
pkg/DatABEL/vignettes/
pkg/DatABEL/vignettes/intro_DatABEL.Rnw
Removed:
pkg/DatABEL/inst/doc/intro_DatABEL.Rnw
Log:
Fixed the last warnings in DatABEL when building for CRAN.
Deleted: pkg/DatABEL/inst/doc/intro_DatABEL.Rnw
===================================================================
--- pkg/DatABEL/inst/doc/intro_DatABEL.Rnw 2013-03-11 23:21:57 UTC (rev 1129)
+++ pkg/DatABEL/inst/doc/intro_DatABEL.Rnw 2013-03-11 23:35:07 UTC (rev 1130)
@@ -1,330 +0,0 @@
-%\VignetteIndexEntry{Introduction to DatABEL}
-\documentclass{article}
-
-\usepackage{hyperref}
-
-\hypersetup{colorlinks,%
- citecolor=black,%
- linkcolor=blue,%
- urlcolor=blue,%
- }
-
-\newcommand{\DA}{\texttt{DatABEL} }
-
-\title{Introduction to \DA}
-\author{Yurii S. Aulchenko, Stepan Yakovenko}
-%\date{\today}
-
-\begin{document}
-
-\maketitle
-\tableofcontents
-
-\section{Introduction}
-
-This vignette demonstrates the use of all major \texttt{DatABEL}
-functions. Central to the \texttt{DatABEL} library is the \texttt{databel} class, which
-is defined as follows:
-\begin{verbatim}
-setClass(
- Class = "databel",
- representation = representation(
- usedRowIndex = "integer",
- usedColIndex = "integer",
- uninames = "list",
- backingfilename = "character",
- cachesizeMb = "integer",
- data = "externalptr"
- ),
- package = "DatABEL"
-);
-\end{verbatim}
-here, \texttt{data} is an external pointer to an instance of the \texttt{FilteredMatrix}
-class of \texttt{filevector} library, \texttt{usedRowIndex} and \texttt{usedColIndex}
-keep the indexes of not masked columns and rows, \texttt{backingfilename} is the
-base name of the \texttt{filevector} data/index files, and \texttt{cachesizeMb} specifies
-the amount of RAM used for cache. The \texttt{uninames} list specifies whether
-the column and/or row names are unique and thus may be used to access the data.
-
-The methods defined for \texttt{databel} class are similar to that
-defined for standard matrices and allow to
-(throughout, \texttt{DAdata} refers to an object of \texttt{databel} class):
-\begin{itemize}
-\item Obtain information about underlying data (\texttt{show}, \texttt{dim}, \texttt{dimnames},
-\texttt{get\_dimnames}, \texttt{length}, \texttt{backingfilename} and \texttt{cachesizeMb}).
-The function \texttt{get\_dimnames} returns a list with row and column names defined
-for the data object; the function \texttt{dimnames} does so if the names are unique;
-in case row/column names are not unique \texttt{NULL} is returned for that dimension.
-\item Set some attributes (\texttt{dimnames<-}, \texttt{set\_dimnames<-},
-\texttt{cachesizeMb<-} and \texttt{setReadOnly<-}).
-\item Connect and disconnect R object of \texttt{databel}-class to/from the
-underlying binary data (\texttt{connect} and \texttt{disconnect}; these functions
-destroy or initiate an instance of \texttt{FilteredMatrix}.
-\item Save a (sub-set of a) \texttt{databel} matrix as a new binary set of files (\texttt{save\_as})
-or export to plain text files (\texttt{databel2text}).
-\item Obtain sub-sets of a \texttt{databel} object (operation \texttt{[}).
-\item Replace values in the matrix (operation \texttt{[<-}).
-\item Coercion of \texttt{databel} matrix to standard R matrix and vector and
-coercion of R matrix to \texttt{databel} matrix.
-\end{itemize}
-
-Internally, \texttt{databel} data may comprise eight different types
-(float, double, signed/unsigned (short) int, signed/unsigned byte).
-In C++, two of these (double and float) have support for missing
-values ('not a number'). For the rest, we reserved the maximal value to
-texttt for the missing data.
-
-Additionally functions to convert plain text files to \texttt{databel} format
- (\texttt{text2databel}) and
-to export \texttt{databel} data to plain text (\texttt{databel2text}) are provided.
-Another function (\texttt{apply2dfo}) is similar to standard R \texttt{apply}
-and allows application of user-defined function to all rows/columns of the
-data.
-
-\section{Conversion of the data to \texttt{databel} format, initialization of
-\texttt{databel} objects, and value modifications}
-
-<<echo=FALSE>>=
-unlink("*.fv?")
-unlink("*.txt")
-@
-
-To start using \texttt{DatABEL} you first need to load the library:
-<<>>=
-library(DatABEL)
-@
-
-We will first create an R matrix and will convert that to
-\texttt{databel} format. For that,
-create R matrix:
-<<>>=
-matr <- matrix (c(1:12),ncol=3,nrow=4)
-matr[3,2] <- NA
-matr
-dimnames(matr) <- list(paste("row",1:4,sep=""),paste("col",1:3,sep=""))
-matr
-@
-
-Conversion from R matrix to \texttt{databel} may be performed in
-two ways, using generic 'as' function or \texttt{matrix2databel}
-function. The difference is that when using 'as' the backing data
-file is named by generating a random name and the type used for
-storage is 'double', while with
-\texttt{matrix2databel} function the user may choose the backing data
-file name and the type of the data him or herself. Thus, 'as' should be used to
-create temporary \texttt{databel} objects:
-<<>>=
-list.files(pattern="*.fv?")
-dat1 <- as(matr,"databel")
-list.files(pattern="*.fv?")
-@
-
-You can see that after application of the \texttt{as} method,
-two files containing data backing the 'dat1' have appeared.
-
-The 'show' method shows basic information for the object:
-<<>>=
-dat1
-@
-
-Note that for big matrices only summaries and a small part of the data
-will appear on the screen.
-
-To keep the naming of the backing files, underlying data type
-and other details under control, use
-\texttt{matrix2databel} function:
-<<>>=
-dat2 <- matrix2databel(matr, filename="matr",cachesizeMb=16, type="UNSIGNED_CHAR",readonly=FALSE)
-dat2
-@
-
-You can see that now the backing files are \texttt{matr.fvd} and \texttt{matr.fvi}:
-<<>>=
-list.files(pattern="*.fv?")
-@
-
-If you try to create a new object with the same backing files,
-an error will appear.
-
-A new \texttt{databel} object can be initialized directly from
-the backing file:
-<<>>=
-dat3 <- databel("matr")
-dat3
-@
-
-A \texttt{databel} object can also be created from a text file.
-First, we will create a text file
-<<>>=
-write.table(matr,"matr.txt",row.names=TRUE,col.names=TRUE,quote=FALSE)
-@
-and then convert that to \texttt{databel} format
-<<>>=
-dat4 <- text2databel("matr.txt",outfile="matr1",R_matrix=TRUE,type="UNSIGNED_INT")
-dat4
-@
-
-Finally, a \texttt{databel} object can be initialized from another \texttt{databel}
-object
-<<>>=
-dat5 <- dat4
-@
-or, through use of \texttt{'['}
-<<>>=
-dat6 <- dat1[c("row1","row3"),c("col1","col2")]
-dat6
-@
-
-Thus, at the moment we have generated five \texttt{databel} objects containing
-identical data (though underlying type is different: double, unsigned byte and
-unsigned int) and one object ('dat6') which contains subset of the data.
-Objects 'dat1' and 'dat6' are using the same backing data file
-\texttt{\Sexpr{backingfilename(dat1)}},
-objects 'dat4' and 'dat5' are connected to \texttt{\Sexpr{backingfilename(dat4)}}, and
-'dat2' and 'dat3' are connected to \texttt{\Sexpr{backingfilename(dat2)}}.
-
-The data contained in \texttt{databel} matrices may be modified by
-use of \texttt{[<-} method:
-<<>>=
-dat1[1,1] <- 321
-@
-
-Note that because 'dat1' and 'dat6' are connected to the same binary
-data, modification of 'dat1' leads automatically to modification of
-'dat6':
-<<>>=
-dat6
-@
-
-To avoid read/write conflicts, all consecutive objects
-based on the same backing files will be connected in
-read-only mode (so that trying '\texttt{dat6[1,1] <- 123}'
-will generate an error). We will show how to work around this
-situation at the end of the next section.
-
-\section{Obtain and modifying attributes}
-
-Several standard methods defined for matrix are defined for
-\texttt{databel} matrices as well. For example
-<<>>=
-dim(dat1)
-length(dat1)
-dimnames(dat1)
-colnames(dat1)
-rownames(dat1)
-@
-
-The method \texttt{dimnames<-} may be used to modify the
-names:
-<<>>=
-dimnames(dat1) <- list(paste("ID",1:4,sep=""),paste("SNP",1:3,sep=""))
-dimnames(dat1)
-@
-
-Additional methods defined for \texttt{databel} matrices
-allow to obtain information about the backing file name
-<<<>>=
-backingfilename(dat1)
-@
-and the size of the cache used
-<<>>=
-cachesizeMb(dat1)
-@
-
-The size of cache can be modified by
-<<>>=
-cachesizeMb(dat1) <- 1
-cachesizeMb(dat1)
-@
-
-A method \texttt{get\_dimnames} is defined to obtain
-row/column names in case these are not uniqie.
-To demonstrate use of this method, we need first to
-create a \texttt{databel} matrix with non-unique
-dimnames. To set such not unique names, we will use
-method \texttt{set\_dimnames}:
-<<>>=
-set_dimnames(dat1) <- list(dimnames(dat1)[[1]],c("duplicate","col2","duplicate"))
-@
-
-Now \texttt{dimnames} returns \texttt{NULL} for the second dimension
-names:
-<<>>=
-dimnames(dat1)
-@
-while \texttt{get\_dimnames} still allows access to the names:
-<<>>=
-get_dimnames(dat1)
-@
-
-Finally, the read-only flag can be modifed. The following code
-demonstrates how to modify the 'dat6' object:
-<<>>=
-disconnect(dat1)
-setReadOnly(dat6) <- FALSE
-dat6[1,1] <- 123
-dat6
-dat1
-@
-
-\section{Coersion and exports}
-
-A standard R matrix can be obtained from a \texttt{databel} matrix
-by use of function 'as':
-<<>>=
-newm <- as(dat2,"matrix")
-class(newm)
-class(newm[1,1])
-newm
-@
-
-Data from a \texttt{databel} matrix may be exported to a text file
-using function
-<<>>=
-databel2text(dat2,file="dat2.txt")
-@
-
-Now 'dat2.txt' contains the data readable with
-<<>>=
-read.table("dat2.txt")
-@
-
-\section{Using \texttt{apply2dfo} function}
-
-The \texttt{apply2dfo} is a powerful function allowing
-complicated analysis of data stored in \texttt{databel}
-matrix. We will demonstrate the basic use of this function here.
-First, we will compute row and columns sums:
-<<>>=
-apply2dfo(SNP,dfodata=dat2,anFUN="sum",MAR=2)
-apply2dfo(SNP,dfodata=dat2,anFUN="sum",MAR=1)
-@
-the 'SNP' stays for current analysis variable (row or column)
-and allows specification of more complicated analysis, e.g.
-<<>>=
-apply2dfo(SNP^2,dfodata=dat2,anFUN="sum",MAR=2)
-@
-or such analysis as consecutive linear regression
-<<>>=
-Y <- rnorm(4)
-apply2dfo(Y~SNP,dfodata=dat2,anFUN="lm",MAR=2)
-apply2dfo(Y~SNP+I(SNP^2),dfodata=dat2,anFUN="lm",MAR=2)
-@
-
-Even more complicated analysis may be done by the user specifying
-their own analysis and result processing functions (see package
-documentation).
-
-\section{Citation}
-
-WILL BE UPDATED AT THE TIME THE PAPER IS ACCEPTED
-
-<<echo=FALSE>>=
-rm(list=ls())
-gc()
-unlink("*.fv?")
-unlink("*.txt")
-@
-
-
-\end{document}
Copied: pkg/DatABEL/vignettes/intro_DatABEL.Rnw (from rev 1128, pkg/DatABEL/inst/doc/intro_DatABEL.Rnw)
===================================================================
--- pkg/DatABEL/vignettes/intro_DatABEL.Rnw (rev 0)
+++ pkg/DatABEL/vignettes/intro_DatABEL.Rnw 2013-03-11 23:35:07 UTC (rev 1130)
@@ -0,0 +1,330 @@
+%\VignetteIndexEntry{Introduction to DatABEL}
+\documentclass{article}
+
+\usepackage{hyperref}
+
+\hypersetup{colorlinks,%
+ citecolor=black,%
+ linkcolor=blue,%
+ urlcolor=blue,%
+ }
+
+\newcommand{\DA}{\texttt{DatABEL} }
+
+\title{Introduction to \DA}
+\author{Yurii S. Aulchenko, Stepan Yakovenko}
+%\date{\today}
+
+\begin{document}
+
+\maketitle
+\tableofcontents
+
+\section{Introduction}
+
+This vignette demonstrates the use of all major \texttt{DatABEL}
+functions. Central to the \texttt{DatABEL} library is the \texttt{databel} class, which
+is defined as follows:
+\begin{verbatim}
+setClass(
+ Class = "databel",
+ representation = representation(
+ usedRowIndex = "integer",
+ usedColIndex = "integer",
+ uninames = "list",
+ backingfilename = "character",
+ cachesizeMb = "integer",
+ data = "externalptr"
+ ),
+ package = "DatABEL"
+);
+\end{verbatim}
+here, \texttt{data} is an external pointer to an instance of the \texttt{FilteredMatrix}
+class of \texttt{filevector} library, \texttt{usedRowIndex} and \texttt{usedColIndex}
+keep the indexes of not masked columns and rows, \texttt{backingfilename} is the
+base name of the \texttt{filevector} data/index files, and \texttt{cachesizeMb} specifies
+the amount of RAM used for cache. The \texttt{uninames} list specifies whether
+the column and/or row names are unique and thus may be used to access the data.
+
+The methods defined for \texttt{databel} class are similar to that
+defined for standard matrices and allow to
+(throughout, \texttt{DAdata} refers to an object of \texttt{databel} class):
+\begin{itemize}
+\item Obtain information about underlying data (\texttt{show}, \texttt{dim}, \texttt{dimnames},
+\texttt{get\_dimnames}, \texttt{length}, \texttt{backingfilename} and \texttt{cachesizeMb}).
+The function \texttt{get\_dimnames} returns a list with row and column names defined
+for the data object; the function \texttt{dimnames} does so if the names are unique;
+in case row/column names are not unique \texttt{NULL} is returned for that dimension.
+\item Set some attributes (\texttt{dimnames<-}, \texttt{set\_dimnames<-},
+\texttt{cachesizeMb<-} and \texttt{setReadOnly<-}).
+\item Connect and disconnect R object of \texttt{databel}-class to/from the
+underlying binary data (\texttt{connect} and \texttt{disconnect}; these functions
+destroy or initiate an instance of \texttt{FilteredMatrix}.
+\item Save a (sub-set of a) \texttt{databel} matrix as a new binary set of files (\texttt{save\_as})
+or export to plain text files (\texttt{databel2text}).
+\item Obtain sub-sets of a \texttt{databel} object (operation \texttt{[}).
+\item Replace values in the matrix (operation \texttt{[<-}).
+\item Coercion of \texttt{databel} matrix to standard R matrix and vector and
+coercion of R matrix to \texttt{databel} matrix.
+\end{itemize}
+
+Internally, \texttt{databel} data may comprise eight different types
+(float, double, signed/unsigned (short) int, signed/unsigned byte).
+In C++, two of these (double and float) have support for missing
+values ('not a number'). For the rest, we reserved the maximal value to
+texttt for the missing data.
+
+Additionally functions to convert plain text files to \texttt{databel} format
+ (\texttt{text2databel}) and
+to export \texttt{databel} data to plain text (\texttt{databel2text}) are provided.
+Another function (\texttt{apply2dfo}) is similar to standard R \texttt{apply}
+and allows application of user-defined function to all rows/columns of the
+data.
+
+\section{Conversion of the data to \texttt{databel} format, initialization of
+\texttt{databel} objects, and value modifications}
+
+<<echo=FALSE>>=
+unlink("*.fv?")
+unlink("*.txt")
+@
+
+To start using \texttt{DatABEL} you first need to load the library:
+<<>>=
+library(DatABEL)
+@
+
+We will first create an R matrix and will convert that to
+\texttt{databel} format. For that,
+create R matrix:
+<<>>=
+matr <- matrix (c(1:12),ncol=3,nrow=4)
+matr[3,2] <- NA
+matr
+dimnames(matr) <- list(paste("row",1:4,sep=""),paste("col",1:3,sep=""))
+matr
+@
+
+Conversion from R matrix to \texttt{databel} may be performed in
+two ways, using generic 'as' function or \texttt{matrix2databel}
+function. The difference is that when using 'as' the backing data
+file is named by generating a random name and the type used for
+storage is 'double', while with
+\texttt{matrix2databel} function the user may choose the backing data
+file name and the type of the data him or herself. Thus, 'as' should be used to
+create temporary \texttt{databel} objects:
+<<>>=
+list.files(pattern="*.fv?")
+dat1 <- as(matr,"databel")
+list.files(pattern="*.fv?")
+@
+
+You can see that after application of the \texttt{as} method,
+two files containing data backing the 'dat1' have appeared.
+
+The 'show' method shows basic information for the object:
+<<>>=
+dat1
+@
+
+Note that for big matrices only summaries and a small part of the data
+will appear on the screen.
+
+To keep the naming of the backing files, underlying data type
+and other details under control, use
+\texttt{matrix2databel} function:
+<<>>=
+dat2 <- matrix2databel(matr, filename="matr",cachesizeMb=16, type="UNSIGNED_CHAR",readonly=FALSE)
+dat2
+@
+
+You can see that now the backing files are \texttt{matr.fvd} and \texttt{matr.fvi}:
+<<>>=
+list.files(pattern="*.fv?")
+@
+
+If you try to create a new object with the same backing files,
+an error will appear.
+
+A new \texttt{databel} object can be initialized directly from
+the backing file:
+<<>>=
+dat3 <- databel("matr")
+dat3
+@
+
+A \texttt{databel} object can also be created from a text file.
+First, we will create a text file
+<<>>=
+write.table(matr,"matr.txt",row.names=TRUE,col.names=TRUE,quote=FALSE)
+@
+and then convert that to \texttt{databel} format
+<<>>=
+dat4 <- text2databel("matr.txt",outfile="matr1",R_matrix=TRUE,type="UNSIGNED_INT")
+dat4
+@
+
+Finally, a \texttt{databel} object can be initialized from another \texttt{databel}
+object
+<<>>=
+dat5 <- dat4
+@
+or, through use of \texttt{'['}
+<<>>=
+dat6 <- dat1[c("row1","row3"),c("col1","col2")]
+dat6
+@
+
+Thus, at the moment we have generated five \texttt{databel} objects containing
+identical data (though underlying type is different: double, unsigned byte and
+unsigned int) and one object ('dat6') which contains subset of the data.
+Objects 'dat1' and 'dat6' are using the same backing data file
+\texttt{\Sexpr{backingfilename(dat1)}},
+objects 'dat4' and 'dat5' are connected to \texttt{\Sexpr{backingfilename(dat4)}}, and
+'dat2' and 'dat3' are connected to \texttt{\Sexpr{backingfilename(dat2)}}.
+
+The data contained in \texttt{databel} matrices may be modified by
+use of \texttt{[<-} method:
+<<>>=
+dat1[1,1] <- 321
+@
+
+Note that because 'dat1' and 'dat6' are connected to the same binary
+data, modification of 'dat1' leads automatically to modification of
+'dat6':
+<<>>=
+dat6
+@
+
+To avoid read/write conflicts, all consecutive objects
+based on the same backing files will be connected in
+read-only mode (so that trying '\texttt{dat6[1,1] <- 123}'
+will generate an error). We will show how to work around this
+situation at the end of the next section.
+
+\section{Obtain and modifying attributes}
+
+Several standard methods defined for matrix are defined for
+\texttt{databel} matrices as well. For example
+<<>>=
+dim(dat1)
+length(dat1)
+dimnames(dat1)
+colnames(dat1)
+rownames(dat1)
+@
+
+The method \texttt{dimnames<-} may be used to modify the
+names:
+<<>>=
+dimnames(dat1) <- list(paste("ID",1:4,sep=""),paste("SNP",1:3,sep=""))
+dimnames(dat1)
+@
+
+Additional methods defined for \texttt{databel} matrices
+allow to obtain information about the backing file name
+<<<>>=
+backingfilename(dat1)
+@
+and the size of the cache used
+<<>>=
+cachesizeMb(dat1)
+@
+
+The size of cache can be modified by
+<<>>=
+cachesizeMb(dat1) <- 1
+cachesizeMb(dat1)
+@
+
+A method \texttt{get\_dimnames} is defined to obtain
+row/column names in case these are not uniqie.
+To demonstrate use of this method, we need first to
+create a \texttt{databel} matrix with non-unique
+dimnames. To set such not unique names, we will use
+method \texttt{set\_dimnames}:
+<<>>=
+set_dimnames(dat1) <- list(dimnames(dat1)[[1]],c("duplicate","col2","duplicate"))
+@
+
+Now \texttt{dimnames} returns \texttt{NULL} for the second dimension
+names:
+<<>>=
+dimnames(dat1)
+@
+while \texttt{get\_dimnames} still allows access to the names:
+<<>>=
+get_dimnames(dat1)
+@
+
+Finally, the read-only flag can be modifed. The following code
+demonstrates how to modify the 'dat6' object:
+<<>>=
+disconnect(dat1)
+setReadOnly(dat6) <- FALSE
+dat6[1,1] <- 123
+dat6
+dat1
+@
+
+\section{Coersion and exports}
+
+A standard R matrix can be obtained from a \texttt{databel} matrix
+by use of function 'as':
+<<>>=
+newm <- as(dat2,"matrix")
+class(newm)
+class(newm[1,1])
+newm
+@
+
+Data from a \texttt{databel} matrix may be exported to a text file
+using function
+<<>>=
+databel2text(dat2,file="dat2.txt")
+@
+
+Now 'dat2.txt' contains the data readable with
+<<>>=
+read.table("dat2.txt")
+@
+
+\section{Using \texttt{apply2dfo} function}
+
+The \texttt{apply2dfo} is a powerful function allowing
+complicated analysis of data stored in \texttt{databel}
+matrix. We will demonstrate the basic use of this function here.
+First, we will compute row and columns sums:
+<<>>=
+apply2dfo(SNP,dfodata=dat2,anFUN="sum",MAR=2)
+apply2dfo(SNP,dfodata=dat2,anFUN="sum",MAR=1)
+@
+the 'SNP' stays for current analysis variable (row or column)
+and allows specification of more complicated analysis, e.g.
+<<>>=
+apply2dfo(SNP^2,dfodata=dat2,anFUN="sum",MAR=2)
+@
+or such analysis as consecutive linear regression
+<<>>=
+Y <- rnorm(4)
+apply2dfo(Y~SNP,dfodata=dat2,anFUN="lm",MAR=2)
+apply2dfo(Y~SNP+I(SNP^2),dfodata=dat2,anFUN="lm",MAR=2)
+@
+
+Even more complicated analysis may be done by the user specifying
+their own analysis and result processing functions (see package
+documentation).
+
+\section{Citation}
+
+WILL BE UPDATED AT THE TIME THE PAPER IS ACCEPTED
+
+<<echo=FALSE>>=
+rm(list=ls())
+gc()
+unlink("*.fv?")
+unlink("*.txt")
+@
+
+
+\end{document}
More information about the Genabel-commits
mailing list