[Genabel-commits] r1151 - tutorials/GenABEL_general

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Fri Mar 15 19:20:00 CET 2013


Author: lckarssen
Date: 2013-03-15 19:19:59 +0100 (Fri, 15 Mar 2013)
New Revision: 1151

Modified:
   tutorials/GenABEL_general/GenABEL-tutorial.Rnw
   tutorials/GenABEL_general/intro.Rnw
   tutorials/GenABEL_general/introR.Rnw
Log:
Tutorial: Some small spelling errors fixed; added \eg and \ie commands to save typing and spacing problems. This requires the xspace package

Modified: tutorials/GenABEL_general/GenABEL-tutorial.Rnw
===================================================================
--- tutorials/GenABEL_general/GenABEL-tutorial.Rnw	2013-03-15 11:25:46 UTC (rev 1150)
+++ tutorials/GenABEL_general/GenABEL-tutorial.Rnw	2013-03-15 18:19:59 UTC (rev 1151)
@@ -11,6 +11,7 @@
 \usepackage{hyperref}
 \usepackage{cite}
 \usepackage{natbib}
+\usepackage{xspace}
 
 \renewcommand{\ExerciseHeader}{
 \vskip \baselineskip
@@ -47,6 +48,11 @@
 \newcommand{\PA}{\texttt{ProbABEL-package}}
 \newcommand{\DA}{\texttt{DatABEL-package}}
 
+% Commonly used abbreviations
+\newcommand{\cf}{cf.\xspace}
+\newcommand{\eg}{e.g.\xspace}
+\newcommand{\ie}{i.e.\xspace}
+
 \newcounter{ex}
 \newenvironment{ex}{
 \noindent \footnotesize
@@ -131,7 +137,7 @@
 \MakeUppercase{
 This work is licensed under the Creative Commons Attribution-ShareAlike 
 3.0 Unported License. To view a copy of this license, visit 
-http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to 
+\url{http://creativecommons.org/licenses/by-sa/3.0/} or send a letter to 
 Creative Commons, 444 Castro Street, Suite 900, Mountain View, 
 California, 94041, USA.
 }

Modified: tutorials/GenABEL_general/intro.Rnw
===================================================================
--- tutorials/GenABEL_general/intro.Rnw	2013-03-15 11:25:46 UTC (rev 1150)
+++ tutorials/GenABEL_general/intro.Rnw	2013-03-15 18:19:59 UTC (rev 1151)
@@ -21,7 +21,7 @@
 in analysis of population-based data; it supports analysis of 
 binary and quantitative tarits, and of survival 
 (time-till-event) data. 
-Most up-to-date information about \GA{} can be found at the web-site
+Most up-to-date information about \GA{} can be found at the web site
 \url{http://www.genabel.org}.
 
 This tutorial was originally written to serve as a set of exercises for the 
@@ -33,7 +33,7 @@
 not-so-strictly-necessary staff, start directly from the 
 section \ref{sec:GWA} ("\nameref{sec:GWA}").  
 
-Otherwise, you can start with R basics and simple association analyises 
+Otherwise, you can start with R basics and simple association analyses 
 using few SNPs in section \ref{sec:introR}, 
 "\nameref{sec:introR}".
 In the next section, \ref{sec:workgwaaclass} 
@@ -48,10 +48,10 @@
 This section is the core of this tutorial. 
 
 The section \ref{sec:strat} ("\nameref{sec:strat}") is 
-dedicated to analysis in presence of populational 
+dedicated to analysis in the presence of population
 stratification and analysis of family-based data. 
 
-Genetic data imputations are covered in section 
+Genetic data imputations are covered in the section 
 \ref{sec:impute}, "\nameref{sec:impute}".
 
 The last section, \ref{sec:reg} ("\nameref{sec:reg}"), is 
@@ -61,9 +61,9 @@
 %Appendix \ref{sec:GWAprotocol} oulines the formal step-by-step 
 %protocol for GWA analysis. 
 Information on importing the data from 
-different formats to \GA{} is given in appendix 
+different file formats to \GA{} is given in appendix 
 \ref{sec:dataimport} ("\nameref{sec:dataimport}").
-Answers to exercises are provided at the end of respective chapters.
+Answers to exercises are provided at the end of the respective chapters.
 
 Experienced R users start directly with 
 the section (\ref{sec:workgwaaclass}, "\nameref{sec:workgwaaclass}"). 

Modified: tutorials/GenABEL_general/introR.Rnw
===================================================================
--- tutorials/GenABEL_general/introR.Rnw	2013-03-15 11:25:46 UTC (rev 1150)
+++ tutorials/GenABEL_general/introR.Rnw	2013-03-15 18:19:59 UTC (rev 1151)
@@ -1,33 +1,33 @@
 \chapter{Introduction to R}
 \label{sec:introR}
 
-In this section we will consider base R data types and operations, 
-and tools for analysis of qualitative and quantitative traits. 
+In this section we will consider the basic R data types and operations, 
+as well as tools for the analysis of qualitative and quantitative traits. 
 Only basic R functionality -- the things which are crucial to know 
 before we can proceed to genetic association analysis -- will be covered 
 within this section. If you want to make most of your data, though, 
 we strongly recommend that you improve your knowledge 
-of R using books other than this. A number of excellent manuals 
+of R using books other than this one. A number of excellent manuals 
 ('An introduction to R', 'Simple R', 'Practical Regression and Anova using R', 
 and others) is available free of charge from the R project web-site 
-(http://www.r-project.org). 
+(\url{http://www.r-project.org}). 
 
 %%Only base R functionality (no extensions, packages) will be used in this chapter. 
 In the first part of this chapter you will learn about the most 
 important R data types 
 and will learn how to work with R data. Next, we will cover 
-exploratory data analysis. The chapter will end with introduction to 
+exploratory data analysis. The chapter will end with an introduction to 
 regression analysis. 
 
 \section{Basic R data types and operations}
 \label{subs:basicR}
 
-On the contrast to many other statistical analysis package, analysis in R is not 
-based on graphic user interface, but is command line-based. 
+In contrast with many other statistical analysis packages, analysis in R is not 
+based on a graphical user interface, but is command line-based. 
 When you first start R, a command prompt appears. To get help 
 and overview of R, type \texttt{help.start()} on the command line 
 and press \texttt{enter}. 
-This will start internet browser and open the main page of the R documentation. 
+This will start your default internet browser and open the main page of the R documentation. 
 
 Let us first use R as a powerful calculator. 
 You can directly operate with numbers in R. Try multiplying two by three:
@@ -50,7 +50,7 @@
 <<>>=
 2+3
 @
-(summation)\footnote{For complete list of arithmetic operations try \texttt{help("+")}}.
+(summation)\footnote{For a complete list of arithmetic operations try \texttt{help("+")}.}.
 \index{arithmetic operations}
 \index{operations!arithmetic}
 
@@ -62,10 +62,10 @@
 exp(0.35)
 @
 
-Here, we have computed \emph{e} to the power of base-10 logarithm 
+Here, we have computed $e$ to the power of base-10 logarithm 
 of the square root of the sum of two and three. After each 
-operation performed, we have rounded the result 
-to the two digits after the floating point -- just in order to 
+operation, we have rounded the result 
+to the two digits after the floating point -- just to 
 do less typing. 
 \index{mathematical functions}
 \index{functions!mathematical}
@@ -83,10 +83,10 @@
 extensive use of these at a later stage, when computing significance 
 and estimating statistical power. 
 
-For any function with name say '\texttt{fun}', help may be obtained 
+For any function with a name say '\texttt{fun}', help may be obtained 
 by typing '\texttt{help(fun)}' (or \texttt{?fun}) on the command line. 
 
-R help pages have standard layout, documenting usage of the 
+R help pages have a standard layout, documenting usage of the 
 function, explaining function arguments, providing details 
 of implementation and/or usage, explaining the value returned by 
 the function, and giving references and examples of the function 
@@ -104,13 +104,13 @@
 
 Most of the documented functions have examples of their usage 
 at the end of the 'help' page, and these examples can be evaluated 
-in R. E.g. try '\texttt{example(log10)}'.
+in R. E.g.~try '\texttt{example(log10)}'.
 
 \begin{Exercise}[title=Explore help for Wilcoxon test]
 Explore the help page for the Wilcoxon test 
 (function: \texttt{wilcox.test}) and answer 
-the questions:
-\Question When exact Wilcoxon test is computed by default?
+the following questions:
+\Question When is the exact Wilcoxon test computed by default?
 \Question If the default conditions for the exact test are not 
 satisfied, what approximation is used?
 \end{Exercise}
@@ -126,7 +126,7 @@
 keyword. 
 
 \begin{Exercise}[title=Finding functions and help pages]
-Try to find out what are the functions to do
+Try to find out what are the functions to do the
 \Question Fisher exact test
 \Question T-test
 \end{Exercise}
@@ -135,13 +135,13 @@
 You will find that the corresponding functions are \texttt{fisher.test} \texttt{t.test}.
 \end{Answer}
 
-One of important R operations is \emph{assignment}, which is 
-done with '\texttt{<-}' operator. A (new) variable name should 
+One of the important R operations is \emph{assignment}, which is 
+done with the '\texttt{<-}' operator. A (new) variable name should 
 be provided on the left-hand side of this operator and on the 
-right-hand side, there must be either name of already existing 
-variable or an expression. For example, we if want to assign 
-value '2' to variable '\texttt{a}', and value '3' to the variable 
-'\texttt{b}' we would use the assignment operator:
+right-hand side, there must be either the name of an already existing 
+variable or an expression. For example, we if want to assign the
+value '2' to the variable '\texttt{a}', and value '3' to the variable 
+'\texttt{b}' we would use the assignment operator in the following way:
 \index{assignment}
 \index{operation!assignment}
 <<>>=
@@ -149,7 +149,7 @@
 b <- 3
 @
 
-Typing the variable name in R command line will return its' value, e.g.
+Typing the variable name on the R command line will return its value, \eg
 <<>>=
 b
 @
@@ -158,14 +158,14 @@
 <<>>=
 exp(log10(sqrt(a+b)))
 @
-gives the expected result we have obtained above using numerical arguments. 
+gives the expected result we have obtained earlier using numerical arguments. 
 
 While the variables 'a' and 'b' contain single numeric values, variables 
-in general can be multi-dimensional; an one-dimensional example of such is a 
-vector (array). Let us create an example vector and experiment 
+in general can be multi-dimensional; a one-dimensional example of such is the 
+vector (or array). Let us create an example vector and experiment 
 with it:
 <<>>=
-v <- c(1,3,5,7,11)
+v <- c(1, 3, 5, 7, 11)
 @
 Here, '\texttt{c()}' is a function, which \textbf{c}ombines its arguments to 
 make a vector. This vector is then assigned to a variable named '\texttt{v}'.
@@ -181,7 +181,7 @@
 It is easy to see that the result is a vector, which is obtained by 
 adding one to each element of the original vector \texttt{v}.
 Other arithmetic operations and mathematical functions behave 
-in the same way, e.g. the operation is performed for each element 
+in the same way, \eg the operation is performed for each element 
 of the vector, and the results are returned:
 
 <<>>=
@@ -192,14 +192,14 @@
 What happens if two vectors are supplied as function arguments?
 Let us define a new vector
 <<>>=
-ov <- c(1,2,3,4,5)
+ov <- c(1, 2, 3, 4, 5)
 @
 and add it to the vector \texttt{v}:
 <<>>=
 v + ov
 @
-You can see that the summation was done element-wise, i.e.
-the first element of the result vector is obtained as 
+You can see that the summation was done element-wise, \ie the 
+first element of the result vector is obtained as 
 the sum of the first elements of \texttt{v} and \texttt{ov}, 
 the second is the sum of the second elements, and so forth. 
 
@@ -218,15 +218,15 @@
 vector as output. There are others -- statistical and summary 
 -- functions which evaluate a vector as a whole 
 and return a single value as output. For example, to 
-obtain a sum of vector's elements, use
+obtain a sum of elements of a vector, use
 \index{statistical functions}
 \index{functions!statistical}
 <<>>=
 sum(v)
 @
 
-Other examples of such functions involve \texttt{length}, returning 
-number of elements of a vector, \texttt{mean}, returning the mean, 
+Other examples of such functions are \texttt{length}, returning 
+the number of elements of a vector, \texttt{mean}, returning the mean, 
 \texttt{var}, returning the variance, etc.:
 <<>>=
 length(v)
@@ -238,12 +238,13 @@
 R is \emph{sub-setting}.
 \index{sub-setting}
 \index{operation!sub-setting}
-This refers to an operations which help you deriving a subset of 
+This refers to an operation which helps you deriving a subset of 
 the data. Let us create a short vector and play a bit with sub-setting.
 This vector will contain 5 simple character strings:
 
 <<>>=
-a <- c("I am element 1","I am element 2","I am element 3","I am element 4","I am element 5")
+a <- c("I am element 1", "I am element 2", "I am element 3", 
+       "I am element 4", "I am element 5")
 a
 @
 
@@ -252,58 +253,57 @@
 <<>>=
 a[3]
 @
-You can also select a bigger sub-set, e.g. all
-elements from 2 to 4:
+You can also select a bigger sub-set, \eg all elements from 2 to 4:
 <<>>=
 a[c(2:4)]
 @
-Here, operation \texttt{c(2:4)} stays for 'combine numbers from 2 to 4
+Here, the operation \texttt{c(2:4)} stands for 'combine numbers from 2 to 4
 into a vector'. An equivalent result is obtained by
 <<>>=
-a[c(2,3,4)]
+a[c(2, 3, 4)]
 @
 
-We can also easily get disjoint elements; e.g. if you want to retrieve elements
+We can also easily get disjoint elements; \eg if you want to retrieve elements
 1, 3, and 5, you can do that with
 <<>>=
-dje <- c(1,3,5)
+dje <- c(1, 3, 5)
 dje
 a[dje]
 @
 
-One of very attractive features of R data objects is possibility 
+One of the very attractive features of R data objects is the possibility 
 to derive a sub-set based on some condition. Let us consider two 
 vectors, \texttt{tmphgt}, containing the height of some subjects, 
 and \texttt{tmpids}, containing their identification codes (IDs):
 <<>>=
-tmphgt <- c(150,175,182,173,192,168)
+tmphgt <- c(150, 175, 182, 173, 192, 168)
 tmphgt
-tmpids <- c("fem1","fem2","man1","fem3","man2","man3")
+tmpids <- c("fem1", "fem2", "man1", "fem3", "man2", "man3")
 tmpids
 @
 
 Imagine you need to derive the IDs of the people with height over 170 cm.
-To do that, we need to combine several operations. First, we shoudl run 
-the logical function \texttt{>170} on the height data:
+To do that, we need to combine several operations. First, we should run 
+the logical function \texttt{> 170} on the height data:
 
 <<>>=
-vec <- (tmphgt>170)
+vec <- (tmphgt > 170)
 vec
 @
 
 This returns a logical vector whose elements are '\texttt{TRUE}', when 
 a particular element of the \texttt{tmphgt} satisfies the condition 
-\texttt{>170}. The returned logical vector, in turn,
+\texttt{> 170}. The returned logical vector, in turn,
 can be applied to sub-set any other vector of the same length\footnote{
 Actually, you can apply it to a longer vector too, and then the logical 
-vector will be "expanded" to total length by repeating the original vector 
-head-to-tail. However, we will not use this in our exercises.
-}, including itself. Thus if you need to see
-what are the heights in people, which are taller than 170 cm, you can use
+vector will be "expanded" to the total length by repeating the original vector 
+head-to-tail. However, we will not use this in our exercises.}, including 
+itself. Thus if you want to see the heights in people that are taller than 
+170 cm, you can use  
 <<>>=
 tmphgt[vec]
 @
-As you can see, only the elements of \texttt{tmphgt}, for which the 
+As you can see, only the elements of \texttt{tmphgt} for which the 
 corresponding value of \texttt{vec} was '\texttt{TRUE}', are returned. 
 In the same manner, the logical vector \texttt{vec} can be applied to 
 select elements of the vector of IDs:
@@ -311,8 +311,9 @@
 tmpids[vec]
 @
 
-You can combine more than one logical condition to derive sub-sets. For example, to 
-see what are the IDs of people taller than 170 but shorter than 190 cm, you can use
+You can combine more than one logical condition to derive sub-sets. For 
+example, to see what are the IDs of people taller than 170 but shorter 
+than 190 cm, you can use
 
 <<>>=
 vec <- (tmphgt>170 & tmphgt<190)
@@ -320,78 +321,77 @@
 tmpids[vec]
 @
 
-A better\footnote{
-Because it treats NAs for you
-} way to do logical sub-setting assumes use of the \texttt{which()}
-\index{which()}
-function on the top of the logical vector. This function reports which elements are 
-\texttt{TRUE}. To obtain above results you can run:
+A better\footnote{Because it treats NAs for you} way to do logical 
+sub-setting is to use the \texttt{which()}\index{which()} function on
+top of the logical vector. This function reports which elements are 
+\texttt{TRUE}. To obtain the aforementioned result you can run:
 
 <<>>=
 vec <- which(tmphgt>170 & tmphgt<190)
 vec
 tmpids[vec]
 @
-You can see that no \texttt{vec} contains a vector, whose elements are the 
-indexes of the elements of \texttt{tmphgt} for which the logical condition 
-satisfies. 
+You can see that now \texttt{vec} contains a vector whose elements are the 
+indices of the elements of \texttt{tmphgt} for which the logical condition 
+holds. 
 
 
-Sub-setting for 2D objects (matrices) is done in similar
+Sub-setting for 2D objects (matrices) is done in a similar
 manner. Let us construct a simple matrix and do several 
 sub-setting operations on it:
 \index{matrix}
 
 <<>>=
-a <- matrix(c(	11,12,13,
-		21,22,23,
-		31,32,33
-	      	),nrow=3,ncol=3)
+a <- matrix(c(11, 12, 13,
+              21, 22, 23,
+              31, 32, 33
+              ), 
+            nrow=3, ncol=3)
 a
 @
 
 To obtain the element in the 2nd row and 2nd column, you can use
 <<>>=
-a[2,2]
+a[2, 2]
 @
-To access the elemnt from the second row and third column, use
+To access the element from the second row and third column, use
 <<>>=
-a[2,3]
+a[2, 3]
 @
 Note that here, the row index (2) comes first, and the column 
 index (3) comes second. 
 
-To obtain the 2x2 set of elements contained in upper left 
+To obtain the $2 \times 2$ set of elements contained in upper left 
 corner, you can do
 <<>>=
-a[1:2,1:2]
+a[1:2, 1:2]
 @
 
-Or you can even get the variables, which reside in corners:
+Or you can even get the variables that reside in corners:
 <<>>=
-a[c(1,3),c(1,3)]
+a[c(1, 3), c(1, 3)]
 @
 
-If one of the dimensions is not specified, complete vector 
+If one of the dimensions is not specified, a complete vector 
 is returned for this dimension. For example, here we retrieve 
 the first row
 <<>>=
 a[1,]
 @
-...and the third column
+\ldots and the third column
 <<>>=
-a[,3]
+a[, 3]
 @
-...or columns 1 and 3:
+\ldots or columns 1 and 3:
 <<>>=
-a[,c(1,3)]
+a[, c(1, 3)]
 @
 
-Other way to address elements of a matrix is to use one-dimensional 
-index. For example, if you want to access element in the 2nd row 
+Another way to address elements of a matrix is to use a one-dimensional 
+index. For example, if you want to access the element in the 2nd row 
 and 2nd column, instead of 
 <<>>=
-a[2,2]
+a[2, 2]
 @
 you can use
 <<>>=
@@ -410,61 +410,61 @@
 \label{tab:matrix}
 \end{center}
 \end{table}
-This way of accessing the elements of a matrix is based on the fact, 
-that each matrix can be preseted as a vector, whose elements are 
-numbered consequtively: the element in the upper-left corner has index 1, 
+This way of accessing the elements of a matrix is based on the fact 
+that each matrix can be represented as a vector whose elements are 
+numbered consecutively: the element in the upper-left corner has index 1, 
 the element in the second row of the first column has index 2, and the 
-last elemnt in the borrom-right corner has the maximal value, as shown 
+last element in the bottom-right corner has the maximal value, as shown 
 in Table \ref{tab:matrix}.
 
-As well as with vectors, you can sub-set matrices using 
-logical conditions or indexes. 
-For example, if we want to see what elements of a are greater than 
+You can sub-set matrices using logical conditions or indexes like you can 
+with vectors.
+For example, if we want to see which elements of \texttt{a} are greater than 
 21, we can run
 <<>>=
-a>21
+a > 21
 @
 or, better
 <<>>=
-which(a>21)
+which(a > 21)
 @
-Note that in the latter case, a vector whose elements give the 1-D indexes of 
+Note that in the latter case, a vector whose elements give the 1-D indicess of 
 the matrix, is returned. This vector 
-indicates the elemnts of matrix \texttt{a}, for which the condition \texttt{(a>21)}
-is satisfied. 
+indicates the elements of matrix \texttt{a}, for which the condition 
+\texttt{(a > 21)} is satisfied. 
 
-You can obtain the values of the matrix's elements, for which 
-the condition isfulfilled, either by 
+You can obtain the values of the matrix's elements for which 
+the condition is fulfilled either by 
 <<>>=
-a[a>21]
+a[a > 21]
 @
-or
+or using 
 <<>>=
-a[which(a>21)]
+a[which(a > 21)]
 @
 
-Once again, the latter method should be prefered. Consider an example, where some 
-elements of the matrix are missing (\texttt{NA}) -- a situation which is 
-common in real data analysis. Let us replace the elemnt number 
+Once again, the latter method should be preferred. Consider the example where 
+some elements of the matrix are missing (\texttt{NA}) -- a situation which is 
+common in real data analysis. Let us replace element number 
 5 with \texttt{NA} and perform sub-setting operations on the resulting matrix:
 <<>>=
 a
 a[5] <- NA
 a
-a[a>21]
-a[which(a>21)]
+a[a > 21]
+a[which(a > 21)]
 @
-You can see that when \texttt{a[a>21]} was used, not only the elements which
+You can see that when \texttt{a[a > 21]} was used, not only the elements which
 are greater than 21 were returned, but also \texttt{NA} was. As a rule, this 
 is not what you want, and \texttt{which} should be used unless you do want 
 to make some use of the \texttt{NA} elements. 
 \index{which()}
 
 In this section, we have generated a number of R data objects. Some of 
-these were numeric (e.g. vector of heights, \texttt{tmphgt}) and 
-some were character, or string (e.g. vector of study IDs, \texttt{tmpids}).
-Some times you need to figure out what is the class of a certain 
-object. This can be done using the \texttt{class()} function. 
+these were numeric (\eg vector of heights, \texttt{tmphgt}) and 
+some were character, or string (\eg vector of study IDs, \texttt{tmpids}).
+Sometimes you need to figure out what the class of a certain 
+object is. This can be done using the \texttt{class()} function. 
 \index{class of an R object}
 \index{class()}
 For example, 
@@ -486,13 +486,13 @@
 
 Results are expected -- we find out that \texttt{a} is a matrix, which is correct. 
 At the same time, a matrix is an upper-level class, which contains 
-a number of elemnts, belonging to some lower-level (e.g. character/numeric) 
-class. To see what is the class of the matrix's elements, try
+a number of elements, belonging to some lower-level (\eg character/numeric) 
+class. To see what is the class of the matrix elements, try
 <<>>=
-a[1,]
-class(a[1,])
+a[1, ]
+class(a[1, ])
 @
-which says that elemnts (at least of the first row) are numeric. Because 
+which says that elements (at least of the first row) are numeric. Because 
 all elements of a matrix should have the same class, we can conclude that 
 \texttt{a} is a matrix containing numeric values.
 
@@ -512,22 +512,22 @@
 \index{list of data objects}
 
 Obviously, this ''list'' command is very useful -- you will soon find that 
-it is just too easy to forget the name of a variable which it 
-took long time to create. 
-Some times you may wish to remove some of the data objects 
+it is just too easy to forget the name of a variable which 
+took a long time to create. 
+Sometimes you may wish to remove some of the data objects 
 because you do not need then anymore. 
 You can remove an object using the \texttt{rm()} command, where 
 the names of objects to be deleted are listed as arguments. 
-For example, to remove \texttt{tmphgt} and \texttt{tmpids} variable you 
+For example, to remove the \texttt{tmphgt} and \texttt{tmpids} variables you 
 can use
 <<>>=
-rm(tmphgt,tmpids)
+rm(tmphgt, tmpids)
 @
 \index{rm()}
 \index{remove data object}
 
 
-If you now look up what data obejcts are still left in you workspace with the \texttt{ls()} command
+If you now look up what data objects are still left in you workspace with the \texttt{ls()} command
 <<>>=
 ls()
 @
@@ -541,31 +541,31 @@
 \index{quit R}
 
 \begin{summary}
-\item You can get access to the top-level R documentation by 
+\item You can get access to the top-level R documentation via the
 \texttt{help.start()} command. To search help for some keyword \texttt{keywrd},
-you can use \texttt{help.search(keywrd)} command. 
-To get description of some function \texttt{fun}, use \texttt{help(fun)}.
+you can use the \texttt{help.search(keywrd)} command. 
+To get a description of some function \texttt{fun}, use \texttt{help(fun)}.
 \item You can use R as a powerful calculator  
 \item It is possible to get sub-sets of vectors and matrices by 
-specifying index value or a logical condition (of the same length as
+specifying an index value or a logical condition (of the same length as
 the vector / matrix) between square brackets 
 (\texttt{[}, \texttt{]})
-\item When you obtain an element of a matrix with \texttt{[i,j]},
+\item When you obtain an element of a matrix with \texttt{[i, j]},
 \texttt{i} is the row and \texttt{j} is the column of the matrix.
-\item Function \texttt{which(A)} returns index of the elements 
+\item The function \texttt{which(A)} returns the index of the elements 
 of A which are \texttt{TRUE}
-\item You can see objects available in your workspace 
+\item You can see the objects available in your workspace 
 by using the \texttt{ls()} command
-\item Unnecessary object (say, \texttt{tmphgt}) can be 
-deleted from the workspace using \texttt{rm} command, 
-e.g. \texttt{rm(tmphgt)}
+\item Unnecessary objects (say, \texttt{tmphgt}) can be 
+deleted from the workspace using the \texttt{rm} command, 
+\eg \texttt{rm(tmphgt)}
 \item You can leave R using the \texttt{q()} command
 \end{summary}
 
 
 \begin{Exercise}[title=Exploring srdta]
-In this exercise, you will explore few vectors representing 
-different data on study subjects described in \texttt{srdta} 
+In this exercise, you will explore a few vectors representing 
+different data on study subjects described in hte \texttt{srdta} 
 example data set supplied together with \GA{}. First, you need 
 to load \GA{} by typing
 <<results=hide>>=
@@ -575,54 +575,55 @@
 <<>>=
 data(srdta)
 @
-The vector containing study subjects sex can be accessed 
+The vector containing the study subjects' sex can be accessed 
 through \texttt{male(srdta)}; this vector's value 
-is one when the corresponding person is male and zero 
+is 1 when the corresponding person is male and 0
 otherwise. The vector containing SNP names can be accessed 
 via \texttt{snpnames(srdta)}, chromosome ID -- through 
 \texttt{chromosome(srdta)} and map -- through 
 \texttt{map(srdta)}. Explore these vectors and answer 
 the questions.
 \Question What is the ID and sex of the first person in the data set? 
-\Question Of the 22nd person? 
-\Question How many males are observed among first hundred subjects? 
-\Question How many FEMALES are among 4th hundred?
-\Question What is the male proportion in first 1000 people?
+\Question Of the $22^\text{nd}$ person? 
+\Question How many males are observed among the first hundred subjects? 
+\Question How many FEMALES are among the $4^\text{th}$ hundred?
+\Question What is the male proportion in the first 1000 people?
 \Question What is the FEMALE proportion in second 1000 (1001:2000) people?
-\Question What is name, chromosome and map position of 33rd maker? 
+\Question What is name, chromosome and map position of 33$^\text{rd}$ marker? 
 \Question What is distance between markers 25 and 26?
 \end{Exercise}
 \begin{Answer}
-For the first person id is "\Sexpr{idnames(srdta)[1]}" and 
+For the first person the id is "\Sexpr{idnames(srdta)[1]}" and the
 sex code is \Sexpr{male(srdta)[1]} (1=male, 0=female)
 <<>>=
 idnames(srdta)[1]
 male(srdta)[1]
 @
-For the 22nd person id is "\Sexpr{idnames(srdta)[22]}" and 
+The id for the 22$^\text{nd}$ person is "\Sexpr{idnames(srdta)[22]}" and 
 sex code is \Sexpr{male(srdta)[22]}:
 <<>>=
 idnames(srdta)[22]
 male(srdta)[22]
 @
-Among first 100 subjects, there are \Sexpr{sum(male(srdta)[1:100])}
+Among the first 100 subjects, there are \Sexpr{sum(male(srdta)[1:100])}
 males:
 <<>>=
 sum(male(srdta)[1:100])
 @
-Among 4th hundred subjects there are \Sexpr{sum(male(srdta)[301:400]==0)} females:
+Among the 4$^\text{th}$ hundred subjects there are 
+\Sexpr{sum(male(srdta)[301:400]==0)} females:
 <<>>=
 100-sum(male(srdta)[301:400])
 @
-Male proportion among first 1000 people is 
+The male proportion among the first 1000 people is 
 <<>>=
 mean(male(srdta)[1:1000])
 @
-Female proportion among second 1000 people is
+The female proportion among the second 1000 people is
 <<>>=
 1 - mean(male(srdta)[1001:2000])
 @
-Name, chromosome and map position of the 33rd marker are:
+Name, chromosome and map position of the 33$^\text{rd}$ marker are:
 <<>>=
 snpnames(srdta)[33]
 chromosome(srdta)[33]
@@ -634,7 +635,7 @@
 pos25
 pos26 <- map(srdta)[26]
 pos26
-pos26-pos25
+pos26 - pos25
 @
 \end{Answer}
 
@@ -656,7 +657,7 @@
 referencing to these names\footnote{This 
 may also be true for matrices; more fundamental 
 difference is though that a matrix \emph{always} contains variables 
-of the same data type, e.g. character or numeric, while a data frame 
+of the same data type, \eg character or numeric, while a data frame 
 may contain variables of different types}. 
 \index{data frame}
 
@@ -758,7 +759,7 @@
 assoc[75,]
 @
 
-In the same manner as with matrices, you can get data for e.g. subjects 
+In the same manner as with matrices, you can get data for \eg subjects 
 5 to 15 by 
 <<>>=
 assoc[5:15,]
@@ -816,7 +817,7 @@
 <<>>=
 attach(assoc)
 @
-After that, the variables can be accessed directly, e.g. 
+After that, the variables can be accessed directly, \eg 
 <<>>=
 subj[75]
 @
@@ -827,7 +828,7 @@
 elements using the assignment (''\texttt{<-}'') operation, 
 you can also explore and modify the data contained in a data frame\footnote{and also 
 a matrix} by 
-using \texttt{fix()} command (e.g. try \texttt{fix(assoc)}). 
+using \texttt{fix()} command (\eg try \texttt{fix(assoc)}). 
 However, normally this is not necessary. 
 
 
@@ -877,7 +878,7 @@
 The variable which will be used when you directly use the name 
 would be the one from the data frame attached last. You can use 
 \texttt{detach()} function to remove a certain data frame from 
-the search path, e.g. after
+the search path, \eg after
 <<>>=
 detach(assoc)
 @
@@ -912,7 +913,7 @@
 \item You can attach the data frame to the search path by 
 \texttt{attach(frame)}. Then the variables contained in this 
 data frame may be accessed directly. To detach the data 
-frame (because, e.g., you are now interested in other data 
+frame (because, \eg, you are now interested in other data 
 frame), use \texttt{detach(frame)}.
 \end{summary}
 
@@ -992,7 +993,7 @@
 @
 
 However, that would not have worked if the sex was coded differently, 
-e.g. with ''1'' for males and ''2'' for females.
+\eg with ''1'' for males and ''2'' for females.
 
 Let us now try to find out the mean of the quantitative trait \texttt{qt}. 
 By definition, the mean of a variable, say $x$ (with i-th element denoted 



More information about the Genabel-commits mailing list