[IPSUR-commits] r146 - pkg/IPSUR/inst/doc

Mon Jan 18 18:05:46 CET 2010

Author: gkerns
Date: 2010-01-18 18:05:46 +0100 (Mon, 18 Jan 2010)
New Revision: 146

Modified:
   pkg/IPSUR/inst/doc/IPSUR.Rnw
   pkg/IPSUR/inst/doc/IPSUR.bib
Log:
small changes, mostly add to bibliography and fix BLANKS


Modified: pkg/IPSUR/inst/doc/IPSUR.Rnw
===================================================================

--- pkg/IPSUR/inst/doc/IPSUR.Rnw	2010-01-18 13:39:24 UTC (rev 145)
+++ pkg/IPSUR/inst/doc/IPSUR.Rnw	2010-01-18 17:05:46 UTC (rev 146)
@@ -2219,8 +2219,8 @@
 represents a middle or general tendency of the data. Of course, there
 are usually several values that would serve as a center, and our later
 tasks will be focused on choosing an appropriate one for the data
-at hand. Judging from the histogram that we saw before, a measure
-of center would be about BLANK. 
+at hand. Judging from the histogram that we saw in Figure \ref{fig:histograms-bins},
+a measure of center would be about \Sexpr{round(mean(precip))}. 
 
 
 \subsection{Spread\label{sub:Spread}}
@@ -3157,7 +3157,8 @@
 it comes from, type \inputencoding{latin9}\lstinline[showstringspaces=false]!?RcmdrTestDrive!\inputencoding{utf8}
 at the command line.
 \begin{xca}
-Perform a summary of all variables in \inputencoding{latin9}\lstinline[showstringspaces=false]!RcmdrTestDrive!\inputencoding{utf8}.
+\label{xca:summary-RcmdrTestDrive}Perform a summary of all variables
+in \inputencoding{latin9}\lstinline[showstringspaces=false]!RcmdrTestDrive!\inputencoding{utf8}.
 You can do this with the command \inputencoding{latin9}
 \begin{lstlisting}[showstringspaces=false]
 summary(RcmdrTestDrive)
@@ -3375,9 +3376,8 @@
 In this problem we will compare the variables \emph{before} and \emph{after}.
 Don't forget \inputencoding{latin9}\lstinline[showstringspaces=false]!library(e1071)!\inputencoding{utf8}.
 \begin{enumerate}
-\item Examine the two measures of center for both variables that you found
-in Exercise BLANK. Judging from these measures, which variable has
-a higher center?
+\item Examine the two measures of center for both variables. Judging from
+these measures, which variable has a higher center?
 \item Which measure of center is more appropriate for \emph{before}? (You
 may want to look at a boxplot.) Which measure of center is more appropriate
 for \emph{after}?
@@ -3402,8 +3402,8 @@
 
 
 We may take a look at the \inputencoding{latin9}\lstinline[showstringspaces=false]!summary(RcmdrTestDrive)!\inputencoding{utf8}
-output from Exercise BLANK. Here we will repeat the relevant summary
-statistics.
+output from Exercise \ref{xca:summary-RcmdrTestDrive}. Here we will
+repeat the relevant summary statistics.
 
 <<>>=
 c(mean(before), median(before))
@@ -3682,7 +3682,7 @@
 of general interest.
 
 
-\subsection{Sampling from Urns}
+\subsection{Sampling from Urns\label{sub:sampling-from-urns}}
 
 This is perhaps the most fundamental type of random experiment. We
 have an urn that contains a bunch of distinguishable objects (balls)
@@ -3729,8 +3729,9 @@
 arguments are logical and specify how sampling will be performed.
 We will discuss each in turn.
 \begin{example}
-Let our urn simply contain three balls, labeled 1, 2, and 3, respectively.
-We are going to take a sample of size 2 from the urn. 
+\label{exa:sample-urn-two-from-three}Let our urn simply contain three
+balls, labeled 1, 2, and 3, respectively. We are going to take a sample
+of size 2 from the urn. 
 
 \subsubsection*{Ordered, With Replacement}
 
@@ -3826,7 +3827,7 @@
 \inputencoding{latin9}\lstinline[showstringspaces=false]!ordered = TRUE!\inputencoding{utf8}
 even when, in fact, the call to the function was \inputencoding{latin9}\lstinline[breaklines=true,showstringspaces=false]!urnsamples(..., ordered = FALSE)!\inputencoding{utf8}.
 Similar remarks apply for the \inputencoding{latin9}\lstinline[showstringspaces=false]!replace!\inputencoding{utf8}
-argument. We investigate this issue further in Section BLANK.
+argument. 
 
 
 \section{Events\label{sec:Events}}
@@ -4304,7 +4305,7 @@
 rolldie(1, makespace = TRUE)
 @ 
 
-or just \inputencoding{latin9}\lstinline[showstringspaces=false]!rolldie(1, TRUE)!\inputencoding{utf8}.
+\noindent or just \inputencoding{latin9}\lstinline[showstringspaces=false]!rolldie(1, TRUE)!\inputencoding{utf8}.
 Many of the other sample space functions (\inputencoding{latin9}\lstinline[showstringspaces=false]!tosscoin!\inputencoding{utf8},
 \inputencoding{latin9}\lstinline[showstringspaces=false]!cards!\inputencoding{utf8},
 \inputencoding{latin9}\lstinline[showstringspaces=false]!roulette!\inputencoding{utf8},
@@ -4350,12 +4351,12 @@
 some scintillating question. (Bear in mind: rolling a die just 9 times
 has a sample space with over \emph{10 million} outcomes.)
 
-Alas! Even if there were enough RAM to barely hold the sample space
+Alas, even if there were enough RAM to barely hold the sample space
 (and there were enough time to wait for it to be generated), the infinitesimal
-probabilities that are associated with SO MANY outcomes make it difficult
-for the underlying machinery to handle reliably. In some cases, special
-algorithms need to be called just to give something that holds asymptotically.
-User beware.
+probabilities that are associated with \emph{so many} outcomes make
+it difficult for the underlying machinery to handle reliably. In some
+cases, special algorithms need to be called just to give something
+that holds asymptotically. User beware.
 
 
 \section{Properties of Probability\label{sec:Properties-of-Probability}}
@@ -4786,8 +4787,9 @@
 \par\end{center}
 \begin{example}
 We will compute the number of outcomes for each of the four \inputencoding{latin9}\lstinline[showstringspaces=false]!urnsamples!\inputencoding{utf8}
-examples that we saw in Example BLANK. Recall that we took a sample
-of size two from an urn with three distinguishable elements.
+examples that we saw in Example \ref{exa:sample-urn-two-from-three}.
+Recall that we took a sample of size two from an urn with three distinguishable
+elements.
 \end{example}
 <<echo=TRUE,print=TRUE>>= 
 nsamp(n=3, k=2, replace = TRUE, ordered = TRUE) 
@@ -8339,6 +8341,9 @@
 with a right-skewed distribution.
 \item The $\mathsf{chisq}(\mathtt{df}=p)$ distribution is the same as a
 $\mathsf{gamma}(\mathtt{shape}=p/2,\,\mathtt{rate}=1/2)$ distribution.
+\item The MGF of $X\sim\mathsf{chisq}(\mathtt{df}=p)$ is\begin{equation}
+M_{X}(t)=\left(1-2t\right)^{-p},\quad t<1/2.\label{eq:mgf-chisq}\end{equation}
+
 \end{enumerate}
 \end{rem}
 
@@ -9206,6 +9211,10 @@
 and this last quantity is exactly $\left(\E u(X)\right)\left(\E v(Y)\right)$.
 \end{proof}
 
+
+Now that we have Proposition \ref{pro:indep-implies-prodexpect} we
+mention a corollary that will help us later to quickly identify those
+random variables which are \emph{not} independent.
 \begin{cor}
 If $X$ and $Y$ are independent, then $\mbox{Cov}(X,Y)=0$, and consequently,
 $\mbox{Corr}(X,Y)=0$.\label{cor:indep-implies-uncorr}
@@ -9221,7 +9230,24 @@
 \label{rem:cov0-not-imply-indep}Unfortunately, the converse of Corollary
 \ref{cor:indep-implies-uncorr} is not true. That is, there are many
 random variables which are dependent yet their covariance and correlation
-is zero. For more details, see Casella and Berger \cite{Casella2002}. \end{rem}
+is zero. For more details, see Casella and Berger \cite{Casella2002}. 
+\end{rem}
+Proposition \ref{pro:indep-implies-prodexpect} is useful to us and
+we will receive mileage out of it, but there is another fact which
+will play an even more important role. Unfortunately, the proof is
+beyond the techniques presented here. The inquisitive reader should
+consult Casella and Berger \cite{Casella2002}, Resnick \cite{Resnick1999},
+\emph{etc}.
+\begin{fact}
+\label{fac:indep-then-function-indep}If $X$ and $Y$ are independent,
+then $u(X)$ and $v(Y)$ are independent for any functions $u$ and
+$v$.
+\end{fact}
+
+\subsection{Combining Independent Random Variables\label{sub:Combining-Independent-Random}}
+
+Another important corollary of Proposition \ref{pro:indep-implies-prodexpect}
+will allow us to find the distribution of sums of random variables.
 \begin{cor}
 If $X$ and $Y$ are independent, then the moment generating function
 of $X+Y$ is \begin{equation}
@@ -9234,19 +9260,27 @@
 \end{proof}
 
 
-Proposition \ref{pro:indep-implies-prodexpect} is useful to us and
-we will receive mileage out of it, but there is another fact which
-will play an even more important role. Unfortunately, the proof is
-beyond the techniques presented here. The inquisitive reader should
-consult Casella and Berger \cite{Casella2002}, Resnick \cite{Resnick1999},
-\emph{etc}.
-\begin{fact}
-\label{fac:indep-then-function-indep}If $X$ and $Y$ are independent,
-then $u(X)$ and $v(Y)$ are independent for any functions $u$ and
-$v$.
-\end{fact}
+Let us take a look at some examples of the corollary in action.
+\begin{example}
+Let $X\sim\mathsf{binom}(\mathtt{size}=n_{1},\,\mathtt{prob}=p)$
+and $Y\sim\mathsf{binom}(\mathtt{size}=n_{2},\,\mathtt{prob}=p)$
+be independent. Then $X+Y$ has MGF\[
+M_{X+Y}(t)=M_{X}(t)\, M_{Y}(t)=\left(q+p\me^{t}\right)^{n_{1}}\left(q+p\me^{t}\right)^{n_{2}}=\left(q+p\me^{t}\right)^{n_{1}+n_{2}},\]
+which is the MGF of a $\mathsf{binom}(\mathtt{size}=n_{1}+n_{2},\,\mathtt{prob}=p)$
+distribution. Therefore, $X+Y\sim\mathsf{binom}(\mathtt{size}=n_{1}+n_{2},\,\mathtt{prob}=p)$.
+\end{example}
 
-\subsection{Combining Independent Random Variables\label{sub:Combining-Independent-Random}}
+\begin{example}
+Let $X\sim\mathsf{norm}(\mathtt{mean}=\mu_{1},\,\mathtt{sd}=\sigma_{1})$
+and $Y\sim\mathsf{norm}(\mathtt{mean}=\mu_{2},\,\mathtt{sd}=\sigma_{2})$
+be independent. Then $X+Y$ has MGF\[
+M_{X}(t)\, M_{Y}(t)=\exp\left\{ \mu_{1}t+t^{2}\sigma_{1}^{2}/2\right\} \exp\left\{ \mu_{2}t+t^{2}\sigma_{2}^{2}/2\right\} =\exp\left\{ \left(\mu_{1}+\mu_{2}\right)t+t^{2}\left(\sigma_{1}^{2}+\sigma_{2}^{2}\right)/2\right\} ,\]
+which is the MGF of a $\mathsf{norm}(\mathtt{mean}=\mu_{1}+\mu_{2},\,\mathtt{sd}=\sqrt{\sigma_{1}^{2}+\sigma_{2}^{2}})$
+distribution. 
+\end{example}
+Even when we cannot use the MGF trick to identify the exact distribution
+of a linear combination of random variables, we can still say something
+about its mean and variance.
 \begin{prop}
 \label{pro:mean-sd-lin-comb-two}Let $X_{1}$ and $X_{2}$ be independent
 with respective population means $\mu_{1}$ and $\mu_{2}$ and population
@@ -9290,11 +9324,7 @@
 to a \emph{lack of influence} between the two variables, exchangeability
 aims to capture the \emph{symmetry} between them.
 \begin{example}
-BLANK.
-\end{example}
-
-\begin{example}
-Here is another one, more complicated than the one above.\begin{multline}
+Let $X$ and $Y$ have joint PDF\begin{multline}
 f_{X,Y}(x,y)=(1+\alpha)\lambda^{2}\me^{-\lambda(x+y)}+\alpha(2\lambda)^{2}\me^{-2\lambda(x+y)}-2\alpha\lambda^{2}\left(\me^{-\lambda(2x+y)}+\me^{-\lambda(x+2y)}\right).\end{multline}
 It is straightforward and tedious to check that $\iint f=1$. We may
 see immediately that $f_{X,Y}(x,y)=f_{X,Y}(y,x)$ for all $(x,y)$,
@@ -9304,13 +9334,33 @@
 \cite{Kotz2000}.
 \end{example}
 
-\begin{rem}
+\begin{example}
+\label{exa:binom-exchangeable}Suppose $X$ and $Y$ are i.i.d.~$\mathsf{binom}(\mathtt{size}=n,\,\mathtt{prob}=p)$.
+Then their joint PMF is\begin{eqnarray*}
+f_{X,Y}(x,y) & = & f_{X}(x)f_{Y}(y)\\
+ & = & {n \choose x}\, p^{x}(1-p)^{n-x}\,{n \choose y}\, p^{y}(1-p)^{n-y},\\
+ & = & {n \choose x}{n \choose y}\, p^{x+y}(1-p)^{2n-(x+y)},\end{eqnarray*}
+and the value is the same if we exchange $x$ and $y$. Therefore
+$(X,Y)$ are exchangeable.
+\end{example}
+
+
+Looking at Example \ref{exa:binom-exchangeable} more closely we see
+that the fact that $(X,Y)$ are exchangeable has nothing to do with
+the $\mathsf{binom}(\mathtt{size}=n,\,\mathtt{prob}=p)$ distribution;
+it only matters that they are independent (so that the joint PDF factors)
+and they are identically distributed (in which case we may swap letters
+to no effect). We could just have easily used any other marginal distribution.
+We will take this as a proof of the following proposition.
+\begin{prop}
 If $X$ and $Y$ are i.i.d.~(with common marginal distribution $F$)
-then $X$ and $Y$ are exchangeable because\[
-F_{X,Y}(x,y)=F(x)F(y)=F(y)F(x)=F_{X,Y}(y,x).\]
- 
-\end{rem}
+then $X$ and $Y$ are exchangeable. 
+\end{prop}
 
+
+Exchangeability thus contains i.i.d.~as a special case. 
+
+
 \section{The Bivariate Normal Distribution\label{sec:The-Bivariate-Normal}}
 
 The bivariate normal PDF is given by the unwieldly formula\begin{multline}
@@ -9803,7 +9853,19 @@
 Prove that $\mbox{Cov}(X,Y)=\E(XY)-(\E X)(\E Y).$\label{xca:Prove-cov-shortcut}
 \end{xca}
 
+\begin{xca}
+\label{xca:sum-indep-chisq}Suppose $X\sim\mathsf{chisq}(\mathtt{df}=p_{1})$
+and $Y\sim\mathsf{chisq}(\mathtt{df}=p_{2})$ are independent. Find
+the distribution of $X+Y$ (you may want to refer to Equation \ref{eq:mgf-chisq}).
+\end{xca}
 
+\begin{xca}
+\label{xca:diff-indep-norm}Show that when $X$ and $Y$ are independent
+the MGF of $X-Y$ is $M_{X}(t)M_{Y}(-t)$. Use this to find the distribution
+of $X-Y$ when $X\sim\mathsf{norm}(\mathtt{mean}=\mu_{1},\,\mathtt{sd}=\sigma_{1})$
+and $Y\sim\mathsf{norm}(\mathtt{mean}=\mu_{2},\,\mathtt{sd}=\sigma_{2})$
+are independent. 
+\end{xca}
 
 
 
@@ -10123,11 +10185,12 @@
 (see Exercise \ref{xca:clt123} at the end of this chapter). Its purpose
 is to investigate what happens to the sampling distribution of $\Xbar$
 when the population distribution is mound shaped, finite support,
-and skewed, namely $\mathsf{dt}(\mathtt{df}=3)$, $\mathsf{unif}(\mathtt{a}=0,\,\mathtt{b}=10)$
-and $\mathsf{gamma}(\mathtt{shape}=,\,\mathtt{scale}=)$, respectively. 
+and skewed, namely $\mathsf{t}(\mathtt{df}=3)$, $\mathsf{unif}(\mathtt{a}=0,\,\mathtt{b}=10)$
+and $\mathsf{gamma}(\mathtt{shape}=1.21,\,\mathtt{scale}=1/2.37)$,
+respectively. 
 
 For example, when the command \inputencoding{latin9}\lstinline[showstringspaces=false]!clt1()!\inputencoding{utf8}
-is issued a plot window opens to show a graph of the PDF of a $\mathsf{dt}(\mathtt{df}=3)$
+is issued a plot window opens to show a graph of the PDF of a $\mathsf{t}(\mathtt{df}=3)$
 distribution. On the display are shown numerical values of the population
 mean and variance. While the students examine the graph the computer
 is simulating random samples of size \inputencoding{latin9}\lstinline[showstringspaces=false]!sample.size = 2!\inputencoding{utf8}
@@ -10223,7 +10286,7 @@
 normal for $n_{2}$ sufficiently large, also by the CLT. Further,
 $\hat{p}_{1}$ and $\hat{p}_{2}$ are independent since they are derived
 from independent samples. And a difference of independent (approximately)
-normal distributions is (approximately) normal, by Proposition BLANK%
+normal distributions is (approximately) normal, by Exercise \ref{xca:diff-indep-norm}%
 \footnote{\begin{rem}
 This does not explicitly follow, because of our cavalier use of {}``approximately''
 in too many places. To be more thorough, however, would require more
@@ -11222,9 +11285,7 @@
 $Y_{m}$ sample.
 
 Suppose that $\sigma_{X}$ and $\sigma_{Y}$ are known. We would like
-a confidence interval for $\mu_{X}-\mu_{Y}$.
-
-We know that \begin{equation}
+a confidence interval for $\mu_{X}-\mu_{Y}$. We know that \begin{equation}
 \Xbar-\Ybar\sim\mathsf{norm}\left(\mathtt{mean}=\mu_{X}-\mu_{Y},\,\mathtt{sd}=\sqrt{\frac{\sigma_{X}^{2}}{n}+\frac{\sigma_{Y}^{2}}{m}}\right).\end{equation}
 Therefore, a $100(1-\alpha)$\% confidence interval for $\mu_{X}-\mu_{Y}$
 is given by\begin{equation}
@@ -11246,7 +11307,7 @@
 \Xbar-\Ybar\sim\mathsf{norm}\left(\mathtt{mean}=\mu_{X}-\mu_{Y},\,\mathtt{sd}=\sigma\sqrt{\frac{1}{n}+\frac{1}{m}}\right).\end{equation}
 Now let \begin{equation}
 U=\frac{n-1}{\sigma^{2}}S_{X}^{2}+\frac{m-1}{\sigma^{2}}S_{Y}^{2}.\end{equation}
-Then by BLANK we know that $U\sim\mathsf{chisq}(\mathtt{df}=n+m-2)$
+Then by Exercise \ref{xca:sum-indep-chisq} we know that $U\sim\mathsf{chisq}(\mathtt{df}=n+m-2)$
 and is not a large leap to believe that $U$ is independent of $\Xbar-\Ybar$;
 thus \begin{equation}
 T=\frac{Z}{\sqrt{\left.U\right\slash (n+m-2)}}\sim\mathsf{t}(\mathtt{df}=n+m-2).\end{equation}
@@ -11261,10 +11322,12 @@
 S_{p}=\sqrt{\frac{(n-1)S_{X}^{2}+(m-1)S_{Y}^{2}}{n+m-2}}\end{equation}
 is called the {}``pooled'' estimator of $\sigma$.
 \item if one of the samples is small, and both underlying populations are
-normal, but $\sigma_{X}\neq\sigma_{Y}$, then we may use a procedure
-attributed to Welch-Aspin (BLANK). The idea is to use an interval
-of the form\begin{equation}
-\left(\Xbar-\Ybar\right)\pm t_{\alpha/2}(\mathtt{df}=r)\,\sqrt{\frac{S_{X}^{2}}{n}+\frac{S_{Y}^{2}}{m}},\end{equation}
+normal, but $\sigma_{X}\neq\sigma_{Y}$, then we may use a Welch (or
+Satterthwaite) approximation to the degrees of freedom. See Welch
+\cite{Welch1947}, Satterthwaite \cite{Satterthwaite1946}, or Neter
+\emph{et al} \cite{Neter1996}. The idea is to use an interval of
+the form\begin{equation}
+\left(\Xbar-\Ybar\right)\pm\mathsf{t}_{\alpha/2}(\mathtt{df}=r)\,\sqrt{\frac{S_{X}^{2}}{n}+\frac{S_{Y}^{2}}{m}},\end{equation}
 where the degrees of freedom $r$ is chosen so that the interval has
 nice statistical properties. It turns out that a good choice for $r$
 is given by\begin{equation}
@@ -11489,6 +11552,8 @@
 
 
 \section{Chapter Exercises}
+
+\setcounter{thm}{0}
 \begin{xca}
 Let $X_{1}$, $X_{2}$, \ldots{}, $X_{n}$ be an $SRS(n)$ from a
 $\mathsf{norm}(\mathtt{mean}=\mu,\,\mathtt{sd}=\sigma)$ distribution.
@@ -12282,7 +12347,9 @@
 
 \section{Chapter Exercises}
 
+\setcounter{thm}{0}
 
+
 \chapter{\label{cha:Simple-Linear-Regression}Simple Linear Regression}
 
 
@@ -12594,7 +12661,8 @@
 and the desired $x$ value(s) represented by a data frame. See the
 example below.
 \begin{example}
-Using the regression line for the \inputencoding{latin9}\lstinline[showstringspaces=false]!cars!\inputencoding{utf8}
+\label{exa:regline-cars-interpret}Using the regression line for the
+\inputencoding{latin9}\lstinline[showstringspaces=false]!cars!\inputencoding{utf8}
 data:
 \begin{enumerate}
 \item What is the meaning of $\mu(60)=\beta_{0}+\beta_{1}(8$)? 
@@ -12617,17 +12685,17 @@
 This would represent the mean stopping distance for a car traveling
 0\,mph (which our regression line estimates to be \Sexpr{round(coef(cars.lm)[1],2)}).
 Of course, this interpretation does not make any sense for this example,
-because a car travelling 0\,mph takes 0~ft to stop (it was not moving
-in the first place)! What went wrong? Looking at the data, we notice
-that the smallest speed for which we have measured data is 4\,mph.
-Therefore, if we predict what would happen for slower speeds then
-we would be \emph{extrapolating}, a dangerous practice which often
-gives nonsensical results.
+because a car travelling 0\,mph takes 0\,ft to stop (it was not
+moving in the first place)! What went wrong? Looking at the data,
+we notice that the smallest speed for which we have measured data
+is 4\,mph. Therefore, if we predict what would happen for slower
+speeds then we would be \emph{extrapolating}, a dangerous practice
+which often gives nonsensical results.
 
 \end{enumerate}
 \end{example}
 
-\subsection{Point Estimates of the Regression Line}
+\subsection{Point Estimates of the Regression Line\label{sub:slr-point-est-regline}}
 
 We said at the beginning of the chapter that our goal was to estimate
 $\mu=\E Y$, and the arguments in Section \ref{sub:point-estimate-mle-slr}
@@ -12642,11 +12710,12 @@
 The first is a number, $\mu(x_{0})$, and the second is a random variable,
 $Y(x_{0})$, but our point estimate is the same for both: $\hat{\mu}(x_{0})$.
 \begin{example}
-We may use the regression line to obtain a point estimate of the mean
-stopping distance for a car traveling 8\,mph: $\hat{\mu}(15)=b_{0}+8b_{1}\approx$
-\Sexpr{round(coef(cars.lm)[1],2)} $+(8)$ (\Sexpr{round(coef(cars.lm)[2],2)})$\approx13.88$.
-We would also use 13.88 as a point estimate for the stopping distance
-of a future car traveling 8\,mph.
+\label{exa:regline-cars-pe-8mph}We may use the regression line to
+obtain a point estimate of the mean stopping distance for a car traveling
+8\,mph: $\hat{\mu}(15)=b_{0}+8b_{1}\approx$ \Sexpr{round(coef(cars.lm)[1],2)}
+$+(8)$ (\Sexpr{round(coef(cars.lm)[2],2)})$\approx13.88$. We would
+also use 13.88 as a point estimate for the stopping distance of a
+future car traveling 8\,mph.
 \end{example}
 Note that we actually have observed data for a car traveling 8\,mph;
 its stopping distance was 16\,ft as listed in the fifth row of the
@@ -12668,13 +12737,14 @@
 belong to the original data if the context of the statement obviates
 any danger of confusion.
 
-We saw in Example BLANK that spooky things can happen when we are
-cavalier about point estimation. While it is usually acceptable to
-predict/estimate at values of $x_{0}$ that fall within the range
-of the original $x$ data, it is reckless to use $\hat{\mu}$ for
-point estimates at locations outside that range. Such estimates are
-usually worthless. \emph{Do not extrapolate} unless there are compelling
-external reasons, and even then, temper it with a good deal of caution.
+We saw in Example \ref{exa:regline-cars-interpret} that spooky things
+can happen when we are cavalier about point estimation. While it is
+usually acceptable to predict/estimate at values of $x_{0}$ that
+fall within the range of the original $x$ data, it is reckless to
+use $\hat{\mu}$ for point estimates at locations outside that range.
+Such estimates are usually worthless. \emph{Do not extrapolate} unless
+there are compelling external reasons, and even then, temper it with
+a good deal of caution.
 
 
 \subsection*{How to do it with \textsf{R}}
@@ -12705,7 +12775,7 @@
 
 Note that there were no observed cars that traveled 6\,mph or 21\,mph.
 Also note that our estimate for a car traveling 8\,mph matches the
-value we computed by hand in Example BLANK.
+value we computed by hand in Example \ref{exa:regline-cars-pe-8mph}.
 
 
 \subsection{Mean Square Error and Standard Error}
@@ -12743,17 +12813,17 @@
 found it to be approximately $\hat{\mu}(8)\approx$\Sexpr{round(predict(cars.lm, newdata = data.frame(speed = 8)), 2)}.
 Now, it turns out that there was only one recorded observation at
 $x=8$, and we have seen this value in the output of \inputencoding{latin9}\lstinline[showstringspaces=false]!head(cars)!\inputencoding{utf8}
-in Example \ref{exa:Speed-and-Stopping}; it was $\mathtt{dist}=16$~ft
-for a car with $\mathtt{speed}=8$~mph. Therefore, the residual should
-be $E=Y-\hat{Y}$ which is $E\approx16-$\Sexpr{round(predict(cars.lm, newdata = data.frame(speed = 8)), 2)}.
+in Example \ref{exa:Speed-and-Stopping}; it was $\mathtt{dist}=16$\,ft
+for a car with $\mathtt{speed}=8$\,mph. Therefore, the residual
+should be $E=Y-\hat{Y}$ which is $E\approx16-$\Sexpr{round(predict(cars.lm, newdata = data.frame(speed = 8)), 2)}.
 Now take a look at the last entry of \inputencoding{latin9}\lstinline[showstringspaces=false]!residuals(cars.lm)!\inputencoding{utf8},
 above. It is not a coincidence.
 
 The estimate $S$ for $\sigma$ is called the \inputencoding{latin9}\lstinline[breaklines=true,showstringspaces=false]!Residual standard error!\inputencoding{utf8}
 and for the \inputencoding{latin9}\lstinline[showstringspaces=false]!cars!\inputencoding{utf8}
 data is shown a few lines up on the \inputencoding{latin9}\lstinline[showstringspaces=false]!summary(cars.lm)!\inputencoding{utf8}
-output (see How to do it with \textsf{R} in Section BLANK). We may
-read it from there to be $S\approx$\Sexpr{round(summary(cars.lm)$sigma, 2)},
+output (see How to do it with \textsf{R} in Section \ref{sub:slr-interval-est-params}).
+We may read it from there to be $S\approx\,$\Sexpr{round(summary(cars.lm)$sigma, 2)},
 or we can access it directly from the \inputencoding{latin9}\lstinline[showstringspaces=false]!summary!\inputencoding{utf8}
 object.
 
@@ -12763,7 +12833,7 @@
 @
 
 
-\subsection{Interval Estimates of the Parameters}
+\subsection{Interval Estimates of the Parameters\label{sub:slr-interval-est-params}}
 
 We discussed general interval estimation in Chapter \ref{cha:Estimation}.
 There we found that we could use what we know about the sampling distribution
@@ -12772,11 +12842,11 @@
 we will determine the sampling distributions of the parameter estimates,
 $b_{1}$ and $b_{0}$.
 
-To that end, we can see from Equation BLANK (and it is made clear
-in Chapter \ref{cha:Multiple-Linear-Regression}) that $b_{1}$ is
-just a linear combination of normally distributed random variables,
-so $b_{1}$ is normally distributed too. Further, it can be shown
-that\begin{equation}
+To that end, we can see from Equation \ref{eq:regline-slope-formula}
+(and it is made clear in Chapter \ref{cha:Multiple-Linear-Regression})
+that $b_{1}$ is just a linear combination of normally distributed
+random variables, so $b_{1}$ is normally distributed too. Further,
+it can be shown that\begin{equation}
 b_{1}\sim\mathsf{norm}\left(\mathtt{mean}=\beta_{1},\,\mathtt{sd}=\sigma_{b_{1}}\right)\end{equation}
  where \begin{equation}
 \sigma_{b_{1}}=\frac{\sigma}{\sqrt{\sum_{i=1}^{n}(x_{i}-\xbar)^{2}}}\end{equation}
@@ -12787,7 +12857,8 @@
 defined by\begin{equation}
 S_{b_{1}}=\frac{S}{\sqrt{\sum_{i=1}^{n}(x_{i}-\xbar)^{2}}}.\end{equation}
 Now, it turns out that $b_{0}$, $b_{1}$, and $S$ are mutually independent
-(see the footnote in Section BLANK). Therefore, the quantity\begin{equation}
+(see the footnote in Section \ref{sub:mlr-interval-est-params}).
+Therefore, the quantity\begin{equation}
 T=\frac{b_{1}-\beta_{1}}{S_{b_{1}}}\end{equation}
 has a $\mathsf{t}(\mathtt{df}=n-2)$ distribution. Therefore, a $100(1-\alpha)\%$
 confidence interval for $\beta_{1}$ is given by \begin{equation}
@@ -12826,10 +12897,10 @@
 In the \inputencoding{latin9}\lstinline[showstringspaces=false]!Coefficients!\inputencoding{utf8}
 section we find the parameter estimates and their respective standard
 errors in the second and third columns; the other columns are discussed
-in Section BLANK. If we wanted, say, a 95\% confidence interval for
-$\beta_{1}$ we could use $b_{1}=\ $\Sexpr{A[2,1]} and $S_{b_{1}}=\ $\Sexpr{A[2,2]}
-together with a $\mathsf{t}_{0.025}(\mathtt{df}=23)$ critical value
-to calculate $b_{1}\pm\mathsf{t}_{0.025}(\mathtt{df}=23)S_{b_{1}}$.
+in Section \ref{sec:Model-Utility-SLR}. If we wanted, say, a 95\%
+confidence interval for $\beta_{1}$ we could use $b_{1}=\ $\Sexpr{A[2,1]}
+and $S_{b_{1}}=\ $\Sexpr{A[2,2]} together with a $\mathsf{t}_{0.025}(\mathtt{df}=23)$
+critical value to calculate $b_{1}\pm\mathsf{t}_{0.025}(\mathtt{df}=23)S_{b_{1}}$.
 
 Or, we could use the \inputencoding{latin9}\lstinline[showstringspaces=false]!confint!\inputencoding{utf8}
 function.
@@ -12842,7 +12913,7 @@
 covers the parameter $\beta_{1}$.
 
 
-\subsection{Interval Estimates of the Regression Line}
+\subsection{Interval Estimates of the Regression Line\label{sub:slr-interval-est-regline}}
 
 We have seen how to estimate the coefficients of regression line with
 both point estimates and confidence intervals. We even saw how to
@@ -12892,8 +12963,8 @@
 
 Confidence and prediction intervals are calculated in \textsf{R} with
 the \inputencoding{latin9}\lstinline[showstringspaces=false]!predict!\inputencoding{utf8}\index{predict@\texttt{predict}}
-function, which we encountered in Section BLANK. There we neglected
-to take advantage of its additional \inputencoding{latin9}\lstinline[showstringspaces=false]!interval!\inputencoding{utf8}
+function, which we encountered in Section \ref{sub:slr-point-est-regline}.
+There we neglected to take advantage of its additional \inputencoding{latin9}\lstinline[showstringspaces=false]!interval!\inputencoding{utf8}
 argument. The general syntax follows.
 \begin{example}
 We will find confidence and prediction intervals for the stopping
@@ -12970,8 +13041,9 @@
 would be inherently less certain than a bound for an average (mean)
 value; therefore, we expect the CIs for the mean to be tighter than
 the PIs for a new observation. A close look at the standard deviations
-in Equations BLANK and BLANK confirms our guess, but we would like
-to see a picture to drive the point home.
+in Equations \ref{eq:SLR-conf-int-formula} and \ref{eq:SLR-pred-int-formula}
+confirms our guess, but we would like to see a picture to drive the
+point home.
 
 We may plot the confidence and prediction intervals with one fell
 swoop using the \inputencoding{latin9}\lstinline[showstringspaces=false]!ci.plot!\inputencoding{utf8}
@@ -13000,13 +13072,13 @@
 Notice that the bands curve outward away from the regression line
 as the $x$ values move away from the center. This is expected once
 we notice the $(x_{0}-\xbar)^{2}$ term in the standard deviation
-formulas in Equations BLANK and BLANK.
+formulas in Equations \ref{eq:SLR-conf-int-formula} and \ref{eq:SLR-pred-int-formula}.
 
 
 \section{Model Utility and Inference\label{sec:Model-Utility-SLR}}
 
 
-\subsection{Hypothesis Tests for the Parameters}
+\subsection{Hypothesis Tests for the Parameters\label{sub:slr-hypoth-test-params}}
 
 Much of the attention of SLR is directed toward $\beta_{1}$ because
 when $\beta_{1}\neq0$ the mean value of $Y$ increases (or decreases)
@@ -13019,18 +13091,19 @@
 we need to know the sampling distribution of $b_{1}$ when the null
 hypothesis is true.
 
-To this end we already know from Section BLANK that the quantity\begin{equation}
+To this end we already know from Section \ref{sub:slr-interval-est-params}
+that the quantity\begin{equation}
 T=\frac{b_{1}-\beta_{1}}{S_{b_{1}}}\end{equation}
-has a $t(\mathtt{df}=n-2)$ distribution; therefore, when $\beta_{1}=0$
-the quantity $b_{1}/S_{b_{1}}$ has a $t(\mathtt{df}=n-2)$ distribution
-and we can compute a $p$-value by comparing the observed value of
-$b_{1}/S{}_{b_{1}}$ with values under a $\mathsf{t}(\mathtt{df}=n-2)$
+has a $\mathsf{t}(\mathtt{df}=n-2)$ distribution; therefore, when
+$\beta_{1}=0$ the quantity $b_{1}/S_{b_{1}}$ has a $\mathsf{t}(\mathtt{df}=n-2)$
+distribution and we can compute a $p$-value by comparing the observed
+value of $b_{1}/S{}_{b_{1}}$ with values under a $\mathsf{t}(\mathtt{df}=n-2)$
 curve. 
 
 Similarly, we may test the hypothesis $H_{0}:\beta_{0}=0$ versus
 the alternative $H_{1}:\beta_{0}\neq0$ with the statistic $T=b_{0}/S_{b_{0}}$,
-where $S_{b_{0}}$ is given in Section BLANK. The test is conducted
-the same way as for $\beta_{1}$. 
+where $S_{b_{0}}$ is given in Section \ref{sub:slr-interval-est-params}.
+The test is conducted the same way as for $\beta_{1}$. 
 
 
 \subsection*{How to do it with \textsf{R}}
@@ -13169,24 +13242,25 @@
 versus $H_{1}:\beta_{1}\neq0$, but it is done with a new test statistic
 called the \emph{overall} $F$ \emph{statistic}. It is defined by
 \begin{equation}
-F=\frac{SSR}{SSE/(n-2)}.\end{equation}
+F=\frac{SSR}{SSE/(n-2)}.\label{eq:slr-overall-F-statistic}\end{equation}
 Under the regression assumptions and when $H_{0}$ is true, the $F$
 statistic has an $\mathtt{f}(\mathtt{df1}=1,\,\mathtt{df2}=n-2)$
 distribution. We reject $H_{0}$ when $F$ is large -- that is, when
 the explained variation is large relative to the unexplained variation.
 
 All this being said, we have not yet gained much from the overall
-$F$ statistic because we already knew from Section BLANK how to test
-$H_{0}:\beta_{1}=0$\ldots{} we use the Student's $t$ statistic.
-What is worse is that (in the simple linear regression model) it can
-be proved that the $F$ in Equation BLANK is exactly the Student's
-$t$ statistic for $\beta_{1}$ squared,\begin{equation}
+$F$ statistic because we already knew from Section \ref{sub:slr-hypoth-test-params}
+how to test $H_{0}:\beta_{1}=0$\ldots{} we use the Student's $t$
+statistic. What is worse is that (in the simple linear regression
+model) it can be proved that the $F$ in Equation \ref{eq:slr-overall-F-statistic}
+is exactly the Student's $t$ statistic for $\beta_{1}$ squared,\begin{equation}
 F=\left(\frac{b_{1}}{S_{b_{1}}}\right)^{2}.\end{equation}
 So why bother to define the $F$ statistic? Why not just square the
 $t$ statistic and be done with it? The answer is that the $F$ statistic
 has a more complicated interpretation and plays a more important role
 in the multiple linear regression model which we will study in Chapter
-\ref{cha:Multiple-Linear-Regression}. See Section BLANK for details.
+\ref{cha:Multiple-Linear-Regression}. See Section \ref{sub:mlr-Overall-F-Test}
+for details.
 
 
 \subsection{How to do it with \textsf{R}}
@@ -13279,7 +13353,7 @@
 W=\frac{\left(\sum_{i=1}^{n}a_{i}E_{(i)}\right)^{2}}{\sum_{j=1}^{n}E_{j}^{2}},\end{equation}
 where the $E_{(i)}$ are the ordered residuals and the $a_{i}$ are
 constants derived from the order statistics of a sample of size $n$
-from a normal distribution. See Section BLANK.
+from a normal distribution. See Section \ref{sub:Shapiro-Wilk-Normality-Test}.
 
 We perform the Shapiro-Wilk test below, using the \inputencoding{latin9}\lstinline[showstringspaces=false]!shapiro.test!\inputencoding{utf8}
 function from the \inputencoding{latin9}\lstinline[showstringspaces=false]!stats!\inputencoding{utf8}
@@ -13529,23 +13603,25 @@
 $h_{ii}$ will be close to 1. 
 
 Leverages have nice mathematical properties; for example, they satisfy\begin{equation}
-0\leq h_{ii}\leq1,\end{equation}
+0\leq h_{ii}\leq1,\label{eq:slr-leverage-between}\end{equation}
 and their sum is \begin{eqnarray}
 \sum_{i=1}^{n}h_{ii} & = & \sum_{i=1}^{n}\left[\frac{1}{n}+\frac{(x_{i}-\xbar)^{2}}{\sum_{k=1}^{n}(x_{k}-\xbar)^{2}}\right],\\
[TRUNCATED]

To get the complete diff run:
    svnlook diff /svnroot/ipsur -r 146