[IPSUR-commits] r119 - pkg/IPSUR/inst/doc www/book www/rcmdrplugin

Mon Jan 4 19:26:22 CET 2010

Author: gkerns
Date: 2010-01-04 19:26:21 +0100 (Mon, 04 Jan 2010)
New Revision: 119

Modified:
   pkg/IPSUR/inst/doc/IPSUR.Rnw
   pkg/IPSUR/inst/doc/IPSUR.bib
   www/book/feedback.php
   www/rcmdrplugin/index.php
Log:
too many to list


Modified: pkg/IPSUR/inst/doc/IPSUR.Rnw
===================================================================

--- pkg/IPSUR/inst/doc/IPSUR.Rnw	2010-01-04 14:19:50 UTC (rev 118)
+++ pkg/IPSUR/inst/doc/IPSUR.Rnw	2010-01-04 18:26:21 UTC (rev 119)
@@ -491,6 +491,12 @@
 but frankly, I have tried to steer clear of them for the past year
 or so to avoid any undue influence on my own writing.
 
+I would like to make special mention of two other books: \emph{Introduction
+to Statistical Thought} by Michael Lavine \cite{Lavine2009} and \emph{Introduction
+to Probability} by Grinstead and Snell \cite{Grinstead1997}. Both
+of these books are \emph{free} and are what ultimately convinced me
+to release \IPSUR\ under a free license, too.
+
 Please bear in mind that the title of this book is {}``Introduction
 to Probability and Statistics Using \textsf{R}'', and not {}``Introduction
 to \textsf{R} Using Probability and Statistics'', nor even {}``Introduction
@@ -502,10 +508,10 @@
 Still others will just want to learn \textsf{R} and skip all of the
 mathematics.
 
-Despite any misgivings: here it is. I humbly invite said individuals
-to take this book, with the GNU-FDL in hand, and make it better. In
-that spirit there are at least a few ways in which this book could
-be improved in my view.
+Despite any misgivings: here it is, warts and all. I humbly invite
+said individuals to take this book, with the GNU-FDL in hand, and
+make it better. In that spirit there are at least a few ways in which
+this book could be improved in my view.
 \begin{description}
 \item [{Better~data:}] the data analyzed in this book are almost entirely
 from the \inputencoding{latin9}\lstinline[showstringspaces=false]!datasets!\inputencoding{utf8}
@@ -525,7 +531,7 @@
 
 In a perfect world with infinite time I would research and contribute
 recent, \emph{real} data in a context crafted to engage the students
-in \emph{every} example. One day I hope to stumble across that time.
+in \emph{every} example. One day I hope to stumble over said time.
 In the meantime, I will add new data sets incrementally as time permits.
 
 \item [{More~proofs:}] for the sake of completeness (I understand that
@@ -749,9 +755,9 @@
 more attention, at an increasing rate.
 
 This book is devoted mostly to the frequentist viewpoint because that
-is how I was trained, with the conspicuous exception of Sections BLANK,
-BLANK, and BLANK. I plan to add more Bayesian material in later editions
-of this book. 
+is how I was trained, with the conspicuous exception of Sections \ref{sec:Bayes'-Rule}
+and \ref{sec:Conditional-Distributions}. I plan to add more Bayesian
+material in later editions of this book. 
 
 
 \chapter{An Introduction to \textsf{R\label{cha:An-Introduction-to-R}}}
@@ -4931,7 +4937,7 @@
 \end{itemize}
 The quantity $n!/[k!(n-k)!]$ is called a \emph{binomial coefficient}
 and plays a special role in mathematics; it is denoted\begin{equation}
-{n \choose k}=\frac{n!}{k!(n-k)!}\end{equation}
+{n \choose k}=\frac{n!}{k!(n-k)!}\label{eq:binomial-coefficient}\end{equation}
 and is read {}``$n$ choose $k$''.
 \begin{example}
 You rent five movies to watch over the span of two nights, but only
@@ -5714,7 +5720,7 @@
 \textbf{\emph{(Bayes' Rule).}} Let $B_{1}$, $B_{2}$, \ldots{},
 $B_{n}$ be mutually exclusive and exhaustive and let $A$ be an event
 with $\P(A)>0$. Then \begin{equation}
-\P(B_{k}|A)=\frac{\P(B_{k})\P(A|B_{k})}{\sum_{i=1}^{n}\P(B_{i})\P(A|B_{i})},\quad k=1,2,\ldots,n.\end{equation}
+\P(B_{k}|A)=\frac{\P(B_{k})\P(A|B_{k})}{\sum_{i=1}^{n}\P(B_{i})\P(A|B_{i})},\quad k=1,2,\ldots,n.\label{eq:bayes-rule}\end{equation}
 \end{thm}
 \begin{proof}
 The proof follows from looking at $\P(B_{k}\cap A)$ in two different
@@ -9081,7 +9087,7 @@
 By the way, there is a shortcut formula for covariance which is almost
 as handy as the shortcut for the variance:\begin{equation}
 \mbox{Cov}(X,Y)=\E(XY)-(\E X)(\E Y).\end{equation}
-The proof is left to Exercise BLANK.
+The proof is left to Exercise \ref{xca:Prove-cov-shortcut}.
 
 The Pearson product moment correlation between $X$ and $Y$ is the
 covariance between $X$ and $Y$ rescaled to fall in the interval
@@ -9147,7 +9153,8 @@
 To do the continuous case we could use the computer algebra utilities
 of \inputencoding{latin9}\lstinline[showstringspaces=false]!Yacas!\inputencoding{utf8}
 and the associated \textsf{R} package \inputencoding{latin9}\lstinline[showstringspaces=false]!Ryacas!\inputencoding{utf8}
-\cite{ryacas}. See Section BLANK for another example where the \inputencoding{latin9}\lstinline[showstringspaces=false]!Ryacas!\inputencoding{utf8}
+\cite{ryacas}. See Section \ref{sub:bivariate-transf-R} for another
+example where the \inputencoding{latin9}\lstinline[showstringspaces=false]!Ryacas!\inputencoding{utf8}
 package appears.
 
 
@@ -9186,10 +9193,10 @@
 $\theta$ \emph{given the observation} $X=x$, denoted $\pi_{\theta|x}$.
 It may seem a mystery how to obtain $\pi_{\theta|x}$ based only on
 the information provided by $\pi$ and $f_{X|\theta}$, but it should
-not be. We have already studied this in Chapter BLANK where it was
-called Bayes' Rule:\begin{equation}
+not be. We have already studied this in Section \ref{sec:Bayes'-Rule}
+where it was called Bayes' Rule:\begin{equation}
 \pi(\theta|x)=\frac{\pi(\theta)\, f(x|\theta)}{\int\pi(u)\, f(x|u)\diff u}.\end{equation}
-Compare the above expression to Equation BLANK.
+Compare the above expression to Equation \ref{eq:bayes-rule}.
 \begin{example}
 Suppose the parameter $\theta$ is the $\P(\mbox{Heads})$ for a biased
 coin. It could be any value from 0 to 1. Perhaps we have some prior
@@ -9241,8 +9248,8 @@
 
 \subsection{Independent Random Variables\label{sub:Independent-Random-Variables}}
 
-We recall from Chapter BLANK that the events $A$ and $B$ are said
-to be independent when\begin{equation}
+We recall from Chapter \ref{cha:Probability} that the events $A$
+and $B$ are said to be independent when\begin{equation}
 \P(A\cap B)=\P(A)\P(B).\end{equation}
 If it happens that\begin{equation}
 \P(X=x,Y=y)=\P(X=x)\P(Y=y),\quad\mbox{for every }x\in S_{X},\ y\in S_{Y},\end{equation}
@@ -9281,8 +9288,8 @@
 They have many, many, tractable properties. We mention some of the
 more important ones.
 \begin{prop}
-If $X$ and $Y$ are independent, then for any functions $u$ and
-$v$, \begin{equation}
+\label{pro:indep-implies-prodexpect}If $X$ and $Y$ are independent,
+then for any functions $u$ and $v$, \begin{equation}
 \E\left(u(X)v(Y)\right)=\left(\E u(X)\right)\left(\E v(Y)\right).\end{equation}
 
 \end{prop}
@@ -9296,7 +9303,7 @@
 
 \begin{cor}
 If $X$ and $Y$ are independent, then $\mbox{Cov}(X,Y)=0$, and consequently,
-$\mbox{Corr}(X,Y)=0$.
+$\mbox{Corr}(X,Y)=0$.\label{cor:indep-implies-uncorr}
 \end{cor}
 \begin{proof}
 When $X$ and $Y$ are independent then $\E XY=\E X\,\E Y$. And when
@@ -9306,9 +9313,10 @@
 
 
 \begin{rem}
-Unfortunately, the converse of Corollary BLANK is not true. That is,
-there are many random variables which are dependent yet their covariance
-and correlation is zero. For more details, see Casella BLANK. \end{rem}
+Unfortunately, the converse of Corollary \ref{cor:indep-implies-uncorr}
+is not true. That is, there are many random variables which are dependent
+yet their covariance and correlation is zero. For more details, see
+Casella and Berger \cite{Casella2002}. \end{rem}
 \begin{cor}
 If $X$ and $Y$ are independent, then the moment generating function
 of $X+Y$ is \begin{equation}
@@ -9316,15 +9324,16 @@
 
 \end{cor}
 \begin{proof}
-Choose $u(x)=\me^{x}$ and $v(y)=\me^{y}$ in Proposition BLANK, and
-remember the identity $\me^{t(x+y)}=\me^{tx}\,\me^{ty}$.
+Choose $u(x)=\me^{x}$ and $v(y)=\me^{y}$ in Proposition \ref{pro:indep-implies-prodexpect},
+and remember the identity $\me^{t(x+y)}=\me^{tx}\,\me^{ty}$.
 \end{proof}
 
 
-Proposition BLANK is useful to us and we will receive mileage out
-of it, but there is another fact which will play an even more important
-role. Unfortunately, the proof is beyond the techniques presented
-here. The inquisitive reader should consult Casella and Berger, Resnick,
+Proposition \ref{pro:indep-implies-prodexpect} is useful to us and
+we will receive mileage out of it, but there is another fact which
+will play an even more important role. Unfortunately, the proof is
+beyond the techniques presented here. The inquisitive reader should
+consult Casella and Berger \cite{Casella2002}, Resnick \cite{Resnick1999},
 \emph{etc}.
 \begin{fact}
 If $X$ and $Y$ are independent, then $u(X)$ and $v(Y)$ are independent
@@ -9386,7 +9395,7 @@
 which confirms that $X$ and $Y$ are exchangeable. Here, $\alpha$
 is said to be an association parameter. This particular example is
 one from the Farlie-Gumbel-Morgenstern family of distributions; see
-BLANK.
+\cite{Kotz2000}.
 \end{example}
 
 \begin{rem}
@@ -9502,19 +9511,43 @@
 \item marginal distributions
 \item how to generate randomly
 \end{itemize}
-When $p_{1}=$
-
-We write $(X_{1},\ldots,X_{k})\sim\mathsf{multinom}(\mathtt{size}=n,\,\mathtt{prob}=\mathbf{p}_{\mathrm{k}\times1})$. 
+We sample $n$ times, with replacement, from an urn that contains
+balls of $k$ different types. Let $X_{1}$ denote the number of balls
+in our sample of type 1, $X_{2}$ denote the number of balls of type
+2, \ldots{} , and $X_{k}$ denote the number of balls of type $k$.
+The the urn has proportion $p_{1}$ of balls of type 1, \ldots{},
+$p_{k}$ of type $p_{k}$, then the joint PMF of $(X_{1},\ldots,X_{k})$
+is\begin{eqnarray}
+f_{X_{1},\ldots,X_{k}}(x_{1},\ldots,x_{k}) & = & {n \choose x_{1}\, x_{2}\,\cdots\, x_{k}}\, p_{1}^{x_{1}}p_{2}^{x_{2}}\cdots p_{k}^{x_{k}},\quad\mbox{for }(x_{1},\ldots,x_{k})\in S_{X_{1},\ldots X_{K}},\end{eqnarray}
+which, as usual, represents $\P(X_{1}=x_{1},\, X_{2}=x_{2},\, X_{k}=x_{k})$.
+We write $(X_{1},\ldots,X_{k})\sim\mathsf{multinom}(\mathtt{size}=n,\,\mathtt{prob}=\mathbf{p}_{\mathrm{k}\times1})$.
+Several comments are in order. First, the support set $S_{X_{1},\ldots X_{K}}$
+contains all nonnegative integer $k$-tuples $(x_{1},\ldots,x_{k})$
+that satisfy $x_{1}+x_{2}+\cdots+x_{k}=n$. A support set like this
+is called a \emph{simplex}. Second, the proportions $p_{1}$, $p_{2}$,
+\ldots{}, $p_{k}$ satisfy $p_{i}\geq0$ for all $i$ and $p_{1}+p_{2}+\cdots+p_{k}=1$.
+Finally, the symbol\begin{equation}
+{n \choose x_{1}\, x_{2}\,\cdots\, x_{k}}=\frac{n!}{x_{1}!\, x_{2}!\,\cdots x_{k}!}\end{equation}
+is called a \emph{multinomial coefficient} which generalizes the notion
+of a binomial coefficient we saw in Equation \ref{eq:binomial-coefficient}.
+When $k=2$, we have $x_{1}=x$ and $x_{2}=n-x$, we have $p_{1}=p$
+and $p_{2}=1-p$, and the multinomial coefficient is literally a binomial
+coefficient. In this notation we have just shown, therefore, that
+the $\mathsf{multinom}(\mathtt{size}=n,\,\mathtt{prob}=\mathbf{p}_{2\times1})$
+distribution is the same as a $\mathsf{binom}(\mathtt{size}=n,\,\mathtt{prob}=p)$
+distribution.
 \begin{example}
-Suppose Barack Obama wants to have dinner \url{http://pewresearch.org/pubs/773/fewer-voters-identify-as-republicans}
+Suppose Barack Obama wants to have dinner \url{http://pewresearch.org/pubs/773/fewer-voters-identify-as-republicans}36
+democrat, 27 republican , 37 independent.
 \end{example}
 
 \subsection{How to do it with \textsf{R}}
 
 There is support for the multinomial distribution in base \textsf{R},
 namely in the \inputencoding{latin9}\lstinline[showstringspaces=false]!stats!\inputencoding{utf8}
-package. The relevant functions are \inputencoding{latin9}\lstinline[showstringspaces=false]!dmultinom!\inputencoding{utf8}
-and \inputencoding{latin9}\lstinline[showstringspaces=false]!rmultinom!\inputencoding{utf8}.
+package. The \inputencoding{latin9}\lstinline[showstringspaces=false]!dmultinom!\inputencoding{utf8}
+function represents the PMF and the \inputencoding{latin9}\lstinline[showstringspaces=false]!rmultinom!\inputencoding{utf8}
+function generates random variates.
 
 <<>>=
 library(combinat)
@@ -9630,7 +9663,7 @@
 distribution. For a more general result see Proposition BLANK.
 \end{rem}
 
-\subsection{How to do it with \textsf{R}}
+\subsection{How to do it with \textsf{R\label{sub:bivariate-transf-R}}}
 
 It is possible to do the computations above in \textsf{R} with the
 \inputencoding{latin9}\lstinline[showstringspaces=false]!Ryacas!\inputencoding{utf8}
@@ -9800,7 +9833,7 @@
 
 
 \begin{xca}
-Prove that $\mbox{Cov}(X,Y)=\E(XY)-(\E X)(\E Y).$
+Prove that $\mbox{Cov}(X,Y)=\E(XY)-(\E X)(\E Y).$\label{xca:Prove-cov-shortcut}
 \end{xca}
 
 
@@ -10150,8 +10183,8 @@
 and \inputencoding{latin9}\lstinline[showstringspaces=false]!clt3!\inputencoding{utf8}
 functions were written so that students could compare what happens
 overall when the shape of the population distribution changes. It
-would be possible to combine all three into one big function \inputencoding{latin9}\lstinline[showstringspaces=false]!clt!\inputencoding{utf8}
-which covers all three cases. 
+would be possible to combine all three into one big function, \inputencoding{latin9}\lstinline[showstringspaces=false]!clt!\inputencoding{utf8}
+which covers all three cases (and more). 
 
 
 \section{Sampling Distributions of Two-Sample Statistics\label{sec:Samp-Dist-Two-Samp}}

Modified: pkg/IPSUR/inst/doc/IPSUR.bib
===================================================================
--- pkg/IPSUR/inst/doc/IPSUR.bib	2010-01-04 14:19:50 UTC (rev 118)
+++ pkg/IPSUR/inst/doc/IPSUR.bib	2010-01-04 18:26:21 UTC (rev 119)
@@ -1,4 +1,4 @@
-% This file was created with JabRef 2.6b2.
+% This file was created with JabRef 2.3.1.
 % Encoding: UTF-8
 
 @MANUAL{rgl,
@@ -333,6 +333,16 @@
   url = {http://www.jstatsoft.org/v29/i10/}
 }
 
+ at BOOK{Grinstead1997,
+  title = {Introduction to Probability},
+  publisher = {American Mathematical Society},
+  year = {1997},
+  author = {Grinstead, Charles M. and Snell, J. Laurie},
+  owner = {jay},
+  timestamp = {2010.01.04},
+  url = {http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/book.html}
+}
+
 @MANUAL{Harrellhmisc,
   title = {Hmisc: Harrell Miscellaneous},
   author = {Harrell, Jr, Frank E and with contributions from many other users.},
@@ -403,6 +413,47 @@
   timestamp = {2009.12.30}
 }
 
+ at BOOK{Johnson1997,
+  title = {Discrete Multivariate Distributions},
+  publisher = {Wiley},
+  year = {1997},
+  author = {Johnson, Norman L. and Kotz, Samuel and Balakrishnan, N.},
+  owner = {jay},
+  timestamp = {2010.01.04}
+}
+
+ at BOOK{Johnson1995,
+  title = {Continuous Univariate Distributions},
+  publisher = {Wiley},
+  year = {1995},
+  author = {Johnson, Norman L. and Kotz, Samuel and Balakrishnan, N.},
+  volume = {2},
+  edition = {Second},
+  owner = {jay},
+  timestamp = {2010.01.04}
+}
+
+ at BOOK{Johnson1994,
+  title = {Continuous Univariate Distributions},
+  publisher = {Wiley},
+  year = {1994},
+  author = {Johnson, Norman L. and Kotz, Samuel and Balakrishnan, N.},
+  volume = {1},
+  edition = {Second},
+  owner = {jay},
+  timestamp = {2010.01.04}
+}
+
+ at BOOK{Johnson1993,
+  title = {Univariate Discrete Distributions},
+  publisher = {Wiley},
+  year = {1993},
+  author = {Johnson, Norman L. and Kotz, Samuel and Kemp, Adrienne W.},
+  edition = {Second},
+  owner = {jay},
+  timestamp = {2010.01.04}
+}
+
 @MISC{howmanyfish,
   author = {Johnson, Roger W.},
   title = {How Many Fish are in the Pond?},
@@ -432,6 +483,17 @@
   url = {http://CRAN.R-project.org/package=RcmdrPlugin.IPSUR}
 }
 
+ at BOOK{Kotz2000,
+  title = {Continuous Multivariate Distributions},
+  publisher = {Wiley},
+  year = {2000},
+  author = {Kotz, Samuel and Balakrishnan, N. and Johnson, Norman L.},
+  volume = {1: Models and Applications},
+  edition = {Second},
+  owner = {jay},
+  timestamp = {2010.01.04}
+}
+
 @MANUAL{odfweave,
   title = {odfWeave: Sweave processing of Open Document Format (ODF) files},
   author = {Max Kuhn and Steve Weaston},

Modified: www/book/feedback.php
===================================================================
--- www/book/feedback.php	2010-01-04 14:19:50 UTC (rev 118)
+++ www/book/feedback.php	2010-01-04 18:26:21 UTC (rev 119)
@@ -36,7 +36,7 @@
 <blockquote>
 
 <p>
-I would be happy to hear any (most?) comments, suggestions, problems, questions, or requests that you may have about <span class="name">IPSUR</span>. But if you would like to ask/say something, then it would be better to ask/say it on the <span class="name">IPSUR</span> mailing lists (or my personal address) rather than bog down an already busy R-help mailing list.  I have made two <span class="name">IPSUR</span> specific mailing lists for this purpose.
+I would be happy to hear any (most?) comments, suggestions, problems, questions, or requests that you may have about <span class="name">IPSUR</span>. And if you would like to ask/say something, then it would be better to ask/say it on the <span class="name">IPSUR</span> mailing lists (or my personal address) rather than bog down an already busy R-help mailing list.  There are two <span class="name">IPSUR</span> specific mailing lists for this purpose.
 </p>
 </blockquote>
 

Modified: www/rcmdrplugin/index.php
===================================================================
--- www/rcmdrplugin/index.php	2010-01-04 14:19:50 UTC (rev 118)
+++ www/rcmdrplugin/index.php	2010-01-04 18:26:21 UTC (rev 119)
@@ -38,7 +38,7 @@
 <p class="articleTitle">What RcmdrPlugin.IPSUR Does:</p>
 <blockquote>
         <p> This plugin for the <a href="http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/">R Commander</a> accompanies the text <em>Introduction to Probability and Statistics Using R</em>  by G. Jay Kerns. The plugin contributes functions unique to the book as well as specific configuration and  functionality to R Commander, the pioneering work by John Fox of McMaster University.</p>
-        <p> <span class="name">RcmdrPlugin.IPSUR</span>'s primary goal is to provide a user-friendly graphical user interface (GUI) to the open-source and freely available R statistical computing environment.  <span class="name">RcmdrPlugin.IPSUR</span> is equipped to handle many of the statistical analyses and graphical displays usually encountered by upper division undergraduate Mathematics, Statistics, and Engineering majors.  Available features are comparable to many expensive commercial packages such as<span class="name"> Minitab</span>, <span class="name">SPSS</span>, and <span class="name">JMP-IN</span>.</p>
+        <p> <span class="name">RcmdrPlugin.IPSUR</span>'s primary goal is to provide a user-friendly graphical user interface (GUI) to the open-source and freely available R statistical computing environment.  <span class="name">RcmdrPlugin.IPSUR</span> is equipped to handle many of the statistical analyses and graphical displays usually encountered by upper division undergraduate mathematics, statistics, and engineering majors.  Available features are comparable to many expensive commercial packages such as<span class="name"> Minitab</span>, <span class="name">SPSS</span>, and <span class="name">JMP-IN</span>.</p>
         <p> Since the audience of <span class="name">RcmdrPlugin.IPSUR</span> is slightly different than <span class="name">Rcmdr</span>'s, certain functionality has been added and selected error-checks have been disabled to permit the student to explore alternative regions of the statistical landscape. The resulting benefit of increased flexibility is balanced by somewhat increased vulnerability to syntax errors and misuse; the instructor should keep this and the academic audience in mind when using <span class="name">RcmdrPlugin.IPSUR</span> in the classroom. </p>
 </blockquote>