[Sleuth2-commits] r54 - in pkg/Sleuth2: . inst/doc inst/doc/NicholasHorton

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Sun Sep 16 10:58:09 CEST 2012


Author: berwin
Date: 2012-09-16 10:58:09 +0200 (Sun, 16 Sep 2012)
New Revision: 54

Added:
   pkg/Sleuth2/.Rbuildignore
   pkg/Sleuth2/inst/doc/NicholasHorton/
   pkg/Sleuth2/inst/doc/NicholasHorton/Makefile
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter01.Rnw
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter01.pdf
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter02.Rnw
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter02.pdf
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter03.Rnw
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter03.pdf
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter04.Rnw
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter04.pdf
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter05.Rnw
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter05.pdf
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter06.Rnw
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter06.pdf
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter07.Rnw
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter07.pdf
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter08.Rnw
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter08.pdf
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter09.Rnw
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter09.pdf
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter10.Rnw
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter10.pdf
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter11.Rnw
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter11.pdf
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter12.Rnw
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter12.pdf
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter13.Rnw
   pkg/Sleuth2/inst/doc/NicholasHorton/chapter13.pdf
   pkg/Sleuth2/inst/doc/NicholasHorton/language.sty
   pkg/Sleuth2/inst/doc/chapter01.Rnw
   pkg/Sleuth2/inst/doc/chapter02.Rnw
   pkg/Sleuth2/inst/doc/chapter03.Rnw
   pkg/Sleuth2/inst/doc/chapter04.Rnw
   pkg/Sleuth2/inst/doc/chapter05.Rnw
   pkg/Sleuth2/inst/doc/chapter06.Rnw
   pkg/Sleuth2/inst/doc/chapter07.Rnw
   pkg/Sleuth2/inst/doc/chapter08.Rnw
   pkg/Sleuth2/inst/doc/chapter09.Rnw
   pkg/Sleuth2/inst/doc/chapter10.Rnw
   pkg/Sleuth2/inst/doc/chapter11.Rnw
   pkg/Sleuth2/inst/doc/chapter12.Rnw
   pkg/Sleuth2/inst/doc/chapter13.Rnw
Modified:
   pkg/Sleuth2/DESCRIPTION
   pkg/Sleuth2/inst/doc/Sleuth2-manual.pdf
Log:
Sleuth2:
Put Nick Horton's knitr vignettes into inst/doc/NicholasHorton
together with a suitable Makefile to create .pdf files from his .Rnw
files.  Created dummy Sweave vignettes in inst/doc.
Created .Rbuildignore to avoid includions of
/inst/doc/Nicholas/Horton/*.pdf files in build package.
Updated inst/doc/Sleuth2-manual.pdf.
Bumped version number.


Added: pkg/Sleuth2/.Rbuildignore
===================================================================
--- pkg/Sleuth2/.Rbuildignore	                        (rev 0)
+++ pkg/Sleuth2/.Rbuildignore	2012-09-16 08:58:09 UTC (rev 54)
@@ -0,0 +1 @@
+inst/doc/NicholasHorton/.*.pdf

Modified: pkg/Sleuth2/DESCRIPTION
===================================================================
--- pkg/Sleuth2/DESCRIPTION	2012-09-16 08:36:48 UTC (rev 53)
+++ pkg/Sleuth2/DESCRIPTION	2012-09-16 08:58:09 UTC (rev 54)
@@ -1,15 +1,17 @@
 Package: Sleuth2
 Title: Data sets from Ramsey and Schafer's "Statistical Sleuth (2nd ed)"
-Version: 1.0-7
-Date: 2012-09-02
+Version: 2.0-1
+Date: 2012-09-16
 Author:  Original by F.L. Ramsey and D.W. Schafer,
-    modifications by Daniel W. Schafer, Jeannie Sifneos and Berwin A. Turlach
+    modifications by Daniel W. Schafer, Jeannie Sifneos and Berwin
+    A. Turlach, vignettes contributed by Nicholas Horton, Kate Aloisio
+    and Ruobing Zhang 
 Description: Data sets from Ramsey, F.L. and Schafer, D.W. (2002), "The
     Statistical Sleuth: A Course in Methods of Data Analysis (2nd
     ed)", Duxbury. 
 Maintainer: Berwin A Turlach <Berwin.Turlach at gmail.com>
 LazyData: yes
-Depends: R (>= 2.12.0)
+Depends: R (>= 2.15.0)
 Suggests: lattice
 License: GPL (>= 2)
 URL: http://r-forge.r-project.org/projects/sleuth2/

Added: pkg/Sleuth2/inst/doc/NicholasHorton/Makefile
===================================================================
--- pkg/Sleuth2/inst/doc/NicholasHorton/Makefile	                        (rev 0)
+++ pkg/Sleuth2/inst/doc/NicholasHorton/Makefile	2012-09-16 08:58:09 UTC (rev 54)
@@ -0,0 +1,24 @@
+.PRECIOUS : %.tex
+
+Rbases   := $(basename $(wildcard *.Rnw))
+Rfiles   := $(foreach base, $(Rbases), $(base).R)
+Rpdfs    := $(foreach base, $(Rbases), $(base).pdf)
+Rtex     := $(foreach base, $(Rbases), \
+		$(addprefix $(base), .aux .log .out .tex .toc))
+
+%.tex : %.Rnw
+	echo "library(knitr); knit(\"$<\")" | R --no-save --no-restore
+
+%.R : %.Rnw
+	echo "library(knitr); knit(\"$<\", tangle=TRUE)" | R --no-save --no-restore
+
+%.pdf : %.Rnw
+	echo "library(knitr); knit2pdf(\"$<\")" | R --no-save --no-restore
+
+all: ${Rpdfs}
+
+clean :
+	rm -f ${Rfiles}
+	rm -f ${Rtex}
+	rm -f *~
+	rm -rf figures


Property changes on: pkg/Sleuth2/inst/doc/NicholasHorton/Makefile
___________________________________________________________________
Added: svn:eol-style
   + native

Added: pkg/Sleuth2/inst/doc/NicholasHorton/chapter01.Rnw
===================================================================
--- pkg/Sleuth2/inst/doc/NicholasHorton/chapter01.Rnw	                        (rev 0)
+++ pkg/Sleuth2/inst/doc/NicholasHorton/chapter01.Rnw	2012-09-16 08:58:09 UTC (rev 54)
@@ -0,0 +1,298 @@
+\documentclass[11pt]{article}
+
+\usepackage[margin=1in,bottom=.5in,includehead,includefoot]{geometry}
+\usepackage{hyperref}
+\usepackage{language}
+\usepackage{alltt}
+\usepackage{fancyhdr}
+\pagestyle{fancy}
+\fancyhf{}
+
+%% Now begin customising things. See the fancyhdr docs for more info.
+
+\chead{}
+\lhead[\sf \thepage]{\sf \leftmark}
+\rhead[\sf \leftmark]{\sf \thepage}
+\lfoot{}
+\cfoot{Statistical Sleuth in R: Chapter 1}
+\rfoot{}
+
+\newcounter{myenumi}
+\newcommand{\saveenumi}{\setcounter{myenumi}{\value{enumi}}}
+\newcommand{\reuseenumi}{\setcounter{enumi}{\value{myenumi}}}
+
+\pagestyle{fancy}
+
+\def\R{{\sf R}}
+\def\Rstudio{{\sf RStudio}}
+\def\RStudio{{\sf RStudio}}
+\def\term#1{\textbf{#1}}
+\def\tab#1{{\sf #1}}
+
+
+\usepackage{relsize}
+
+\newlength{\tempfmlength}
+\newsavebox{\fmbox}
+\newenvironment{fmpage}[1]
+     {
+   \medskip
+   \setlength{\tempfmlength}{#1}
+	 \begin{lrbox}{\fmbox}
+	   \begin{minipage}{#1}
+		 \vspace*{.02\tempfmlength}
+		 \hfill
+	   \begin{minipage}{.95 \tempfmlength}}
+		 {\end{minipage}\hfill
+		 \vspace*{.015\tempfmlength}
+		 \end{minipage}\end{lrbox}\fbox{\usebox{\fmbox}}
+	 \medskip
+	 }
+
+
+\newenvironment{boxedText}[1][.98\textwidth]%
+{%
+\begin{center}
+\begin{fmpage}{#1}
+}%
+{%
+\end{fmpage}
+\end{center}
+}
+
+\newenvironment{boxedTable}[2][tbp]%
+{%
+\begin{table}[#1]
+  \refstepcounter{table}
+  \begin{center}
+\begin{fmpage}{.98\textwidth}
+  \begin{center}
+	\sf \large Box~\expandafter\thetable. #2
+\end{center}
+\medskip
+}%
+{%
+\end{fmpage}
+\end{center}
+\end{table}		% need to do something about exercises that follow boxedTable
+}
+
+
+\newcommand{\cran}{\href{http://www.R-project.org/}{CRAN}}
+
+\title{The Statistical Sleuth in R: \\
+Chapter 1}
+
+\author{
+Ruobing Zhang \and Kate Aloisio \and Nicholas J. Horton\thanks{Department of Mathematics and Statistics, Smith College, nhorton at smith.edu}
+} 
+
+\date{\today}
+
+\begin{document}
+
+
+\maketitle
+\tableofcontents
+
+%\parindent=0pt
+
+
+<<setup, include=FALSE, cache=FALSE>>=
+opts_chunk$set(
+  dev="pdf",
+  fig.path="figures/",
+	fig.height=3,
+	fig.width=4,
+	out.width=".47\\textwidth",
+	fig.keep="high",
+	fig.show="hold",
+	fig.align="center",
+	prompt=TRUE,  # show the prompts; but perhaps we should not do this 
+	comment=NA    # turn off commenting of ouput (but perhaps we should not do this either
+  )
+@
+
+<<pvalues, echo=FALSE, message=FALSE>>=
+print.pval = function(pval) {
+  threshold = 0.0001
+    return(ifelse(pval < threshold, paste("p<", sprintf("%.4f", threshold), sep=""),
+                ifelse(pval > 0.1, paste("p=",round(pval, 2), sep=""),
+                       paste("p=", round(pval, 3), sep=""))))
+}
+@
+
+<<setup2,echo=FALSE,message=FALSE>>=
+require(mosaic)
+require(Sleuth2)
+trellis.par.set(theme=col.mosaic())  # get a better color scheme for lattice
+set.seed(123)
+# this allows for code formatting inline.  Use \Sexpr{'function(x,y)'}, for exmaple.
+knit_hooks$set(inline = function(x) {
+if (is.numeric(x)) return(knitr:::format_sci(x, 'latex'))
+x = as.character(x)
+h = knitr:::hilight_source(x, 'latex', list(prompt=FALSE, size='normalsize'))
+h = gsub("([_#$%&])", "\\\\\\1", h)
+h = gsub('(["\'])', '\\1{}', h)
+gsub('^\\\\begin\\{alltt\\}\\s*|\\\\end\\{alltt\\}\\s*$', '', h)
+})
+showOriginal=FALSE
+showNew=TRUE
+@ 
+
+\section{Introduction}
+
+This document is intended to help describe how to undertake analyses introduced as examples in the Second Edition of the \emph{Statistical Sleuth} (2002) by Fred Ramsey and Dan Schafer.
+More information about the book can be found at \url{http://www.proaxis.com/~panorama/home.htm}.  This
+file as well as the associated \pkg{knitr} reproducible analysis source file can be found at
+\url{http://www.math.smith.edu/~nhorton/sleuth}.
+
+This work leverages initiatives undertaken by Project MOSAIC (\url{http://www.mosaic-web.org}), an NSF-funded effort to improve the teaching of statistics, calculus, science and computing in the undergraduate curriculum. In particular, we utilize the 
+\pkg{mosaic} package, which was written to simplify the use of R for introductory statistics courses. A short summary of the R needed to teach introductory statistics can be found in the mosaic package vignette (\url{http://cran.r-project.org/web/packages/mosaic/vignettes/MinimalR.pdf}).
+
+To use a package within R, it must be installed (one time), and loaded (each session). The package can be installed using the following command:
+<<install_mosaic,eval=FALSE>>=
+install.packages('mosaic')               # note the quotation marks
+@
+Once this is installed, it can be loaded by running the command:
+<<load_mosaic,eval=FALSE>>=
+require(mosaic)
+@
+This
+needs to be done once per session.
+
+In addition the data files for the \emph{Sleuth} case studies can be accessed by installing the \pkg{Sleuth2} package.
+<<install_Sleuth2,eval=FALSE>>=
+install.packages('Sleuth2')               # note the quotation marks
+@
+<<load_Sleuth2,eval=FALSE>>=
+require(Sleuth2)
+@
+
+We also set some options to improve legibility of graphs and output.
+<<eval=TRUE>>=
+trellis.par.set(theme=col.mosaic())  # get a better color scheme for lattice
+options(digits=3)
+@
+
+The specific goal of this document is to demonstrate how to calculate the quantities described in Chapter 1: Drawing Statistical Conclusions using R.
+
+\section{Motivation and Creativity}
+
+For Case Study 1: Motivation and Creativity, the following questions are posed: Do grading systems promote creativity in students? Do ranking systems and incentive awards increase productivity among employees? Do rewards and praise stimulate children to learn? Do rewards and praise stimulate children to learn? 
+
+The data for Case Study 1 was collected by psychologist Teresa Amabile in an experiment concerning the effects of intrinsic and extrinsic motivation on creativity (page 2 of the \emph{Sleuth}).
+
+\subsection{Statistical summary and graphical display}
+
+We begin by reading the data and summarizing the variables.
+<<>>=
+summary(case0101)
+@
+A total of \Sexpr{nrow(case0101)} subjects with considerable experience in creative writing were randomly assigned to one of two treatment groups: \Sexpr{nrow(subset(case0101, Treatment=="Extrinsic"))} were placed into the ``extrinsic" treatment group and \Sexpr{nrow(subset(case0101, Treatment=="Intrinsic"))} were placed into the ``intrinsic" treatment group. As summarized in Display 1.1 (\emph{Sleuth}, page 2)
+
+<<eval=TRUE>>=
+favstats(Score ~ Treatment, data=case0101)
+histogram(~ Score | Treatment, data=case0101)
+@
+<<>>=
+with(subset(case0101, Treatment=="Extrinsic"), stem(Score, scale=5))
+with(subset(case0101, Treatment=="Intrinsic"), stem(Score, scale=5))
+@
+
+Similar output can be generated using the following code:
+<<eval=FALSE>>=
+maggregate(Score ~ Treatment, data=case0101, FUN=stem)
+@
+
+The extrinsic group (n=\Sexpr{nrow(subset(case0101, Treatment=="Extrinsic"))}) has an average creativity score that is \Sexpr{round(diff(mean(Score ~ Treatment, data=case0101)), 1)} points less than the intrinsic group (n=\Sexpr{nrow(subset(case0101, Treatment=="Intrinsic"))}). The extrinsic group has relatively larger spread than the intrinsic group (sd=\Sexpr{round(sd(Score, data=subset(case0101, Treatment=="Extrinsic")),2)} for extrinsic group and sd=\Sexpr{round(sd(Score, subset(case0101, Treatment=="Intrinsic")), 2)} for intrinsic group). Both distributions are approximately normally distributed.
+
+\subsection{Inferential procedures (two-sample t-test)}
+ 
+<<eval=TRUE>>=
+t.test(Score ~ Treatment, alternative="two.sided", data=case0101)
+@
+The two-sample $t$-test shows strong evidence that a subject would receive a lower creativity score for a poem written after the extrinsic motivation questionnaire than for one written after the intrinsic motivation questionnaire. The two-sided $p$-value is \Sexpr{pval(t.test(Score~Treatment, alternative="two.sided", data=case0101), digits=4)}, which is small enough to reject the null hypothesis. 
+
+Thus, we can conclude that there is a difference between the population mean in extrinsic group and the population mean for the intrinsic group; the estimated difference between these two scores is \Sexpr{round(diff(mean(Score ~ Treatment, data=case0101)), 1)} points on the 0-40 point scale. A 95\% confidence interval for the decrease in score due to having extrinsic motivation rather than intrinsic motivation is between \Sexpr{round(t.test(Score~Treatment, alternative="two.sided", data=case0101)$conf.int[2], 2)} and  \Sexpr{round(t.test(Score~Treatment, alternative="two.sided", data=case0101)$conf.int[1], 2)} points (\emph{Sleuth}, page 3).
+
+
+<<eval=TRUE>>=
+summary(lm(Score ~ Treatment, data=case0101))
+@
+
+In the creativity study, the question that whether there is a treatment effect becomes a question about whether the parameter has a nonzero value. The value of the test statistic for the creativity scores is \Sexpr{round(diff(mean(Score ~ Treatment, data=case0101)), 2)}.
+
+\subsection{Permutation test}
+
+<<eval=TRUE>>=
+diffmeans = diff(mean(Score ~ Treatment, data=case0101))
+diffmeans     # observed difference
+numsim = 1000     # set to a sufficient number
+nulldist = do(numsim)*diff(mean(Score~shuffle(Treatment), data=case0101))
+confint(nulldist)
+# Display 1.8 Sleuth
+histogram(~ Intrinsic, nint=50, data=nulldist, v=c(-4.14,4.14)) 
+@
+
+As described in the Sleuth on page 12, if the group assignment changes, we will get different results. First, the test statistics will be just as likely to be negative as positive. Second, the majority of values fall in the range from -3.0 to +3.0. Third, only few of the 1,000 randomization produced test statistics as large as 4.14. This last point indicates that 4.14 is a value corresponding to an unusually uneven randomization outcome, if the null hypothesis is correct.
+
+\section{Gender Discrimination}
+
+For Case Study 2: Gender Discrimination the following questions are posed: Did a bank discriminatorily pay higher starting salaries to men than to women?  Display 1.3 (page 4 of the \emph{Sleuth}) displays the beginning salaries for male and female skilled entry level clerical employees hired between 1969 and 1977.
+
+\subsection{Statistical summary and graphical display}
+
+We begin by reading the data and summarizing the variables.
+
+<<eval=TRUE>>=
+summary(case0102) # Display 1.3 Sleuth p4
+@
+
+<<eval=TRUE>>=
+favstats(Salary ~ Sex, data=case0102)
+bwplot(Salary ~ Sex, data=case0102)
+densityplot(~ Salary, groups=Sex, auto.key=TRUE, data=case0102)
+@
+
+The \Sexpr{nrow(subset(case0102, Sex=="MALE"))} men have an average starting salary that is \$\Sexpr{round(diff(mean(Salary ~ Sex, data=case0102)), 1)} more than the \Sexpr{nrow(subset(case0102, Sex=="Female"))} women (\$\Sexpr{round(mean(Salary, data=subset(case0102, Sex=="Male")),0)} vs \$\Sexpr{round(mean(Salary, data=subset(case0102, Sex=="Female")),0)}).  Both distributions have similar spread (sd=\$\Sexpr{round(sd(Salary, subset(case0102, Sex=="Female")), 2)} for women and sd=\$\Sexpr{round(sd(Salary, subset(case0102, Sex=="Male")), 2)} for men) and distributions that are approximately normally distributed (see density plot). The key difference between the groups is the shift (as indicated by the parallel boxplots).
+
+To show Display 1.13
+<<>>=
+histogram(rnorm(1000))  # Normal
+histogram(rexp(1000))   # Long-tailed 
+histogram(runif(1000))  # Short-tailed
+histogram(rchisq(1000, df=15)) # Skewed
+@
+
+\subsection{Inferential procedures (two-sample t-test)}
+
+The $t$-test on page 4 of Sleuth can be replicated using the following commands (note that the equal-variance t-test is specified by {\tt var.equal=TRUE} which is not the default).
+
+<<eval=TRUE>>=
+t.test(Salary ~ Sex, var.equal=TRUE, data=case0102)
+@
+
+\subsection{Permutation test}
+
+We undertake a permutation test to assess whether the differences in the center of these samples that we are observing are due to chance, if the distributions are actually equivalent back in the populations of male and female possible clerical hires.  We start by calculating our test statistic (the difference in means) for the observed data, then simulate from the null distribution (where the labels can be interchanged) and calculate our $p$-value. 
+
+<<permutetest,eval=TRUE>>=
+obsdiff = diff(mean(Salary ~ Sex, data=case0102)); obsdiff
+numsim = 1000
+res = do(numsim) * diff(mean(Salary~shuffle(Sex), data=case0102))
+densityplot(~ Male, data=res)
+confint(res)
+p = sum(abs(res$Male) >= abs(obsdiff))/numsim; p
+@
+
+Through the permutation test, we observe that the mean starting salary for males is estimated to be \$\Sexpr{round(confint(res)["lower"]+obsdiff, 2)} to \$\Sexpr{round(confint(res)["upper"]+obsdiff, 2)} larger than the mean starting salary  for females (95\% confidence interval). We never see a permuted difference in means close to our observed value. Therefore, we reject the null hypothesis ($p<0.001$) and conclude that the salaries of the men are higher than that of the women.
+
+<<eval=TRUE>>=
+t.test(Salary ~ Sex, alternative="less", data=case0102) 
+@
+
+The $p$-value ($<0.001$) from the two-sample t-test shows that the large difference between estimated salaries for males and females is unlikely to be due to chance.
+
+
+\end{document}

Added: pkg/Sleuth2/inst/doc/NicholasHorton/chapter01.pdf
===================================================================
(Binary files differ)


Property changes on: pkg/Sleuth2/inst/doc/NicholasHorton/chapter01.pdf
___________________________________________________________________
Added: svn:mime-type
   + application/pdf

Added: pkg/Sleuth2/inst/doc/NicholasHorton/chapter02.Rnw
===================================================================
--- pkg/Sleuth2/inst/doc/NicholasHorton/chapter02.Rnw	                        (rev 0)
+++ pkg/Sleuth2/inst/doc/NicholasHorton/chapter02.Rnw	2012-09-16 08:58:09 UTC (rev 54)
@@ -0,0 +1,301 @@
+\documentclass[11pt]{article}
+
+\usepackage[margin=1in,bottom=.5in,includehead,includefoot]{geometry}
+\usepackage{hyperref}
+\usepackage{language}
+\usepackage{alltt}
+\usepackage{fancyhdr}
+\pagestyle{fancy}
+\fancyhf{}
+
+%% Now begin customising things. See the fancyhdr docs for more info.
+
+\chead{}
+\lhead[\sf \thepage]{\sf \leftmark}
+\rhead[\sf \leftmark]{\sf \thepage}
+\lfoot{}
+\cfoot{Statistical Sleuth in R: Chapter 2}
+\rfoot{}
+
+\newcounter{myenumi}
+\newcommand{\saveenumi}{\setcounter{myenumi}{\value{enumi}}}
+\newcommand{\reuseenumi}{\setcounter{enumi}{\value{myenumi}}}
+
+\pagestyle{fancy}
+
+\def\R{{\sf R}}
+\def\Rstudio{{\sf RStudio}}
+\def\RStudio{{\sf RStudio}}
+\def\term#1{\textbf{#1}}
+\def\tab#1{{\sf #1}}
+
+
+\usepackage{relsize}
+
+\newlength{\tempfmlength}
+\newsavebox{\fmbox}
+\newenvironment{fmpage}[1]
+     {
+   \medskip
+   \setlength{\tempfmlength}{#1}
+   \begin{lrbox}{\fmbox}
+     \begin{minipage}{#1}
+     \vspace*{.02\tempfmlength}
+		 \hfill
+	   \begin{minipage}{.95 \tempfmlength}}
+		 {\end{minipage}\hfill
+		 \vspace*{.015\tempfmlength}
+		 \end{minipage}\end{lrbox}\fbox{\usebox{\fmbox}}
+	 \medskip
+	 }
+
+
+\newenvironment{boxedText}[1][.98\textwidth]%
+{%
+\begin{center}
+\begin{fmpage}{#1}
+}%
+{%
+\end{fmpage}
+\end{center}
+}
+
+\newenvironment{boxedTable}[2][tbp]%
+{%
+\begin{table}[#1]
+  \refstepcounter{table}
+  \begin{center}
+\begin{fmpage}{.98\textwidth}
+  \begin{center}
+	\sf \large Box~\expandafter\thetable. #2
+\end{center}
+\medskip
+}%
+{%
+\end{fmpage}
+\end{center}
+\end{table}		% need to do something about exercises that follow boxedTable
+}
+
+
+\newcommand{\cran}{\href{http://www.R-project.org/}{CRAN}}
+
+\title{The Statistical Sleuth in R: \\
+Chapter 2}
+
+\author{
+Ruobing Zhang\and Kate Aloisio \and Nicholas J. Horton\thanks{Department of Mathematics and Statistics, Smith College, nhorton at smith.edu}
+} 
+
+\date{\today}
+
+\begin{document}
+
+
+\maketitle
+\tableofcontents
+
+%\parindent=0pt
+
+
+\SweaveOpts{
+  dev="pdf",
+	fig.path="figures/",
+	fig.height=3,
+	fig.width=4,
+	out.width=".47\\textwidth",
+	fig.keep="high",
+	fig.show="hold",
+	fig.align="center",
+	prompt=TRUE,  # show the prompts; but perhaps we should not do this 
+	comment=NA    # turn off commenting of ouput (but perhaps we should not do this either
+	}
+
+<<pvalues, echo=FALSE, message=FALSE>>=
+print.pval = function(pval) {
+  threshold = 0.0001
+    return(ifelse(pval < threshold, paste("p<", sprintf("%.4f", threshold), sep=""),
+                ifelse(pval > 0.1, paste("p=",round(pval, 2), sep=""),
+                       paste("p=", round(pval, 3), sep=""))))
+}
+@
+
+<<setup,echo=FALSE,message=FALSE>>=
+require(Sleuth2)
+require(mosaic)
+trellis.par.set(theme=col.mosaic())  # get a better color scheme 
+set.seed(123)
+# this allows for code formatting inline.  Use \Sexpr{'function(x,y)'}, for exmaple.
+knit_hooks$set(inline = function(x) {
+if (is.numeric(x)) return(knitr:::format_sci(x, 'latex'))
+x = as.character(x)
+h = knitr:::hilight_source(x, 'latex', list(prompt=FALSE, size='normalsize'))
+h = gsub("([_#$%&])", "\\\\\\1", h)
+h = gsub('(["\'])', '\\1{}', h)
+gsub('^\\\\begin\\{alltt\\}\\s*|\\\\end\\{alltt\\}\\s*$', '', h)
+})
+showOriginal=FALSE
+showNew=TRUE
+@ 
+
+\section{Introduction}
+
+This document is intended to help describe how to undertake analyses introduced as examples in the Second Edition of the \emph{Statistical Sleuth} (2002) by Fred Ramsey and Dan Schafer.
+More information about the book can be found at \url{http://www.proaxis.com/~panorama/home.htm}.
+This
+file as well as the associated \pkg{knitr} reproducible analysis source file can be found at
+\url{http://www.math.smith.edu/~nhorton/sleuth}.
+
+
+This work leverages initiatives undertaken by Project MOSAIC (\url{http://www.mosaic-web.org}), an NSF-funded effort to improve the teaching of statistics, calculus, science and computing in the undergraduate curriculum. In particular, we utilize the 
+\pkg{mosaic} package, which was written to simplify the use of R for introductory statistics courses. A short summary of the R needed to teach introductory statistics can be found in the mosaic package vignette (\url{http://cran.r-project.org/web/packages/mosaic/vignettes/MinimalR.pdf}). 
+
+To use a package within R, it must be installed (one time), and loaded (each session). The package can be installed using the following command:
+<<install_mosaic,eval=FALSE>>=
+install.packages('mosaic')               # note the quotation marks
+@
+Once this is installed, it can be loaded by running the command:
+<<load_mosaic,eval=FALSE>>=
+require(mosaic)
+@
+This
+needs to be done once per session.
+
+In addition the data files for the \emph{Sleuth} case studies can be accessed by installing the \pkg{Sleuth2} package.
+<<install_Sleuth2,eval=FALSE>>=
+install.packages('Sleuth2')               # note the quotation marks
+@
+<<load_Sleuth2,eval=FALSE>>=
+require(Sleuth2)
+@
+
+We also set some options to improve legibility of graphs and output.
+<<eval=TRUE>>=
+trellis.par.set(theme=col.mosaic())  # get a better color scheme for lattice
+options(digits=3, show.signif.stars=FALSE)
+@
+
+The specific goal of this document is to demonstrate how to calculate the quantities described in Chapter 2: Inference Using \emph{t}-Distributions using R.
+
+\section{Bumpus's Data on Natural Selection}
+
+Is humerus length related to whether the bird would survive or perish? That's the question being addressed by Case Study 2.1 in the \emph{Sleuth}.
+
+\subsection{Statistical summary and graphical display}
+We begin by reading the data and summarizing the variables.
+
+<<>>=
+summary(case0201)
+fav=favstats(Humerus ~ Status, data=case0201); fav
+@
+
+A total of \Sexpr{nrow(case0201)} subjects are included in the data: \Sexpr{nrow(subset(case0201, Status=="Survived"))} are adult male sparrows that survived and \Sexpr{nrow(subset(case0201, Status=="Perished"))} that perished. 
+The following figure replicates Display 2.1 on page 29.
+
+<<>>=
+bwplot(Status ~ Humerus, data=case0201)
+@
+
+<<>>=
+densityplot(~ Humerus, groups=Status, auto.key=TRUE, data=case0201)
+@
+
+Both distributions are approximately normally distributed.
+
+\subsection{Inferential procedures (two-sample t-test)}
+
+First, we calculate the pooled SD and the standard error between these two different sample average (page 40, Display 2.8).
+<<>>=
+# Calculate Pooled SD
+n1 = fav["Perished", "n"]; n1
+n2 = fav["Survived", "n"]; n2
+s1 = fav["Perished", "sd"]; s1
+s2 = fav["Survived", "sd"]; s2
+Sp = sqrt(((n1-1)*(s1)^2+(n2-1)*(s2)^2)/(n1+n2-2)); Sp
+# Calculate standard error
+SE = Sp*sqrt(1/n1+1/n2); SE
+@
+
+So the pooled SD is \Sexpr{round(Sp, 2)} and the standard error is \Sexpr{round(SE, 1)}.
+
+Based on this information, we can construct a 95\% confidence interval (page 41, Display 2.9).
+
+<<>>=
+Y1 = fav["Perished", "mean"]; Y1
+Y2 = fav["Survived", "mean"]; Y2
+Yd = Y2-Y1; Yd
+df = n1+n2-2; df
+qt = qt(0.975, 57); qt
+hw = qt*SE; hw
+lower = Yd-hw; lower
+upper = Yd+hw; upper
+@
+
+So the 95\% confidence interval of the difference between means is (\Sexpr{round(lower, 1)}, \Sexpr{round(upper, 1)})
+
+Now we want to calculate the $t$-statistic and $p$-value (as shown on page 44, Display 2.10).
+<<>>=
+tstats = (Yd-0)/SE; tstats      # The hypothesis difference=0
+onepval = 1-pt(tstats, df); onepval
+twopval = 2*onepval; twopval
+@
+
+The one-sided $p$-value is \Sexpr{round(onepval, 2)} and the two-sided $p$-value is \Sexpr{round(twopval, 2)}.
+
+We can get the results of ``Summary of Statistical Findings" (page 29) by using the following code: 
+<<>>=
+t.test(Humerus ~ Status, var.equal=TRUE, data=case0201)
+confint(lm(Humerus ~ Status, data=case0201))
+@
+
+\section{Anatomical Abnormalities Associated with Schizophrenia}
+
+Is the area of brain related to the development of schizophrenia? That's the question being addressed by case study 2.2 in the \emph{Sleuth}.
+
+\subsection{Statistical summary and graphical display}
+We begin by reading the data and summarizing the variables.
+
+<<>>=
+summary(case0202)
+@
+
+A total of \Sexpr{nrow(case0202)} subjects are included in the data. There are \Sexpr{nrow(case0202[ "Affected"])} pairs of twins; one of the twins has schizophrenia, and the other does not. So there are \Sexpr{nrow(case0202["Affected"])} affected subjects and \Sexpr{nrow(case0202["Unaffect"])} unaffected subjects.
+
+The difference in area of left hippocampus of these pairs of twins is:
+<<>>=
+DIFF = case0202[, "Unaffect"]-case0202[, "Affected"]
+favstats(DIFF)
+@
+
+This matches the results on page 30, Display 2.2.
+
+<<>>=
+densityplot(DIFF)
+@
+
+\subsection{Inferential procedures (two-sample t-test)}
+
+We want to calculate the paired t-test and 95\% confidence interval.
+
+<<>>=
+# Calculate t-statistics
+difmean = favstats(DIFF)[, "mean"]; difmean
+difsd = favstats(DIFF)[, "sd"]; difsd
+difSE = difsd/sqrt(15); difSE
+tscore = (difmean-0)/difSE; tscore         # hypothesis difference=0
+twopvalue = 2*(1-pt(tscore, 15-1)); twopvalue
+# Construct confidence interval
+q = qt(0.975, 15-1); q
+schizolower = difmean-q*difSE; schizolower
+schizoupper = difmean+q*difSE; schizoupper
+@
+
+So the two-sided $p$-value is \Sexpr{round(twopvalue, 3)} and the 95\% confidence interval is (\Sexpr{round(schizolower, 2)}, \Sexpr{round(schizoupper, 2)}).
+
+Or we can get the results displayed on page 31 by conducting a paired $t$-test.
+
+<<>>=
+t.test(case0202[, "Unaffect"], case0202[, "Affected"], paired=TRUE)
+@
+
+\end{document}

Added: pkg/Sleuth2/inst/doc/NicholasHorton/chapter02.pdf
===================================================================
(Binary files differ)


Property changes on: pkg/Sleuth2/inst/doc/NicholasHorton/chapter02.pdf
___________________________________________________________________
Added: svn:mime-type
   + application/pdf

Added: pkg/Sleuth2/inst/doc/NicholasHorton/chapter03.Rnw
===================================================================
--- pkg/Sleuth2/inst/doc/NicholasHorton/chapter03.Rnw	                        (rev 0)
+++ pkg/Sleuth2/inst/doc/NicholasHorton/chapter03.Rnw	2012-09-16 08:58:09 UTC (rev 54)
@@ -0,0 +1,315 @@
+\documentclass[11pt]{article}
+
+\usepackage[margin=1in,bottom=.5in,includehead,includefoot]{geometry}
+\usepackage{hyperref}
+\usepackage{language}
+\usepackage{alltt}
+\usepackage{fancyhdr}
+\pagestyle{fancy}
+\fancyhf{}
+
+%% Now begin customising things. See the fancyhdr docs for more info.
+
+\chead{}
+\lhead[\sf \thepage]{\sf \leftmark}
+\rhead[\sf \leftmark]{\sf \thepage}
+\lfoot{}
+\cfoot{Statistical Sleuth in R: Chapter 3}
+\rfoot{}
+
+\newcounter{myenumi}
+\newcommand{\saveenumi}{\setcounter{myenumi}{\value{enumi}}}
+\newcommand{\reuseenumi}{\setcounter{enumi}{\value{myenumi}}}
+
+\pagestyle{fancy}
+
+\def\R{{\sf R}}
+\def\Rstudio{{\sf RStudio}}
+\def\RStudio{{\sf RStudio}}
+\def\term#1{\textbf{#1}}
+\def\tab#1{{\sf #1}}
+
+
+\usepackage{relsize}
+
+\newlength{\tempfmlength}
+\newsavebox{\fmbox}
+\newenvironment{fmpage}[1]
+     {
+   \medskip
+   \setlength{\tempfmlength}{#1}
+   \begin{lrbox}{\fmbox}
+     \begin{minipage}{#1}
+     \vspace*{.02\tempfmlength}
+		 \hfill
+	   \begin{minipage}{.95 \tempfmlength}}
+		 {\end{minipage}\hfill
+		 \vspace*{.015\tempfmlength}
+		 \end{minipage}\end{lrbox}\fbox{\usebox{\fmbox}}
+	 \medskip
+	 }
+
+
+\newenvironment{boxedText}[1][.98\textwidth]%
+{%
+\begin{center}
+\begin{fmpage}{#1}
+}%
+{%
+\end{fmpage}
+\end{center}
+}
+
+\newenvironment{boxedTable}[2][tbp]%
+{%
+\begin{table}[#1]
+  \refstepcounter{table}
+  \begin{center}
+\begin{fmpage}{.98\textwidth}
+  \begin{center}
+	\sf \large Box~\expandafter\thetable. #2
+\end{center}
+\medskip
+}%
+{%
+\end{fmpage}
+\end{center}
+\end{table}		% need to do something about exercises that follow boxedTable
+}
+
+
+\newcommand{\cran}{\href{http://www.R-project.org/}{CRAN}}
+
+\title{The Statistical Sleuth in R: \\
+Chapter 3}
+
+\author{
+Ruobing Zhang \and Kate Aloisio \and Nicholas J. Horton\thanks{Department of Mathematics and Statistics, Smith College, nhorton at smith.edu}
+} 
+
+\date{\today}
+
+\begin{document}
+
+
+\maketitle
+\tableofcontents
+
+%\parindent=0pt
+
+
+\SweaveOpts{
+  dev="pdf",
+	fig.path="figures/",
+	fig.height=3,
+	fig.width=4,
+	out.width=".47\\textwidth",
+	fig.keep="high",
+	fig.show="hold",
+	fig.align="center",
+	prompt=TRUE,  # show the prompts; but perhaps we should not do this 
+	comment=NA    # turn off commenting of ouput (but perhaps we should not do this either
+	}
+
+<<pvalues, echo=FALSE, message=FALSE>>=
+print.pval = function(pval) {
+  threshold = 0.0001
+    return(ifelse(pval < threshold, paste("p<", sprintf("%.4f", threshold), sep=""),
+                ifelse(pval > 0.1, paste("p=",round(pval, 2), sep=""),
+                       paste("p=", round(pval, 3), sep=""))))
+}
+@
+
+<<setup,echo=FALSE,message=FALSE>>=
+require(Sleuth2)
+require(mosaic)
+trellis.par.set(theme=col.mosaic())  # get a better color scheme 
+set.seed(123)
+# this allows for code formatting inline.  Use \Sexpr{'function(x,y)'}, for exmaple.
+knit_hooks$set(inline = function(x){
+if (is.numeric(x)) return(knitr:::format_sci(x, 'latex'))
+x = as.character(x)
+h = knitr:::hilight_source(x, 'latex', list(prompt=FALSE, size='normalsize'))
+h = gsub("([_#$%&])", "\\\\\\1", h)
+h = gsub('(["\'])', '\\1{}', h)
+gsub('^\\\\begin\\{alltt\\}\\s*|\\\\end\\{alltt\\}\\s*$', '', h)
+})
+showOriginal=FALSE
+showNew=TRUE
+@ 
+
+\section{Introduction}
+
+This document is intended to help describe how to undertake analyses introduced as examples in the Second Edition of the \emph{Statistical Sleuth} (2002) by Fred Ramsey and Dan Schafer.
+More information about the book can be found at \url{http://www.proaxis.com/~panorama/home.htm}.
+This
+file as well as the associated \pkg{knitr} reproducible analysis source file can be found at
+\url{http://www.math.smith.edu/~nhorton/sleuth}.
+
+
+This work leverages initiatives undertaken by Project MOSAIC (\url{http://www.mosaic-web.org}), an NSF-funded effort to improve the teaching of statistics, calculus, science and computing in the undergraduate curriculum. In particular, we utilize the 
+\pkg{mosaic} package, which was written to simplify the use of R for introductory statistics courses. A short summary of the R needed to teach introductory statistics can be found in the mosaic package vignette (\url{http://cran.r-project.org/web/packages/mosaic/vignettes/MinimalR.pdf}). 
+
+To use a package within R, it must be installed (one time), and loaded (each session). The package can be installed using the following command:
+<<install_mosaic,eval=FALSE>>=
+install.packages('mosaic')               # note the quotation marks
+@
+Once this is installed, it can be loaded by running the command:
+<<load_mosaic,eval=FALSE>>=
+require(mosaic)
+@
+This
+needs to be done once per session.
+
+In addition the data files for the \emph{Sleuth} case studies can be accessed by installing the \pkg{Sleuth2} package.
+<<install_Sleuth2,eval=FALSE>>=
+install.packages('Sleuth2')               # note the quotation marks
+@
+<<load_Sleuth2,eval=FALSE>>=
+require(Sleuth2)
+@
+
+We also set some options to improve legibility of graphs and output.
+<<eval=TRUE>>=
+trellis.par.set(theme=col.mosaic())  # get a better color scheme for lattice
+options(digits=3, show.signif.stars=FALSE)
+@
+
+The specific goal of this document is to demonstrate how to calculate the quantities described in \emph{Sleuth} Chapter 3: A Closer Look at Assumptions using R.
+
+\section{Cloud Seeding to Increase Rainfall}
+
+Does seeding clouds lead to more rainfall? This is the question being addressed by case study 3.1 in the \emph{Sleuth}.
+
+\subsection{Summary statistics and graphical displays (untransformed)}
+
+We begin by reading the data and summarizing the variables.
+
+<<>>=
+summary(case0301)
+favstats(Rainfall ~ Treatment, data=case0301)
+@
+
+A total of \Sexpr{nrow(case0301)} subjects were included in this data: \Sexpr{nrow(subset(case0301, Treatment=="Seeded"))} seeded days  and \Sexpr{nrow(subset(case0301, Treatment=="Unseeded"))} 
+unseeded days (Display 3.1, page 57). 
+
+<<fig.height=8, fig.width=8>>=
+bwplot(Rainfall ~ Treatment, data=case0301)
+@
+
+<<fig.height=8, fig.width=8>>=
+densityplot(~Rainfall, groups=Treatment, auto.key=TRUE, data=case0301)
+@
+
+According to the boxplot and the density plot, the rainfall from seeded days seems to be larger than unseeded days. Both density curves are highly skewed to the right.
+
+\subsection{Summary statistics and graphical display (transformed)}
+
+The skewness suggests there is a need to apply the logarithmic transformation. The transformed data is shown on page 71 (Display 3.9).
+
+<<>>=
+case0301$lograin=log(case0301$Rainfall)
+favstats(lograin ~ Treatment, data=case0301)
+@
+
+<<fig.height=8, fig.width=8>>=
[TRUNCATED]

To get the complete diff run:
    svnlook diff /svnroot/sleuth2 -r 54


More information about the Sleuth2-commits mailing list