[Sleuth2-commits] r64 - / pkg/Sleuth3/vignettes

Wed Jun 15 03:47:38 CEST 2016

Author: rpruim
Date: 2016-06-15 03:47:37 +0200 (Wed, 15 Jun 2016)
New Revision: 64

Modified:
   /
   pkg/Sleuth3/vignettes/chapter01-HortonMosaic.Rnw
   pkg/Sleuth3/vignettes/chapter02-HortonMosaic.Rnw
   pkg/Sleuth3/vignettes/chapter05-HortonMosaic.Rnw
Log:
updating to make compatible with new version of mosaic package.


Property changes on: 
___________________________________________________________________
Added: svn:ignore
   + .Rproj.user
.Rhistory
.RData
.Ruserdata


Modified: pkg/Sleuth3/vignettes/chapter01-HortonMosaic.Rnw
===================================================================

--- pkg/Sleuth3/vignettes/chapter01-HortonMosaic.Rnw	2016-01-02 09:13:13 UTC (rev 63)
+++ pkg/Sleuth3/vignettes/chapter01-HortonMosaic.Rnw	2016-06-15 01:47:37 UTC (rev 64)
@@ -88,7 +88,7 @@
 
 \author{
 Linda Loi \and Ruobing Zhang \and Kate Aloisio \and Nicholas J. Horton\thanks{Department of Mathematics and Statistics, Smith College, nhorton at smith.edu}
-} 
+}
 
 \date{\today}
 
@@ -112,7 +112,7 @@
         fig.keep="high",
         fig.show="hold",
         fig.align="center",
-        prompt=TRUE,  # show the prompts; but perhaps we should not do this 
+        prompt=TRUE,  # show the prompts; but perhaps we should not do this
         comment=NA    # turn off commenting of ouput (but perhaps we should not do this either
   )
 @
@@ -142,7 +142,7 @@
 })
 showOriginal=FALSE
 showNew=TRUE
-@ 
+@
 
 \section{Introduction}
 
@@ -151,7 +151,7 @@
 file as well as the associated \pkg{knitr} reproducible analysis source file can be found at
 \url{http://www.math.smith.edu/~nhorton/sleuth3}.
 
-This work leverages initiatives undertaken by Project MOSAIC (\url{http://www.mosaic-web.org}), an NSF-funded effort to improve the teaching of statistics, calculus, science and computing in the undergraduate curriculum. In particular, we utilize the 
+This work leverages initiatives undertaken by Project MOSAIC (\url{http://www.mosaic-web.org}), an NSF-funded effort to improve the teaching of statistics, calculus, science and computing in the undergraduate curriculum. In particular, we utilize the
 \pkg{mosaic} package, which was written to simplify the use of R for introductory statistics courses. A short summary of the R needed to teach introductory statistics can be found in the mosaic package vignette (\url{http://cran.r-project.org/web/packages/mosaic/vignettes/MinimalR.pdf}).
 
 To use a package within R, it must be installed (one time), and loaded (each session). The package can be installed using the following command:
@@ -183,7 +183,7 @@
 
 \section{Motivation and Creativity}
 
-For Case Study 1: Motivation and Creativity, the following questions are posed: Do grading systems promote creativity in students? Do ranking systems and incentive awards increase productivity among employees? Do rewards and praise stimulate children to learn? 
+For Case Study 1: Motivation and Creativity, the following questions are posed: Do grading systems promote creativity in students? Do ranking systems and incentive awards increase productivity among employees? Do rewards and praise stimulate children to learn?
 
 The data for Case Study 1 was collected by psychologist Teresa Amabile in an experiment concerning the effects of intrinsic and extrinsic motivation on creativity (page 2 of the \emph{Sleuth}).
 
@@ -209,14 +209,14 @@
 maggregate(Score ~ Treatment, data=case0101, FUN=stem)
 @
 
-The extrinsic group (n=\Sexpr{nrow(subset(case0101, Treatment=="Extrinsic"))}) has an average creativity score that is \Sexpr{round(diff(mean(Score ~ Treatment, data=case0101)), 1)} points less than the intrinsic group (n=\Sexpr{nrow(subset(case0101, Treatment=="Intrinsic"))}). The extrinsic group has relatively larger spread than the intrinsic group (sd=\Sexpr{round(sd(Score, data=subset(case0101, Treatment=="Extrinsic")),2)} for extrinsic group and sd=\Sexpr{round(sd(Score, data=subset(case0101, Treatment=="Intrinsic")), 2)} for intrinsic group). Both distributions are approximately normally distributed.
+The extrinsic group (n=\Sexpr{nrow(subset(case0101, Treatment=="Extrinsic"))}) has an average creativity score that is \Sexpr{round(diff(mean(Score ~ Treatment, data=case0101)), 1)} points less than the intrinsic group (n=\Sexpr{nrow(subset(case0101, Treatment=="Intrinsic"))}). The extrinsic group has relatively larger spread than the intrinsic group (sd=\Sexpr{round(sd(~Score, data=subset(case0101, Treatment=="Extrinsic")),2)} for extrinsic group and sd=\Sexpr{round(sd(~Score, data=subset(case0101, Treatment=="Intrinsic")), 2)} for intrinsic group). Both distributions are approximately normally distributed.
 
 \subsection{Inferential procedures (two-sample t-test)}
- 
+
 <<eval=TRUE>>=
 t.test(Score ~ Treatment, alternative="two.sided", data=case0101)
 @
-The two-sample $t$-test shows strong evidence that a subject would receive a lower creativity score for a poem written after the extrinsic motivation questionnaire than for one written after the intrinsic motivation questionnaire. The two-sided $p$-value is \Sexpr{pval(t.test(Score~Treatment, alternative="two.sided", data=case0101), digits=4)}, which is small enough to reject the null hypothesis. 
+The two-sample $t$-test shows strong evidence that a subject would receive a lower creativity score for a poem written after the extrinsic motivation questionnaire than for one written after the intrinsic motivation questionnaire. The two-sided $p$-value is \Sexpr{pval(t.test(Score~Treatment, alternative="two.sided", data=case0101), digits=4)}, which is small enough to reject the null hypothesis.
 
 Thus, we can conclude that there is a difference between the population mean in the extrinsic group and the population mean in the intrinsic group; the estimated difference between these two scores is \Sexpr{round(diff(mean(Score ~ Treatment, data=case0101)), 1)} points on the 0-40 point scale. A 95\% confidence interval for the decrease in score due to having extrinsic motivation rather than intrinsic motivation is between \Sexpr{round(t.test(Score~Treatment, alternative="two.sided", data=case0101)$conf.int[2], 2)} and  \Sexpr{round(t.test(Score~Treatment, alternative="two.sided", data=case0101)$conf.int[1], 2)} points (\emph{Sleuth}, page 3).
 
@@ -236,7 +236,7 @@
 nulldist = do(numsim)*diff(mean(Score~shuffle(Treatment), data=case0101))
 confint(nulldist)
 # Display 1.8 Sleuth
-histogram(~ Intrinsic, nint=50, data=nulldist, v=c(-4.14,4.14)) 
+histogram(~ Intrinsic, nint=50, data=nulldist, v=c(-4.14,4.14))
 @
 
 As described in the \emph{Sleuth} on page 12, if the group assignment changes, we will get different results. First, the test statistics will be just as likely to be negative as positive. Second, the majority of values fall in the range from -3.0 to +3.0. Third, only few of the 1,000 randomization produced test statistics as large as 4.14. This last point indicates that 4.14 is a value corresponding to an unusually uneven randomization outcome, if the null hypothesis is correct.
@@ -259,12 +259,12 @@
 densityplot(~ Salary, groups=Sex, auto.key=TRUE, data=case0102)
 @
 
-The \Sexpr{nrow(subset(case0102, Sex=="MALE"))} men have an average starting salary that is \$\Sexpr{round(diff(mean(Salary ~ Sex, data=case0102)), 1)} more than the \Sexpr{nrow(subset(case0102, Sex=="Female"))} women (\$\Sexpr{round(mean(Salary, data=subset(case0102, Sex=="Male")),0)} vs \$\Sexpr{round(mean(Salary, data=subset(case0102, Sex=="Female")),0)}).  Both distributions have similar spread (sd=\$\Sexpr{round(sd(Salary, data=subset(case0102, Sex=="Female")), 2)} for women and sd=\$\Sexpr{round(sd(Salary, data=subset(case0102, Sex=="Male")), 2)} for men) and distributions that are approximately normally distributed (see density plot). The key difference between the groups is the shift (as indicated by the parallel boxplots).
+The \Sexpr{nrow(subset(case0102, Sex=="MALE"))} men have an average starting salary that is \$\Sexpr{round(diff(mean(Salary ~ Sex, data=case0102)), 1)} more than the \Sexpr{nrow(subset(case0102, Sex=="Female"))} women (\$\Sexpr{round(mean(~Salary, data=subset(case0102, Sex=="Male")),0)} vs \$\Sexpr{round(mean(~Salary, data=subset(case0102, Sex=="Female")),0)}).  Both distributions have similar spread (sd=\$\Sexpr{round(sd(~Salary, data=subset(case0102, Sex=="Female")), 2)} for women and sd=\$\Sexpr{round(sd(~Salary, data=subset(case0102, Sex=="Male")), 2)} for men) and distributions that are approximately normally distributed (see density plot). The key difference between the groups is the shift (as indicated by the parallel boxplots).
 
 To show Display 1.13
 <<>>=
 histogram(rnorm(1000))  # Normal
-histogram(rexp(1000))   # Long-tailed 
+histogram(rexp(1000))   # Long-tailed
 histogram(runif(1000))  # Short-tailed
 histogram(rchisq(1000, df=15)) # Skewed
 @
@@ -279,13 +279,13 @@
 
 \subsection{Permutation test}
 
-We undertake a permutation test to assess whether the differences in the center of these samples that we are observing are due to chance, if the distributions are actually equivalent back in the populations of male and female possible clerical hires.  We start by calculating our test statistic (the difference in means) for the observed data, then simulate from the null distribution (where the labels can be interchanged) and calculate our $p$-value. 
+We undertake a permutation test to assess whether the differences in the center of these samples that we are observing are due to chance, if the distributions are actually equivalent back in the populations of male and female possible clerical hires.  We start by calculating our test statistic (the difference in means) for the observed data, then simulate from the null distribution (where the labels can be interchanged) and calculate our $p$-value.
 
 <<obsdiff,eval=TRUE>>=
 obsdiff = diff(mean(Salary ~ Sex, data=case0102)); obsdiff
 @
 
-The labeling for the difference in means isn't ideal (but will be given 
+The labeling for the difference in means isn't ideal (but will be given
 as ``Male'' by R).
 <<permutetest>>=
 numsim = 1000
@@ -304,7 +304,7 @@
 Through the permutation test, we observe that the mean starting salary for males is significantly larger than the mean starting salary for females, as we never see a permuted difference in means close to our observed value. Therefore, we reject the null hypothesis ($p<0.001$) and conclude that the salaries of the men are higher than that of the women back in the population.
 
 <<eval=TRUE>>=
-t.test(Salary ~ Sex, alternative="less", data=case0102) 
+t.test(Salary ~ Sex, alternative="less", data=case0102)
 @
 
 The $p$-value ($<0.001$) from the two-sample t-test shows that the large difference between estimated salaries for males and females is unlikely to be due to chance.

Modified: pkg/Sleuth3/vignettes/chapter02-HortonMosaic.Rnw
===================================================================
--- pkg/Sleuth3/vignettes/chapter02-HortonMosaic.Rnw	2016-01-02 09:13:13 UTC (rev 63)
+++ pkg/Sleuth3/vignettes/chapter02-HortonMosaic.Rnw	2016-06-15 01:47:37 UTC (rev 64)
@@ -88,7 +88,7 @@
 
 \author{
 Linda Loi \and Ruobing Zhang\and Kate Aloisio \and Nicholas J. Horton\thanks{Department of Mathematics and Statistics, Smith College, nhorton at smith.edu}
-} 
+}
 
 \date{\today}
 
@@ -130,7 +130,7 @@
 <<setup2,echo=FALSE,message=FALSE>>=
 require(Sleuth3)
 require(mosaic)
-trellis.par.set(theme=col.mosaic())  # get a better color scheme 
+trellis.par.set(theme=col.mosaic())  # get a better color scheme
 set.seed(123)
 # this allows for code formatting inline.  Use \Sexpr{'function(x,y)'}, for exmaple.
 knit_hooks$set(inline = function(x) {
@@ -143,7 +143,7 @@
 })
 showOriginal=FALSE
 showNew=TRUE
-@ 
+@
 
 \section{Introduction}
 
@@ -154,8 +154,8 @@
 \url{http://www.math.smith.edu/~nhorton/sleuth3}.
 
 
-This work leverages initiatives undertaken by Project MOSAIC (\url{http://www.mosaic-web.org}), an NSF-funded effort to improve the teaching of statistics, calculus, science and computing in the undergraduate curriculum. In particular, we utilize the 
-\pkg{mosaic} package, which was written to simplify the use of R for introductory statistics courses. A short summary of the R needed to teach introductory statistics can be found in the mosaic package vignette (\url{http://cran.r-project.org/web/packages/mosaic/vignettes/MinimalR.pdf}). 
+This work leverages initiatives undertaken by Project MOSAIC (\url{http://www.mosaic-web.org}), an NSF-funded effort to improve the teaching of statistics, calculus, science and computing in the undergraduate curriculum. In particular, we utilize the
+\pkg{mosaic} package, which was written to simplify the use of R for introductory statistics courses. A short summary of the R needed to teach introductory statistics can be found in the mosaic package vignette (\url{http://cran.r-project.org/web/packages/mosaic/vignettes/MinimalR.pdf}).
 
 To use a package within R, it must be installed (one time), and loaded (each session). The package can be installed using the following command:
 <<install_mosaic,eval=FALSE>>=
@@ -196,7 +196,7 @@
 fav = favstats(Depth ~ Year, data=case0201); fav
 @
 
-A total of \Sexpr{nrow(case0201)} subjects are included in the data: \Sexpr{nrow(subset(case0201, Year=="1976"))} are finches that were caught in 1976 and \Sexpr{nrow(subset(case0201, Year=="1978"))} are finches that were caught in 1978. 
+A total of \Sexpr{nrow(case0201)} subjects are included in the data: \Sexpr{nrow(subset(case0201, Year=="1976"))} are finches that were caught in 1976 and \Sexpr{nrow(subset(case0201, Year=="1978"))} are finches that were caught in 1978.
 The following figure replicates Display 2.1 on page 30.
 
 <<>>=
@@ -249,7 +249,7 @@
 
 The one-sided $p$-value is approximately \Sexpr{round(onepval, 2)} and the two-sided $p$-value is also approximately \Sexpr{round(twopval, 2)}.
 
-We can get the results of ``Summary of Statistical Findings" (page 29) by using the following code: 
+We can get the results of ``Summary of Statistical Findings" (page 29) by using the following code:
 <<>>=
 t.test(Depth ~ Year, var.equal=TRUE, data=case0201)
 confint(lm(Depth ~ Year, data=case0201))

Modified: pkg/Sleuth3/vignettes/chapter05-HortonMosaic.Rnw
===================================================================
--- pkg/Sleuth3/vignettes/chapter05-HortonMosaic.Rnw	2016-01-02 09:13:13 UTC (rev 63)
+++ pkg/Sleuth3/vignettes/chapter05-HortonMosaic.Rnw	2016-06-15 01:47:37 UTC (rev 64)
@@ -88,7 +88,7 @@
 
 \author{
 Linda Loi \and Kate Aloisio \and Ruobing Zhang \and Nicholas J. Horton\thanks{Department of Mathematics and Statistics, Smith College, nhorton at smith.edu}
-} 
+}
 
 \date{\today}
 
@@ -117,7 +117,7 @@
   )
 @
 
-  
+
 <<pvalues, echo=FALSE, message=FALSE>>=
 print.pval = function(pval) {
   threshold = 0.0001
@@ -143,7 +143,7 @@
 })
 showOriginal=FALSE
 showNew=TRUE
-@ 
+@
 
 \section{Introduction}
 
@@ -154,8 +154,8 @@
 \url{http://www.math.smith.edu/~nhorton/sleuth3}.
 
 
-This work leverages initiatives undertaken by Project MOSAIC (\url{http://www.mosaic-web.org}), an NSF-funded effort to improve the teaching of statistics, calculus, science and computing in the undergraduate curriculum. In particular, we utilize the 
-\pkg{mosaic} package, which was written to simplify the use of R for introductory statistics courses. A short summary of the R needed to teach introductory statistics can be found in the mosaic package vignette (\url{http://cran.r-project.org/web/packages/mosaic/vignettes/MinimalR.pdf}). 
+This work leverages initiatives undertaken by Project MOSAIC (\url{http://www.mosaic-web.org}), an NSF-funded effort to improve the teaching of statistics, calculus, science and computing in the undergraduate curriculum. In particular, we utilize the
+\pkg{mosaic} package, which was written to simplify the use of R for introductory statistics courses. A short summary of the R needed to teach introductory statistics can be found in the mosaic package vignette (\url{http://cran.r-project.org/web/packages/mosaic/vignettes/MinimalR.pdf}).
 
 To use a package within R, it must be installed (one time), and loaded (each session). The package can be installed using the following command:
 <<install_mosaic,eval=FALSE>>=
@@ -178,7 +178,7 @@
 
 We also set some options to improve legibility of graphs and output.
 <<eval=FALSE>>=
-trellis.par.set(theme=col.mosaic())  # get a better color scheme 
+trellis.par.set(theme=col.mosaic())  # get a better color scheme
 options(digits=3)
 @
 
@@ -189,7 +189,7 @@
 in case study 5.1 in the \emph{Sleuth}.
 
 
-\subsection{Summary statistics and graphical display} 
+\subsection{Summary statistics and graphical display}
 
 We begin by reading the data and summarizing the variables.
 
@@ -289,11 +289,11 @@
 
 \subsection{Residual analysis and diagnostics}
 
-The residuals versus fitted graph does not demonstrate dramatic lack of fit (though some of the 
+The residuals versus fitted graph does not demonstrate dramatic lack of fit (though some of the
 mice had very small residuals).  The following figure is akin to Display 5.14 (page 132).
 <<fig.height=8, fig.width=8>>=
 aov1 = aov(lm(Lifetime ~ Diet, data=case0501))
-plot(aov1, which=1) 
+plot(aov1, which=1)
 @
 
 The quantile plot of the residuals indicates that the normality assumption may be violated.
@@ -305,7 +305,7 @@
 
 
 \section{Spock Conspiracy Trial}
-Did Dr. Benjamin Spock have a fair trial?  More specifically, were women underrepresented on his jury pool?  This is the question considered in 
+Did Dr. Benjamin Spock have a fair trial?  More specifically, were women underrepresented on his jury pool?  This is the question considered in
 case study 5.2 in the \emph{Sleuth}.
 
 
@@ -332,7 +332,7 @@
 
 First we fit the one way analysis of variance (ANOVA) model, with all of the groups. These results are summarized on page 118 and shown in Display 5.10 (page 127).
 <<>>=
-aov1 = anova(lm(Percent ~ Judge, data=case0502)); aov1 
+aov1 = anova(lm(Percent ~ Judge, data=case0502)); aov1
 @
 
 By default, the use of the linear model (regression) function displays the pairwise differences between the first group and each of the other groups.  Note that the overall test of the model is the same.
@@ -361,15 +361,15 @@
 tally(twoJudge ~ Judge, format="count", data=case0502)
 
 @
-Recall that the book calculates the extra sum of squares as (2,190.90 - 1864.45)/(44-39)) / (1864.45 / 39) = 1.37, with df 5 and 39.  P(F $\textgreater$ 1.366) = 0.26 (page 130).  Below are the calculations for the results found on page 128. 
+Recall that the book calculates the extra sum of squares as (2,190.90 - 1864.45)/(44-39)) / (1864.45 / 39) = 1.37, with df 5 and 39.  P(F $\textgreater$ 1.366) = 0.26 (page 130).  Below are the calculations for the results found on page 128.
 
 <<>>=
 numdf1 = aov1["Residuals", "Df"]; numdf1 # Within
 ss1 = aov1["Residuals", "Sum Sq"]; ss1 # Within
 aov2 = anova(lm(Percent ~ as.factor(twoJudge), data=case0502)); aov2
-df2 = aov2["Residuals", "Df"]; df2 # Spock and others 
-ss2 = aov2["Residuals", "Sum Sq"]; ss2 # Spock and others 
-Fstat = ((ss2 - ss1)/(df2 - numdf1)) / (ss1 / numdf1); Fstat 
+df2 = aov2["Residuals", "Df"]; df2 # Spock and others
+ss2 = aov2["Residuals", "Sum Sq"]; ss2 # Spock and others
+Fstat = ((ss2 - ss1)/(df2 - numdf1)) / (ss1 / numdf1); Fstat
 1-pf(Fstat, length(levels(case0502$Judge))-2, numdf1)
 @
 
@@ -378,7 +378,7 @@
 anova(lm(Percent ~ as.factor(Judge), data=case0502), lm(Percent ~ as.factor(twoJudge), data=case0502))
 @
 
-There are some other ways to compare whether the other judges differ from Dr. Spock's judge in their female composition using contrasts. 
+There are some other ways to compare whether the other judges differ from Dr. Spock's judge in their female composition using contrasts.
 
 <<>>=
 # test all of the other judges vs. Spock's judge using a contrast page 118