[Sleuth2-commits] r65 - pkg/Sleuth2/vignettes

Wed Jun 15 03:54:45 CEST 2016

Author: rpruim
Date: 2016-06-15 03:54:44 +0200 (Wed, 15 Jun 2016)
New Revision: 65

Modified:
   pkg/Sleuth2/vignettes/chapter01-HortonMosaic.Rnw
Log:
update code to match new version of mosaic package

Modified: pkg/Sleuth2/vignettes/chapter01-HortonMosaic.Rnw
===================================================================

--- pkg/Sleuth2/vignettes/chapter01-HortonMosaic.Rnw	2016-06-15 01:47:37 UTC (rev 64)
+++ pkg/Sleuth2/vignettes/chapter01-HortonMosaic.Rnw	2016-06-15 01:54:44 UTC (rev 65)
@@ -88,7 +88,7 @@
 
 \author{
 Ruobing Zhang \and Kate Aloisio \and Nicholas J. Horton\thanks{Department of Mathematics, Amherst College, nhorton at amherst.edu}
-} 
+}
 
 \date{\today}
 
@@ -112,7 +112,7 @@
         fig.keep="high",
         fig.show="hold",
         fig.align="center",
-        prompt=TRUE,  # show the prompts; but perhaps we should not do this 
+        prompt=TRUE,  # show the prompts; but perhaps we should not do this
         comment=NA    # turn off commenting of ouput (but perhaps we should not do this either
   )
 @
@@ -142,7 +142,7 @@
 })
 showOriginal=FALSE
 showNew=TRUE
-@ 
+@
 
 \section{Introduction}
 
@@ -151,7 +151,7 @@
 file as well as the associated \pkg{knitr} reproducible analysis source file can be found at
 \url{http://www.amherst.edu/~nhorton/sleuth}.
 
-This work leverages initiatives undertaken by Project MOSAIC (\url{http://www.mosaic-web.org}), an NSF-funded effort to improve the teaching of statistics, calculus, science and computing in the undergraduate curriculum. In particular, we utilize the 
+This work leverages initiatives undertaken by Project MOSAIC (\url{http://www.mosaic-web.org}), an NSF-funded effort to improve the teaching of statistics, calculus, science and computing in the undergraduate curriculum. In particular, we utilize the
 \pkg{mosaic} package, which was written to simplify the use of R for introductory statistics courses. A short summary of the R needed to teach introductory statistics can be found in the mosaic package vignette (\url{http://cran.r-project.org/web/packages/mosaic/vignettes/MinimalR.pdf}).
 
 To use a package within R, it must be installed (one time), and loaded (each session). The package can be installed using the following command:
@@ -183,7 +183,7 @@
 
 \section{Motivation and Creativity}
 
-For Case Study 1: Motivation and Creativity, the following questions are posed: Do grading systems promote creativity in students? Do ranking systems and incentive awards increase productivity among employees? Do rewards and praise stimulate children to learn? Do rewards and praise stimulate children to learn? 
+For Case Study 1: Motivation and Creativity, the following questions are posed: Do grading systems promote creativity in students? Do ranking systems and incentive awards increase productivity among employees? Do rewards and praise stimulate children to learn? Do rewards and praise stimulate children to learn?
 
 The data for Case Study 1 was collected by psychologist Teresa Amabile in an experiment concerning the effects of intrinsic and extrinsic motivation on creativity (page 2 of the \emph{Sleuth}).
 
@@ -209,14 +209,14 @@
 maggregate(Score ~ Treatment, FUN=stem, data=case0101)
 @
 
-The extrinsic group (n=\Sexpr{nrow(subset(case0101, Treatment=="Extrinsic"))}) has an average creativity score that is \Sexpr{round(diff(mean(Score ~ Treatment, data=case0101)), 1)} points less than the intrinsic group (n=\Sexpr{nrow(subset(case0101, Treatment=="Intrinsic"))}). The extrinsic group has relatively larger spread than the intrinsic group (sd=\Sexpr{round(sd(Score, data=subset(case0101, Treatment=="Extrinsic")),2)} for extrinsic group and sd=\Sexpr{round(sd(Score, data=subset(case0101, Treatment=="Intrinsic")), 2)} for intrinsic group). Both distributions are approximately normally distributed.
+The extrinsic group (n=\Sexpr{nrow(subset(case0101, Treatment=="Extrinsic"))}) has an average creativity score that is \Sexpr{round(diff(mean(Score ~ Treatment, data=case0101)), 1)} points less than the intrinsic group (n=\Sexpr{nrow(subset(case0101, Treatment=="Intrinsic"))}). The extrinsic group has relatively larger spread than the intrinsic group (sd=\Sexpr{round(sd(~Score, data=subset(case0101, Treatment=="Extrinsic")),2)} for extrinsic group and sd=\Sexpr{round(sd(~Score, data=subset(case0101, Treatment=="Intrinsic")), 2)} for intrinsic group). Both distributions are approximately normally distributed.
 
 \subsection{Inferential procedures (two-sample t-test)}
- 
+
 <<eval=TRUE>>=
 t.test(Score ~ Treatment, alternative="two.sided", data=case0101)
 @
-The two-sample $t$-test shows strong evidence that a subject would receive a lower creativity score for a poem written after the extrinsic motivation questionnaire than for one written after the intrinsic motivation questionnaire. The two-sided $p$-value is \Sexpr{pval(t.test(Score~Treatment, alternative="two.sided", data=case0101), digits=4)}, which is small enough to reject the null hypothesis. 
+The two-sample $t$-test shows strong evidence that a subject would receive a lower creativity score for a poem written after the extrinsic motivation questionnaire than for one written after the intrinsic motivation questionnaire. The two-sided $p$-value is \Sexpr{pval(t.test(Score~Treatment, alternative="two.sided", data=case0101), digits=4)}, which is small enough to reject the null hypothesis.
 
 Thus, we can conclude that there is a difference between the population mean in extrinsic group and the population mean for the intrinsic group; the estimated difference between these two scores is \Sexpr{round(diff(mean(Score ~ Treatment, data=case0101)), 1)} points on the 0-40 point scale. A 95\% confidence interval for the decrease in score due to having extrinsic motivation rather than intrinsic motivation is between \Sexpr{round(t.test(Score~Treatment, alternative="two.sided", data=case0101)$conf.int[2], 2)} and  \Sexpr{round(t.test(Score~Treatment, alternative="two.sided", data=case0101)$conf.int[1], 2)} points (\emph{Sleuth}, page 3).
 
@@ -236,7 +236,7 @@
 nulldist = do(numsim)*diff(mean(Score~shuffle(Treatment), data=case0101))
 confint(nulldist)
 # Display 1.8 Sleuth
-histogram(~ Intrinsic, nint=50, data=nulldist, v=c(-4.14,4.14)) 
+histogram(~ Intrinsic, nint=50, data=nulldist, v=c(-4.14,4.14))
 @
 
 As described in the Sleuth on page 12, if the group assignment changes, we will get different results. First, the test statistics will be just as likely to be negative as positive. Second, the majority of values fall in the range from -3.0 to +3.0. Third, only few of the 1,000 randomization produced test statistics as large as 4.14. This last point indicates that 4.14 is a value corresponding to an unusually uneven randomization outcome, if the null hypothesis is correct.
@@ -259,14 +259,14 @@
 densityplot(~ Salary, groups=Sex, auto.key=TRUE, data=case0102)
 @
 
-The \Sexpr{nrow(subset(case0102, Sex=="MALE"))} men have an average starting salary that is \$\Sexpr{round(diff(mean(Salary ~ Sex, data=case0102)), 1)} more than the \Sexpr{nrow(subset(case0102, Sex=="Female"))} women (\$\Sexpr{round(mean(Salary, data=subset(case0102, Sex=="Male")),0)} vs \$\Sexpr{round(mean(Salary, data=subset(case0102, Sex=="Female")),0)}).  Both distributions have similar spread (sd=\$\Sexpr{round(sd(Salary, data=subset(case0102, Sex=="Female")), 2)} for women and sd=\$\Sexpr{round(sd(Salary, data=subset(case0102, Sex=="Male")), 2)} for men) and distributions that are approximately normally distributed (see density plot). The key difference between the groups is the shift (as indicated by the parallel boxplots).
+The \Sexpr{nrow(subset(case0102, Sex=="MALE"))} men have an average starting salary that is \$\Sexpr{round(diff(mean(Salary ~ Sex, data=case0102)), 1)} more than the \Sexpr{nrow(subset(case0102, Sex=="Female"))} women (\$\Sexpr{round(mean(~Salary, data=subset(case0102, Sex=="Male")),0)} vs \$\Sexpr{round(mean(~Salary, data=subset(case0102, Sex=="Female")),0)}).  Both distributions have similar spread (sd=\$\Sexpr{round(sd(~Salary, data=subset(case0102, Sex=="Female")), 2)} for women and sd=\$\Sexpr{round(sd(~Salary, data=subset(case0102, Sex=="Male")), 2)} for men) and distributions that are approximately normally distributed (see density plot). The key difference between the groups is the shift (as indicated by the parallel boxplots).
 
 To show Display 1.13
 <<>>=
 x = rnorm(1000)
 histogram(~ x)  # Normal
 x = rexp(1000)
-histogram(~ x)   # Long-tailed 
+histogram(~ x)   # Long-tailed
 x = runif(1000)
 histogram(~ x)  # Short-tailed
 x = rchisq(1000, df=15)
@@ -283,7 +283,7 @@
 
 \subsection{Permutation test}
 
-We undertake a permutation test to assess whether the differences in the center of these samples that we are observing are due to chance, if the distributions are actually equivalent back in the populations of male and female possible clerical hires.  We start by calculating our test statistic (the difference in means) for the observed data, then simulate from the null distribution (where the labels can be interchanged) and calculate our $p$-value. 
+We undertake a permutation test to assess whether the differences in the center of these samples that we are observing are due to chance, if the distributions are actually equivalent back in the populations of male and female possible clerical hires.  We start by calculating our test statistic (the difference in means) for the observed data, then simulate from the null distribution (where the labels can be interchanged) and calculate our $p$-value.
 
 <<permutetest,eval=TRUE>>=
 obsdiff = diff(mean(Salary ~ Sex, data=case0102)); obsdiff
@@ -298,7 +298,7 @@
 Through the permutation test, we observe that the mean starting salary for males is estimated to be \$\Sexpr{as.numeric(round(confint(res)["lower"]+obsdiff, 2))} to \$\Sexpr{as.numeric(round(confint(res)["upper"]+obsdiff, 2))} larger than the mean starting salary  for females (95\% confidence interval). We never see a permuted difference in means close to our observed value. Therefore, we reject the null hypothesis ($p<0.001$) and conclude that the salaries of the men are significantly higher than that of the women.
 
 <<>>=
-t.test(Salary ~ Sex, alternative="less", data=case0102) 
+t.test(Salary ~ Sex, alternative="less", data=case0102)
 @
 
 The $p$-value ($<0.001$) from the two-sample t-test shows that the large difference between estimated salaries for males and females is unlikely to be due to chance.