[Prob-commits] r47 - in pkg: . R man

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Thu Feb 16 03:44:20 CET 2017


Author: gkerns
Date: 2017-02-16 03:44:19 +0100 (Thu, 16 Feb 2017)
New Revision: 47

Added:
   pkg/R/genData.R
   pkg/man/gen2wayTable.Rd
   pkg/man/genIndepTable.Rd
   pkg/man/genLogRegData.Rd
   pkg/man/genXdata.Rd
Modified:
   pkg/DESCRIPTION
   pkg/NAMESPACE
   pkg/man/prob-package.Rd
Log:
updated and added gendata


Modified: pkg/DESCRIPTION
===================================================================
--- pkg/DESCRIPTION	2013-12-11 16:59:22 UTC (rev 46)
+++ pkg/DESCRIPTION	2017-02-16 02:44:19 UTC (rev 47)
@@ -1,17 +1,18 @@
-Package: prob
-Version: 0.9-5
-Date: 2013-11-20
-Title: Elementary Probability on Finite Sample Spaces
-Authors at R: person(given = "G. Jay", family = "Kerns", role = c("aut", "cre", "cph"), email = "gkerns at ysu.edu")
-Depends: combinat, fAsianOptions, hypergeo, VGAM
-Description:
- This package provides a framework for performing elementary probability
- calculations on finite sample spaces, which may be represented by data frames
- or lists.  Functionality includes setting up sample spaces, counting tools,
- defining probability spaces, performing set algebra, calculating probability
- and conditional probability, tools for simulation and checking the law of
- large numbers, adding random variables, and finding marginal distributions.
- Characteristic functions for all base R distributions are included.
-License: GPL (>= 3)
-URL: http://prob.r-forge.r-project.org, http://people.ysu.edu/~gkerns/
-
+Package: prob
+Version: 1.0-0
+Date: 2017-02-15
+Title: Elementary Probability on Finite Sample Spaces
+Authors at R: person(given = "G. Jay", family = "Kerns", role = c("aut", "cre", "cph"), email = "gkerns at ysu.edu")
+Depends: combinat, fAsianOptions
+Suggests: VGAM, reshape, MASS, hypergeo
+Description:
+ This package provides a framework for performing elementary probability
+ calculations on finite sample spaces, which may be represented by data frames
+ or lists.  Functionality includes setting up sample spaces, counting tools,
+ defining probability spaces, performing set algebra, calculating probability
+ and conditional probability, tools for simulation and checking the law of
+ large numbers, adding random variables, and finding marginal distributions.
+ Characteristic functions for all base R distributions are included.
+License: GPL (>= 3)
+URL: http://prob.r-forge.r-project.org, http://people.ysu.edu/~gkerns/
+

Modified: pkg/NAMESPACE
===================================================================
--- pkg/NAMESPACE	2013-12-11 16:59:22 UTC (rev 46)
+++ pkg/NAMESPACE	2017-02-16 02:44:19 UTC (rev 47)
@@ -1,73 +1,79 @@
-
-
-export(
-addrv,
-cards,
-cfbeta,
-cfbinom,
-cfcauchy,
-cfchisq,
-cfexp,
-cff,
-cfgamma,
-cfgeom,
-cfhyper,
-cflnorm,
-cflogis,
-cfnbinom,
-cfnorm,
-cfpois,
-cfsignrank,
-cft,
-cfunif,
-cfweibull,
-cfwilcox,
-countrep,
-empirical,
-euchredeck,
-iidspace,
-intersect,
-is.probspace,
-isin,
-isrep,
-marginal,
-noorder,
-nsamp,
-permsn,
-prob,
-Prob,
-probspace,
-rolldie,
-roulette,
-setdiff,
-sim,
-subset,
-tosscoin,
-union,
-urnsamples)
-
-S3method(union, default)
-S3method(union, data.frame)
-S3method(union, ps)
-S3method(intersect, default)
-S3method(intersect, data.frame)
-S3method(intersect, ps)
-S3method(setdiff, default)
-S3method(setdiff, data.frame)
-S3method(setdiff, ps)
-S3method(urnsamples, default)
-S3method(urnsamples, data.frame)
-S3method(probspace, default)
-S3method(probspace, list)
-S3method(Prob, default)
-S3method(Prob, ps)
-S3method(isin, default)
-S3method(isin, data.frame)
-S3method(isrep, default)
-S3method(isrep, data.frame)
-S3method(countrep, default)
-S3method(countrep, data.frame)
-S3method(subset, ps)
-S3method(sim, default)
-S3method(sim, ps)
-
+importFrom("stats", "aggregate", "dbeta", "dsignrank", "dt",
+             "dweibull", "dwilcox", "integrate", "rbinom")
+importFrom("utils", "combn")
+importFrom("fAsianOptions", "kummerM", "kummerU")
+importFrom("combinat", "permn")
+
+# Export all names
+exportPattern(".")
+export(
+addrv,
+cards,
+cfbeta,
+cfbinom,
+cfcauchy,
+cfchisq,
+cfexp,
+cff,
+cfgamma,
+cfgeom,
+cfhyper,
+cflnorm,
+cflogis,
+cfnbinom,
+cfnorm,
+cfpois,
+cfsignrank,
+cft,
+cfunif,
+cfweibull,
+cfwilcox,
+countrep,
+empirical,
+euchredeck,
+iidspace,
+intersect,
+is.probspace,
+isin,
+isrep,
+marginal,
+noorder,
+nsamp,
+permsn,
+prob,
+Prob,
+probspace,
+rolldie,
+roulette,
+setdiff,
+sim,
+subset,
+tosscoin,
+union,
+urnsamples)
+
+S3method(union, default)
+S3method(union, data.frame)
+S3method(union, ps)
+S3method(intersect, default)
+S3method(intersect, data.frame)
+S3method(intersect, ps)
+S3method(setdiff, default)
+S3method(setdiff, data.frame)
+S3method(setdiff, ps)
+S3method(urnsamples, default)
+S3method(urnsamples, data.frame)
+S3method(probspace, default)
+S3method(probspace, list)
+S3method(Prob, default)
+S3method(Prob, ps)
+S3method(isin, default)
+S3method(isin, data.frame)
+S3method(isrep, default)
+S3method(isrep, data.frame)
+S3method(countrep, default)
+S3method(countrep, data.frame)
+S3method(subset, ps)
+S3method(sim, default)
+S3method(sim, ps)
+

Added: pkg/R/genData.R
===================================================================
--- pkg/R/genData.R	                        (rev 0)
+++ pkg/R/genData.R	2017-02-16 02:44:19 UTC (rev 47)
@@ -0,0 +1,142 @@
+####################################
+# Functions to generate data
+# for pedagogical purposes
+
+#######################################################
+# continuous X data
+
+genXdata <- function(n, nvar = 1,
+                     mu = rep(0, nvar),
+                     Sigma = diag(length(mu)),
+                     varnames = paste("x", 1:length(mu), sep = ""),
+                     roundto = NULL
+                     ){
+  tmp <- as.data.frame(MASS::mvrnorm(n, mu = mu, Sigma = Sigma))
+  names(tmp) <- varnames
+  if (!is.null(roundto)){
+    tmp <- round(tmp, roundto)
+  }
+  tmp
+}
+
+# genXdata(10, nvar = 3, roundto = 2)
+# X = genXdata(10, nvar = 3, roundto = 2)
+
+#######################################################
+# logistic regression data
+
+genLogRegData <- function(xdata,
+                          beta = rep(1, ncol(xdata)),
+                          yname = "y"){
+  tmp <- as.matrix(xdata) %*% beta
+  probs <- exp(tmp)/(1 + exp(tmp))
+  y <- apply(probs, 1, function(p){rbinom(1, size = 1, prob = p)})
+  resdata <- cbind(xdata, y)
+  as.data.frame(resdata, col.names = c(names(xdata), yname))
+}
+
+
+#params <- c(1,2,3,4)
+#require(MASS)
+#xmean <- Null(params)[ , 1]
+#X = genXdata(10, mu = xmean, roundto = 2)
+#genLogRegData(X, beta = params)
+
+
+######################################################3
+# contingency tables
+
+genIndepTable <- function(n = sample(100:500, size = 1),
+                          prow = 1:3, pcol = 1:4,
+                          dmnames = list(X = paste("x", 1:length(prow), sep = ""),
+                                         Y = paste("y", 1:length(pcol), sep = "")),
+                          addmargins = TRUE,
+                          as.df = FALSE, untable = TRUE){
+  prow <- prow/sum(prow)
+  pcol <- pcol/sum(pcol)
+  pmatrix <- outer(prow, pcol)
+  probs <- as.numeric(pmatrix)
+  x <- factor(sample(1:length(probs), size = n, replace = TRUE, prob = probs),
+              levels = 1:length(probs))
+  tmp <- matrix(as.integer(table(x)), nrow = length(prow))
+  dimnames(tmp) <- dmnames
+  tmp <- as.table(tmp)
+
+  if (as.df){
+    tmp <- as.data.frame(tmp)
+    if (untable){
+      tmp <- with(tmp, reshape::untable(tmp, Freq))
+      tmp[ , "Freq"] <- NULL
+      rownames(tmp) <- 1:dim(tmp)[1]
+    }
+    tmp
+  } else if (addmargins) {
+    addmargins(tmp)
+  } else {
+    tmp
+  }
+}
+  
+# 
+# genIndepTable(n = 100)
+# genIndepTable(n = 100, nfixed = TRUE)
+# genIndepTable(n = 100, nfixed = TRUE, as.df = TRUE)
+# genIndepTable(n = 100, nfixed = TRUE, as.df = TRUE, untable = FALSE)
+# 
+# tmp = genIndepTable(n = 10, nfixed = TRUE, as.df = TRUE)
+# tmp
+# 
+# model.matrix(~., data = tmp)
+# tmp2 = as.data.frame(model.matrix(~ X*Y, data = tmp))
+# tmp2
+# 
+# genLogRegData(tmp2)
+# 
+# A = genIndepTable(n = 500, nfixed = TRUE, as.df = TRUE)
+# chisq.test(xtabs(~., data = A))
+# 
+
+
+######################################################3
+# general two-way tables
+
+gen2wayTable <- function(n = sample(100:500, size = 1),
+                          pmatrix = matrix(1:12, nrow = 3),
+                          dmnames = list(X = paste("x", 1:nrow(pmatrix), sep = ""),
+                                         Y = paste("y", 1:ncol(pmatrix), sep = "")),
+                          addmargins = TRUE,
+                          as.df = FALSE, untable = TRUE){
+  probs <- as.numeric(pmatrix)
+  x <- factor(sample(1:length(probs), size = n, replace = TRUE, prob = probs),
+              levels = 1:length(probs))
+  tmp <- matrix(as.integer(table(x)), nrow = nrow(pmatrix))
+  dimnames(tmp) <- dmnames
+  tmp <- as.table(tmp)
+
+  if (as.df){
+    tmp <- as.data.frame(tmp)
+    if (untable){
+      tmp <- with(tmp, reshape::untable(tmp, Freq))
+      tmp[ , "Freq"] <- NULL
+      rownames(tmp) <- 1:dim(tmp)[1]
+    }
+    tmp
+  } else if (addmargins) {
+    addmargins(tmp)
+  } else {
+    tmp
+  }
+}
+
+# 
+# gen2wayTable(n = 100)
+# gen2wayTable(n = 100, nfixed = TRUE)
+# gen2wayTable(n = 100, nfixed = TRUE, as.df = TRUE)
+# gen2wayTable(n = 100, nfixed = TRUE, as.df = TRUE, untable = FALSE)
+# 
+# w = matrix(c(8, 5, 3, 2, 5, 5), nrow = 2)
+# 
+# B = gen2wayTable(n = 300, pmatrix = w, addmargins = FALSE)
+# chisq.test(B)
+# 
+

Added: pkg/man/gen2wayTable.Rd
===================================================================
--- pkg/man/gen2wayTable.Rd	                        (rev 0)
+++ pkg/man/gen2wayTable.Rd	2017-02-16 02:44:19 UTC (rev 47)
@@ -0,0 +1,44 @@
+\name{gen2wayTable}
+\alias{gen2wayTable}
+
+\title{
+Generate Two-way Tables
+}
+\description{
+A function to randomly generate arbitrary two-way tables
+}
+\usage{
+gen2wayTable(n = sample(100:500, size = 1), pmatrix = matrix(1:12, nrow = 3), dmnames = list(X = paste("x", 1:nrow(pmatrix), sep = ""), Y = paste("y", 1:ncol(pmatrix), sep = "")), addmargins = TRUE, as.df = FALSE, untable = TRUE)
+}
+%- maybe also 'usage' for other objects documented here.
+\arguments{
+  \item{n}{
+sum total observations 
+}
+  \item{pmatrix}{
+  matrix of nonnegative weights for the probability distribution 
+}
+  \item{dmnames}{
+  names of the table dimensions
+}
+  \item{addmargins}{
+should margins be added?
+}
+  \item{as.df}{
+table will be returned as a data frame
+}
+  \item{untable}{
+  should counts be untabled to single observation per row
+}
+}
+
+\value{
+  An object of class table containing the generated values.
+}
+\author{
+G. Jay Kerns
+}
+
+% Add one or more standard keywords, see file 'KEYWORDS' in the
+% R documentation directory.
+\keyword{ datagen }% use one of  RShowDoc("KEYWORDS")

Added: pkg/man/genIndepTable.Rd
===================================================================
--- pkg/man/genIndepTable.Rd	                        (rev 0)
+++ pkg/man/genIndepTable.Rd	2017-02-16 02:44:19 UTC (rev 47)
@@ -0,0 +1,48 @@
+\name{genIndepTable}
+\alias{genIndepTable}
+%- Also NEED an '\alias' for EACH other topic documented here.
+\title{
+Generate Independent Two-way Table
+}
+\description{
+A function to generate a two-way table with independent margins
+}
+\usage{
+genIndepTable(n = sample(100:500, size = 1), prow = 1:3, pcol = 1:4, dmnames = list(X = paste("x", 1:length(prow), sep = ""), Y = paste("y", 1:length(pcol), sep = "")), addmargins = TRUE, as.df = FALSE, untable = TRUE)
+}
+%- maybe also 'usage' for other objects documented here.
+\arguments{
+  \item{n}{
+sum total of observations generated
+}
+  \item{prow}{
+nonnegative weights for the row marginal distribution
+}
+  \item{pcol}{
+nonnegative weights for the col marginal distribution
+}
+  \item{dmnames}{
+names for the table dimensions
+}
+  \item{addmargins}{
+ should margins be added to the table
+}
+  \item{as.df}{
+should the result be returned as a data frame
+}
+  \item{untable}{
+if true then data frame will be expanded to one observation per row
+}
+}
+\details{
+This function will generate a two-way table with independent marginal distributions.
+}
+\value{
+Either an object of class table or a data frame.
+}
+
+\author{
+G. Jay Kerns
+}
+
+\keyword{ datagen }% use one of  RShowDoc("KEYWORDS")

Added: pkg/man/genLogRegData.Rd
===================================================================
--- pkg/man/genLogRegData.Rd	                        (rev 0)
+++ pkg/man/genLogRegData.Rd	2017-02-16 02:44:19 UTC (rev 47)
@@ -0,0 +1,36 @@
+\name{genLogRegData}
+\alias{genLogRegData}
+%- Also NEED an '\alias' for EACH other topic documented here.
+\title{
+Generate data for logistic regression
+}
+\description{
+This function generates data ready for a logistic regression model
+}
+\usage{
+genLogRegData(xdata, beta = rep(1, ncol(xdata)), yname = "y")
+}
+%- maybe also 'usage' for other objects documented here.
+\arguments{
+  \item{xdata}{
+the model matrix
+}
+  \item{beta}{
+vector of parameters to multiply the model matrix
+}
+  \item{yname}{
+the name for the generated y values
+}
+}
+\details{
+This function generates data ready for a logistic regression model
+}
+\value{
+A data frame with the model matrix and the generated y values added
+}
+
+\author{
+G. Jay Kerns
+}
+
+\keyword{ ~datagen }% use one of  RShowDoc("KEYWORDS")

Added: pkg/man/genXdata.Rd
===================================================================
--- pkg/man/genXdata.Rd	                        (rev 0)
+++ pkg/man/genXdata.Rd	2017-02-16 02:44:19 UTC (rev 47)
@@ -0,0 +1,46 @@
+\name{genXdata}
+\alias{genXdata}
+%- Also NEED an '\alias' for EACH other topic documented here.
+\title{
+Generate continuous model matrix data
+}
+\description{
+This function generates correlated normal data to serve as a model matrix in a regression model.
+}
+\usage{
+genXdata(n, nvar = 1, mu = rep(0, nvar), Sigma = diag(length(mu)), varnames = paste("x", 1:length(mu), sep = ""), roundto = NULL)
+}
+%- maybe also 'usage' for other objects documented here.
+\arguments{
+  \item{n}{
+how many rows
+}
+  \item{nvar}{
+how many columns
+}
+  \item{mu}{
+the mean of the multivariate normal distribution
+}
+  \item{Sigma}{
+the variance-covariance matrix of the normal distribution
+}
+  \item{varnames}{
+how you would like the variables to be named in the result
+}
+  \item{roundto}{
+number of places to round the generated values
+}
+}
+\details{
+This function generates correlated normal data to serve as a model matrix in a regression model.
+
+}
+\value{
+A data frame of generated data
+}
+
+\author{
+G. Jay Kerns
+}
+
+\keyword{ ~datagen }% use one of  RShowDoc("KEYWORDS")

Modified: pkg/man/prob-package.Rd
===================================================================
--- pkg/man/prob-package.Rd	2013-12-11 16:59:22 UTC (rev 46)
+++ pkg/man/prob-package.Rd	2017-02-16 02:44:19 UTC (rev 47)
@@ -1,39 +1,40 @@
-\name{prob-package}
-\alias{prob-package}
-\docType{package}
-\title{
-Elementary Probability on Finite Sample Spaces
-}
-\description{
- This package provides a framework for performing elementary probability calculations on finite sample spaces.   It is built around the concept of a \emph{probability space}, which is an object of outcomes and an object \code{probs} of probabilities associated with the outcomes.
- 
- There are two ways to represent a probability space in the \code{prob} package.  The first is with a data frame that has a \code{probs} column.  Entries of \code{probs} should be nonnegative and sum to one.  The second way is with a list having two components: \code{outcomes} and \code{probs}.  The component \code{outcomes} is a list containing elements of the most arbitrary sort; they can be data frames, vectors, more lists, whatever.  The \code{probs} component is a vector (of the same length as \code{outcomes}), which associates to each element of \code{outcomes} a nonnegative number.  As before, the only additional requirement is that the \code{sum} of \code{probs} be one.
- 
- There are functions in the \code{prob} package to address many topics in a standard course in elementary probability.  In particular, there are methods for setting up sample spaces, counting tools, defining probability spaces, performing set algebra, calculating probability and conditional probability, tools for simulation and checking the law of large numbers, adding random variables, and finding marginal distributions.  See \code{vignette("prob")} for details.
- 
- There are some functions included to set up some of the standard sample spaces usually encountered in an elementary probability course.  Examples include tossing a coin, rolling a die, drawing from a 52 card deck, \emph{etc}.  If you know of topics that would be of general interest and could be incorporated in the \code{prob} package framework, I would be happy to hear about them.  Comments and suggestions are always welcomed.
- 
- The \code{prob} package is a first step toward addressing probability in \code{R}, and has been written in the spirit of simplicity.  The procedures work best to solve problems that are manageable in scope.  Users that wish to investigate especially large or intricate problems are encouraged to modify and streamline the code to suit their individual needs. 
-
-Characteristic functions for the base probability distributions have been included.  For details, type \code{vignette("charfunc")} at the command prompt.
-}
-
-\details{
-\tabular{ll}{
-Package: \tab prob\cr
-Version: \tab 0.9-5\cr
-Date: \tab 2013-11-20\cr
-Depends: \tab combinat, fAsianOptions, hypergeo, VGAM\cr
-LazyLoad: \tab no\cr
-License: \tab GPL version 3 or newer\cr
-URL: \tab http://prob.r-forge.r-project.org,
-http://people.ysu.edu/~gkerns\cr
-}
-
-}
-\author{
-G. Jay Kerns <gkerns at ysu.edu>
-
-Maintainer: G. Jay Kerns <gkerns at ysu.edu>
-}
-\keyword{package}
+\name{prob-package}
+\alias{prob-package}
+\docType{package}
+\title{
+Elementary Probability on Finite Sample Spaces
+}
+\description{
+ This package provides a framework for performing elementary probability calculations on finite sample spaces.   It is built around the concept of a \emph{probability space}, which is an object of outcomes and an object \code{probs} of probabilities associated with the outcomes.
+ 
+ There are two ways to represent a probability space in the \code{prob} package.  The first is with a data frame that has a \code{probs} column.  Entries of \code{probs} should be nonnegative and sum to one.  The second way is with a list having two components: \code{outcomes} and \code{probs}.  The component \code{outcomes} is a list containing elements of the most arbitrary sort; they can be data frames, vectors, more lists, whatever.  The \code{probs} component is a vector (of the same length as \code{outcomes}), which associates to each element of \code{outcomes} a nonnegative number.  As before, the only additional requirement is that the \code{sum} of \code{probs} be one.
+ 
+ There are functions in the \code{prob} package to address many topics in a standard course in elementary probability.  In particular, there are methods for setting up sample spaces, counting tools, defining probability spaces, performing set algebra, calculating probability and conditional probability, tools for simulation and checking the law of large numbers, adding random variables, and finding marginal distributions.  See \code{vignette("prob")} for details.
+ 
+ There are some functions included to set up some of the standard sample spaces usually encountered in an elementary probability course.  Examples include tossing a coin, rolling a die, drawing from a 52 card deck, \emph{etc}.  If you know of topics that would be of general interest and could be incorporated in the \code{prob} package framework, I would be happy to hear about them.  Comments and suggestions are always welcomed.
+ 
+ The \code{prob} package is a first step toward addressing probability in \code{R}, and has been written in the spirit of simplicity.  The procedures work best to solve problems that are manageable in scope.  Users that wish to investigate especially large or intricate problems are encouraged to modify and streamline the code to suit their individual needs. 
+
+Characteristic functions for the base probability distributions have been included.  For details, type \code{vignette("charfunc")} at the command prompt.
+}
+
+\details{
+\tabular{ll}{
+Package: \tab prob\cr
+Version: \tab 1.0-0\cr
+Date: \tab 2017-02-15\cr
+Depends: \tab combinat, fAsianOptions\cr
+Suggests: \tab VGAM, hypergeo\cr
+LazyLoad: \tab no\cr
+License: \tab GPL version 3 or newer\cr
+URL: \tab http://prob.r-forge.r-project.org,
+http://gkerns.people.ysu.edu/\cr
+}
+
+}
+\author{
+G. Jay Kerns <gkerns at ysu.edu>
+
+Maintainer: G. Jay Kerns <gkerns at ysu.edu>
+}
+\keyword{package}



More information about the Prob-commits mailing list