[Distr-commits] r767 - in branches/distr-2.4/pkg/distrMod: . inst vignettes

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Fri Nov 18 13:36:22 CET 2011


Author: ruckdeschel
Date: 2011-11-18 13:36:22 +0100 (Fri, 18 Nov 2011)
New Revision: 767

Added:
   branches/distr-2.4/pkg/distrMod/vignettes/
   branches/distr-2.4/pkg/distrMod/vignettes/Estimate.pdf
   branches/distr-2.4/pkg/distrMod/vignettes/ParamFamParameter.pdf
   branches/distr-2.4/pkg/distrMod/vignettes/ProbFamily.pdf
   branches/distr-2.4/pkg/distrMod/vignettes/distrMod.Rnw
   branches/distr-2.4/pkg/distrMod/vignettes/distrMod.bib
Removed:
   branches/distr-2.4/pkg/distrMod/inst/doc/
Modified:
   branches/distr-2.4/pkg/distrMod/DESCRIPTION
Log:
[distrMod] (branches)
reaction to http://developer.r-project.org/214update.txt
+ created folder vignettes and moved content of inst/doc/ to it
+ removed lazyload tag in DESCRIPTION
+ updated affiliation info in newDistributions.Rnw
+ deleted inst/doc folder

Modified: branches/distr-2.4/pkg/distrMod/DESCRIPTION
===================================================================
--- branches/distr-2.4/pkg/distrMod/DESCRIPTION	2011-11-18 12:33:04 UTC (rev 766)
+++ branches/distr-2.4/pkg/distrMod/DESCRIPTION	2011-11-18 12:36:22 UTC (rev 767)
@@ -1,6 +1,6 @@
 Package: distrMod
 Version: 2.4
-Date: 2011-02-14
+Date: 2011-11-18
 Title: Object oriented implementation of probability models
 Description: Object oriented implementation of probability models based on packages 'distr' and
         'distrEx'
@@ -9,10 +9,9 @@
 Depends: R(>= 2.6.0), methods, startupmsg, distr(>= 2.2), distrEx(>= 2.2), RandVar(>= 0.6.3),
         MASS, stats4
 ByteCompile: yes
-LazyLoad: yes
 License: LGPL-3
 Encoding: latin1
 URL: http://distr.r-forge.r-project.org/
 LastChangedDate: {$LastChangedDate$}
 LastChangedRevision: {$LastChangedRevision$}
-SVNRevision: 699
+SVNRevision: 767

Copied: branches/distr-2.4/pkg/distrMod/vignettes/Estimate.pdf (from rev 742, branches/distr-2.4/pkg/distrMod/inst/doc/Estimate.pdf)
===================================================================
(Binary files differ)

Copied: branches/distr-2.4/pkg/distrMod/vignettes/ParamFamParameter.pdf (from rev 742, branches/distr-2.4/pkg/distrMod/inst/doc/ParamFamParameter.pdf)
===================================================================
--- branches/distr-2.4/pkg/distrMod/vignettes/ParamFamParameter.pdf	                        (rev 0)
+++ branches/distr-2.4/pkg/distrMod/vignettes/ParamFamParameter.pdf	2011-11-18 12:36:22 UTC (rev 767)
@@ -0,0 +1,99 @@
+%PDF-1.4
+%Ç쏢
+5 0 obj
+<</Length 6 0 R/Filter /FlateDecode>>
+stream
+xœ­RÛnÛ0}×Wðmí
+p"u£òX ÝËníü†g·b§u¬Ÿ?:v‚Ò`ô A:$ÏEÏ`‘ÀŽkÞ«Î|ºOððbÆ5<˜@™pˆ^&ÎH.Gïa¨McXÐù̶Ü>§ìçŒÖ!φ¦ó¼U\†Q(hmQ)+ÊZÉEc&¶)"KtÀ‰1…Eg.~”CÙÕëz¸,~››ÂÜÑÂB(aOŠÅ¡4i9ô.s‹ªS„'ê( $'o©‡€œÔ3v³È–úU¯ÌP=ª„ê¨òH‰2ä”0kCNÊö03 at gx¯|}:á½NÉA}"=ØñiçýmٝÈ`Gn·ïÈåè0Åx¨àè¿d€1fòÛoìœAW¶ýúMWm5òŸ¹±3Ƥ®|ÃÚ´/e_i`ߟÖíª/—ßæ¢Ã/®šöµþõ>l=”Íj_ËõÛM_Š›,tÉcÒ8­Õé Ûߐ“ŒÖÃR]yc´R<gv³¢‰#¡ß#–¦ùøOe:÷Îü×…ò€endstream
+endobj
+6 0 obj
+410
+endobj
+4 0 obj
+<</Type/Page/MediaBox [0 0 275 298]
+/Parent 3 0 R
+/Resources<</ProcSet[/PDF /Text]
+/ExtGState 10 0 R
+/Font 11 0 R
+>>
+/Contents 5 0 R
+>>
+endobj
+3 0 obj
+<< /Type /Pages /Kids [
+4 0 R
+] /Count 1
+>>
+endobj
+1 0 obj
+<</Type /Catalog /Pages 3 0 R
+/Metadata 12 0 R
+>>
+endobj
+7 0 obj
+<</Type/ExtGState
+/OPM 1>>endobj
+10 0 obj
+<</R7
+7 0 R>>
+endobj
+11 0 obj
+<</R9
+9 0 R/R8
+8 0 R>>
+endobj
+9 0 obj
+<</BaseFont/Courier/Type/Font
+/Subtype/Type1>>
+endobj
+8 0 obj
+<</BaseFont/Helvetica-Bold/Type/Font
+/Subtype/Type1>>
+endobj
+12 0 obj
+<</Type/Metadata
+/Subtype/XML/Length 1390>>stream
+<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
+<?adobe-xap-filters esc="CRLF"?>
+<x:xmpmeta xmlns:x='adobe:ns:meta/' x:xmptk='XMP toolkit 2.9.1-13, framework 1.6'>
+<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:iX='http://ns.adobe.com/iX/1.0/'>
+<rdf:Description rdf:about='0c532899-189a-11e9-0000-e76883270f66' xmlns:pdf='http://ns.adobe.com/pdf/1.3/' pdf:Producer='GPL Ghostscript 8.63'/>
+<rdf:Description rdf:about='0c532899-189a-11e9-0000-e76883270f66' xmlns:xap='http://ns.adobe.com/xap/1.0/' xap:ModifyDate='2009-01-12T08:49:11+01:00' xap:CreateDate='2009-01-12T08:49:11+01:00'><xap:CreatorTool>Dia v0.96.1</xap:CreatorTool></rdf:Description>
+<rdf:Description rdf:about='0c532899-189a-11e9-0000-e76883270f66' xmlns:xapMM='http://ns.adobe.com/xap/1.0/mm/' xapMM:DocumentID='0c532899-189a-11e9-0000-e76883270f66'/>
+<rdf:Description rdf:about='0c532899-189a-11e9-0000-e76883270f66' xmlns:dc='http://purl.org/dc/elements/1.1/' dc:format='application/pdf'><dc:title><rdf:Alt><rdf:li xml:lang='x-default'>/home/btm722/arbeit/Rpackages/ParamFamParameter.dia</rdf:li></rdf:Alt></dc:title><dc:creator><rdf:Seq><rdf:li>btm722</rdf:li></rdf:Seq></dc:creator></rdf:Description>
+</rdf:RDF>
+</x:xmpmeta>
+                                                                        
+                                                                        
+<?xpacket end='w'?>
+endstream
+endobj
+2 0 obj
+<</Producer(GPL Ghostscript 8.63)
+/CreationDate(D:20090112084911+01'00')
+/ModDate(D:20090112084911+01'00')
+/Title(/home/btm722/arbeit/Rpackages/ParamFamParameter.dia)
+/Creator(Dia v0.96.1)
+/Author(btm722)>>endobj
+xref
+0 13
+0000000000 65535 f 
+0000000724 00000 n 
+0000002497 00000 n 
+0000000665 00000 n 
+0000000514 00000 n 
+0000000015 00000 n 
+0000000495 00000 n 
+0000000789 00000 n 
+0000000961 00000 n 
+0000000899 00000 n 
+0000000830 00000 n 
+0000000860 00000 n 
+0000001030 00000 n 
+trailer
+<< /Size 13 /Root 1 0 R /Info 2 0 R
+/ID [<E31A21438BC351B4CC16D9B55F14B4DB><E31A21438BC351B4CC16D9B55F14B4DB>]
+>>
+startxref
+2718
+%%EOF

Copied: branches/distr-2.4/pkg/distrMod/vignettes/ProbFamily.pdf (from rev 742, branches/distr-2.4/pkg/distrMod/inst/doc/ProbFamily.pdf)
===================================================================
(Binary files differ)

Copied: branches/distr-2.4/pkg/distrMod/vignettes/distrMod.Rnw (from rev 742, branches/distr-2.4/pkg/distrMod/inst/doc/distrMod.Rnw)
===================================================================
--- branches/distr-2.4/pkg/distrMod/vignettes/distrMod.Rnw	                        (rev 0)
+++ branches/distr-2.4/pkg/distrMod/vignettes/distrMod.Rnw	2011-11-18 12:36:22 UTC (rev 767)
@@ -0,0 +1,1141 @@
+\documentclass[nojss]{jss}
+%\documentclass[article]{jss}
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%% declarations for jss.cls %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%
+\usepackage{amssymb}%
+%
+\newcommand{\N}{\mathbb N}
+\newcommand{\R}{\mathbb R}
+\newcommand{\B}{\mathbb B}
+%
+\newcommand\smkreis[1]{{(M#1)}}
+% -------------------------------------------------------------------------------
+\SweaveOpts{keep.source=TRUE}
+% -------------------------------------------------------------------------------
+
+%\VignetteIndexEntry{R Package distrMod: S4 Classes and Methods for Probability Models}
+%\VignetteDepends{distr,distrEx,distrMod}
+%\VignetteKeywords{probability models, minimum criterion estimators, maximum likelihood estimators, minimum distance estimators, S4 classes, S4 methods}
+%\VignettePackage{distrMod} 
+
+%% almost as usual
+\author{Matthias Kohl\\FH Furtwangen\And
+        Peter Ruckdeschel\\Fraunhofer ITWM Kaiserslautern}
+\title{\proglang{R} Package~\pkg{distrMod}: \proglang{S}4 Classes and Methods for Probability Models}
+
+%% for pretty printing and a nice hypersummary also set:
+\Plainauthor{Matthias Kohl, Peter Ruckdeschel} %% comma-separated
+\Plaintitle{R Package distrMod: S4 Classes and Methods for Probability Models} %% without formatting
+\Shorttitle{R Package distrMod} %% a short title (if necessary)
+
+%% an abstract and keywords
+\Abstract{
+  This vignette is published as \citet{distrMod}.
+  Package~\pkg{distrMod} provides an object oriented (more specifically
+  \proglang{S}4-style)
+  implementation of probability models. Moreover, it contains functions
+  and methods to compute minimum criterion estimators -- in particular,
+  maximum likelihood and minimum distance estimators.
+}
+\Keywords{probability models, minimum criterion estimators, minimum distance estimators,
+maximum likelihood estimators, \proglang{S}4 classes, \proglang{S}4 methods}
+\Plainkeywords{probability models, minimum criterion estimators, maximum likelihood 
+estimators, minimum distance estimators, S4 classes, S4 methods} %% without formatting
+%% at least one keyword must be supplied
+
+%% publication information
+%% NOTE: Typically, this can be left commented and will be filled out by the technical editor
+%% \Volume{13}
+%% \Issue{9}
+%% \Month{September}
+%% \Year{2004}
+%% \Submitdate{2004-09-29}
+%% \Acceptdate{2004-09-29}
+
+%% The address of (at least) one author should be given
+%% in the following format:
+\Address{
+  Matthias Kohl\\
+  Hochschule Furtwangen\\
+Fakult\"at Maschinenbau und Verfahrenstechnik\\
+Jakob-Kienzle-Strasse 17\\
+78054 Villingen-Schwenningen \\
+  E-mail: \email{Matthias.Kohl at hs-furtwangen.de}\\
+  \bigskip \\
+  Peter Ruckdeschel\\
+  TU Kaiserslautern\\ 
+  FB Mathematik\\ 
+  P.O.Box 3049\\ 
+  67653 Kaiserslautern, Germany\\
+  and\\ 
+  Fraunhofer-ITWM\\ 
+  Fraunhofer-Platz 1\\ 
+  67663 Kaiserslautern, Germany\\
+  E-mail: \email{Peter.Ruckdeschel at itwm.fraunhofer.de}
+}
+%% It is also possible to add a telephone and fax number
+%% before the e-mail in the following format:
+%% Telephone: +43/1/31336-5053
+%% Fax: +43/1/31336-734
+
+%% for those who use Sweave please include the following line (with % symbols):
+%% need no \usepackage{Sweave.sty}
+
+%% end of declarations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+
+\begin{document}
+%----------------------------------------------------------------------------
+<<library, echo=FALSE, results = hide>>=
+library(distrMod)
+distrModoptions(show.details="minimal")
+options(prompt = "R> ", continue = "+  ", width = 70, 
+        useFancyQuotes = FALSE)
+@
+%----------------------------------------------------------------------------
+\section{Introduction}
+% -----------------------------------------------------------------------------
+\subsection[Aims of package distrMod]{Aims of package \pkg{distrMod}}
+% -----------------------------------------------------------------------------
+\textbf{What is \pkg{distrMod}? }
+It is an extension package for the statistical software \proglang{R}, \citep{RCore}
+and is the latest member of a family of packages, which we call {\tt distr}-family. 
+The family so far consists of packages~\pkg{distr}, \pkg{distrEx},
+\pkg{distrEllipse}, \pkg{distrSim}, \pkg{distrTEst}, \pkg{distrTeach}, and \pkg{distrDoc};
+see~\citet{distr} and \citet{distrTeach}.\\ 
+Package~\pkg{distrMod} makes extensive use of the distribution classes of 
+package~\pkg{distr} as well as the functions and methods of package~\pkg{distrEx}. 
+Its purpose is to extend support in base \proglang{R} for distributions 
+and in particular for parametric modelling by ``object oriented'' implementation of 
+probability models via several new \proglang{S}4 classes and methods; see
+Section~\ref{OOPinR} and \citet{Ch98} for more details. 
+In addition, it includes functions and methods to compute minimum criterion estimators --
+in particular, maximum likelihood (ML[E]) (i.e.\ minimum negative log-likelihood)
+and minimum distance estimators (MDE).\\ 
+Admittedly, \pkg{distrMod} is not the first package to provide infrastructure for
+ML estimation, we compete in some sense with such prominent functions as 
+\code{fitdistr} from package~\pkg{MASS} \citep{V:R:02} and, already using 
+the \proglang{S}4 paradigm, \code{mle} from package~\pkg{stats4} 
+\citep{RCore}.\\
+Our implementation however, goes beyond the scope of these packages, as we work
+with distribution objects and have quite general methods available to operate
+on these objects.
+
+\textbf{Who should use it? }
+It is aimed at users who want to use non-standard parametric models, allowing them 
+to either explore these models, or fit them to data by non-standard techniques. 
+The user will receive standardized output on which she/he may
+apply standard \proglang{R} functions like
+\code{plot}, \code{show}, \code{confint}, \code{profile}.\\
+By non-standard parametric models we mean models not in the list of 
+explicit models covered by \code{fitdistr}; that is, \code{"Poisson"}, \code{"beta"}, 
+\code{"cauchy"}, \code{"chi-squared"}, \code{"exponential"}, 
+\code{"gamma"}, \code{"geometric"}, \code{"lognormal"}, \code{"logistic"}, 
+\code{"negative binomial"}, \code{"normal"}, \code{"f"}, \code{"t"}, \code{"weibull"}.
+Standard as well as non-standard models can easily be implemented based on the infrastructure 
+provided by packages \pkg{distr} and \pkg{distrEx}. We will demonstrate this using 
+examples $\smkreis{2}$ and $\smkreis{4}$ specified in Section~\ref{examples}.\\
+Non-standard techniques may include minimum criterion estimation, 
+minimum distance estimation, a particular optimization routine not covered 
+by \code{optim}/\code{optimize} in the MLE case, or some explicit expression 
+for the MLE not covered by the standard class-room examples. Non-standard 
+techniques may also stand for estimation of a (differentiable) function of the 
+parameter as illustrated in example $\smkreis{3}$.\\
+Despite this flexibility, we need not modify our code to cover all this.
+In short, we are able to implement {\bf one} static algorithm which by \proglang{S}4
+method dispatch dynamically takes care of various models and optimization
+techniques, thus avoiding redundancy and simplifying maintenance. We will explain this 
+more precisely in Section~\ref{OOApproach}.\\
+All information relevant for a specific parametric model is grouped within an object of 
+class \code{ParamFamily} or subclasses for which it may for instance be of interest 
+to explore the (perhaps automatically derived, as in the case of example~$\smkreis{2}$) 
+score function and the corresponding Fisher information. The return value of the model 
+fit, an estimate of the parameter, is an object of class \code{Estimator} or subclasses 
+for which one may want to have confidence intervals, some profiling, etc. For 
+objects of these classes we provide various methods for standard \proglang{R} functions; 
+see Sections~\ref{modelsec} and \ref{estsec} for more details.
+
+\textbf{Availability } The current version of package~\pkg{distrMod} is 2.2
+and can be found on the Comprehensive \proglang{R} Archive Network at 
+\url{http://CRAN.R-project.org/package=distrMod}. The development version of the distr-family is located at R-Forge; 
+see~\citet{RForge}.
+% -----------------------------------------------------------------------------
+\subsection{Running examples}\label{examples}
+% -----------------------------------------------------------------------------
+For illustrating the functionality of \pkg{distrMod}, we will use four running
+examples for each of which we assume i.i.d.\ observations $X_i$ ($i=1,\ldots,n$, $n\in\N$) 
+distributed according to the respective $P_\theta$:
+\begin{itemize}
+\item[$\smkreis{1}$] the one-dimensional normal location family
+${\cal P}:=\{P_\theta\,|\, \theta\in\R\}$ with $P_\theta = {\cal N}(\theta,1)$. 
+This model is $L_2$-differentiable (i.e.\ smoothly parametrized) with scores
+$\Lambda_\theta(x)=x-\theta$.
+\item[$\smkreis{2}$]
+a one-dimensional location and scale family
+${\cal P}:=\{P_{\theta}\,|\, \theta=(\mu,\sigma)'\in\R\times(0,\infty)\}$ with
+some non-standard $P_\theta$. More precisely we assume,
+\begin{equation}
+   X_i=\mu+\sigma V_i \qquad \mbox{for}\quad V_i\stackrel{\rm i.i.d}{\sim}  P
+\end{equation}
+where $P=P_{\theta_0}$ ($\theta_0=(0,1)'$) is the following central distribution
+\begin{equation}
+P(dx)=p(x)\,dx,\qquad \;\;p(x)\propto e^{-|x|^3}
+\end{equation}
+${\cal P}$ is $L_2$-differentiable with scores
+$\Lambda_\theta(x)= (3\,{\rm sign}(y) y^2, 3 |y|^3-1)/\sigma$
+for $y=(x-\mu)/\sigma$.
+\item[$\smkreis{3}$]
+the gamma family
+${\cal P}:=\{P_{\theta}={\rm gamma}(\theta)\,|\, \theta=(\beta,\xi)'
+  \in(0,\infty)^2\}$ for scale parameter $\beta$ and shape parameter $\xi$.
+This model is $L_2$-differentiable with scores
+$\Lambda_\theta(x)= \big(\frac{y - \xi}\beta,
+                     \log(y) - (\log \Gamma)'(\xi)\big)$
+for $y=x/\beta$ and
+\item[$\smkreis{4}$] a censored Poisson family:
+${\cal P}:=\{P_{\theta}\,|\, \theta\in(0,\infty)\}$
+where $P_{\theta}={\cal L}_\theta(X | X>1)$ for $X \sim {\rm Pois}(\theta)$,
+that is, we only observe counts larger or equal to $2$ in a Poisson model.
+This model is $L_2$-differentiable with scores
+$\Lambda_\theta(x)= %\frac
+{x}/{\theta} -  %\frac{
+(1- e^{-\theta})%}{
+/(1-(1+\theta) e^{-\theta})
+%}
+\;$.
+\end{itemize}
+We will estimate $\theta$ from $X_1,\ldots, X_n$ with mean squared
+error (MSE) as risk. This makes the MLE asymptotically optimal. Other considerations, 
+in particular robustness issues, suggest that one should also look at alternatives. 
+For the sake of this paper, we will limit ourselves to one alternative in each model. 
+In model~$\smkreis{1}$ we will use the median as most-robust estimator, in
+model~$\smkreis{2}$ we will look at the very robust estimator
+$\theta_r=({\rm median},{\rm mad})$ (mad = suitable standardized MAD), while
+in models~$\smkreis{3}$ and $\smkreis{4}$ we use minimum distance
+estimators (MDE) to the Cram\'er-von-Mises distance.\\
+The four examples were chosen for the following reasons:\newline
+In Example~$\smkreis{1}$, nothing has to be redefined. Estimation by MDE or
+MLE is straightforward: We define an object of class \code{NormLocationFamily}
+and generate some data.
+%----------------------------------------------------------------------------
+<<locNorm, eval = TRUE>>=
+(N <- NormLocationFamily(mean = 3))
+x <- r(N)(20)
+@
+%----------------------------------------------------------------------------
+We compute the MLE and the Cram\'er-von-Mises MDE using some (preliminary) 
+method for the computation of the asymptotic covariance of the MDE.
+%----------------------------------------------------------------------------
+<<locNorm1, eval = TRUE>>=
+MLEstimator(x,N)
+MDEstimator(x,N,distance=CvMDist,
+            asvar.fct = distrMod:::.CvMMDCovariance)
+@
+%----------------------------------------------------------------------------
+Example~$\smkreis{2}$ illustrates the use of a ``parametric group model''
+in the sense of \citet[Section~1.3, pp.~19--26]{Leh:83}, and as 
+this model is quite non-standard, we use it to demonstrate some capabilities
+of our generating functions. Example~$\smkreis{3}$
+illustrates the use of a predefined \proglang{S}4 class; specifically, 
+class \code{GammaFamily}. In this case there are various equivalent 
+parameterizations, which in our setup can easily be transformed into each 
+other; see Section~\ref{ParamFamP}.  
+Example~$\smkreis{4}$, also available in package~\pkg{distrMod}
+as demo \code{censoredPois}, illustrates a situation where we have to set
+up a model completely anew.
+% -----------------------------------------------------------------------------
+\subsection{Organization of the paper}
+% -----------------------------------------------------------------------------
+We first explain some aspects of the specific way object orientation (OO) is 
+realized in \proglang{R}. We then present the new model \proglang{S}4 classes 
+and demonstrate how package~\pkg{distrMod} can be used to compute minimum 
+criterion estimators. The global options which may be set in our package and 
+some general programming practices are given in the appendix.
+% -----------------------------------------------------------------------------
+\section[Object orientation in S4]{Object orientation in \proglang{S}4}   \label{OOPinR}
+% -----------------------------------------------------------------------------
+In \proglang{R}, OO is realized in the \proglang{S}3 class concept as introduced
+in \citet{Cham:93a,Cham:93b} and by its successor, the \proglang{S}4 class concept, 
+as developed in \citet{Ch98,Ch99,Ch01} and described in detail in \citet{Ch08}. 
+Of course, also \citet[Section~5]{RLangDef} may serve as reference.\\ 
+An account of some of the differences to standard OO may be found in \citet{ChL01}, 
+\citet{Beng:03}, and \citet{Ch06}.\\
+Using the terminology of \citet{Beng:03}, mainstream software engineering (e.g.~\proglang{C++}) 
+uses \emph{COOP\/} (class-object-oriented programming) style whereas the \proglang{S}3/\proglang{S}4 
+concept of \proglang{R} uses \emph{FOOP\/} (function-object-oriented programming) style 
+or, according to \citet{Ch06}, at least \emph{F+COOP} (i.e.\ both styles).\\
+In COOP style, methods providing access to or manipulation of an object are part 
+of the object, while in FOOP style, they are not, but belong 
+to \emph{generic functions\/} -- abstract functions which allow for arguments of 
+varying type/class. A dispatching mechanism then decides on run-time which method 
+best fits the \emph{signature\/} of the function, that is, the types/classes of
+(a certain subset of) its arguments. {\tt C++} has a similar concept, 
+``overloaded functions'' as discussed by \citet[Section 4.6.6]{Stro:92}.
+\par
+In line with the different design of OO within \proglang{R}, some 
+notions have different names in \proglang{R} context as well. This is in part 
+justified by slightly different meanings; e.g., members in \proglang{R} are 
+called \emph{slots}, and constructors are called \emph{generating functions}. 
+In the case of the latter, the notion does mean something similar but not identical 
+to a constructor: a generating function according to \citet{Ch01} is a user-friendly 
+wrapper to a call to \code{new()}, the actual constructor in the \proglang{S}4 system. 
+In general it does not have the same flexibility as the full-fledged constructor 
+in that some calling possibilities will still be reserved to a call to \code{new()}.
+\par
+Following the (partial) FOOP style of \proglang{R}, we sometimes have to
+deviate from best practice in mainstream OO, namely documenting the methods of
+each class hierarchy together as a group. Instead we document the
+corresponding particular methods in the help file for the corresponding 
+generic.\\
+Although the use of OO in the \proglang{R} context will certainly not be able
+to gain benefits using object identity, information hiding and encapsulation,
+the mere use of inheritance and polymorphism does provide advantages:\newline
+Polymorphism is a very important feature in interactively used languages
+as the user will not have to remember a lot of different function names but
+instead is able to say \code{plot} to many different objects of classes
+among which there need not be any inheritance structure.
+On the other hand, inheritance will make it possible to have a general (default) 
+code which applies if nothing else is known while still any user may register 
+his own particular method for a derived class, without interference of the authors 
+of the class and generic function definitions. Of course, this could also be 
+achieved by functional arguments, but using method dispatch we have much more 
+control on the input and output types of the corresponding function. This 
+is important, as common \proglang{R} functions neither have type checking for 
+input arguments nor for return values. In addition to simple type checking 
+we could even impose some refined checking by means of the \proglang{S}4 
+validity checking.
+% -----------------------------------------------------------------------------
+\section[S4 classes: Models and parameters]{\proglang{S}4 classes: Models and parameters}\label{modelsec}
+% -----------------------------------------------------------------------------
+% -----------------------------------------------------------------------------
+\subsection{Model classes}
+% -----------------------------------------------------------------------------
+\textbf{Models in Statistics and in \proglang{R} }
+In Statistics, a probability model or shortly model is a family of probability 
+distributions. More precisely, a subset ${\cal P}\subset {\cal M}_1({\cal A})$
+of all probability measures on some sample space $(\Omega,{\cal A})$.
+In case we are dealing with a parametric model, there is a finite-dimensional 
+parameter domain $\Theta$ (usually an open subset of $\R^k$) and a mapping 
+$\theta\mapsto P_\theta$, assigning each parameter $\theta\in\Theta$ a corresponding 
+member of the family ${\cal P}$. If this parametrization is smooth, more specifically
+$L_2$-differentiable, see~\citet[Section~2.3]{Ri94}, we additionally have an 
+$L_2$-derivative $\Lambda_\theta$ for each $\theta\in\Theta$; that is, some random 
+variable (RV) in $L_2(P_\theta)$ and its corresponding (co)variance, the Fisher 
+information ${\cal I}_\theta$. In most cases, 
+$\Lambda_\theta=\frac{d}{d\theta} \log p_\theta$ (the classical scores) 
+for $p_\theta$ the density of $P_\theta$ w.r.t.\ Lebesgue or counting measure.
+\par
+One of the strengths of \proglang{R} (or more accurately of \proglang{S}) right 
+from the introduction of \proglang{S}3 in \citet{B:C:W:88} is that models, more 
+specifically [generalized] linear models (see functions \code{lm} and \code{glm} 
+in package \pkg{stats}) may be explicitly formulated in terms of the language.
+The key advantage of this is grouping of relevant information, re-usability,
+and of course the formula interface (see \code{formula} in package \pkg{stats}) 
+by which computations \emph{on} the model are possible in \proglang{S}.\\
+From a mathematical point of view however, these models are somewhat incomplete: In 
+the case of \code{lm}, there is an implicit assumption of Gaussian errors, while in 
+the case of \code{glm} only a limited number of explicit families and explicit link 
+functions are ``hard-coded''. So in fact, again the user will not enter any 
+distributional assumption.\\
+Other models like the more elementary location and scale family (with general central 
+distribution) so far have not even been implemented.
+\par
+With our distribution classes available from package \pkg{distr} we go ahead 
+in this direction in package \pkg{distrMod}, although admittedly, up to now, 
+we have not yet implemented any regression model or integrated any formula 
+interface, but this will hopefully be done in the future.
+
+\textbf{Packages \pkg{distr} and \pkg{distrEx} } Much of our infrastructure
+relies on our \proglang{R} packages \pkg{distr} and \pkg{distrEx} available on
+\href{http://cran.r-project.org/}{\tt CRAN}.
+Package \pkg{distr}, see~\citet{distr,distrTeach}, aims to provide a conceptual
+treatment of distributions by means of \proglang{S}4 classes. A mother class 
+\code{Distribution} is introduced with slots for a parameter and for functions 
+\code{ r}, \code{ d}, \code{ p} and \code{ q} for simulation, for 
+evaluation of density, c.d.f.\ and quantile function of the corresponding 
+distribution, respectively. All distributions of the \pkg{stats} package are implemented as 
+subclasses of either \code{AbscontDistribution} or \code{DiscreteDistribution},
+which themselves are again subclasses of \code{UnivariateDistribution}.
+As usual in stochastics, we identify distributions with RVs distributed accordingly.
+By means of these classes, we may automatically generate new objects of these 
+classes for the laws of RVs under standard univariate mathematical transformations 
+and under standard bivariate arithmetical operations acting on independent RVs. 
+Here is a short example: We create objects of ${\cal N}\,(2,1.69)$ and 
+${\rm Pois}\,(1.2)$ and convolve an affine transformation of them.
+%----------------------------------------------------------------------------
+<<exam1, eval = TRUE, fig = TRUE, height = 4, include = FALSE>>=
+library(distr)
+N <- Norm(mean = 2, sd = 1.3)
+P <- Pois(lambda = 1.2)
+Z <- 2*N + 3 + P
+Z
+plot(Z, cex.inner = 0.9)
+@
+%----------------------------------------------------------------------------
+The new distribution has corresponding slots \code{ r}, \code{ d}, \code{ p} 
+and \code{ q}.
+%----------------------------------------------------------------------------
+<<exam11, eval = TRUE>>=
+p(Z)(0.4)
+q(Z)(0.3)
+r(Z)(5)
+@
+%----------------------------------------------------------------------------
+\begin{figure}[!ht]
+  \begin{center}
+    \includegraphics[width=12cm]{distrMod-exam1.pdf}%
+    \caption{Plot of \code{Z}, an object of class \code{AbscontDistribution}.}
+  \end{center}
+\end{figure}%
+\par
+Package~\pkg{distrEx} extends \pkg{distr} by covering statistical functionals like
+expectation, variance or the median evaluated at distributions, as well as
+distances between distributions and basic support for multivariate and
+conditional distributions. E.g., using the distributions generated above, we 
+can write
+%----------------------------------------------------------------------------
+<<expectation, eval = TRUE>>=
+library(distrEx)
+E(N)
+E(P)
+E(Z)
+E(Z, fun=function(x) sin(x))
+@
+where \code{E(N)} and \code{E(P)} return the analytic value whereas the last
+two calls invoke some numerical computations.
+
+\textbf{Models in \pkg{distrMod} }
+Based on class \code{Distribution} of package \pkg{distr} and its subclasses
+we define classes for families of probability measures in package \pkg{distrMod}. 
+So far, we specialized this to parametric families of probability measures in class 
+\code{ParamFamily}; see~Figure~\ref{figPF}. The concept however, also allows
+the derivation of subclasses for other (e.g.\ semiparametric) families of 
+probability measures. In the case of $L_2$-differentiable parametric families
+we introduce several additional slots for scores $\Lambda_\theta$ and Fisher 
+information ${\cal I}_\theta$. In particular, slot \code{L2deriv} for the score 
+function is of class \code{EuclRandVarList}, a class defined in package~\pkg{RandVar} 
+\citep{RandVar}.
+\begin{figure}[!ht]
+  \begin{center}
+    \includegraphics[width=12cm]{ProbFamily.pdf}%
+    \caption{\label{figPF}Inheritance relations and slots of the
+    corresponding \mbox{(sub-)}classes for \code{ProbFamily} where we do not
+    repeat inherited slots.}
+  \end{center}
+\end{figure}%
+The mother class \code{ProbFamily} is virtual and objects can only be created for
+all derived classes.
+\par
+Class \code{ParamFamily} and all its subclasses have pairs of slots: actual
+value slots and functional slots, the latter following the COOP paradigm.
+The actual value slots like \code{distribution}, \code{param}, \code{L2deriv},
+and \code{FisherInfo} are used for computations at a certain value of
+the parameter, while functional slots like \code{modifyParam}, \code{L2deriv.fct},
+and \code{FisherInfo.fct} provide mappings $\Theta\to {\cal M}_1(\B)$,
+$\theta\mapsto P_\theta$, $\Theta \to \bigcup_{\theta\in\Theta} L_2(P_\theta)$,
+$\theta \mapsto \Lambda_\theta$, and $\Theta \to \R^{k \times k}$,
+$\theta \mapsto {\cal I}_\theta$, respectively, and are needed to modify the
+actual parameter of the model, or to move the model from one parameter value
+to another. The different modifications due after a change in the parameter
+are grouped in \proglang{S}4 method \code{modifyModel}.
+
+\textbf{Generating functions } Generating objects of class
+\code{L2ParamFamily} and derived classes involves filling a considerable
+number of slots. Hence, for convenience, there are several user-friendly
+generating functions as displayed in Table~\ref{TabPF}.
+\begin{table}[!ht]
+  \begin{center}\small
+    \begin{tabular}{|r|l|}
+    \hline
+      \textbf{Name} & \textbf{Family}\\ \hline
+      \code{ParamFamily} & general parametric family\\
+      \code{L2ParamFamily} & general $L_2$ differentiable parametric family \\
+      \code{L2LocationFamily} & general $L_2$ differentiable location family \\
+      \code{L2LocationUnknownScaleFamily} & general $L_2$ differentiable location family\\
+                                          & with unknown scale (nuisance parameter)\\
+      \code{L2ScaleFamily} & general $L_2$ differentiable scale family\\
+      \code{L2ScaleUnknownLocationFamily} & general $L_2$ differentiable scale family\\
+                                          & with unknown location (nuisance parameter)\\
+      \code{L2LocationScaleFamily} & general $L_2$ differentiable location and\\
+                                   & scale family\\
+      \code{BetaFamily} & beta family\\
+      \code{BinomFamily} & binomial family\\
+      \code{GammaFamily} & gamma family\\
+      \code{PoisFamily} & Poisson family\\
+      \code{ExpScaleFamily} & exponential scale family\\
+      \code{GumbelLocationFamily} & Gumbel location family\\
+      \code{LnormScaleFamily} & log-normal scale family\\
+      \code{NormLocationFamily} & normal location family\\
+      \code{NormLocationUnknownScaleFamily} & normal location family with\\
+                                            & unknown scale (nuisance parameter)\\
+      \code{NormScaleFamily} & normal scale family\\
+      \code{NormScaleUnknownLocationFamily} & normal scale family with\\
+                                            & unknown location (nuisance parameter)\\
+      \code{NormLocationScaleFamily} & normal location and scale family\\
+    \hline
+    \end{tabular}
+    \caption{\label{TabPF}Generating functions for \code{ParamFamily} and
+    derived classes.}
+  \end{center}
+\end{table}%
+
+\textbf{Examples }
+In order to follow our running example~$\smkreis{2}$, consider the 
+following code: we first define the (non-standard) central distribution \code{myD} 
+and then generate the location and scale model. For the central distribution, 
+the corresponding standardizing constant could be expressed in closed form 
+in terms of the gamma function, but instead we present the more general approach 
+in which (by argument \code{withS}) standardization to mass $1$ is enforced 
+by numerical integration.
+<<centralD, eval = TRUE>>=
+myD <- AbscontDistribution(d = function(x) exp(-abs(x)^3), 
+                           withS = TRUE)
+@
[TRUNCATED]

To get the complete diff run:
    svnlook diff /svnroot/distr -r 767


More information about the Distr-commits mailing list