[Rsiena-commits] r166 - in pkg: RSiena RSienaTest RSienaTest/doc

Tue Aug 2 18:12:12 CEST 2011

Author: tomsnijders
Date: 2011-08-02 18:12:12 +0200 (Tue, 02 Aug 2011)
New Revision: 166

Modified:
   pkg/RSiena/changeLog
   pkg/RSienaTest/changeLog
   pkg/RSienaTest/doc/RSiena.bib
   pkg/RSienaTest/doc/RSiena_Manual.tex
Log:
Corrections to the manual and small format corrections to bibtex file.

Modified: pkg/RSiena/changeLog
===================================================================

--- pkg/RSiena/changeLog	2011-07-28 08:49:47 UTC (rev 165)
+++ pkg/RSiena/changeLog	2011-08-02 16:12:12 UTC (rev 166)
@@ -1,3 +1,11 @@
+2011-08-02 R-forge revision 166
+   * doc/RSiena_Manual.tex: corrections
+   * doc/RSiena.bib: small format corrections
+
+2011-07-28 R-forge revision 165
+   * doc/RSiena_Manual.tex: corrections
+   * doc/RSiena.bib: a few additions
+
 2011-07-24 R-forge revision 164
 
 	* R/effects.r: change default behavior effects to include quadratic

Modified: pkg/RSienaTest/changeLog
===================================================================
--- pkg/RSienaTest/changeLog	2011-07-28 08:49:47 UTC (rev 165)
+++ pkg/RSienaTest/changeLog	2011-08-02 16:12:12 UTC (rev 166)
@@ -1,3 +1,11 @@
+2011-08-02 R-forge revision 166
+   * doc/RSiena_Manual.tex: corrections
+   * doc/RSiena.bib: small format corrections
+
+2011-07-28 R-forge revision 165
+   * doc/RSiena_Manual.tex: corrections
+   * doc/RSiena.bib: a few additions
+
 2011-07-24 R-forge revision 164
 
 	* R/effects.r: change default behavior effects to include quadratic

Modified: pkg/RSienaTest/doc/RSiena.bib
===================================================================
--- pkg/RSienaTest/doc/RSiena.bib	2011-07-28 08:49:47 UTC (rev 165)
+++ pkg/RSienaTest/doc/RSiena.bib	2011-08-02 16:12:12 UTC (rev 166)
@@ -1794,9 +1794,9 @@
     JOURNAL={Advances in Data Analysis and Computation},
     YEAR={2011},
     doi = {DOI: 10.1016/j.socnet.2010.03.001},
-    URL = {http://www.stats.ox.ac.uk/~lospinos}
-  volume =   5,
-  pages =    {147--176}}
+    URL = {http://www.stats.ox.ac.uk/~lospinos},
+    volume =   5,
+    pages =    {147--176}}
 
 @Article{Lospinoso2011b,
     TITLE = {Goodness of Fit for Stochastic Actor Oriented Models},
@@ -2958,7 +2958,7 @@
   editor =    {Kees van Montfort and Han Oud and Albert Satorra}}
 
 @Article{SnijdersEA10a,
-  author =       {T.A.B. Snijders and Johan H. Koskinen and Michael
+  author =       {T. A. B. Snijders and Johan H. Koskinen and Michael
 Schweinberger},
   title =        {Maximum Likelihood Estimation for Social Network Dynamics},
   journal =      {Annals of Applied Statistics},

Modified: pkg/RSienaTest/doc/RSiena_Manual.tex
===================================================================
--- pkg/RSienaTest/doc/RSiena_Manual.tex	2011-07-28 08:49:47 UTC (rev 165)
+++ pkg/RSienaTest/doc/RSiena_Manual.tex	2011-08-02 16:12:12 UTC (rev 166)
@@ -591,6 +591,8 @@
   or a \sfn{Siena network file} (an edgelist,
   containing three or four columns: (from, to, value, wave (optional)), not yet
   tested for dyadic covariates!).
+  By specifying the waves in the fourth column in the \sfn{Siena} format,
+  one file can be used to contain data for all the waves.
 %(\sfn{.paj} file support will be added
   %later, with a specific button to load a complete project.)
 \item[\sfn{Period(s)}] Only relevant for networks and dyadic covariates. All
@@ -857,7 +859,7 @@
 \label{S_multipleProcesses}
 \begin{enumerate}
 \item
-If multiple processes are available, then using
+If multiple processors are available, then using
 multiple processes can speed up the estimation in \sfn{siena07}.
 
 \item In Phases 1 and 3 the simulations are performed in parallel. In Phase 2,
@@ -1019,8 +1021,8 @@
 and then possibly continue with new
 model specifications followed by estimation or simulation.
 
-The main output is written to a text file named \textsf{{\em
-pname}.out}, where \textsf{{\em pname}} is the name
+The main output is written to a text file named
+\textsf{\textsl{pname}.out}, where \textsf{\textsl{pname}} is the name
 specified in the call of \textsf{sienaModelCreate()}.
 
 \newpage
@@ -1132,6 +1134,8 @@
       Like the Pajek format, this has the advantage that absent ties
       (tie variables with the value 0) do not need to be mentioned
       in the data file.
+      By specifying the waves in the fourth column in the \sfn{Siena} format,
+      one file can be used to contain data for all the waves.
 \end{enumerate}
 
 Missing values must be indicated in the way usual for \Rn,
@@ -1207,6 +1211,8 @@
 # put edge values in desired places
 adj[edges[, 1:2]] <- edges[, 3]
 \end{verbatim}
+Note that this starts with a matrix having all 0 entries,
+and results in a matrix with no 0 entries at all.
 To check the results, after doing these two operations, the command
 \begin{verbatim}
 length(which(a != adj))
@@ -1340,7 +1346,8 @@
 all observation moments, and has the role of an independent
 variable. Changing covariates, on the other hand, have one such
 value for each period between measurement points. If there are $M$
-waves of network data, this covers $M-1$ periods, and accordingly,
+waves (i.e., observation moments) of network data,
+this covers $M-1$ periods, and accordingly,
 for specifying a single changing dyadic covariate, $M-1$ data files
 with covariate matrices are needed.
 
@@ -1649,18 +1656,27 @@
 In some data sets, a dependent variable only increases, or only decreases.
 For a network, this means that ties can be created but not terminated,
 or the other way around.
-This will be noted by \RS and mentioned in the output file generated by
-\textsf{print01Report}.
-This constraint then is also respected in the simulations. This is represented
+This may be the case for all periods (a period is defined by the
+two consecutive observation waves at its start and end points)
+or just in some of the periods.
+\RS will note when a dependent variable only increases or only decreases,
+and mention this in the output file generated by \textsf{print01Report}.
+This constraint then is also respected in the simulations, in the periods
+where it is observed.
+This is represented
 internally by a variable called \texttt{uponly} indicating that the
 dependent variable cannot decrease,
 and a variable \texttt{downonly} indicating that the
 dependent variable cannot increase.
 
-In such cases, the outdegree effect for a dependent network variable,
-and the linear shape effect for a dependent behavior variable (these effects
-are defined below), are not identified and should be dropped from the
-model specification.
+If a dependent variable is only increasing or only decreasing
+for all periods, then two of the basic effects defined below
+are not identified.
+These are the outdegree effect for a dependent network variable,
+and the linear shape effect for a dependent behavior variable;
+these effects define the balance between the probabilities of
+going up and going down.
+These effects then are dropped automatically from the effects object,
 
 
 \newpage
@@ -1689,7 +1705,7 @@
 
 
 \begin{itemize}
-\item {\em rate function}\\
+\item \emph{rate function}\\
 The rate function models the speed by which the dependent variable
 changes; more precisely: the speed by which each network actor
 gets an opportunity for changing her score on the dependent
@@ -1796,7 +1812,7 @@
 At any given moment, let the network be denoted $x^0$.
 The rate function for actor $i$ is denoted $\lambda_i(x)$;
 the evaluation function is $f_i(x)$; the creation function is $c_i(x)$;
-and the endowment function is $g_i(x)$.
+and the endowment function is $e_i(x)$.
 
 At any given moment, let the current network be denoted $x^0$.
 The time duration until the next opportunity of change
@@ -1853,7 +1869,7 @@
 \[
    u_i(x^0, x) \,=\, \big(f_i(x) - f_i(x^0)\big)
                    \,+\,  \Delta^+(x^0, x)\,\big(c_i(x) - c_i(x^0)\big)
-                   \,+\,  \Delta^-(x^0, x)\,\big(g_i(x) - g_i(x^0)\big)    \ .
+                   \,+\,  \Delta^-(x^0, x)\,\big(e_i(x) - e_i(x^0)\big)    \ .
 \]
 This shows that the change in creation function plays a role
 only if a tie is created ($\Delta^+(x^0,x) = 1$), and the change in
@@ -2693,7 +2709,9 @@
 \end{description}
 Whether an effect is an ego effect or a dyadic effect is defined by
 the column \texttt{interactionType} in the effects data frame.
-You can see the values of the \texttt{interactionType} by requesting,
+This column can be inspected by using the \sfn{fix} editor.
+Another way of seeing
+the values of the \texttt{interactionType} is by requesting,
 if the network variable is called, e.g., \texttt{friendship}, the following:
 \begin{verbatim}
 cbind(myeff[myeff$name=="friendship","effectName"],
@@ -3089,7 +3107,8 @@
 These are interactions on the ego level, in line with the
 actor-oriented nature of the model.
 
-Only selected behavior effects can be interacted with each other.
+There are some restrictions on what is permitted
+as interactions between behavior effects.
 Of course,  they should refer to the same dependent behavior variable.
 What is permitted depends on the so-called \texttt{interactionType} of the
 effects, which for behavior effects can be \texttt{OK}\footnote{The value
@@ -3211,9 +3230,13 @@
 \subsection{Time heterogeneity in model parameters}
 \label{S_timetest1}
 
-Currently for the case of a one mode network, you can include
-time heterogeneous parameters in your model. Consider the reformulation of
-the evaluation function into
+When working with two or more periods, i.e., three or more waves,
+there is the question whether parameters are constant across the periods.
+This can be tested by the \sfn{sienaTimeTest} function, as explained
+in Section~\ref{S_timetest2}.
+To specify a model with time heterogeneous parameters, the function
+\sfn{includeTimeDummy} can be used, as follows.
+Consider the reformulation of the evaluation function into
 \begin{align}
 f^{(m)}_{ij}(\mathbf{x})= \sum_k \Big(\beta_k + \delta_k^{(m)} h_k^{(m)}\Big)
                               \,      s_{ik}\big(\mathbf{x}(i \leadsto j)\big)
@@ -3231,14 +3254,12 @@
 which would add three time dummy terms to each effect listed in the function.
 
 We recommend that you start with simple models,
-and use the score type test for
-assessing heterogeneity, i.e., if \texttt{ans}
-is the object of results produced
-by \texttt{siena07},
-\begin{verbatim}
-tt <- sienaTimeTest(ans)
-\end{verbatim}
-to decide which dummy terms to include.
+and base the decision to include time heterogeneous parameters
+on your theoretical and empirical insight in the data
+(e.g., whether the different waves cover a period where the importance
+of some of the modeled `mechanisms' may have changed) and
+the score type test that is implemented in the \sfn{sienaTimeTest} function,
+see Section~\ref{S_timetest2}.
 
 See \citet{Lospinoso2011} for a technical presentation of how the test works,
 and \citet{Lospinoso2010b} for a walkthrough on model selection.
@@ -3784,7 +3805,6 @@
 
 \subsubsection{Initial Values and Convergence Check}
 
-\hyperlink{T_initial}{
 The \emph{initial values} can be given in three ways.}
 \begin{enumerate}
 \item If \texttt{useStdInits = TRUE} is used in the call of
@@ -3810,8 +3830,14 @@
 \begin{verbatim}
 myeff$initialValue[myeff$include]
 \end{verbatim}
-      \hyperlink{T_change_init}{Below, we give a way to change}
-       the initial values contained in the \textsf{sienaEffects} object.
+      Changing these values is hardly ever necessary, because the
+      parameter \textsf{prevAns}, as explained in the next item,
+      does this behind the scenes.
+      If one does wish to change the initial values contained in
+      the effects object, this can be done by the \textsf{setEffect}
+      function in which the \textsf{initialValue} then must be set.
+      Of course one could also operate directly on the vector
+      \texttt{myeff\$initialValue[myeff\$include]}.
 \item If \texttt{useStdInits = FALSE} and the \textsf{prevAns}
       (`previous answer')
       parameter is used, such as in
@@ -3895,46 +3921,7 @@
 Large values of the averages and standard deviations are
 in themselves not at all a reason for concern; only the
 $t$-ratio is important.
-\medskip
 
-\noindent
-\hypertarget{T_change_init}
-        {\emph{Changing the starting values in the effects object}}
-\smallskip
-
-In a \textsf{sienaEffects} object, which here we shall denote
-by the name \texttt{myeff},
-the initial values are contained in the column \texttt{myeff\$initialValue}.
-To change them, in most cases it is most convenient to access only the values
-for effects included in the model.
-Unless time dummies have been requested using \textsf{sienaTimeFix},
-the estimates obtained from a previous fit can be included as
-initial values by a command such as
-\begin{verbatim}
-myeff$initialValue[results1$requestedEffects$effectNumber] <-
-                   results1$theta
-\end{verbatim}
-where \texttt{results1} is the \textsf{sienaFit} object produced by an earlier
-estimation run. The column with name \textsf{theta} contains the parameter
-estimates.  Of course, instead of \texttt{results1\$theta} one could use any
-other desired vector of the correct length.
-
-Warning: this code will not work if you have requested time dummies
-using \textsf{sienaTimeFix}. In such more complicated cases a
-suitable \textsf{match} command can be used.
-
-Another way is to select the parameters for which initial values are
-changed not by the result of a fit, but by the effects object itself.
-To do this, one may assign the desired value to the vector
-\begin{verbatim}
-myeff$initialValue[myeff$include]
-\end{verbatim}
-The rows of \texttt{myeff} for which \texttt{myeff\$include} is true
-are the same rows that are listed by requesting
-\begin{verbatim}
-myeff
-\end{verbatim}
-
 \subsection{Some important components of the \textsf{sienaFit} object}
 \label{S_fitcomp}
 
@@ -3942,26 +3929,30 @@
 useful to know about the following components of  \textsf{sienaFit} objects.
 Suppose the object is called \texttt{ans}. Some of the components
 are the following.
-Further details are in the help file for \textsf{siena07}.
+Further details are in the help file for \textsf{siena07}
+(which is maintained up to date better than this manual).
 
 \begin{tabbing}
  \texttt{ans\$theta  }\hspace{3em}    \=  estimates  \\
-                                       \> (but not for the rate parameter used for conditioning;\\
-                                       \> if time dummies were requested using \textsf{sienaTimeFix},\\
-                                       \> these are also in \texttt{theta}).           \\
+                                      \> (but not for the rate parameter used
+                                                       for conditioning;\\
+                                      \> if time dummies were requested using
+                                            \textsf{sienaTimeFix},\\
+                                      \> these are also in \texttt{theta}). \\
  \texttt{ans\$covtheta   }            \> covariance matrix of the estimates \\
  \texttt{ans\$pp}                     \> number of parameters  \\
- \texttt{ans\$targets }               \> targets (observed statistics) for Method of Moments estimation  \\
+ \texttt{ans\$targets }               \> targets (observed statistics)
+                                         for Method of Moments estimation  \\
  \texttt{ans\$tconv   }               \> $t$-ratios for convergence  \\
  \texttt{ans\$sf    }                 \>  generated statistics in Phase 3 \\
  \texttt{ans\$msf   }                 \>  covariance matrix of
                                            \texttt{ans\$sf    } \\
  \texttt{ans\$dfra  }                 \> estimated derivative of expected
-                                            statistics w.r.t.\ parameters  \\
- \texttt{ans\$effects }               \> effects object used in the call
-                                           of \textsf{siena07}  \\
+                                            statistics w.r.t.\ parameters,
+                                         for Methods of Moments estimation  \\
  \texttt{ans\$sims }               \> simulated values of dependent variables
-                                      in Phase 3 of the algorithm      \\
+                                      in Phase 3 of the algorithm
+                                         for Methods of Moments estimation  \\
                                     \>     (see Section~\ref{S_sims}),
                                         if \textsf{returnDeps = TRUE}
                                         in the call of \textsf{siena07}.
@@ -3983,7 +3974,7 @@
 \begin{verbatim}
 # Compute the covariance matrix of the generated statistics
 print(covsf <- cov(ans$sf))
-# This is the same as ans$msf
+# This is the same as ans$msf, provided there are no fixed parameters.
 # The means and standard deviations of the generated statistics minus targets:
 (v <- colMeans(ans$sf))
 (s <- apply(ans$sf, 2, sd))
@@ -4074,7 +4065,7 @@
         requesting just \texttt{ans} or \texttt{print(ans)}
         produces output on the \R console. The function
         \texttt{summary(ans)} produces more extensive output.
-  \item A table in latex or htlm format can be produced by the
+  \item A table in latex or html format can be produced by the
         \textsf{xtable.sienaFit} method.
         For example.
         \begin{verbatim}
@@ -4082,12 +4073,12 @@
         \end{verbatim}
         produces in the working directory a html file with the \texttt{ans}
         results in
-        tabular form. The \textsf{xtable} library has many further options.
+        tabular form. The \textsf{xtable} package has many further options.
   \item The function \textsf{siena07}
         writes an output file which is an ASCII (`text') file that can be
         read by any text editor.
-        It is called \textsf{{\em pname}.out},
-        where \textsf{{\em pname}} is the name
+        It is called \textsf{\textsl{pname}.out},
+        where \textsf{\textsl{pname}} is the name
         specified in the call of \textsf{sienaModelCreate()}.
 
         This output file is divided into sections
@@ -4438,8 +4429,7 @@
 and the coding was given correctly, and then re-specify the model
 or restart the estimation with other (e.g., 0) parameter values.
 Sometimes starting from different parameter values (e.g., the
-default values implied by the
-\hyperlink{T_initial}{model option}
+default values implied by the model option
 of ``standard initial values") will lead to a good result.
 Sometimes, however, it works better to delete this effect
 altogether from the model.
@@ -4656,10 +4646,14 @@
 \begin{verbatim}
 Wald.RSiena <- function(A, ans)
 {
+    if (is.vector(A))
+    {
+        A <- matrix(A, nrow=1)
+    }
     th    <- A %*% ans$theta
     cov   <- A %*% ans$covtheta %*% t(A)
     chisq <- t(th) %*% solve(cov) %*% th
-    df    <- nrow(as.matrix(A))
+    df    <- nrow(A)
     pval  <- 1 - pchisq(chisq,df)
     c(chisquare = chisq, df = df, pvalue = pval)
 }
@@ -5123,6 +5117,7 @@
 a list of edgelists according to the format of the \textsf{sna} package
 \citep{Butts08}, and then calculate the maximum $k$-core numbers
 in the networks.
+This assumes that a one-mode network is being analyzed.
 \begin{verbatim}
 # First define a function that extracts the desired component
 # from the list element,
@@ -5521,8 +5516,7 @@
       or from values obtained as the estimates for a simpler model
       that gave no problems.
       The initial default parameter values can be obtained
-      by choosing the  \hyperlink{T_initial}{model option}
-      ``standard initial values".   \\
+      by choosing the model option ``standard initial values".   \\
 \iffalse
       When starting estimations with Model Type 2
       (see Section~\ref{S_modeltype}), there may be some problems to
@@ -7105,7 +7099,7 @@
 (here only the endowment function is treated and not the creation function,
 but they are similar in an opposite way).
 
-\subsubsection{Network endowment function} \label{S_g}
+\subsubsection{Network endowment function} \label{S_e}
 
 The network endowment function
 is the way of modeling effects which operate in
@@ -7114,8 +7108,8 @@
 The network endowment function is zero for creation of ties,
 and is given by
 \begin{equation}
-g^{\rm net}(x) \, = \, \sum_k \gamma^{\rm net}_k \, s^{\rm net}_{ik}(x)
-                                                           \label{g_net}
+e^{\rm net}(x) \, = \, \sum_k \gamma^{\rm net}_k \, s^{\rm net}_{ik}(x)
+                                                           \label{e_net}
 \end{equation}
 for dissolution of ties.
 In this formula, the $\gamma_k^{\rm net}$
@@ -7274,7 +7268,9 @@
 Next there is a list of effects that have to do with the influence of
 the network on the behavior.
 To specify such effects in \RS using, e.g., function \sfn{includeEffects},
-it is necessary to specify the dependent behavior variable
+it is necessary\footnote{If this behavior variable is the only dependent
+variable, then this is not necessary. But this seldom happens.}
+to specify the dependent behavior variable
 in the keyword \sfn{name}
 as well as the network in the keyword \sfn{interaction1}.
 For example,
@@ -7554,7 +7550,7 @@
 \subsubsection{Behavioral endowment function}
 Also the behavioral model knows the distinction between evaluation and
 endowment effects. The formulae of the effects that can be included
-in the behavioral endowment function $g^{\rm beh}$ are the same as
+in the behavioral endowment function $e^{\rm beh}$ are the same as
 those given for the behavioral evaluation function. However, they enter
 calculation of the endowment function only when the actor considers
 decreasing his behavioral score by one unit (downward steps), not
@@ -7663,7 +7659,7 @@
 extend a new tie to $h$ is $e^{0.3} = 1.35$ times as high
 as the probability for $i$ to extend a new tie to $j$.
 
-\subsubsection{Ego -- alter selection tables}
+\subsection{Ego -- alter selection tables}
 
 When some variable $V$ occurs in several effects in the model,
 then its effects can best be understood
@@ -8071,7 +8067,7 @@
 However, the differences between the two fits are not significant,
 as can be shown e.g.\ by score-type tests.
 
-\subsubsection{Ego -- alter influence tables}
+\subsection{Ego -- alter influence tables}
 
 In quite a similar way as in the preceding section,
 from the output tables and the formulae for the effects