[Returnanalytics-commits] r3460 - in pkg/FactorAnalytics: R man

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Thu Jul 3 02:08:54 CEST 2014


Author: pragnya
Date: 2014-07-03 02:08:53 +0200 (Thu, 03 Jul 2014)
New Revision: 3460

Modified:
   pkg/FactorAnalytics/R/fitTSFM.R
   pkg/FactorAnalytics/R/summary.tsfm.r
   pkg/FactorAnalytics/man/fitTSFM.Rd
   pkg/FactorAnalytics/man/summary.tsfm.Rd
Log:
Added HC/HAC stats to summary.tsfm. Edited fitTSFM, covFM to accomodate differing factors across assets.

Modified: pkg/FactorAnalytics/R/fitTSFM.R
===================================================================
--- pkg/FactorAnalytics/R/fitTSFM.R	2014-07-02 06:10:51 UTC (rev 3459)
+++ pkg/FactorAnalytics/R/fitTSFM.R	2014-07-03 00:08:53 UTC (rev 3460)
@@ -21,24 +21,24 @@
 #' Criterion (AIC), improves. And, "all subsets" enables subsets selection 
 #' using \code{\link[leaps]{regsubsets}} that chooses the n-best performing 
 #' subsets of any given size (specified as \code{num.factor.subsets} here). 
-#' "lars" and "lasso" correspond to variants of least angle regression using 
-#' \code{\link[lars]{lars}}. If "lars" or "lasso" are chosen, \code{fit.method} 
+#' "lar" and "lasso" correspond to variants of least angle regression using 
+#' \code{\link[lars]{lars}}. If "lar" or "lasso" are chosen, \code{fit.method} 
 #' will be ignored.
 #' 
-#' Note: If  \code{variable.selection}="lars" or "lasso", \code{fit.method} 
-#' will be ignored. And, "Robust" \code{fit.method} is not truly available with 
-#' \code{variable.selection="all subsets"}; instead, results are produced for 
-#' \code{variable.selection="none"} with "Robust" to include all factors.
+#' Note: If \code{variable.selection="lar" or "lasso"}, \code{fit.method} 
+#' will be ignored. And, \code{fit.method="Robust"} is not truly available with 
+#' \code{variable.selection="all subsets"}; instead, 
+#' \code{variable.selection="none"} is used to include all factors.
 #' 
-#' If \code{add.up.market = TRUE}, max(0, Rm-Rf) is added as a factor in the 
+#' If \code{add.up.market=TRUE}, \code{max(0, Rm-Rf)} is added as a factor in the 
 #' regression, following Henriksson & Merton (1981), to account for market 
 #' timing (price movement of the general stock market relative to fixed income 
 #' securities). The coefficient can be interpreted as the number of free put 
-#' options. Similarly, if \code{add.market.sqd = TRUE}, (Rm-Rf)^2 is added as 
+#' options. Similarly, if \code{add.market.sqd=TRUE}, \code{(Rm-Rf)^2} is added as 
 #' a factor in the regression, following Treynor-Mazuy (1966), to account for 
 #' market timing with respect to volatility.
 #' 
-#' Finally, for both the "lars" and "lasso" methods, the "Cp" statistic 
+#' Finally, for both the "lar" and "lasso" methods, the "Cp" statistic 
 #' (defined in page 17 of Efron et al. (2002)) is calculated using 
 #' \code{\link[lars]{summary.lars}} . While, "cv" computes the K-fold 
 #' cross-validated mean squared prediction error using 
@@ -56,7 +56,7 @@
 #' @param fit.method the estimation method, one of "OLS", "DLS" or "Robust". 
 #' See details. 
 #' @param variable.selection the variable selection method, one of "none", 
-#' "stepwise","all subsets","lars" or "lasso". See details.
+#' "stepwise","all subsets","lar" or "lasso". See details.
 #' @param subsets.method one of "exhaustive", "forward", "backward" or "seqrep" 
 #' (sequential replacement) to specify the type of subset search/selection. 
 #' Required if "all subsets" variable selection is chosen. 
@@ -75,7 +75,7 @@
 #' regressor and \code{market.name} is also required. Default is \code{FALSE}.
 #' @param decay a scalar in (0, 1] to specify the decay factor for 
 #' \code{fit.method="DLS"}. Default is 0.95.
-#' @param lars.criterion an option to assess model selection for the "lars" or 
+#' @param lars.criterion an option to assess model selection for the "lar" or 
 #' "lasso" variable.selection methods; one of "Cp" or "cv". See details. 
 #' Default is "Cp".
 #' @param ... optional arguments passed to the \code{step} function for 
@@ -100,7 +100,7 @@
 #' \item{asset.fit}{list of fitted objects for each asset. Each object is of 
 #' class \code{lm} if \code{fit.method="OLS" or "DLS"}, class \code{lmRob} if 
 #' the \code{fit.method="Robust"}, or class \code{lars} if 
-#' \code{variable.selection="lars" or "lasso"}.}
+#' \code{variable.selection="lar" or "lasso"}.}
 #' \item{alpha}{N x 1 vector of estimated alphas.}
 #' \item{beta}{N x K matrix of estimated betas.}
 #' \item{r2}{N x 1 vector of R-squared values.}
@@ -165,7 +165,7 @@
 fitTSFM <- function(asset.names, factor.names, market.name, data=data, 
                     fit.method = c("OLS","DLS","Robust"),
                     variable.selection = c("none","stepwise","all subsets",
-                                           "lars","lasso"),
+                                           "lar","lasso"),
                     subsets.method = c("exhaustive", "backward", "forward", 
                                        "seqrep"),
                     nvmax=8, force.in=NULL, num.factors.subset=1, 
@@ -207,7 +207,7 @@
                                  market.name, fit.method, subsets.method, 
                                  nvmax, force.in, num.factors.subset, 
                                  add.up.market, add.market.sqd, decay)
-  } else if (variable.selection == "lars" | variable.selection == "lasso"){
+  } else if (variable.selection == "lar" | variable.selection == "lasso"){
     result.lars <- SelectLars(dat.xts, asset.names, factor.names, market.name, 
                               variable.selection, add.up.market, add.market.sqd, 
                               decay, lars.criterion)
@@ -217,14 +217,14 @@
   } 
   else {
     stop("Invalid argument: variable.selection must be either 'none',
-         'stepwise','all subsets','lars' or 'lasso'")
+         'stepwise','all subsets','lar' or 'lasso'")
   }
   
   # extract the fitted factor models, coefficients, r2 values and residual vol 
   # from returned factor model fits above
-  coef.mat <- t(sapply(reg.list, coef))
-  alpha <- coef.mat[, 1]
-  beta <- coef.mat[, -1]
+  coef.mat <- makePaddedDataFrame(lapply(reg.list, coef))
+  alpha <- coef.mat[, 1, drop = FALSE]
+  beta <- coef.mat[, -1, drop = FALSE]
   r2 <- sapply(reg.list, function(x) summary(x)$r.squared)
   resid.sd <- sapply(reg.list, function(x) summary(x)$sigma)
   # create list of return values.
@@ -380,7 +380,7 @@
 }
 
 
-### method variable.selection = "lars" or "lasso"
+### method variable.selection = "lar" or "lasso"
 #
 SelectLars <- function(dat.xts, asset.names, factor.names, market.name, 
                        variable.selection, add.up.market, add.market.sqd, 
@@ -467,6 +467,15 @@
   w/sum(w)
 }
 
+### make a data frame (padded with NAs) from columns of unequal length
+#
+makePaddedDataFrame <- function(l){
+  DF <- do.call(rbind, lapply(lapply(l, unlist), "[", 
+                        unique(unlist(c(sapply(l,names))))))
+  DF <- as.data.frame(DF)
+  names(DF) <- unique(unlist(c(sapply(l,names))))
+  DF
+}
 
 #' @param object a fit object of class \code{tsfm} which is returned by 
 #' \code{fitTSFM}
@@ -521,9 +530,10 @@
   }
   
   # get parameters and factors from factor model
-  beta <- object$beta
+  beta <- as.matrix(object$beta)
+  beta[is.na(beta)] <- 0
   sig2.e = object$resid.sd^2
-  factor <- object$data[, colnames(object$beta)]
+  factor <- as.matrix(object$data[, colnames(object$beta)])
   
   # factor covariance matrix 
   factor.cov = var(factor, use="na.or.complete")
@@ -537,9 +547,9 @@
   
   cov.fm = beta %*% factor.cov %*% t(beta) + D.e
   
-  if (any(diag(chol(cov.fm)) == 0)) {
-    warning("Covariance matrix is not positive definite!")
-  }
+#   if (any(diag(chol(cov.fm)) == 0)) {
+#     warning("Covariance matrix is not positive definite!")
+#   }
   
   return(cov.fm)
 }

Modified: pkg/FactorAnalytics/R/summary.tsfm.r
===================================================================
--- pkg/FactorAnalytics/R/summary.tsfm.r	2014-07-02 06:10:51 UTC (rev 3459)
+++ pkg/FactorAnalytics/R/summary.tsfm.r	2014-07-03 00:08:53 UTC (rev 3460)
@@ -2,23 +2,39 @@
 #' 
 #' @description \code{summary} method for object of class \code{tsfm}. 
 #' Returned object is of class {summary.tsfm}.
+#' 
+#' @details The default \code{summary} method for a fitted \code{lm} object 
+#' computes the standard errors and t-statistics under the assumption of 
+#' homoskedasticty. Argument \code{se.type} gives the option to compute 
+#' heteroskedasticity-consistent (HC) or 
+#' heteroskedasticity-autocorrelation-consistent (HAC) standard errors and 
+#' t-statistics using \code{\link[lmtest]{coeftest}}. This option is meaningful 
+#' only if \code{fit.method = "OLS" or "DLS"}.
 #'  
 #' @param object an object of class \code{tsfm} returned by \code{fitTSFM}.
+#' @param se.type one of "Default", "HC" or "HAC"; option for computing 
+#' HC/HAC standard errors and t-statistics. 
 #' @param x an object of class \code{summary.tsfm}.
 #' @param digits number of significants digits to use when printing. 
 #' Default is 3.
 #' @param ... futher arguments passed to or from other methods.
 #' 
-#' @return Returns an object of class \code{summary.tsfm}, which is a list
-#' containing the function call to \code{fitTSFM} and the 
-#' \code{summary.lm} objects fitted for each asset in the factor model. 
+#' @return Returns an object of class \code{summary.tsfm}. 
 #' The print method for class \code{summary.tsfm} outputs the call, 
-#' coefficients, r-squared and residual volatilty for all assets.
+#' coefficients (with standard errors and t-statistics), r-squared and 
+#' residual volatilty (under the homoskedasticity assumption) for all assets. 
 #' 
+#' Object of class \code{summary.tsfm} is a list of length N + 2 containing:
+#' \item{call}{the function call to \code{fitTSFM}}
+#' \item{se.type}{standard error type as input} 
+#' \item{}{summaries of the N fit objects (of class \code{lm}, \code{lmRob} 
+#' or \code{lars}) for each asset in the factor model.}
+#' 
 #' @note For a more detailed printed summary for each asset, refer to 
-#' \code{print.summary.lm}, which further formats the coefficients, 
-#' standard errors, etc. and additionally gives significance 
-#' stars if \code{signif.stars} is TRUE. 
+#' \code{\link[stats]{summary.lm}} or \code{\link[robustbase]{lmRob}}, which 
+#' include F-statistics, Multiple R-squared, Adjusted R-squared and further 
+#' format the coefficients, standard errors, etc. and additionally give 
+#' significance stars if \code{signif.stars} is TRUE. 
 #' 
 #' @author Yi-An Chen & Sangeetha Srinivasan.
 #' 
@@ -32,7 +48,7 @@
 #'                fit.method="OLS", variable.selection="none", 
 #'                add.up.market=TRUE, add.market.sqd=TRUE)
 #' # summary of factor model fit for all assets
-#' summary(fit)
+#' summary(fit, "HAC")
 #' 
 #' # summary of lm fit for a single asset
 #' summary(fit$asset.fit[[1]])
@@ -40,15 +56,30 @@
 #' @method summary tsfm
 #' @export
 
-summary.tsfm <- function(object, ...){
+summary.tsfm <- function(object, se.type="Default", ...){
   # check input object validity
   if (!inherits(object, "tsfm")) {
     stop("Invalid 'tsfm' object")
   }
+  if (object$fit.method=="Robust" && se.type!="default") {
+    stop("Invalid argument: HC/HAC standard errors are applicable only if 
+         fit.method = 'OLS' or 'DLS'")
+  }
+  
   # extract summary.lm objects for each asset
   sum <- lapply(object$asset.fit, summary)
-  # include the call to fitTSFM
-  sum <- c(call=object$call, sum)
+  
+  # convert to HC/HAC standard errors and t-stats if specified
+  for (i in object$asset.names) {
+    if (se.type == "HC") {
+      sum[[i]]$coefficients <- coeftest(fit$asset.fit[[i]], vcovHC)[,1:4]
+    } else if (se.type == "HAC") {
+      sum[[i]]$coefficients <- coeftest(fit$asset.fit[[i]], vcovHAC)[,1:4]
+    }
+  }
+  
+  # include the call and se.type to fitTSFM
+  sum <- c(call=object$call, Type=se.type, sum)
   class(sum) <- "summary.tsfm"
   return(sum)
 }
@@ -63,12 +94,14 @@
     cat("\nCall:\n")
     dput(cl)
   }
-  cat("\nFactor Model Coefficients:\n")
+  cat("\nFactor Model Coefficients:\n", 
+      sep="")
   n <- length(x)
-  for (i in 2:n) {
+  for (i in 3:n) {
     options(digits = digits)  
-    cat("\nAsset", i-1, ": ", names(x[i]), "\n", sep = "")  
-    table.coef <- t(x[[i]]$coefficients)
+    cat("\nAsset", i-2, ": ", names(x[i]), "\n(",x$Type,
+        " Standard Errors & T-stats)\n\n", sep = "")  
+    table.coef <- x[[i]]$coefficients
     print(table.coef, digits = digits, ...)
     cat("\nR-squared: ", x[[i]]$r.squared,", Residual Volatility: "
         , x[[i]]$sigma,"\n", sep = "")

Modified: pkg/FactorAnalytics/man/fitTSFM.Rd
===================================================================
--- pkg/FactorAnalytics/man/fitTSFM.Rd	2014-07-02 06:10:51 UTC (rev 3459)
+++ pkg/FactorAnalytics/man/fitTSFM.Rd	2014-07-03 00:08:53 UTC (rev 3460)
@@ -9,11 +9,10 @@
 \usage{
 fitTSFM(asset.names, factor.names, market.name, data = data,
   fit.method = c("OLS", "DLS", "Robust"), variable.selection = c("none",
-  "stepwise", "all subsets", "lars", "lasso"),
-  subsets.method = c("exhaustive", "backward", "forward", "seqrep"),
-  nvmax = 8, force.in = NULL, num.factors.subset = 1,
-  add.up.market = FALSE, add.market.sqd = FALSE, decay = 0.95,
-  lars.criterion = "Cp", ...)
+  "stepwise", "all subsets", "lar", "lasso"), subsets.method = c("exhaustive",
+  "backward", "forward", "seqrep"), nvmax = 8, force.in = NULL,
+  num.factors.subset = 1, add.up.market = FALSE, add.market.sqd = FALSE,
+  decay = 0.95, lars.criterion = "Cp", ...)
 
 \method{coef}{tsfm}(object, ...)
 
@@ -41,7 +40,7 @@
 See details.}
 
 \item{variable.selection}{the variable selection method, one of "none",
-"stepwise","all subsets","lars" or "lasso". See details.}
+"stepwise","all subsets","lar" or "lasso". See details.}
 
 \item{subsets.method}{one of "exhaustive", "forward", "backward" or "seqrep"
 (sequential replacement) to specify the type of subset search/selection.
@@ -68,7 +67,7 @@
 \item{decay}{a scalar in (0, 1] to specify the decay factor for
 \code{fit.method="DLS"}. Default is 0.95.}
 
-\item{lars.criterion}{an option to assess model selection for the "lars" or
+\item{lars.criterion}{an option to assess model selection for the "lar" or
 "lasso" variable.selection methods; one of "Cp" or "cv". See details.
 Default is "Cp".}
 
@@ -98,7 +97,7 @@
 \item{asset.fit}{list of fitted objects for each asset. Each object is of
 class \code{lm} if \code{fit.method="OLS" or "DLS"}, class \code{lmRob} if
 the \code{fit.method="Robust"}, or class \code{lars} if
-\code{variable.selection="lars" or "lasso"}.}
+\code{variable.selection="lar" or "lasso"}.}
 \item{alpha}{N x 1 vector of estimated alphas.}
 \item{beta}{N x K matrix of estimated betas.}
 \item{r2}{N x 1 vector of R-squared values.}
@@ -134,24 +133,24 @@
 Criterion (AIC), improves. And, "all subsets" enables subsets selection
 using \code{\link[leaps]{regsubsets}} that chooses the n-best performing
 subsets of any given size (specified as \code{num.factor.subsets} here).
-"lars" and "lasso" correspond to variants of least angle regression using
-\code{\link[lars]{lars}}. If "lars" or "lasso" are chosen, \code{fit.method}
+"lar" and "lasso" correspond to variants of least angle regression using
+\code{\link[lars]{lars}}. If "lar" or "lasso" are chosen, \code{fit.method}
 will be ignored.
 
-Note: If  \code{variable.selection}="lars" or "lasso", \code{fit.method}
-will be ignored. And, "Robust" \code{fit.method} is not truly available with
-\code{variable.selection="all subsets"}; instead, results are produced for
-\code{variable.selection="none"} with "Robust" to include all factors.
+Note: If \code{variable.selection="lar" or "lasso"}, \code{fit.method}
+will be ignored. And, \code{fit.method="Robust"} is not truly available with
+\code{variable.selection="all subsets"}; instead,
+\code{variable.selection="none"} is used to include all factors.
 
-If \code{add.up.market = TRUE}, max(0, Rm-Rf) is added as a factor in the
+If \code{add.up.market=TRUE}, \code{max(0, Rm-Rf)} is added as a factor in the
 regression, following Henriksson & Merton (1981), to account for market
 timing (price movement of the general stock market relative to fixed income
 securities). The coefficient can be interpreted as the number of free put
-options. Similarly, if \code{add.market.sqd = TRUE}, (Rm-Rf)^2 is added as
+options. Similarly, if \code{add.market.sqd=TRUE}, \code{(Rm-Rf)^2} is added as
 a factor in the regression, following Treynor-Mazuy (1966), to account for
 market timing with respect to volatility.
 
-Finally, for both the "lars" and "lasso" methods, the "Cp" statistic
+Finally, for both the "lar" and "lasso" methods, the "Cp" statistic
 (defined in page 17 of Efron et al. (2002)) is calculated using
 \code{\link[lars]{summary.lars}} . While, "cv" computes the K-fold
 cross-validated mean squared prediction error using

Modified: pkg/FactorAnalytics/man/summary.tsfm.Rd
===================================================================
--- pkg/FactorAnalytics/man/summary.tsfm.Rd	2014-07-02 06:10:51 UTC (rev 3459)
+++ pkg/FactorAnalytics/man/summary.tsfm.Rd	2014-07-03 00:08:53 UTC (rev 3460)
@@ -4,13 +4,16 @@
 \alias{summary.tsfm}
 \title{Summarizing a fitted time series factor model}
 \usage{
-\method{summary}{tsfm}(object, ...)
+\method{summary}{tsfm}(object, se.type = "Default", ...)
 
 \method{print}{summary.tsfm}(x, digits = 3, ...)
 }
 \arguments{
 \item{object}{an object of class \code{tsfm} returned by \code{fitTSFM}.}
 
+\item{se.type}{one of "Default", "HC" or "HAC"; option for computing
+HC/HAC standard errors and t-statistics.}
+
 \item{x}{an object of class \code{summary.tsfm}.}
 
 \item{digits}{number of significants digits to use when printing.
@@ -19,21 +22,36 @@
 \item{...}{futher arguments passed to or from other methods.}
 }
 \value{
-Returns an object of class \code{summary.tsfm}, which is a list
-containing the function call to \code{fitTSFM} and the
-\code{summary.lm} objects fitted for each asset in the factor model.
+Returns an object of class \code{summary.tsfm}.
 The print method for class \code{summary.tsfm} outputs the call,
-coefficients, r-squared and residual volatilty for all assets.
+coefficients (with standard errors and t-statistics), r-squared and
+residual volatilty (under the homoskedasticity assumption) for all assets.
+
+Object of class \code{summary.tsfm} is a list of length N + 2 containing:
+\item{call}{the function call to \code{fitTSFM}}
+\item{se.type}{standard error type as input}
+\item{}{summaries of the N fit objects (of class \code{lm}, \code{lmRob}
+or \code{lars}) for each asset in the factor model.}
 }
 \description{
 \code{summary} method for object of class \code{tsfm}.
 Returned object is of class {summary.tsfm}.
 }
+\details{
+The default \code{summary} method for a fitted \code{lm} object
+computes the standard errors and t-statistics under the assumption of
+homoskedasticty. Argument \code{se.type} gives the option to compute
+heteroskedasticity-consistent (HC) or
+heteroskedasticity-autocorrelation-consistent (HAC) standard errors and
+t-statistics using \code{\link[lmtest]{coeftest}}. This option is meaningful
+only if \code{fit.method = "OLS" or "DLS"}.
+}
 \note{
 For a more detailed printed summary for each asset, refer to
-\code{print.summary.lm}, which further formats the coefficients,
-standard errors, etc. and additionally gives significance
-stars if \code{signif.stars} is TRUE.
+\code{\link[stats]{summary.lm}} or \code{\link[robustbase]{lmRob}}, which
+include F-statistics, Multiple R-squared, Adjusted R-squared and further
+format the coefficients, standard errors, etc. and additionally give
+significance stars if \code{signif.stars} is TRUE.
 }
 \examples{
 data(managers.df)
@@ -43,7 +61,7 @@
                fit.method="OLS", variable.selection="none",
                add.up.market=TRUE, add.market.sqd=TRUE)
 # summary of factor model fit for all assets
-summary(fit)
+summary(fit, "HAC")
 
 # summary of lm fit for a single asset
 summary(fit$asset.fit[[1]])



More information about the Returnanalytics-commits mailing list