[Returnanalytics-commits] r3474 - in pkg/FactorAnalytics: . R man

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Mon Jul 21 10:10:52 CEST 2014


Author: pragnya
Date: 2014-07-21 10:10:51 +0200 (Mon, 21 Jul 2014)
New Revision: 3474

Added:
   pkg/FactorAnalytics/R/fitTsfm.control.R
   pkg/FactorAnalytics/man/fitTsfm.control.Rd
Modified:
   pkg/FactorAnalytics/NAMESPACE
   pkg/FactorAnalytics/R/fitTsfm.R
   pkg/FactorAnalytics/man/fitTsfm.Rd
Log:
Add fitTsfm.control function

Modified: pkg/FactorAnalytics/NAMESPACE
===================================================================
--- pkg/FactorAnalytics/NAMESPACE	2014-07-18 01:40:25 UTC (rev 3473)
+++ pkg/FactorAnalytics/NAMESPACE	2014-07-21 08:10:51 UTC (rev 3474)
@@ -29,6 +29,7 @@
 export(fitFundamentalFactorModel)
 export(fitStatisticalFactorModel)
 export(fitTsfm)
+export(fitTsfm.control)
 export(pCornishFisher)
 export(paFm)
 export(qCornishFisher)

Modified: pkg/FactorAnalytics/R/fitTsfm.R
===================================================================
--- pkg/FactorAnalytics/R/fitTsfm.R	2014-07-18 01:40:25 UTC (rev 3473)
+++ pkg/FactorAnalytics/R/fitTsfm.R	2014-07-21 08:10:51 UTC (rev 3474)
@@ -12,22 +12,23 @@
 #' the option to supply a risk free rate variable to subtract from each asset 
 #' return and factor to create excess returns. 
 #' 
-#' Estimation method "OLS" corresponds to ordinary least squares, "DLS" is 
-#' discounted least squares (weighted least squares with exponentially 
-#' declining weights that sum to unity), and, "Robust" is robust 
-#' regression (uses \code{\link[robust]{lmRob}}). 
+#' Estimation method "OLS" corresponds to ordinary least squares using 
+#' \code{\link[stats]{lm}}, "DLS" is discounted least squares (weighted least 
+#' squares with exponentially declining weights that sum to unity), and, 
+#' "Robust" is robust regression (useing \code{\link[robust]{lmRob}}). 
 #' 
 #' If \code{variable.selection="none"}, all chosen factors are used in the 
 #' factor model. Whereas, "stepwise" performs traditional forward/backward 
-#' stepwise OLS regression (using \code{\link[stats]{step}}), that starts from 
-#' the initial set of factors and adds factors only if the regression fit, as 
-#' measured by the Bayesian Information Criterion (BIC) or Akaike Information 
-#' Criterion (AIC), improves. And, "all subsets" enables subsets selection 
-#' using \code{\link[leaps]{regsubsets}} that chooses the n-best performing 
-#' subsets of any given size (specified as \code{num.factor.subsets} here). 
-#' \code{varaible.selection="lars"} corresponds to least angle regression 
-#' using \code{\link[lars]{lars}} with variants "lasso", "lar", 
-#' "forward.stagewise" or "stepwise". Note: If 
+#' stepwise OLS regression (using \code{\link[stats]{step}} or 
+#' \code{\link[robust]{step.lmRob}}), that starts from the initial set of 
+#' factors and adds factors only if the regression fit, as measured by the 
+#' Bayesian Information Criterion (BIC) or Akaike Information Criterion (AIC), 
+#' improves. And, "subsets" enables subsets selection using 
+#' \code{\link[leaps]{regsubsets}}; chooses the best performing subset of any 
+#' given size. See \code{\link{fitTsfm.control}} for more details on the 
+#' control arguments. \code{varaible.selection="lars"} corresponds to least 
+#' angle regression using \code{\link[lars]{lars}} with variants "lasso", 
+#' "lar", "forward.stagewise" or "stepwise". Note: If 
 #' \code{variable.selection="lars"}, \code{fit.method} will be ignored.
 #' 
 #' \code{mkt.timing} allows for market-timing factors to be added to any of the 
@@ -40,13 +41,15 @@
 #' volatility, and \code{market.sqd = (Rm-Rf)^2} is added as a factor in the 
 #' regression. Option "both" adds both of these factors.
 #' 
-#' \code{lars.criterion} selects the criterion (one of "Cp" or "cv") to 
-#' determine the best fitted model for \code{variable.selection="lars"}. The 
-#' "Cp" statistic (defined in page 17 of Efron et al. (2002)) is calculated 
-#' using \code{\link[lars]{summary.lars}}. While, "cv" computes the K-fold 
-#' cross-validated mean squared prediction error using 
-#' \code{\link[lars]{cv.lars}}.
+#' Note about NAs: Before model fitting, incomplete cases are removed for 
+#' every asset (return data combined with respective factors' return data) 
+#' using \code{\link[stats]{na.omit}}. Otherwise, all observations in 
+#' \code{data} are included.
 #' 
+#' Note about spaces in asset/factor names: Spaces in column names of the data 
+#' object will be converetd to periods as the function works with \code{xts} 
+#' objects internally and hence column names can't be retained as such.
+#' 
 #' @param asset.names vector containing names of assets, whose returns or 
 #' excess returns are the dependent variable.
 #' @param factor.names vector containing names of the macroeconomic factors.
@@ -62,32 +65,13 @@
 #' @param fit.method the estimation method, one of "OLS", "DLS" or "Robust". 
 #' See details. Default is "OLS". 
 #' @param variable.selection the variable selection method, one of "none", 
-#' "stepwise","all subsets","lars". See details. Default is "none".
-#' @param subsets.method one of "exhaustive", "forward", "backward" or "seqrep" 
-#' (sequential replacement) to specify the type of subset search/selection. 
-#' Required if "all subsets" variable selection is chosen. Default is 
-#' "exhaustive".
-#' @param nvmax the maximum size of subsets to examine; an option for 
-#' "all subsets" variable selection. Default is 8. 
-#' @param force.in vector containing the names of factors that should always 
-#' be included in the model; an option for "all subsets" variable selection. 
-#' Default is NULL.
-#' @param num.factors.subset number of factors required in the factor model; 
-#' an option for "all subsets" variable selection. Default is 1. 
-#' Note: nvmax >= num.factors.subset >= length(force.in).
+#' "stepwise","subsets","lars". See details. Default is "none".
 #' @param mkt.timing one of "HM", "TM" or "both". Default is NULL. See Details. 
 #' \code{mkt.name} is required if any of these options are specified.
-#' @param decay a scalar in (0, 1] to specify the decay factor for 
-#' \code{fit.method="DLS"}. Default is 0.95.
-#' @param lars.type One of "lasso", "lar", "forward.stagewise" or "stepwise". 
-#' The names can be abbreviated to any unique substring. Default is "lasso".
-#' @param lars.criterion an option to assess model selection for the "lars" 
-#' method; one of "Cp" or "cv". See details. Default is "Cp".
-#' @param ... optional arguments passed to the \code{step} function for 
-#' variable.selection method "stepwise", such as direction, steps and 
-#' the penalty factor k. Note that argument k is available only for "OLS" 
-#' and "DLS" fits. Scope argument is not available presently. Also plan to
-#' include other controls passed to \code{lmRob} soon.
+#' @param control list of control parameters. The default is constructed by 
+#' the function \code{\link{fitTsfm.control}}. See the documentation for 
+#' \code{\link{fitTsfm.control}} for details.
+#' @param ... arguments passed to \code{\link{fitTsfm.control}}
 #' 
 #' @return fitTsfm returns an object of class \code{tsfm}. 
 #' 
@@ -112,7 +96,7 @@
 #' \item{beta}{N x K matrix of estimated betas.}
 #' \item{r2}{N x 1 vector of R-squared values.}
 #' \item{resid.sd}{N x 1 vector of residual standard deviations.}
-#' \item{fitted}{xts data object of fitted values; if and only if 
+#' \item{fitted}{xts data object of fitted values; iff 
 #' \code{variable.selection="lars"}}
 #' \item{call}{the matched function call.}
 #' \item{data}{xts data object containing the assets and factors.}
@@ -173,31 +157,51 @@
 
 fitTsfm <- function(asset.names, factor.names, mkt.name=NULL, rf.name=NULL, 
                     data=data, fit.method=c("OLS","DLS","Robust"),
-                    variable.selection=c("none","stepwise","all subsets",
+                    variable.selection=c("none","stepwise","subsets",
                                          "lars"),
-                    subsets.method=c("exhaustive","backward","forward",
-                                     "seqrep"),
-                    nvmax=8, force.in=NULL, num.factors.subset=1, 
-                    mkt.timing=NULL, decay=0.95,
-                    lars.type=c("lasso","lar","forward.stagewise","stepwise"),
-                    lars.criterion="Cp", ...){
+                    mkt.timing=NULL, control=fitTsfm.control(...), ...) {
   
-  # get all the arguments specified by their full names
+  # record the call as an element to be returned
   call <- match.call()
   
-  fit.method = fit.method[1] # default is OLS
-  variable.selection = variable.selection[1] # default is "none"
-  subsets.method = subsets.method[1] # default is "exhaustive"
-  lars.type=lars.type[1] # default is "lasso"
-  
-  if (!exists("direction")) {direction <- "backward"}
-  if (!exists("steps")) {steps <- 1000}
-  if (!exists("k")) {k <- 2}
+  # set defaults and check input vailidity
+  fit.method = fit.method[1]
+  if (!(fit.method %in% c("OLS","DLS","Robust"))) {
+    stop("Invalid argument: fit.method must be 'OLS', 'DLS' or 'Robust'")
+  }
+  variable.selection = variable.selection[1]
+  if (!(variable.selection %in% c("none","stepwise","subsets","lars"))) {
+    stop("Invalid argument: variable.selection must be either 'none',
+         'stepwise','subsets' or 'lars'")
+  }
   if (xor(is.null(mkt.name), is.null(mkt.timing))) {
     stop("Missing argument: 'mkt.name' and 'mkt.timing' are both required to 
          include market-timing factors.")
   }
   
+  # extract arguments to pass to different fit and variable selection functions
+  decay <- control$decay
+  subset.size <- control$subset.size
+  lars.criterion <- control$lars.criterion
+  m1 <- match(c("weights","method","model","x","y","qr"), 
+              names(control), 0L)
+  lm.args <- control[m1, drop=TRUE]
+  m2 <-  match(c("weights","model","x","y","nrep"), 
+               names(control), 0L)
+  lmRob.args <- control[m2, drop=TRUE]
+  m3 <-  match(c("scope","scale","direction","trace","steps","k"), 
+               names(control), 0L)
+  step.args <- control[m3, drop=TRUE]
+  m4 <-  match(c("weights","nbest","nvmax","force.in","force.out","method",
+                 "really.big"), names(control), 0L)
+  regsubsets.args <- control[m4, drop=TRUE]
+  m5 <-  match(c("type","normalize","eps","max.steps","trace"), 
+               names(control), 0L)
+  lars.args <- control[m5, drop=TRUE]
+  m6 <-  match(c("K","type","mode","normalize","eps","max.steps","trace"), 
+               names(control), 0L)
+  cv.lars.args <- control[m6, drop=TRUE]
+  
   # convert data into an xts object and hereafter work with xts objects
   data.xts <- checkData(data)
   
@@ -207,12 +211,15 @@
   
   # convert all asset and factor returns to excess return form if specified
   if (!is.null(rf.name)) {
-    cat("Excess returns were used for all assets and factors.")
+    cat("Excess returns were computed and used for all assets and factors.")
     dat.xts <- "[<-"(dat.xts,,vapply(dat.xts, function(x) x-data.xts[,rf.name], 
                                      FUN.VALUE = numeric(nrow(dat.xts))))
+  } else {
+    cat("Note: fitTsfm was NOT asked to compute EXCESS returns. Input returns 
+        data was used as it is for all factors and assets.")
   }
   
-  # opt add market-timing factors: up.market=max(0,Rm-Rf), market.sqd=(Rm-Rf)^2
+  # opt add mkt-timing factors: up.market=max(0,Rm-Rf), market.sqd=(Rm-Rf)^2
   if (!is.null(mkt.timing)) {
     if(mkt.timing=="HM" | mkt.timing=="both") {
       up.market <- data.xts[,mkt.name]
@@ -237,17 +244,17 @@
   # Each method returns a list of fitted factor models for each asset.
   if (variable.selection == "none") {
     reg.list <- NoVariableSelection(dat.xts, asset.names, factor.names, 
-                                    fit.method, decay)
-  } else if (variable.selection == "stepwise"){
-    reg.list <- SelectStepwise(dat.xts, asset.names, factor.names, 
-                               fit.method, decay, direction, steps, k)
-  } else if (variable.selection == "all subsets"){
-    reg.list <- SelectAllSubsets(dat.xts, asset.names, factor.names, 
-                                 fit.method, subsets.method, 
-                                 nvmax, force.in, num.factors.subset, decay)
-  } else if (variable.selection == "lars"){
-    result.lars <- SelectLars(dat.xts, asset.names, factor.names, 
-                              lars.type, decay, lars.criterion)
+                                    fit.method, lm.args, lmRob.args, decay)
+  } else if (variable.selection == "stepwise") {
+    reg.list <- SelectStepwise(dat.xts, asset.names, factor.names, fit.method, 
+                               lm.args, lmRob.args, step.args, decay)
+  } else if (variable.selection == "subsets") {
+    reg.list <- SelectAllSubsets(dat.xts, asset.names, factor.names,fit.method, 
+                                 lm.args, lmRob.args, regsubsets.args, 
+                                 subset.size, decay)
+  } else if (variable.selection == "lars") {
+    result.lars <- SelectLars(dat.xts, asset.names, factor.names, lars.args, 
+                              cv.lars.args, lars.criterion)
     input <- list(call=call, data=dat.xts, asset.names=asset.names, 
                   factor.names=factor.names, fit.method=fit.method, 
                   variable.selection=variable.selection)
@@ -255,10 +262,6 @@
     class(result) <- "tsfm"
     return(result)
   } 
-  else {
-    stop("Invalid argument: variable.selection must be either 'none',
-         'stepwise','all subsets','lars'")
-  }
   
   # extract the fitted factor models, coefficients, r2 values and residual vol 
   # from returned factor model fits above
@@ -282,12 +285,12 @@
 ### method variable.selection = "none"
 #
 NoVariableSelection <- function(dat.xts, asset.names, factor.names, fit.method,
-                                decay){
+                                lm.args, lmRob.args, decay){
   # initialize list object to hold the fitted objects
   reg.list <- list()
   
   # loop through and estimate model for each asset to allow unequal histories
-  for (i in asset.names){
+  for (i in asset.names) {
     # completely remove NA cases
     reg.xts <- na.omit(dat.xts[, c(i, factor.names)])
     
@@ -296,15 +299,16 @@
     
     # fit based on time series regression method chosen
     if (fit.method == "OLS") {
-      reg.list[[i]] <- lm(fm.formula, data=reg.xts)
+      reg.list[[i]] <- do.call(lm, c(list(fm.formula,data=reg.xts),lm.args))
     } else if (fit.method == "DLS") {
-      w <- WeightsDLS(nrow(reg.xts), decay)
-      reg.list[[i]] <- lm(fm.formula, data=reg.xts, weights=w)
+      if(!"weights" %in% names(lm.args)) {
+        lm.args$weights <- WeightsDLS(nrow(reg.xts), decay)
+      }
+      reg.list[[i]] <- do.call(lm, c(list(fm.formula,data=reg.xts),lm.args))
     } else if (fit.method == "Robust") {
-      reg.list[[i]] <- lmRob(fm.formula, data=reg.xts)
-    } else {
-      stop("Invalid argument: fit.method must be 'OLS', 'DLS' or 'Robust'")
-    }
+      reg.list[[i]] <- do.call(lmRob, c(list(fm.formula,data=reg.xts),
+                                        lmRob.args))
+    } 
   } 
   reg.list  
 }
@@ -313,12 +317,12 @@
 ### method variable.selection = "stepwise"
 #
 SelectStepwise <- function(dat.xts, asset.names, factor.names, fit.method, 
-                           decay, direction, steps, k){
+                           lm.args, lmRob.args, step.args, decay) {
   # initialize list object to hold the fitted objects
   reg.list <- list()
   
   # loop through and estimate model for each asset to allow unequal histories
-  for (i in asset.names){
+  for (i in asset.names) {
     # completely remove NA cases
     reg.xts <- na.omit(dat.xts[, c(i, factor.names)])
     
@@ -327,76 +331,63 @@
     
     # fit based on time series regression method chosen
     if (fit.method == "OLS") {
-      reg.list[[i]] <- step(lm(fm.formula, data=reg.xts), direction=direction, 
-                            steps=steps, k=k, trace=0)
+      lm.fit <- do.call(lm, c(list(fm.formula,data=reg.xts),lm.args))
+      reg.list[[i]] <- do.call(step, c(list(lm.fit),step.args))
     } else if (fit.method == "DLS") {
-      w <- WeightsDLS(nrow(reg.xts), decay)
-      reg.list[[i]] <- step(lm(fm.formula, data=reg.xts, weights=w), 
-                            direction=direction, steps=steps, k=k, trace=0)
+      if(!"weights" %in% names(lm.args)) {
+        lm.args$weights <- WeightsDLS(nrow(reg.xts), decay)
+      }
+      lm.fit <- do.call(lm, c(list(fm.formula,data=reg.xts),lm.args))
+      reg.list[[i]] <- do.call(step, c(list(lm.fit),step.args))
     } else if (fit.method == "Robust") {
-      reg.list[[i]] <- step.lmRob(lmRob(fm.formula, data=reg.xts), trace=FALSE,
-                                  direction=direction, steps=steps)
-    } else {
-      stop("Invalid argument: fit.method must be 'OLS', 'DLS' or 'Robust'")
-    }
+      lmRob.fit <- do.call(lmRob, c(list(fm.formula,data=reg.xts),lmRob.args))
+      reg.list[[i]] <- do.call(step.lmRob, c(list(lmRob.fit),step.args))
+    } 
   }
   reg.list
 }
 
 
-### method variable.selection = "all subsets"
+### method variable.selection = "subsets"
 #
 SelectAllSubsets <- function(dat.xts, asset.names, factor.names, fit.method, 
-                             subsets.method, nvmax, force.in, 
-                             num.factors.subset, decay){
-  # Check argument validity
-  if (nvmax < num.factors.subset) {
-    stop("Invaid Argument: nvmax should be >= num.factors.subset")
-  }
+                             lm.args, lmRob.args, regsubsets.args, subset.size, 
+                             decay) {
   
   # initialize list object to hold the fitted objects
   reg.list <- list()
   
   # loop through and estimate model for each asset to allow unequal histories
-  for (i in asset.names){
+  for (i in asset.names) {
+    # completely remove NA cases
+    reg.xts <- na.omit(dat.xts[, c(i, factor.names)])
     
-    # choose best subset of factors depending on specified number of factors
-    if (num.factors.subset == length(force.in)) {
-      reg.xts <- na.omit(dat.xts[, c(i, force.in)])
-    } else if (num.factors.subset > length(force.in)) {
-      reg.xts <- na.omit(dat.xts[, c(i, factor.names)])
-      
-      # formula to pass to lm or lmRob
-      fm.formula <- as.formula(paste(i," ~ ."))
-      
-      if (fit.method != "DLS") {decay <- 1}
-      # do weighted least squares if "DLS"
-      w <- WeightsDLS(nrow(reg.xts), decay)
-      
-      # use regsubsets to find the best model with a subset of factors of size 
-      # num.factors.subset
-      fm.subsets <- regsubsets(fm.formula, data=reg.xts, nvmax=nvmax,
-                               force.in=force.in, method=subsets.method, 
-                               weights=w)
-      sum.sub <- summary(fm.subsets)
-      reg.xts <- na.omit(dat.xts[,c(i,names(which(sum.sub$which[
-        as.character(num.factors.subset),-1]==TRUE)))])
-    } else {
-      stop("Invalid Argument: num.factors.subset should be >= 
-             length(force.in)")
+    # formula to pass to lm or lmRob
+    fm.formula <- as.formula(paste(i," ~ ."))
+    
+    if (fit.method == "DLS" && !"weights" %in% names(regsubsets.args)) {
+      regsubsets.args$weights <- WeightsDLS(nrow(reg.xts), decay)
     }
     
+    # choose best subset of factors depending on specified subset size
+    fm.subsets <- do.call(regsubsets, c(list(fm.formula,data=reg.xts),
+                                        regsubsets.args))
+    sum.sub <- summary(fm.subsets)
+    names.sub <- names(which(sum.sub$which[as.character(subset.size),-1]==TRUE))
+    reg.xts <- na.omit(dat.xts[,c(i,names.sub)])
+    
     # fit based on time series regression method chosen
     if (fit.method == "OLS") {
-      reg.list[[i]] <- lm(fm.formula, data=reg.xts)
+      reg.list[[i]] <- do.call(lm, c(list(fm.formula,data=reg.xts),lm.args))
     } else if (fit.method == "DLS") {
-      w <- WeightsDLS(nrow(reg.xts), decay)
-      reg.list[[i]] <- lm(fm.formula, data=reg.xts, weights=w)
+      if(!"weights" %in% names(lm.args)) {
+        lm.args$weights <- WeightsDLS(nrow(reg.xts), decay)
+      }
+      reg.list[[i]] <- do.call(lm, c(list(fm.formula,data=reg.xts),lm.args))
     } else if (fit.method == "Robust") {
-      reg.list[[i]] <- lmRob(fm.formula, data=reg.xts)
-    } else {
-      stop("Invalid argument: fit.method must be 'OLS', 'DLS' or 'Robust'")
-    }
+      reg.list[[i]] <- do.call(lmRob, c(list(fm.formula,data=reg.xts),
+                                        lmRob.args))
+    } 
   }
   reg.list
 }
@@ -404,8 +395,8 @@
 
 ### method variable.selection = "lars"
 #
-SelectLars <- function(dat.xts, asset.names, factor.names, lars.type, 
-                       decay, lars.criterion) {
+SelectLars <- function(dat.xts, asset.names, factor.names, lars.args, 
+                       cv.lars.args, lars.criterion) {
   # initialize list object to hold the fitted objects and, vectors and matrices
   # for the other results
   asset.fit <- list()
@@ -418,38 +409,33 @@
   colnames(beta) <- factor.names
   
   # loop through and estimate model for each asset to allow unequal histories
-  for (i in asset.names){
+  for (i in asset.names) {
     # completely remove NA cases
     reg.xts <- na.omit(dat.xts[, c(i, factor.names)])
     
     # convert to matrix
     reg.mat <- as.matrix(reg.xts)
     # fit lars regression model
-    lars.fit <- lars(reg.mat[,-1], reg.mat[,i], 
-                     type=lars.type, trace = FALSE)
+    lars.fit <- do.call(lars, c(x=list(reg.mat[,-1],y=reg.mat[,i]),lars.args))
     lars.sum <- summary(lars.fit)
+    cv.error <- do.call(cv.lars, c(x=list(reg.mat[,-1],y=reg.mat[,i],
+                                          plot.it=FALSE),cv.lars.args))
     
-    # get the step that minimizes the "Cp" statistic or the "cv" mean-sqd 
-    # prediction error
+    # get the step that minimizes the "Cp" statistic or 
+    # the K-fold "cv" mean-squared prediction error
     if (lars.criterion == "Cp") {
       s <- which.min(lars.sum$Cp)
-    } else if (lars.criterion == "cv") {
-      lars.cv <- cv.lars(reg.mat[,factor.names], reg.mat[,i], trace=FALSE,
-                         type=lars.type, mode="step", plot.it=FALSE)
-      s <- which.min(lars.cv$cv)
     } else {
-      stop("Invalid argument: lars.criterion must be Cp' or 'cv'")
+      s <- which.min(cv.error$cv)
     }
     
     # get factor model coefficients & fitted values at the step obtained above
     coef.lars <- predict(lars.fit, s=s, type="coef", mode="step")
-    fitted.lars <- predict(lars.fit, reg.xts[,-1], s=s, type="fit", 
-                           mode="step")
+    fitted.lars <- predict(lars.fit, reg.xts[,-1], s=s, type="fit",mode="step")
     fitted.list[[i]] <- xts(fitted.lars$fit, index(reg.xts))
     # extract and assign the results
     asset.fit[[i]] = lars.fit
-    alpha[i] <- (fitted.lars$fit - 
-                   reg.xts[,-1]%*%coef.lars$coefficients)[1]
+    alpha[i] <- (fitted.lars$fit - reg.xts[,-1]%*%coef.lars$coefficients)[1]
     beta.names <- names(coef.lars$coefficients)
     beta[i, beta.names] <- coef.lars$coefficients
     r2[i] <-  lars.fit$R2[s]
@@ -459,21 +445,24 @@
   fitted.xts <- do.call(merge, fitted.list)
   results.lars <- list(asset.fit=asset.fit, alpha=alpha, beta=beta, r2=r2, 
                        resid.sd=resid.sd, fitted=fitted.xts)
+  # As a special case for variable.selection="lars", fitted values are also 
+  # returned by fitTsfm. Else, shrinkage s from the best fit is needed to get 
+  # fitted values & residuals.
 }
 
 
 ### calculate weights for "DLS"
 #
-WeightsDLS <- function(t,d){
+WeightsDLS <- function(t,d) {
   # more weight given to more recent observations 
   w <- d^seq((t-1),0,-1)    
   # ensure that the weights sum to unity
   w/sum(w)
 }
 
-### make a data frame (padded with NAs) from columns of unequal length
+### make a data frame (padded with NAs) from unequal vectors with named rows
 #
-makePaddedDataFrame <- function(l){
+makePaddedDataFrame <- function(l) {
   DF <- do.call(rbind, lapply(lapply(l, unlist), "[", 
                               unique(unlist(c(sapply(l,names))))))
   DF <- as.data.frame(DF)
@@ -489,12 +478,12 @@
 #' @method coef tsfm
 #' @export
 
-coef.tsfm <- function(object,...){
+coef.tsfm <- function(object,...) {
   if (object$variable.selection=="lars") {
     coef.mat <- cbind(object$alpha, object$beta)
     colnames(coef.mat)[1] <- "(Intercept)"
   } else {
-    coef.mat <- t(sapply(object$asset.fit, coef))
+    coef.mat <- t(sapply(object$asset.fit, coef, ...))
   }
   return(coef.mat)
 }
@@ -503,13 +492,14 @@
 #' @method fitted tsfm
 #' @export
 
-fitted.tsfm <- function(object,...){  
+fitted.tsfm <- function(object,...) {  
   if (object$variable.selection=="lars") {
     fitted.xts <- object$fitted
   } else {
     # get fitted values from each linear factor model fit 
     # and convert them into xts/zoo objects
-    fitted.list = sapply(object$asset.fit, function(x) checkData(fitted(x)))
+    fitted.list = sapply(object$asset.fit, 
+                         function(x) checkData(fitted(x,...)))
     # this is a list of xts objects, indexed by the asset name
     # merge the objects in the list into one xts object
     fitted.xts <- do.call(merge, fitted.list)
@@ -528,7 +518,8 @@
   } else {
     # get residuals from each linear factor model fit 
     # and convert them into xts/zoo objects
-    residuals.list = sapply(object$asset.fit, function(x) checkData(residuals(x)))
+    residuals.list = sapply(object$asset.fit, 
+                            function(x) checkData(residuals(x,...)))
     # this is a list of xts objects, indexed by the asset name
     # merge the objects in the list into one xts object
     residuals.xts <- do.call(merge, residuals.list)

Added: pkg/FactorAnalytics/R/fitTsfm.control.R
===================================================================
--- pkg/FactorAnalytics/R/fitTsfm.control.R	                        (rev 0)
+++ pkg/FactorAnalytics/R/fitTsfm.control.R	2014-07-21 08:10:51 UTC (rev 3474)
@@ -0,0 +1,194 @@
+#' @title List of control parameters for \code{fitTsfm}
+#' 
+#' @description Creates a list of control parameters for \code{\link{fitTsfm}}. 
+#' All control parameters that are not passed to this function are set to 
+#' default values.
+#' 
+#' @details This control function is primarily used to pass optional arguments 
+#' to \code{\link[stats]{lm}}, \code{\link[robust]{lmRob}}, 
+#' \code{\link[stats]{step}}, \code{\link[leaps]{regsubsets}}, 
+#' \code{\link[lars]{lars}} and \code{\link[lars]{cv.lars}} within 
+#' \code{fitTsfm}. See their respective help files for more details. The 
+#' arguments to each of these functions are listed approximately in the same 
+#' order for user convenience.
+#' 
+#' The scalar \code{decay} is used by \code{\link{fitTsfm}} to compute 
+#' exponentially decaying weights for \code{fit.method="DLS"}. Optionally, one 
+#' can directly specify \code{weights}, a weights vector, to be used with 
+#' "OLS" or "Robust".
+#' 
+#' \code{lars.criterion} selects the criterion (one of "Cp" or "cv") to 
+#' determine the best fitted model for \code{variable.selection="lars"}. The 
+#' "Cp" statistic (defined in page 17 of Efron et al. (2004)) is calculated 
+#' using \code{\link[lars]{summary.lars}}. While, "cv" computes the K-fold 
+#' cross-validated mean squared prediction error using 
+#' \code{\link[lars]{cv.lars}}.
+#' 
+#' @param decay a scalar in (0, 1] to specify the decay factor for "DLS". 
+#' Default is 0.95.
+#' @param weights an optional vector of weights to be used in the fitting 
+#' process for \code{fit.method="OLS","Robust"}, or 
+#' \code{variable.selection="subsets"}. Should be \code{NULL} or a numeric 
+#' vector. If non-\code{NULL}, weighted least squares is performed with weights 
+#' given by \code{weights} (i.e., minimizing sum(w*e^2)). The length of 
+#' \code{weights} must be the same as the number of observations. The weights 
+#' must be nonnegative and strongly recommended to be strictly positive.
+#' @param model,x,y,qr logicals passed to \code{lm} for 
+#' \code{fit.method="OLS"}. If \code{TRUE} the corresponding components of the 
+#' fit (the model frame, the model matrix, the response, the QR decomposition) 
+#' are returned.
+#' @param nrep the number of random subsamples to be drawn for 
+#' \code{fit.method="Robust"}. If the data set is small and "Exhaustive" 
+#' resampling is being used, the value of \code{nrep} is ignored.
+#' @param scope defines the range of models examined in the \code{"stepwise"} 
+#' search. This should be either a single formula, or a list containing 
+#' components \code{upper} and \code{lower}, both formulae. See 
+#' \code{\link[stats]{step}} for how to specify the formulae and usage.
+#' @param scale optional parameter for \code{variable.selection="stepwise"}. 
+#' The argument is passed to \code{\link[stats]{step}} or 
+#' \code{\link[robust]{step.lmRob}} as appropriate.
+#' @param direction the mode of \code{"stepwise"} search, can be one of "both", 
+#' "backward", or "forward", with a default of "both". If the \code{scope} 
+#' argument is missing the default for \code{direction} is "backward". 
+#' @param trace If positive (or, not \code{FALSE}), info is printed during the 
+#' running of \code{\link[stats]{step}}, \code{\link[robust]{step.lmRob}},
+#' \code{\link[lars]{lars}} or \code{\link[lars]{cv.lars}} as relevant. Larger 
+#' values may give more detailed information. Default is \code{FALSE}.
+#' @param steps the maximum number of steps to be considered for 
+#' \code{"stepwise"}. Default is 1000 (essentially as many as required). It is 
+#' typically used to stop the process early. 
+#' @param k the multiple of the number of degrees of freedom used for the 
+#' penalty in \code{"stepwise"}. Only \code{k = 2} gives the genuine AIC. 
+#' \code{k = log(n)} is sometimes referred to as BIC or SBC. Default is 2.
+#' @param nbest number of subsets of each size to record for \code{"subsets"}. 
+#' Default is 1.
+#' @param nvmax maximum size of subsets to examine for \code{"subsets"}. 
+#' Default is 8.
+#' @param force.in index to columns of design matrix that should be in all 
+#' models for \code{"subsets"}. Default is \code{NULL}.
+#' @param force.out index to columns of design matrix that should be in no 
+#' models for \code{"subsets"}. Default is \code{NULL}.
+#' @param method one of "exhaustive", "forward", "backward" or "seqrep" 
+#' (sequential replacement) to specify the type of subset search/selection. 
+#' Required if \code{variable selection="subsets"} is chosen. Default is 
+#' "exhaustive".
+#' @param really.big option for \code{"subsets"}; Must be \code{TRUE} to 
+#' perform exhaustive search on more than 50 variables.
+#' @param subset.size number of factors required in the factor model; 
+#' an option for \code{"subsets"} variable selection. Default is 1. 
+#' Note: \code{nvmax >= subset.size >= length(force.in)}.
+#' @param type option for \code{"lars"}. One of "lasso", "lar", 
+#' "forward.stagewise" or "stepwise". The names can be abbreviated to any 
+#' unique substring. Default is "lasso".
+#' @param normalize option for \code{"lars"}. If \code{TRUE}, each variable is 
+#' standardized to have unit L2 norm, otherwise they are left alone. Default 
+#' is \code{TRUE}.
+#' @param eps option for \code{"lars"}; An effective zero.
+#' @param max.steps Limit the number of steps taken for \code{"lars"}; the 
+#' default is \code{8 * min(m, n-intercept)}, with \code{m} the number of 
+#' variables, and \code{n} the number of samples. For \code{type="lar"} or 
+#' \code{type="stepwise"}, the maximum number of steps is 
+#' \code{min(m,n-intercept)}. For \code{type="lasso"} and especially 
+#' \code{type="forward.stagewise"}, there can be many more terms, because 
+#' although no more than \code{min(m,n-intercept)} variables can be active 
+#' during any step, variables are frequently droppped and added as the 
+#' algorithm proceeds. Although the default usually guarantees that the 
+#' algorithm has proceeded to the saturated fit, users should check.
+#' @param lars.criterion an option to assess model selection for the 
+#' \code{"lars"} method; one of "Cp" or "cv". See details. Default is "Cp".
+#' @param K number of folds for computing the K-fold cross-validated mean 
+#' squared prediction error for \code{"lars"}. Default is 10.
+#' @param mode This refers to the index that is used for cross-validation. The 
+#' default is "fraction" for \code{type="lasso"} or 
+#' \code{type="forward.stagewise"}. For \code{type="lar"} or 
+#' \code{type="stepwise"} the default is "step".
+#' 
+#' @return A list of the above components. This is only meant to be used by 
+#' \code{fitTsfm}.
+#' 
+#' @author Sangeetha Srinivasan
+#' 
+#' @references 
+#' \enumerate{
+#' \item Efron, Bradley, Trevor Hastie, Iain Johnstone, and Robert Tibshirani. 
+#' "Least angle regression." The Annals of statistics 32, no.2 (2004): 407-499. 
+#' }
+#' 
+#' @seealso \code{\link{fitTsfm}}, \code{\link[stats]{lm}}, 
+#' \code{\link[robust]{lmRob}}, \code{\link[stats]{step}}, 
+#' \code{\link[leaps]{regsubsets}}, \code{\link[lars]{lars}} and 
+#' \code{\link[lars]{cv.lars}}
+#' 
+#' @examples
+#' 
+#' # check argument list passed by fitTsfm.control
+#' tsfm.ctrl <- fitTsfm.control(method="exhaustive", nbest=2)
+#' print(tsfm.ctrl)
+#' 
+#' # used internally by fitTsfm
+#' data(managers)
+#' fit <- fitTsfm(asset.names=colnames(managers[,(1:6)]),
+#'                factor.names=colnames(managers[,(7:9)]), 
+#'                data=managers, variable.selection="subsets", 
+#'                method="exhaustive", nbest=2)
[TRUNCATED]

To get the complete diff run:
    svnlook diff /svnroot/returnanalytics -r 3474


More information about the Returnanalytics-commits mailing list