From noreply at r-forge.r-project.org Wed May 20 16:12:49 2015 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Wed, 20 May 2015 16:12:49 +0200 (CEST) Subject: [Blotter-commits] r1686 - pkg/quantstrat/sandbox/backtest_musings Message-ID: <20150520141249.61B6C187311@r-forge.r-project.org> Author: braverock Date: 2015-05-20 16:12:49 +0200 (Wed, 20 May 2015) New Revision: 1686 Modified: pkg/quantstrat/sandbox/backtest_musings/research_replication.Rmd pkg/quantstrat/sandbox/backtest_musings/research_replication.pdf pkg/quantstrat/sandbox/backtest_musings/stat_process.bib pkg/quantstrat/sandbox/backtest_musings/strat_dev_process.Rmd pkg/quantstrat/sandbox/backtest_musings/strat_dev_process.pdf Log: - add simplifying assumptions to replication paper - minor updates to references Modified: pkg/quantstrat/sandbox/backtest_musings/research_replication.Rmd =================================================================== --- pkg/quantstrat/sandbox/backtest_musings/research_replication.Rmd 2015-04-15 02:33:20 UTC (rev 1685) +++ pkg/quantstrat/sandbox/backtest_musings/research_replication.Rmd 2015-05-20 14:12:49 UTC (rev 1686) @@ -143,12 +143,21 @@ paper you will review during the course of your replication research. The main goal is rather to understand the framework that your research goal fits into, and develop resources for deeper understanding when that is required. +The literature review is to help organize the tools and knowledge needed to +complete the replication. Like a chef measures out ingredients before starting +a recipe, or a carpenter gathers tools, the analyst is gathering tools and +knowledge to make sure they have enough to work with for the replication project. ## organization of the literature review The analyst will need to decide on an organizational scheme for the literature -review. There are two models which you will return to; which model is appropriate -will depend both on how the replication is to proceed, and how the analyst's -internal organizational structure is envisioned. +review. In the social sciences, it is common to present the literature review as +a narrative which describes a topic through its available research material +almost as a story arc. The quantitative analyst trying to replicate strategy or +model research should probably avoid this purely narrative form because it +typically lacks enough detail to anchor replication. There are two models which +you will return to; which model is appropriate will depend both on how the +replication is to proceed, and how the analyst's internal organizational +structure is envisioned. The first model is the *annotated bibliography*. It is organized alphabetically, by author, containing the bibliographic entry plus the paragraph of summary as @@ -176,16 +185,18 @@ The literature review should start with the key references of the paper you are trying to replicate. It should have become clear while reading and summarizing the paper which papers provided starting material or key techniques for this -paper. These papers should be located and summarized in a single paragraph -each, as described above. +paper. There will also often be key original references in a particular topic, +cited by many other papers. These papers should be located and summarized in a +single paragraph each, as described above. ## finding similar work Other papers which should be included in the literature review are similar work. Resources like Google Scholar will automatically recommend similar papers, and even order them by number of papers that reference the paper in the results. -The analyst will want to make sure that they at least review a few papers which -are at the top of a literature search on the key topics. +Key words and phrases from the source material may be used to find more recent +papers. The analyst will want to make sure that they at least review a few +papers which are at the top of a literature search on the key topics. ## references with implementation hints @@ -198,6 +209,19 @@ interest, aiding the analyst in collecting their thoughts, and aiding any readers to find the relevant material. +## refuting your hypothesis + +\textsc{The literature review may refute your initial hypothesis.} In searching +for more information about the topics or models you hope to replicate, you may +find sources that refute or challenge those methods or theories. This should be +carefully documented, as it is potentially very valuable. If the evidence in +the source materials is very detailed andhas a good experimental design, you +may choose to document the evidence and scrap or revise your hypotheses. If the +new source is suggestive, inconclusive, or refutes a related hypothesis that is +not identical to the object of your replication study, then you may choose to +identify the argument, evidence, and tests, and add it to what you will test in +the replication phase of your project. + ___________ # Data @@ -415,7 +439,41 @@ replication. This type of analysis will often be saved for later work on the ideas contained in the source paper. +## going beyond simplifying assumptions +Another necessary place to look for extensions to the replication is in that +assumptions used while doing the analysis. Almost all published papers use +simplifying assumptions. Many of these assumptions exist to fit in well with +other similar literature. Others are made to make the analysis easier on the +authors of the paper. In many cases, these simplifying assumptions will make +a technique unsuitable for use with real data or real portfolios. + +Some examples of common simplifying assumptions and ways to extend or rectify +them appear below: + +1. **Gaussian assumption** : + Probably the worst offender is the use of a Gaussian distribution to model + volatility, or errors, or noise, or to sample from. Many authors + acknowledge this in their text, and then use it anyway. What would be a + better choice for your modeling, and why? + +1. **sample moments** : + Papers use sample moments because it is easy, under the cover of saying + that sample moments introduce less model risk. Many other and better + methods exist for estimating moments. Note that with volatility, this + may be an extension of the **Gaussian assumption**, above. In other cases, + it introduces additional ancillary model choices (and subsequent model + testing). + +1. **too many/too few parameters**: + Choice of parameters is a key area where you can tune an analysis. Many + papers use a huge number of parameters; others choose a minimal or + parsimonious model with few parameters. Either of these choices which made + it easier to publish the paper in the first place are unlikely to make a + usable model on a real portfolio. @Kuhn2013 provides many guidelines on + parameter or feature choice, and when and how to increase or decrease the + number of parameters under consideration. + ## similar techniques It is quite likely that the Literature Review uncovered similar techniques to @@ -424,14 +482,16 @@ the replication. Typically, it is not worth spending a lot of time on this. Exceptions usually include when a paper claims to have improved a technique, but does an incomplete job of reporting results for the original (theoretically -deficient) technique. +deficient) technique. Another case where you should spend a more time on similar +techniques is where one of the Literature Review papers extends, clarifies, or +refutes claims made in the paper under replication. ## probability of overfitting - at Peterson2015 discusses multiple techniques for detecting biases and overfitting. -Most replication reports should contain results of appropriate tests. Specific -categories to pay attention to in most cases include selection biases, -look-ahead bias, and out of sample deterioration. + at Peterson2015 and @Kuhn2013 discuss multiple techniques for detecting biases and +overfitting. Most replication reports should contain results of appropriate +tests. Specific categories to pay attention to in most cases include selection +biases, look-ahead bias, and out of sample deterioration. ___________ @@ -498,4 +558,4 @@ ___________ -# References \ No newline at end of file +# References Modified: pkg/quantstrat/sandbox/backtest_musings/research_replication.pdf =================================================================== (Binary files differ) Modified: pkg/quantstrat/sandbox/backtest_musings/stat_process.bib =================================================================== --- pkg/quantstrat/sandbox/backtest_musings/stat_process.bib 2015-04-15 02:33:20 UTC (rev 1685) +++ pkg/quantstrat/sandbox/backtest_musings/stat_process.bib 2015-05-20 14:12:49 UTC (rev 1686) @@ -69,12 +69,26 @@ Number = {7317}, Pages = {753--753}, Volume = {467}, + Owner = {brian}, Publisher = {Nature Publishing Group}, Timestamp = {2015.01.14}, Url = {http://www.nature.com/news/2010/101013/full/467753a.html} } + at Article{baxter1999, + Title = {Measuring business cycles: approximate band-pass filters for economic time series}, + Author = {Baxter, Marianne and King, Robert G}, + Journal = {Review of economics and statistics}, + Year = {1999}, + Number = {4}, + Pages = {575--593}, + Volume = {81}, + + Publisher = {MIT Press}, + Url = {http://pages.stern.nyu.edu/~dbackus/GE_asset_pricing/ms/Filters/BaxterKing%20bandpass%20NBER%205022.pdf} +} + @Book{Box1987, Title = {Empirical model-building and response surfaces.}, Author = {Box, George E.P. and Draper, Norman R.}, @@ -115,6 +129,7 @@ Author = {Dudler, Martin and Gmuer, Bruno and Malamud, Semyon}, Journal = {Available at SSRN 2457647}, Year = {2014}, + Abstract = {We introduce a new class of momentum strategies that are based on the long-term averages of risk-adjusted returns and test these strategies on a universe of 64 liquid futures contracts. We show that this risk adjusted momentum strategy outperforms the time series momentum strategy of Ooi, Moskowitz and Pedersen (2012) for almost all combinations of holding- and look-back periods. We construct measures of momentum-specific volatility (risk), (both within and across asset classes) and show that these volatility measures can be used both for risk management and it momentum timing. We find that momentum risk management significantly increases Sharpe ratios, but at the same time leads to more pronounced negative skewness and tail risk; by contrast, combining risk management with momentum timing practically eliminates the negative skewness of momentum returns and significantly reduces tail risk. In addition, momentum risk management leads to a much lower exposure to market, value, and momentum factors. As a result, risk-managed momentum returns offer much higher diversification benefits than the standard momentum returns.}, Owner = {brian}, Timestamp = {2015.01.15}, @@ -142,6 +157,7 @@ Author = {Fox, John and Weisberg, Sanford}, Year = {2011}, + Owner = {brian}, Publisher = {An Appendix to An R Companion to Applied Regression, Sage, Thousand Oaks, CA,}, Timestamp = {2015.01.13} @@ -164,10 +180,14 @@ @Article{Harvey2014, Title = {Evaluating Trading Strategies}, Author = {Harvey, Campbell R. and Liu, Yan}, - Journal = {SSRN}, + Journal = {Journal of Portfolio Management}, Year = {2014}, + Number = {5}, + Pages = {108-118}, + Volume = {40}, - Url = {http://ssrn.com/abstract=2474755} + Comment = {preprint at http://ssrn.com/abstract=2474755}, + Url = {https://faculty.fuqua.duke.edu/~charvey/Research/Published_Papers/P116_Evaluating_trading_strategies.pdf} } @Article{Harvey2013backtesting, @@ -219,6 +239,7 @@ Number = {7386}, Pages = {485--488}, Volume = {482}, + Owner = {brian}, Publisher = {Nature Publishing Group}, Timestamp = {2015.01.14}, @@ -241,6 +262,33 @@ Url = {http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124#s6} } + at Article{Jegadeesh2002, + Title = {Cross-sectional and time-series determinants of momentum returns}, + Author = {Jegadeesh, Narasimhan and Titman, Sheridan}, + Journal = {Review of Financial Studies}, + Year = {2002}, + Number = {1}, + Pages = {143--157}, + Volume = {15}, + + Publisher = {Soc Financial Studies}, + Url = {http://www.researchgate.net/profile/Narasimhan_Jegadeesh/publication/5216887_Cross-Sectional_and_Time-Series_Determinants_of_Momentum_Returns/links/0a85e5383ba5d2941e000000.pdf} +} + + at Article{Kaastra1996, + Title = {Designing a neural network for forecasting financial and economic time series}, + Author = {Kaastra, Iebeling and Boyd, Milton}, + Journal = {Neurocomputing}, + Year = {1996}, + Number = {3}, + Pages = {215--236}, + Volume = {10}, + + Owner = {brian}, + Publisher = {Elsevier}, + Timestamp = {2015.05.19} +} + @Book{Kestner2003, Title = {Quantitative trading strategies: {H}arnessing the power of quantitative techniques to create a winning trading program}, Author = {Kestner, Lars}, @@ -248,12 +296,29 @@ Year = {2003} } + at Article{Kim2003, + Title = {Financial time series forecasting using support vector machines}, + Author = {Kim, Kyoung-jae}, + Journal = {Neurocomputing}, + Year = {2003}, + Number = {1}, + Pages = {307--319}, + Volume = {55}, + + Owner = {brian}, + Publisher = {Elsevier}, + Timestamp = {2015.05.19}, + Url = {http://www.cse.ust.hk/~leichen/courses/comp630p/collection/reference-1-23.pdf} +} + @Book{Kuhn2013, Title = {Applied predictive modeling}, Author = {Kuhn, Max and Johnson, Kjell}, Publisher = {Springer}, Year = {2013}, + Owner = {brian}, + Timestamp = {2015.05.20}, Url = {http://appliedpredictivemodeling.com/} } @@ -278,6 +343,7 @@ Number = {1}, Pages = {181--212}, Volume = {9}, + Owner = {brian}, Publisher = {Informing Science Institute}, Timestamp = {2015.01.13}, @@ -318,6 +384,7 @@ Number = {6060}, Pages = {1226}, Volume = {334}, + Owner = {brian}, Publisher = {NIH Public Access}, Timestamp = {2015.01.14}, Modified: pkg/quantstrat/sandbox/backtest_musings/strat_dev_process.Rmd =================================================================== --- pkg/quantstrat/sandbox/backtest_musings/strat_dev_process.Rmd 2015-04-15 02:33:20 UTC (rev 1685) +++ pkg/quantstrat/sandbox/backtest_musings/strat_dev_process.Rmd 2015-05-20 14:12:49 UTC (rev 1686) @@ -11,8 +11,8 @@ keywords: quantitative trading, backtest, quantitative strategy, scientific method subject: quantitative trading, backtest, quantitative strategy, scientific method -footer: Copyright 2014 Brian G. Peterson CC-BY-NC-SA. Please do not distribute this draft without permission. -copyright: Copyright 2014 Brian G. Peterson CC-BY-NC-SA. Please do not distribute this draft without permission. +footer: Copyright 2014 Brian G. Peterson CC-BY-NC-SA. +copyright: Copyright 2014 Brian G. Peterson CC-BY-NC-SA. abstract: Analysts and portfolio managers face many challenges in developing new systematic trading systems. This paper provides a detailed, repeatable process to aid in evaluating new ideas, developing those ideas into testable hypotheses, measuring results in comparable ways, and avoiding and measuring the ever-present risks of over-fitting. ^[ *Back-testing. I hate it ?- it's just optimizing over history. You never see a bad back-test. Ever. In any strategy.* - Josh Diedesch[- at Diedesch2014] ] @@ -1700,18 +1700,16 @@ # Acknowledgements I would like to thank my team for thoughtful comments and questions, John Bollinger, -and Stephen Rush at the University of Connecticut for his insightful comments -on an early draft of this paper. All remaining errors or omissions should be -attributed to the author. All views expressed in this paper are to be viewed -as those of Brian Peterson, and do not necessarily reflect the opinions or -policies of DV Trading or DV Asset Management. +Ilya Kipnis, and Stephen Rush at the University of Connecticut for insightful +comments on early drafts of this paper. All remaining errors or omissions +should be attributed to the author. All views expressed in this paper are to be +viewed as those of Brian Peterson, and do not necessarily reflect the opinions +or policies of DV Trading or DV Asset Management. ?2014-2015 Brian G. Peterson \includegraphics[width=1.75cm]{cc-by-nc-sa} -*Please do not distribute this draft without permission.* - The most recently published version of this document may be found at \url{http://goo.gl/na4u5d} \newpage Modified: pkg/quantstrat/sandbox/backtest_musings/strat_dev_process.pdf =================================================================== (Binary files differ) From noreply at r-forge.r-project.org Thu May 21 21:45:04 2015 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Thu, 21 May 2015 21:45:04 +0200 (CEST) Subject: [Blotter-commits] r1687 - pkg/quantstrat Message-ID: <20150521194504.6EC27185F82@r-forge.r-project.org> Author: braverock Date: 2015-05-21 21:45:04 +0200 (Thu, 21 May 2015) New Revision: 1687 Modified: pkg/quantstrat/DESCRIPTION Log: - remove xtsExtra, since plot is in xts upstream now Modified: pkg/quantstrat/DESCRIPTION =================================================================== --- pkg/quantstrat/DESCRIPTION 2015-05-20 14:12:49 UTC (rev 1686) +++ pkg/quantstrat/DESCRIPTION 2015-05-21 19:45:04 UTC (rev 1687) @@ -1,7 +1,7 @@ Package: quantstrat Type: Package Title: Quantitative Strategy Model Framework -Version: 0.9.1669 +Version: 0.9.1687 Date: $Date$ Author: Peter Carl, Brian G. Peterson, Joshua Ulrich, Jan Humme Depends: @@ -14,7 +14,6 @@ PortfolioAnalytics, rgl, testthat, - xtsExtra, rCharts, gamlss.util, reshape2, From noreply at r-forge.r-project.org Thu May 28 17:58:53 2015 From: noreply at r-forge.r-project.org (noreply at r-forge.r-project.org) Date: Thu, 28 May 2015 17:58:53 +0200 (CEST) Subject: [Blotter-commits] r1688 - pkg/quantstrat/R Message-ID: <20150528155853.74DEE186229@r-forge.r-project.org> Author: bodanker Date: 2015-05-28 17:58:53 +0200 (Thu, 28 May 2015) New Revision: 1688 Modified: pkg/quantstrat/R/paramsets.R Log: Fix environment subsetting in clone.portfolio clone.portfolio() was originally written when blotter portfolio objects were lists. Now that they're environments, subsetting by xts.tables fails because grep() returns positions and you can't subset environments by integers. Use grep(..., value=TRUE) to get the values of the tables that match the pattern, and use those character strings to subset the symbols environment. Modified: pkg/quantstrat/R/paramsets.R =================================================================== --- pkg/quantstrat/R/paramsets.R 2015-05-21 19:45:04 UTC (rev 1687) +++ pkg/quantstrat/R/paramsets.R 2015-05-28 15:58:53 UTC (rev 1688) @@ -60,7 +60,7 @@ { portfolio$symbols[[symbol]]$txn <- portfolio$symbols[[symbol]]$txn[1,] - xts.tables <- grep('(^posPL|txn)',names(portfolio$symbols[[symbol]])) + xts.tables <- grep('(^posPL|txn)',names(portfolio$symbols[[symbol]]), value=TRUE) for(xts.table in xts.tables) portfolio$symbols[[symbol]][[xts.table]] <- portfolio$symbols[[symbol]][[xts.table]][1,] }