Investing through Economic Cycles with Ensemble Machine Learning Algorithms

Similar documents
ECS171: Machine Learning

Credit Card Default Predictive Modeling

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Modeling Private Firm Default: PFirm

Web Appendix to Components of bull and bear markets: bull corrections and bear rallies

Internet Appendix. Additional Results. Figure A1: Stock of retail credit cards over time

A Multi-topic Approach to Building Quant Models. Bringing Semantic Intelligence to Financial Markets

Prediction of Stock Price Movements Using Options Data

Session 5. Predictive Modeling in Life Insurance

The Loans_processed.csv file is the dataset we obtained after the pre-processing part where the clean-up python code was used.

Quantile Regression. By Luyang Fu, Ph. D., FCAS, State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting

DFAST Modeling and Solution

Machine Learning Performance over Long Time Frame

Predicting Bear and Bull Stock Markets with Dynamic Binary Time Series Models

MS&E 448 Final Presentation High Frequency Algorithmic Trading

Budget Management In GSP (2018)

Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS001) p approach

Econ 582 Nonlinear Regression

Forecasting turning points of the business cycle: dynamic logit models for panel data

Loan Approval and Quality Prediction in the Lending Club Marketplace

Combining State-Dependent Forecasts of Equity Risk Premium

Top-down particle filtering for Bayesian decision trees

Exploiting Alternative Data in the Investment Process Bringing Semantic Intelligence to Financial Markets

Machine Learning for Quantitative Finance

Lecture 2: Forecasting stock returns

Machine Learning on Tactical Asset Allocation with Machine Learning and MATLAB Distributed Computing Server on Microsoft Azure Cloud

Automated Options Trading Using Machine Learning

Role of soft computing techniques in predicting stock market direction

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

Session 5. A brief introduction to Predictive Modeling

A Nonlinear Approach to the Factor Augmented Model: The FASTR Model

Optimal Window Selection for Forecasting in The Presence of Recent Structural Breaks

Statistical Models and Methods for Financial Markets

Loan Approval and Quality Prediction in the Lending Club Marketplace

Boosting Actuarial Regression Models in R

Gradient Boosting Trees: theory and applications

Notes on the EM Algorithm Michael Collins, September 24th 2005

A new look at tree based approaches

Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques

Equity, Vacancy, and Time to Sale in Real Estate.

Modeling Implied Volatility

A Dynamic Model of Expected Bond Returns: a Functional Gradient Descent Approach.

Academic Research Review. Classifying Market Conditions Using Hidden Markov Model

State Switching in US Equity Index Returns based on SETAR Model with Kalman Filter Tracking

Predictive modelling around the world Peter Banthorpe, RGA Kevin Manning, Milliman

Foreign Exchange Forecasting via Machine Learning

Research Memo: Adding Nonfarm Employment to the Mixed-Frequency VAR Model

Financial Econometrics Notes. Kevin Sheppard University of Oxford

Forecasting Agricultural Commodity Prices through Supervised Learning

Session 113 PD, Data and Model Actuaries Should be an Expert of Both. Moderator: David L. Snell, ASA, MAAA

Intro to GLM Day 2: GLM and Maximum Likelihood

$tock Forecasting using Machine Learning

CSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems

Dynamic Replication of Non-Maturing Assets and Liabilities

A Markov switching regime model of the South African business cycle

Convexity-Concavity Indicators and Automated Trading Strategies Based on Gradient Boosted Classification Trees Models

Predicting Economic Recession using Data Mining Techniques

Reasoning with Uncertainty

An Online Algorithm for Multi-Strategy Trading Utilizing Market Regimes

Mixing Frequencies: Stock Returns as a Predictor of Real Output Growth

Lecture 2: Forecasting stock returns

A Dynamic Model of Expected Bond Returns: a Functional Gradient Descent Approach

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2017, Mr. Ruey S. Tsay. Solutions to Final Exam

Forecasting macroeconomic conditions can be challenging. Accurate

Examining the Morningstar Quantitative Rating for Funds A new investment research tool.

2D5362 Machine Learning

Macroeconomic conditions and equity market volatility. Benn Eifert, PhD February 28, 2016

Market Risk Analysis Volume I

Performance of Statistical Arbitrage in Future Markets

Predicting Foreign Exchange Arbitrage

Quant Trader. Market Forecasting and Optimization of Trading Models. Presented by Quant Trade Technologies, Inc.

Dynamic Portfolio Choice II

SELECTION BIAS REDUCTION IN CREDIT SCORING MODELS

Applications of machine learning for volatility estimation and quantitative strategies

Agricultural and Applied Economics 637 Applied Econometrics II

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam

Is there a decoupling between soft and hard data? The relationship between GDP growth and the ESI

Analyzing Oil Futures with a Dynamic Nelson-Siegel Model

Estimation of a Ramsay-Curve IRT Model using the Metropolis-Hastings Robbins-Monro Algorithm

Wage Determinants Analysis by Quantile Regression Tree

Predicting Market Fluctuations via Machine Learning

Large-Scale SVM Optimization: Taking a Machine Learning Perspective

B usiness recessions, as a major source of

Decision Trees An Early Classifier

Introduction to Reinforcement Learning. MAL Seminar

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #24 Scribe: Jordan Ash May 1, 2014

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, ISSN

Adaptive Interest Rate Modelling

Graph signal processing for clustering

Multiple Regression and Logistic Regression II. Dajiang 525 Apr

Computational Statistics Handbook with MATLAB

Rollout Allocation Strategies for Classification-based Policy Iteration

A Novel Method of Trend Lines Generation Using Hough Transform Method

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model

Premium Timing with Valuation Ratios

Application of Support Vector Machine on Algorithmic Trading

ALGORITHMIC TRADING STRATEGIES IN PYTHON

International Journal of Advance Engineering and Research Development REVIEW ON PREDICTION SYSTEM FOR BANK LOAN CREDIBILITY

Support Vector Machines: Training with Stochastic Gradient Descent

LendingClub Loan Default and Profitability Prediction

Transcription:

Investing through Economic Cycles with Ensemble Machine Learning Algorithms Thomas Raffinot Silex Investment Partners Big Data in Finance Conference Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 1 / 22

Turning points detection in real time: Ensemble ML algorithms In theory, investment strategies based on growth cycle turning points outperform not only passive buy-and-hold benchmarks, but also business cycles strategies Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 2 / 22

Turning points detection in real time: Ensemble ML algorithms In theory, investment strategies based on growth cycle turning points outperform not only passive buy-and-hold benchmarks, but also business cycles strategies Nowcasting growth cycle turning points in real time in the euro area and in the United States to time markets Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 2 / 22

Turning points detection in real time: Ensemble ML algorithms In theory, investment strategies based on growth cycle turning points outperform not only passive buy-and-hold benchmarks, but also business cycles strategies Nowcasting growth cycle turning points in real time in the euro area and in the United States to time markets Non parametric model to avoid local maxima in the likelihood Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 2 / 22

Turning points detection in real time: Ensemble ML algorithms In theory, investment strategies based on growth cycle turning points outperform not only passive buy-and-hold benchmarks, but also business cycles strategies Nowcasting growth cycle turning points in real time in the euro area and in the United States to time markets Non parametric model to avoid local maxima in the likelihood Ensemble machine learning algorithms: Random forest (Breiman (2001)) Boosting (Schapire (1990)) Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 2 / 22

Ensemble Machine Learning Algorithms Machine learning adapts statistical methods to get better results in an environment with much more data and processing power Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 3 / 22

Ensemble Machine Learning Algorithms Machine learning adapts statistical methods to get better results in an environment with much more data and processing power Ensemble algorithms: making decisions based on the input of multiple people or experts Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 3 / 22

Ensemble Machine Learning Algorithms Machine learning adapts statistical methods to get better results in an environment with much more data and processing power Ensemble algorithms: making decisions based on the input of multiple people or experts Entertain a large number of predictors and perform estimation and variable selection simultaneously Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 3 / 22

Ensemble Machine Learning Algorithms Machine learning adapts statistical methods to get better results in an environment with much more data and processing power Ensemble algorithms: making decisions based on the input of multiple people or experts Entertain a large number of predictors and perform estimation and variable selection simultaneously Random forest (Breiman (2001)): simple averaging of models Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 3 / 22

Ensemble Machine Learning Algorithms Machine learning adapts statistical methods to get better results in an environment with much more data and processing power Ensemble algorithms: making decisions based on the input of multiple people or experts Entertain a large number of predictors and perform estimation and variable selection simultaneously Random forest (Breiman (2001)): simple averaging of models Boosting (Schapire (1990)): iterative process where the errors are kept being modelled Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 3 / 22

Random forest Each decision tree is built from a bootstrapped sample of the full dataset and then, at each node, only a random sample of the available variables is used Algorithm: I Given that a training set consists of N observations and M features, choose a number m M of features to randomly select for each tree and a number K that represents the number of trees to grow. II Take a bootstrap sample Z of the N observations. So about two third of the cases are chosen. Then select randomly m features. III Grow a CART using the bootstrap sample Z and the m randomly selected features. IV Repeat the steps 2 and 3, K times. V Output the ensemble of trees T K 1 VI For regression, to make a prediction at a new point x: ŷ RF (x) = 1 K K T i (x) i=1 Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 4 / 22

The gradient descent view of boosting (Friedman (2001)) The task is to estimate the function ˆf (x), that minimizes the expectation of some loss function, Ψ(y, f ), i.e., ˆf (x) = arg min E(Ψ(y, f (x)) f (x) One has to provide the choices of functional parameters Ψ(y, f ) and the weak learner h(x, θ) The function estimate ˆf (x) is parameterized in the additive functional form: ˆf (x) = M stop m=1 β m h(x, θ m ) The original function optimization problem has thus been changed to a parameter optimization problem The size of the ensemble is determined by M, which is determined by cross-validation Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 5 / 22

Boosting: loss-functions The most frequently used loss-functions for classification are the following: y typically takes on binary values y 0, 1. To simplify the notation, let us assume the transformed labels ȳ = 2y 1 making ȳ 1, 1 Adaboost loss function: Ψ(y, f (x)) = exp( ȳf (x)) Binomial loss function: Ψ(y, f (x)) = log(1 + exp( 2ȳf (x))) The most frequently used loss-functions for regression are the following: Squared error loss: Ψ(y, f (x)) = (y f (x)) 2 Absolute loss: Ψ(y, f (x)) = y f (x Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 6 / 22

GBM algorithm with shrinkage Step 1 Initialize ˆf 0 (x) = arg min Ni=1 ρ Ψ(y i, ρ), m = 0. Step 2 m = m + 1 Step 3 Compute the negative gradient z i = f (x i ) Ψ(y i, f (x i, i = 1,..., n )) f (x i )=ˆf m 1 (x i ) Step 4 Fit the base-learner function, h(x, θ) to be the most correlated with the gradient vector. n θ m = arg min z i βh(x i, θ m) β,θ i=1 Step 5 Find the best gradient descent step-size ρ m ρ m = arg min ρ N Ψ(y i, ˆf (x i ) m 1 + ρh(x, θ m)) i=1 Step 6 Step 7 Update the estimate of f m(x) as Iterate 2-6 until m = M stop. ˆf m(x) ˆf (x) m 1 + λρ mh(x, θ m)) Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 7 / 22

Variables: almost non-revised series Financial series: Government bonds, Yield curves, investment-grade and high-yield corporate spreads, stock markets (Large caps, large caps sectors, small caps, mid caps, the growth and value version of those indexes), Assets volatility, VIX index and the VSTOXX index, commodities (crude oil, natural gas, gold, silver and CRB index),... Economic surveys: European Commission, the Institute for Supply Management, the Conference Board and the National Association of Home Builders (NAHB) Real economic data: Initial claims Different lags of differentiation were considered: 1 to 18 months More than 1000 variables Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 8 / 22

Different models Boosting: Combination of a binomial loss function with decision trees ( BTB ) as in Ng (2014) Combination of a squared error loss function with P-splines ( SPB ) as in Berge (2015) or Taieb et al. (2015) Random forest RF Competitive models: Acc classifies all data as acceleration Slow classifies all data as slowdown Random randomly assigns classes based on the proportions found in the training data Prob refers to the probit model based on the term spread MS refers to the Markov-switching dynamic factor model EN refers to the elastic-net logistic model Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 9 / 22

Real time issues To implement the ensemble algorithms, a classification of economic regimes is needed Applied to the context of nowcasting, it can be summarized as follows: { 1, if in acceleration R t = 0, otherwise A recursive estimation is computed: The ensemble algorithms are trained each month on a sample that extends from the beginning of the sample through month T 12, over which the turning point chronology is assumed known The estimation windows is thus expanding as data accumulates, over the period from January 2002 to December 2013 Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 10 / 22

Data snooping Data snooping occurs when a given set of data is used more than once for purposes of inference or model selection. It leads to the possibility that any successful results may be spurious because they could be due to chance (White (2000)) Model Confidence Set (Hansen et al. (2011)): Model selection algorithm, which filters a set of models from a given entirety of models. The MCS aims at finding the best model and all models which are indistinguishable from the best Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 11 / 22

Classical criteria The Brier s Quadratic Probability Score (QPS): QPS = 1 F F (ŷ t y t ) 2 t=1 The Area Under the ROC curve (AUROC), defined by: AUROC = 1 0 ROC(α)dα where the Receiver Operating Characteristics (ROC) curve describes all possible combinations of true positive (T p(c)) and false positive rates (F p(c)) that arise as one varies the threshold c used to make binomial forecasts from a real-valued classifier. As c is varied from 0 to 1, the ROC curve is traced out in (T p(c), F p(c)) space that describes the classification ability of the model. Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 12 / 22

Investment strategies Disconnection between econometric predictability and actual profitability (Cenesizoglu and Timmermann (2012)) Very basic investment strategies: Equity portfolio: if acceleration: 120% of his wealth is invested on the asset and 20% of cash is borrowed, otherwise 80% of his wealth is invested on the asset and 20% is kept in cash Asset allocation; if acceleration: 80% of the portfolio is allocated to equities and 20% to bonds, otherwise 40% of the portfolio is allocated to equities and 60% to bonds Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 13 / 22

Classical evaluation criteria in the United States, January 2002 to December 2013 QPS AUROC SPB 0.13 RF 0.07 0.94 BTB 0.05 0.94 Prob 0.22 MS 0.21 EN 0.18 Acc 0.21 Slow 0.79 Random 0.25 Note: ** indicates the model is in the set of best models M 75%. Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 14 / 22

Turning point signals of the reference cycle in the United States SPB RF BTB Trough: February 2003 0-1 -2 Peak: October 2007 1-2 -1 Trough: September 2009 1 2 3 Peak: June 2011-3 2 Trough: December 2011 1 1 Note: Value shown is the model-implied peak/trough calculated using a 0.5 threshold. The minus sign refers to the lead in which the models anticipate the turning point dates. - indicates that the model did not generate any signal. SPB refers to a boosting model based on squared error loss with P-splines, RF refers to a random forest model, BTB refers to a boosting model based on binomial loss function with decision trees. Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 15 / 22

United States: 120/80 equity strategy, January 2002 to December 2013 Average returns Volatily SR MDD SPB 0.110 0.149 0.74-0.43 RF 0.107 0.147 0.72-0.43 BTB 0.109 0.146 0.75-0.44 Prob 0.094 0.173 0.54-0.57 MS 0.101 0.171 0.59-0.56 EN 0103 0.161 0.64-0.51 Acc 0.099 0.177 0.56-0.58 Slow 0.066 0.118 0.56-0.43 Random 0.092 0.155 0.59-0.51 Benchmark 0.083 0.147 0.56-0.51 Note: ** indicates the model is in the set of best models M 75%. Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 16 / 22

United States: dynamic asset allocation, January 2002 to December 2013 Average returns Volatily SR MDD SPB 0.091 0.090 1.0-0.18 RF 0.088 0.088 0.98-0.18 BTB 0.091 0.087 1.0-0.20 Prob 0.074 0.113 0.66-0.39 MS 0.075 0.101 0.74-0.28 EN 077 0.098 0.79-0.25 Acc 0.075 0.116 0.65-0.42 Slow 0.060 0.058 1-0.18 Random 0.076 0.095 0.79-0.30 Benchmark 0.068 0.085 0.79-0.31 Note: ** indicates the model is in the set of best models M 75%. Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 17 / 22

Classical evaluation criteria in the euro area, January 2002 to December 2013 QPS AUROC SPB 0.12 0.90 RF 0.11 0.91 BTB 0.12 0.90 Prob 0.25 MS 0.20 EN 0.15 Acc 0.45 Slow 0.54 Random 0.48 Note: ** indicates the model is in the set of best models M 75%. Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 18 / 22

Turning point signals of the reference cycle in the euro area SPB RF BTB Trough: September 2003 1 1 0 Peak: May 2004 11 9 10 Trough: May 2005 4 3 4 Peak: October 2007-1 1-2 Trough: August 2009 1 3 2 Peak: June 2011-1 -2-2 Trough: March 2013 2 2 3 Note: Value shown is the model-implied peak/trough calculated using a 0.5 threshold. The minus sign refers to the lead in which the models anticipate the turning point dates. - indicates that the model did not generate any signal. SPB refers to a boosting model based on squared error loss with P-splines, RF refers to a random forest model, BTB refers to a boosting model based on binomial loss function with decision trees. Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 19 / 22

Euro area: 120/80 equity strategy, January 2002 to December 2013 Average returns Volatily SR MDD SPB 0.085 0.161 0.53-0.46 RF 0.083 0.160 0.52-0.46 BTB 0.079 0.158 0.50-0.46 Prob 0.075 0.182 0.41-0.48 MS 0.076 0.178 0.43-0.47 EN 078 0.169 0.46-0.47 Acc 0.077 0.207 0.37-0.61 Slow 0.051 0.138 0.37-0.43 Random 0.076 0.182 0.42-0.53 Benchmark 0.064 0.173 0.37-0.54 Note: ** indicates the model is in the set of best models M 75%. Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 20 / 22

Euro area: dynamic asset allocation, January 2002 to December 2013 Average returns Volatily SR MDD SPB 0.081 0.094 0.86-0.21 RF 0.080 0.093 0.86-0.22 BTB 0.075 0.091 0.83-0.22 Prob 0.064 0.114 0.56-0.25 MS 0.069 0.105 0.66-0.24 EN 071 0.098 0.72-0.23 Acc 0.060 0.137 0.44-0.44 Slow 0.052 0.070 0.75-0.21 Random 0.064 0.115 0.55-0.32 Benchmark 0.06 0.100 0.55-0.34 Note: ** indicates the model is in the set of best models M 75%. Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 21 / 22

Conclusion Timing the market based on the indicators is possible in real time Ensemble machine learning algorithms are effective Depending on the data and the objective, random forest sometimes performs better than boosting, sometimes not Further work: Economic turning points forecasting (business cycles?) New features (google trends, news-based sentiment values,...) Deep learning Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 22 / 22

Appendix: Correlations between lagged variables Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 1 / 3

References I Berge, T. (2015). Predicting Recessions with Leading Indicators: Model Averaging and Selection over the Business Cycle. Journal of Forecasting, 34(6):455 471. Breiman, L. (2001). Random forests. Machine Learning, 45:5 32. Cenesizoglu, T. and Timmermann, A. (2012). Do return prediction models add economic value? Journal of Banking and Finance, 36(11):2974 2987. International Corporate Finance Governance Conference. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29:1189 1232. Hansen, P., Lunde, A., and Nason, J. (2011). The model confidence set. Econometrica, 79(2):453 497. Ng, S. (2014). Viewpoint: Boosting recessions. Canadian Journal of Economics, 47(1):1 34. Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 2 / 3

References II Schapire, R. E. (1990). The strength of weak learnability. In Machine Learning, pages 197 227. Taieb, S. B., Huser, R., Hyndman, R. J., and Genton, M. G. (2015). Probabilistic time series forecasting with boosted additive models: an application to smart meter data. Technical report. White, H. (2000). A Reality Check for Data Snooping. Econometrica, 68(5):1097 1126. Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning Big Data in Finance 3 / 3