MS&E 448 Final Presentation High Frequency Algorithmic Trading

Similar documents
High Frequency Price Movement Strategy. Adam, Hujia, Samuel, Jorge

Deep Learning - Financial Time Series application

RSI 2 System. for Shorter term SWING trading and Longer term TREND following. Dave Di Marcantonio 2016

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Evaluation of Models. Niels Landwehr

ALGORITHMIC TRADING STRATEGIES IN PYTHON

Interactive Brokers Webcast. Options on ETFs. February 13, 2013 Presented by Russell Rhoads, CFA

Portfolio replication with sparse regression

Investing through Economic Cycles with Ensemble Machine Learning Algorithms

ECS171: Machine Learning

Option-Implied Information in Asset Allocation Decisions

HANDBOOK OF. Market Risk CHRISTIAN SZYLAR WILEY

Better decision making under uncertain conditions using Monte Carlo Simulation

Lecture 10: Alternatives to OLS with limited dependent variables, part 1. PEA vs APE Logit/Probit

A share on algorithm trading strategy design and testing. Peter XI 20 November 2017

VERY IMPORTANT Before you start you have to follow these instructions to insure that the strategy is working properly:

Real-time Analytics Methodology

Alpha-Beta Soup: Mixing Anomalies for Maximum Effect. Matthew Creme, Raphael Lenain, Jacob Perricone, Ian Shaw, Andrew Slottje MIRAJ Alpha MS&E 448

Forecasting Agricultural Commodity Prices through Supervised Learning

Simple Steps You Can Take Right Now To Trade Volatility Like A Pro

The Loans_processed.csv file is the dataset we obtained after the pre-processing part where the clean-up python code was used.

The Effect of US Economy on SPY 10-13

Risk and Risk Management in the Credit Card Industry

The CTA VAI TM (Value Added Index) Update to June 2015: original analysis to December 2013

We are not saying it s easy, we are just trying to make it simpler than before. An Online Platform for backtesting quantitative trading strategies.

INTEREST RATE RISK MAKING YOUR MODEL UNDERSTANDABLE AND RELEVANT

Regressing Loan Spread for Properties in the New York Metropolitan Area

Matt Hougan Graduate Seminar: Paul Britt, CFA

Applications of machine learning for volatility estimation and quantitative strategies

Machine Learning for Trading Financial Investing Part 3 of Course Overview and Introduction

LendingClub Loan Default and Profitability Prediction

IRC / stressed VaR : feedback from on-site examination

Welcome to OneChicago Using the Single Stock Futures Calculator

Exchange Traded Funds ETFs

Andrew Falde s Strategy Set Theory Updated 2/19/2016

Loan Approval and Quality Prediction in the Lending Club Marketplace

Funeral by funeral, theory advances. (Paul Samuelson)

Modelling Counterparty Exposure and CVA An Integrated Approach

Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques

Trading volatility 5/11/2012. Trading volatility. At what cost? TVIX. Copyright (c) 2012 by Robert E. Whaley. All rights reserved.

High Probability ETF Trading For All

VANGUARD DIVIDEND APPREC ETF (VIG)

Loan Approval and Quality Prediction in the Lending Club Marketplace

Relative and absolute equity performance prediction via supervised learning

Lending Club Loan Portfolio Optimization Fred Robson (frobson), Chris Lucas (cflucas)

Machine Learning in Finance and Trading RA2R, Lee A Cole

Weeklys Options and Short Term Strategies. Russell A. Rhoads, CFA

MS&E 448 Cluster-based Strategy

A Machine Learning Investigation of One-Month Momentum. Ben Gum

High Probability ETF Trading For All

GN47: Stochastic Modelling of Economic Risks in Life Insurance

ELEMENTS OF MONTE CARLO SIMULATION

Modelling the Sharpe ratio for investment strategies

Machine Learning in Risk Forecasting and its Application in Low Volatility Strategies

Minimizing Timing Luck with Portfolio Tranching The Difference Between Hired and Fired

January 24, Michael Rechenthin, PhD Frank Kaberna

Improving Returns-Based Style Analysis

VANGUARD HIGH DIVIDEND YIELD ETF (VYM)

High Frequency Trading Strategy Based on Prex Trees

Commentary. CBOE Volatility Index (VIX) Brexit Election VIX

Market Risk: FROM VALUE AT RISK TO STRESS TESTING. Agenda. Agenda (Cont.) Traditional Measures of Market Risk

Are Your Risk Tolerance and LDI Glide Path in Sync?

Bond Pricing AI. Liquidity Risk Management Analytics.

ORIGINALLY APPEARED IN ACTIVE TRADER M AGAZINE

Overlapping ETF: Pair trading between two gold stocks

Distance-Based High-Frequency Trading

1. NEW Sector Trading Application to emulate and improve upon Modern Portfolio Theory.

Predictive modelling around the world Peter Banthorpe, RGA Kevin Manning, Milliman

MFE8825 Quantitative Management of Bond Portfolios

An introduction to Machine learning methods and forecasting of time series in financial markets

$tock Forecasting using Machine Learning

Artificially Intelligent Forecasting of Stock Market Indexes

Automated Options Trading Using Machine Learning

Options: How About Wealth & Income?

ActiveAllocator Insights

Risk Control of Mean-Reversion Time in Statistical Arbitrage,

Real Options. Katharina Lewellen Finance Theory II April 28, 2003

Modeling Volatility Risk in Equity Options: a Cross-sectional approach

Commentary. Our greatest weakness lies in giving up. The most certain way to succeed is always to try just one more time.

Web Appendix to Components of bull and bear markets: bull corrections and bear rallies

Backtesting Performance with a Simple Trading Strategy using Market Orders

Options and Volatility Benchmarks & Indicators Cboe Risk Management Conference Asia. John Hiatt

Building a Portfolio of ETFs to exploit negative Autocorrelation. Chrilly Donninger Chief Scientist, Sibyl-Project Sibyl-Working-Paper, September 2016

CASE STUDY DEPOSIT GUARANTEE FUNDS

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Internet appendix to Is There Price Discovery in Equity Options?

Commentary. Patience and perseverance have a magical effect before which difficulties disappear and obstacles vanish. - John Quincy Adams

CAES Workshop: Risk Management and Commodity Market Analysis

Econometric Methods for Valuation Analysis

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

ESTIMATING ECONOMIC BENEFITS OF ALLOWING A FLEXIBLE WINDOW FOR MARYLAND PURCHASES OF SPONGE CRABS

IVolatility.com User Guide. My Favorites

ISHARES GLOBAL 100 ETF (IOO)

Nasdaq Chaikin Power US Small Cap Index

Brooks, Introductory Econometrics for Finance, 3rd Edition

VIX Hedging September 30, 2015 Pravit Chintawongvanich, Head of Risk Strategy

Variable Annuities - issues relating to dynamic hedging strategies

Alternative VaR Models

Algorithmic Trading under the Effects of Volume Order Imbalance

DB Quant Research Americas

LEVEL II CFA 2019 CURRICULUM UPDATES

Transcription:

MS&E 448 Final Presentation High Frequency Algorithmic Trading Francis Choi George Preudhomme Nopphon Siranart Roger Song Daniel Wright Stanford University June 6, 2017 High-Frequency Trading MS&E448 June 6, 2017 1 / 29

Overview Review our strategy and progress from the midterm Changes in Data Processing Changes to Models Strategy and Simulations Results Evaluation and Next Steps High-Frequency Trading MS&E448 June 6, 2017 2 / 29

Recall from the Midterm Goal: Next-minute price movement prediction based on order book dynamics Data: Minute-by-Minute consolidated book for S&P 500 ETF (IVV) Model: Random Forest three-way classifier Labels: Mid-price changes and spread-crossing Trading Strategy: Accumulating positions and closing them out at the end of the day Results: Still not generated profit High-Frequency Trading MS&E448 June 6, 2017 3 / 29

After the Midterm Data Processing Changing the data from minute by minute to second by second Change from three-way classification to binary classification (no longer using spread crossing label) Train and test on a rolling window basis - 2 weeks training period prior to each day High-Frequency Trading MS&E448 June 6, 2017 4 / 29

Data (Example) High-Frequency Trading MS&E448 June 6, 2017 5 / 29

After the Midterm New Labels AREA Time-weighed PnL over the next period (area under the price movement curve) VWAP Volume-weighted average price (VWAP) based on inner bid and ask. Whether it goes up or down in the window. High-Frequency Trading MS&E448 June 6, 2017 6 / 29

After the Midterm Adding new features Bid-Ask Volume Imbalance Quantity indicating the number of shares at the bid minus the number of shares at the ask in the current order book. VWAP A variation on mid-price where the average of the bid and ask prices is weighted according to their inverse volume. Second Order Derivatives Expand feature universe to encompass multiple time periods. High-Frequency Trading MS&E448 June 6, 2017 7 / 29

Model Logistic Regression Outputs probability (how confident we are) on each trade Advantages over random forest: it trains much faster, the coefficients have an interpretation High-Frequency Trading MS&E448 June 6, 2017 8 / 29

Model Random Forest Again, outputs probability (how confident we are) on each trade One key advantage over logistic regression - doesn t assume any functional form and slightly higher accuracy High-Frequency Trading MS&E448 June 6, 2017 9 / 29

Strategy Train the model on a rolling backwards window. At each second, use the model to arrive at a prediction with a probability estimate. If the probability estimate is above the threshold, make the predicted trade with the size weighted accordingly Close out the trade at the end of the trading window. High-Frequency Trading MS&E448 June 6, 2017 10 / 29

Thesys Simulator Here is what we think it looks like High-Frequency Trading MS&E448 June 6, 2017 11 / 29

Thesys Simulator Here is what it actually looks like High-Frequency Trading MS&E448 June 6, 2017 12 / 29

Thesys Simulator Very frustrating and very slow We decided to just pull the data from Thesys and do the simulations manually. High-Frequency Trading MS&E448 June 6, 2017 13 / 29

Results We choose 10 stocks and ETFs to test our trading strategies, chosen based on liquidity These include XLF, CSCO, EEM, IVV, IWM, QQQ, UVXY, VXX, XLE, SPY Training Period - 2 weeks from 01/05/2015-01/16/2015 Test Period - 2 weeks from 01/19/2015-01/30/2015 We use PnL per trade as a performance metric High-Frequency Trading MS&E448 June 6, 2017 14 / 29

Tuning Parameters Figure: Heat map of accuracy for different decay and window length parameters (Left) XLE (Right) XLF High-Frequency Trading MS&E448 June 6, 2017 15 / 29

Accuracy of Model: Logistic Regression Figure: Prediction accuracy vs prediction threshold for the logistic regression model High-Frequency Trading MS&E448 June 6, 2017 16 / 29

Accuracy of Model: Random Forest Figure: Prediction accuracy vs prediction threshold for the random forest model. High-Frequency Trading MS&E448 June 6, 2017 17 / 29

Accuracy of Model: Difference Overall, Random Forest has slightly better accuracy across threshold values. Figure: Prediction accuracy RF - LR vs prediction threshold. High-Frequency Trading MS&E448 June 6, 2017 18 / 29

Cumulative PnL (XLF) PnL stably increasing throughout the day - High Sharpe Ratio!! Figure: Cumulative PnL within a day High-Frequency Trading MS&E448 June 6, 2017 19 / 29

Trading PnL (XLF) Logistic Regression with VWAP label performs best in this case Figure: PnL per Trade vs prediction threshold for each algorithm and label High-Frequency Trading MS&E448 June 6, 2017 20 / 29

Trading PnL (XLF) Tuning hyperparameters improves the model significantly Figure: PnL per Trade vs prediction threshold for different hyperparameters High-Frequency Trading MS&E448 June 6, 2017 21 / 29

Trading PnL (MSFT) Random Forest with AREA label performs best for MSFT Figure: PnL per Trade vs prediction threshold for each algorithm and label High-Frequency Trading MS&E448 June 6, 2017 22 / 29

Trading PnL (MSFT) A combination of non-optimal hyperparameters, models and labels performs poorly. Figure: PnL per Trade vs prediction threshold for different hyperparameters High-Frequency Trading MS&E448 June 6, 2017 23 / 29

Multiple Stocks Random Forest with AREA labels. Window = 15, decay = 0.8 Figure: PnL per Trade vs prediction threshold for different stocks High-Frequency Trading MS&E448 June 6, 2017 24 / 29

Multiple Stocks Logistic Regression with AREA labels. Window = 15, decay = 0.8 Figure: PnL per Trade vs prediction threshold for different stocks High-Frequency Trading MS&E448 June 6, 2017 25 / 29

Evaluating Our Strategy Strengths: High accuracy rates: model is doing a good job High PnL per trade with small variance especially when training on a longer period of time The model can be generalized to multiple stocks/etfs Perform well even in tumultuous historical periods and on hypothetical scenarios Limitations: Have to tune hyperparameters for each stock High prediction accuracy does not always mean profit: label isn t exactly a prediction of PnL Interpretability of the model High-Frequency Trading MS&E448 June 6, 2017 26 / 29

Future Work and Areas for Improvement Within 10 weeks, we can t make the perfect trading strategy: there is still a lot we could improve. Some ideas for further work: Training on a longer period of time More sophisticated features: right now we only use the order book data, could try including external features (such as an index like the VIX, or data on correlated securities, etc.) Converting to a strategy that trades at bid and ask (rather than midprice) Modifying strategy to handle scaled-up trade quantities Risk Management High-Frequency Trading MS&E448 June 6, 2017 27 / 29

Conclusion Idea: use machine learning techniques on the order book to make price movement predictions. Trade on these predictions to make $$$ Models: Random forest, logistic regression Data: Second-by-second orderbook data from Thesys Calibrated trading frequency, prediction label, hyperparameters of models Performed simulations on historical data Promising results that can be built upon High-Frequency Trading MS&E448 June 6, 2017 28 / 29

Conclusion The End Questions? High-Frequency Trading MS&E448 June 6, 2017 29 / 29