HKUST CSE FYP , TEAM RO4 OPTIMAL INVESTMENT STRATEGY USING SCALABLE MACHINE LEARNING AND DATA ANALYTICS FOR SMALL-CAP STOCKS

Similar documents
Stock Market Predictor and Analyser using Sentimental Analysis and Machine Learning Algorithms

bitarisk. BITA Vision a product from corfinancial. london boston new york BETTER INTELLIGENCE THROUGH ANALYSIS better intelligence through analysis

Does Money Matter? An Artificial Intelligence Approach

An enhanced artificial neural network for stock price predications

STOCK MARKET PREDICTION AND ANALYSIS USING MACHINE LEARNING

Lending Club Loan Portfolio Optimization Fred Robson (frobson), Chris Lucas (cflucas)

$tock Forecasting using Machine Learning

MFE Course Details. Financial Mathematics & Statistics

Novel Approaches to Sentiment Analysis for Stock Prediction

for Finance Python Yves Hilpisch Koln Sebastopol Tokyo O'REILLY Farnham Cambridge Beijing

Stock Prediction Using Twitter Sentiment Analysis

arxiv: v2 [stat.ml] 19 Oct 2017

Loan Approval and Quality Prediction in the Lending Club Marketplace

A Comparative Study of Various Forecasting Techniques in Predicting. BSE S&P Sensex

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, ISSN

Topic-based vector space modeling of Twitter data with application in predictive analytics

Washington University Fall Economics 487

Predicting stock prices for large-cap technology companies

Academic Research Review. Algorithmic Trading using Neural Networks

Application of Deep Learning to Algorithmic Trading

Predictive modelling around the world Peter Banthorpe, RGA Kevin Manning, Milliman

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, ISSN

MS&E 448 Cluster-based Strategy

MAKING OPTIMISATION TECHNIQUES ROBUST WITH AGNOSTIC RISK PARITY

Gamma Distribution Fitting

Key Features Asset allocation, cash flow analysis, object-oriented portfolio optimization, and risk analysis

Financial Mathematics III Theory summary

Machine Learning in Risk Forecasting and its Application in Low Volatility Strategies

Stock Market Prediction System

List of tables List of boxes List of screenshots Preface to the third edition Acknowledgements

Likelihood Approaches to Low Default Portfolios. Alan Forrest Dunfermline Building Society. Version /6/05 Version /9/05. 1.

MS Finance-Quantitative (MSFQ) Academic Year

From optimisation to asset pricing

Washington University Fall Economics 487. Project Proposal due Monday 10/22 Final Project due Monday 12/3

-divergences and Monte Carlo methods

MFE Course Details. Financial Mathematics & Statistics

Artificially Intelligent Forecasting of Stock Market Indexes

Introductory Econometrics for Finance

A TEMPORAL PATTERN APPROACH FOR PREDICTING WEEKLY FINANCIAL TIME SERIES

Loan Approval and Quality Prediction in the Lending Club Marketplace

ALGORITHMIC TRADING STRATEGIES IN PYTHON

State Switching in US Equity Index Returns based on SETAR Model with Kalman Filter Tracking

UPDATED IAA EDUCATION SYLLABUS

FINANCIAL MARKETS. Products. Providing a comprehensive view of the global ETP market to support the needs of all participants

Improving VIX Futures Forecasts using Machine Learning Methods

Stat 328, Summer 2005

Applications of machine learning for volatility estimation and quantitative strategies

The Markowitz framework

Risk Reward Optimisation for Long-Run Investors: an Empirical Analysis

UNDERSTANDING ML/DL MODELS USING INTERACTIVE VISUALIZATION TECHNIQUES

Practical Considerations for Building a D&O Pricing Model. Presented at Advisen s 2015 Executive Risk Insights Conference

Extend the ideas of Kan and Zhou paper on Optimal Portfolio Construction under parameter uncertainty

Market Insights. 1. Rice Warner Research Reports. Superannuation and Investments Reports. 1.1 Superannuation Market Projections

ECS171: Machine Learning

Forecasting Agricultural Commodity Prices through Supervised Learning

Improving Returns-Based Style Analysis

Reinforcement Learning and Simulation-Based Search

Investigating Algorithmic Stock Market Trading using Ensemble Machine Learning Methods

Session 10: Lessons from the Markowitz framework p. 1

Statistical Models and Methods for Financial Markets

Lazy Prices: Vector Representations of Financial Disclosures and Market Outperformance

Lecture 9: Markov and Regime

Predicting and Preventing Credit Card Default

XSG. Economic Scenario Generator. Risk-neutral and real-world Monte Carlo modelling solutions for insurers

TUTORIAL KIT OMEGA SEMESTER PROGRAMME: BANKING AND FINANCE

COMM 324 INVESTMENTS AND PORTFOLIO MANAGEMENT ASSIGNMENT 2 Due: October 20

Predicting the Success of a Retirement Plan Based on Early Performance of Investments

STOCK MARKET FORECASTING USING NEURAL NETWORKS

Stock Market Index Prediction Using Multilayer Perceptron and Long Short Term Memory Networks: A Case Study on BSE Sensex

Modeling Private Firm Default: PFirm

INDICATORS. The Insync Index

EQUITY RESEARCH AND PORTFOLIO MANAGEMENT

CFA Level II - LOS Changes

Session 8: The Markowitz problem p. 1

Abstract Making good predictions for stock prices is an important task for the financial industry. The way these predictions are carried out is often

A Machine Learning Investigation of One-Month Momentum. Ben Gum

Intro to GLM Day 2: GLM and Maximum Likelihood

Statistical and Machine Learning Approach in Forex Prediction Based on Empirical Data

Economics 424/Applied Mathematics 540. Final Exam Solutions

Stochastic Programming for Financial Applications

Welcome To VertexFX Trader Presentation

Oracle Financial Services Market Risk User Guide

Lecture 8: Markov and Regime

Gradient Descent and the Structure of Neural Network Cost Functions. presentation by Ian Goodfellow

Disclaimer. 2 Disclaimer

THE PROBLEM THERE IS AN INFORMATION CRISIS IN CONSUMER FINANCE LATIKA. Emilian. Alternative online lender without enough data

(High Dividend) Maximum Upside Volatility Indices. Financial Index Engineering for Structured Products

Portfolio Recommendation System Stanford University CS 229 Project Report 2015

Chapter IV. Forecasting Daily and Weekly Stock Returns

Credit Card Default Predictive Modeling

Scaling SGD Batch Size to 32K for ImageNet Training

Beating the market, using linear regression to outperform the market average

Rubric TESTING FRAMEWORK FOR EARLY WARNING INDICATORS CONTENTS

D4.7: Action planning manager

Contents Cloud or On-Premises Content Introduction Data Dictionary... 23

Two kinds of neural networks, a feed forward multi layer Perceptron (MLP)[1,3] and an Elman recurrent network[5], are used to predict a company's

Market caps bias EM portfolios to sub-optimal solutions: Enhancing performance through an active approach

FE670 Algorithmic Trading Strategies. Stevens Institute of Technology

****** requests that its sales of services be considered non-marketing services and

A Comparative Study of Ensemble-based Forecasting Models for Stock Index Prediction

Transcription:

HKUST CSE FYP 2017-18, TEAM RO4 OPTIMAL INVESTMENT STRATEGY USING SCALABLE MACHINE LEARNING AND DATA ANALYTICS FOR SMALL-CAP STOCKS

MOTIVATION MACHINE LEARNING AND FINANCE

MOTIVATION SMALL-CAP MID-CAP LARGE-CAP < US$ 2B US$ 2B - US$10B > US$ 10B Market Capitalisation = Market value of a company s outstanding shares

MOTIVATION SMALL CAPITALISATION STOCKS Higher risk and volatility Potentially higher returns Of most interest to Retail Investors Institutional Investors not very active Listed on NASDAQ for at least 15 years

MOTIVATION TARGET SEGMENT: RETAIL INVESTORS Lack sophistication and expert knowledge Access to lower quality research and resources Look for: higher returns for lower risk diversified portfolio in a smaller investment

MOTIVATION THE SMALL-CAP MARKET Little analyst coverage Less financial information published Market inefficiencies

OBJECTIVES MACHINE LEARNING MODELS FOR PREDICTION + PORTFOLIO ALLOCATION USING PREDICTIONS + WEB APPLICATION FOR USER INTERACTION

OBJECTIVES OBJECTIVES Experiment with different machine learning algorithms for stock price forecasting Use time series predictions to allocate stocks within risk threshold of user Develop a web application that allows users to specify parameters and track portfolio over time

DATA DATASOURCES Python scraper for ticker symbols of NASDAQ small-cap stocks from Zacks Stock Screener Tool Cleaned for inconsistencies in preferred stocks symbols Extraction of historical stock prices using AlphaVantage API Filtered to obtain prices between Oct 2001 and Feb 2018

PRICE PREDICTION MODEL

PRICE PREDICTION MODEL LEVERAGES MACHINE LEARNING TO PREDICT STOCK PRICES FOR A MONTH AHEAD Price Prediction Model

PRICE PREDICTION MODEL PROBLEMS SOLVED BY ML 1 2 Classification Regression

PRICE PREDICTION MODEL PROBLEM WE ARE SOLVING 1 2 Classification Regression

PRICE PREDICTION MODEL MACHINE LEARNING FOR STOCK PRICES Time series: a long list of decimal values (Stock prices) Features and targets? FEATURE 1 FEATURE 2 FEATURE M TARGET VARIABLE 5.9732, 5.9732, 5.9001, 5.9732, 6.0406, 5.9001, 6.2541, 6.0743, 6.0743, 5.8664, 5.8327,. 5.9732 5.9001 6.0406 6.2541 5.9001 6.0406 5.9001 5.8327..

PRICE PREDICTION MODEL MACHINE LEARNING ALGORITHM - LONG SHORT-TERM MEMORY RNN (Recurrent Neural Network): class of Artificial Neural Network that allows units to form a directed graph LSTM: type of RNN that can model long temporal sequences

PRICE PREDICTION MODEL MACHINE LEARNING ALGORITHM - LONG SHORT-TERM MEMORY Critical parameter to decide: sequence length for machine learning to create dataset M = sequence length FEATURE 1 FEATURE 2 FEATURE M TARGET 5.9732 5.9001 6.0406 6.2541 5.9001 6.0406 5.9001 5.8327..

PRICE PREDICTION MODEL MACHINE LEARNING ALGORITHM - LONG SHORT-TERM MEMORY Multiple Strategies of choosing sequence length Strategy 1: Fix sequence length for all stocks. e.g.: 10 May not give best results Strategy 2: Optimise sequence length based on test RMSE Unclear hypothesis space, exhaustive search expensive

PRICE PREDICTION MODEL MACHINE LEARNING ALGORITHM - LONG SHORT-TERM MEMORY Take sequence length as 7 Need 30-day forecast Divide the time series into 70/30 for training/testing Train using Root Mean Square Error as loss function Create dataset from time series as follows: Features (Input) Target (Output) p t, p t+1.. p t+6 p t+36 p t+1, p t+2.. p t+7 p t+37 p t : stock price on day t

PRICE PREDICTION MODEL MACHINE LEARNING ALGORITHM - LONG SHORT-TERM MEMORY Stock Price (US$) > Day > Unable to generalise on testing data Unreliable forecast

PRICE PREDICTION MODEL MACHINE LEARNING ALGORITHM - LINEAR REGRESSION Simpler model Fewer parameters StockPrice t = β 1 * StockPrice t-30 + β 2 * StockPrice t-60 + β 0 Train using R 2 loss as loss function

PRICE PREDICTION MODEL MACHINE LEARNING ALGORITHM - LINEAR REGRESSION Performs well on testing data Follows general trend unlike previous case 30-day forecast reliable

ASSET ALLOCATION MODEL

ASSET ALLOCATION MODEL USES PREDICTIONS TO FIND OPTIMAL SET OF STOCKS WITH THE RATIOS TO INVEST IN Asset Allocation Model

ASSET ALLOCATION MODEL MEAN VARIANCE OPTIMISATION Proposed by Henry Markowitz in 1952 Weighted average of individual stocks R w = w 1 R 1 + w 2 R 2 + + w n R n (R: return, n: number of stocks) Use covariance matrix to minimise mean variance

ASSET ALLOCATION MODEL MEAN VARIANCE OPTIMISATION Markowitz Bullet

ASSET ALLOCATION MODEL ALLOCATOR SCRIPT DESIGN User input: number of stocks, volatility threshold Modular design offers flexibility Sorting parameters Minimise risk (SD) Maximise return (E[R]) Maximise risk efficiency (E[R]/SD) Stock E[R] SD E[R]/ SD A 5% 1.2% 4.16 E 7% 2.2% 3.18 C 10% 4% 2.5 D 2% 0.8% 2.5 B 8% 4.5% 1.77

ASSET ALLOCATION MODEL ALLOCATOR SCRIPT IMPLEMENTATION 1 User provides input through web application 2 Processing input to obtain parameters 3 Covariance Matrix constructed and Convex Optimisation done using cvxopt library 4 Results returned to JavaScript application

WEB APPLICATION

WEB APPLICATION INTERACTIVE USER INTERFACE FOR MANAGING, TRACKING CHANGES TO PORTFOLIO Web Application

WEB APPLICATION FRAMEWORKS AND TOOLS Component HTML5, CSS Bootstrap AngularJS D3.js jquery Flask Firebase Purpose Styling web pages Styling components of Backend application logic Render charts and graphs using SVG components Application logic for front-end components behaviours Develop front-to-back end applications in Python, used for running allocation script Services like Authentication, NoSQL user database

WEB APPLICATION SERVICES OFFERED 1 Authentication using social network APIs - Google, Facebook 2 Stocks Analyser Graphical representation of historical prices and predicted price for upcoming month for all stocks 3 4 Portfolio Manager View current portfolio constituents, ratios and growth. Optimise portfolio using custom parameters. Portfolio Growth Analyser Evaluate growth over time Compare growth with that of benchmarks

DEMO

TESTING AND EVALUATION

TESTING AND EVALUATION PRICE PREDICTION MODEL TESTING 1 2 Debugging and testing Loss function (during model training): RMSE (Root Mean Square Error) - LSTM R 2 loss - Linear and Multiple Linear Regression

TESTING AND EVALUATION PRICE PREDICTION MODEL EVALUATION 1 Portfolio Growth Analyser feature of Web Application 2 Multiple Linear Regression gave best, most consistent results across all stocks

TESTING AND EVALUATION ASSET ALLOCATION MODEL TESTING 1 White box testing - Pylint for syntax and coding errors 2 Black box testing - CPU usage, memory, context switching statistics to check for memory leaks in convex optimisation component 3 Manual checks for formats, validation of value ranges

TESTING AND EVALUATION ASSET ALLOCATION MODEL EVALUATION 1 Beat benchmarks in 35 out of 36 simulated months 2 3

TESTING AND EVALUATION ASSET ALLOCATION MODEL EVALUATION

TESTING AND EVALUATION WEB APPLICATION EVALUATION Usability Testing Average Rating Usability of Login Page 4.2 / 5.0 Usability of Services Page 4.7 / 5.0 Usability of Stocks Explorer Page 4.1 / 5.0 Usability of Portfolio Manager Page 4.4 / 5.0 Usability of Portfolio Growth Analyser Page 4.4 / 5.0

DISCUSSION AND CONCLUSION

DISCUSSION AND CONCLUSION CHALLENGES FACED 1 Data collection and preprocessing for consistency 2 Accurate prediction of stocks prices over time 3 Adaptation of portfolio allocation theories for price prediction models generated using machine learning techniques 4 Integration of Flask application into web application

DISCUSSION AND CONCLUSION FINAL THOUGHTS Expectation that LSTM would perform better than multiple linear regression. Overfitting Limitation of resources, computation power, time No inclusion of transaction fees in calculation of portfolio growth Real life limitations beyond scope of our project

DISCUSSION AND CONCLUSION FURTHER AREAS OF EXPANSION/IMPROVEMENT Try more machine learning algorithms Incorporate other portfolio theories Improve current algorithm to increase prediction accuracy Inclusion of non-financial data like tweets, weather data, Google Trends results.

THANK YOU! QUESTIONS?