UNDERSTANDING ML/DL MODELS USING INTERACTIVE VISUALIZATION TECHNIQUES

Similar documents
Novel Approaches to Sentiment Analysis for Stock Prediction

Foreign Exchange Forecasting via Machine Learning

Predicting stock prices for large-cap technology companies

ALGORITHMIC TRADING STRATEGIES IN PYTHON

Are New Modeling Techniques Worth It?

Improving VIX Futures Forecasts using Machine Learning Methods

Application of Deep Learning to Algorithmic Trading

distribution of the best bid and ask prices upon the change in either of them. Architecture Each neural network has 4 layers. The standard neural netw

Credit Card Default Predictive Modeling

Better decision making under uncertain conditions using Monte Carlo Simulation

Stock Prediction Using Twitter Sentiment Analysis

SEC 22e-4 Solution for Liquidity

High Frequency Price Movement Strategy. Adam, Hujia, Samuel, Jorge

Milestone2. Zillow House Price Prediciton. Group: Lingzi Hong and Pranali Shetty

Stock Price Prediction using Deep Learning

Machine Learning in Risk Forecasting and its Application in Low Volatility Strategies

UPDATED IAA EDUCATION SYLLABUS

International Journal of Advance Engineering and Research Development REVIEW ON PREDICTION SYSTEM FOR BANK LOAN CREDIBILITY

APPLICATIONS OF STATISTICAL DATA MINING METHODS

Recurrent Residual Network

Forecasting Agricultural Commodity Prices through Supervised Learning

DFAST Modeling and Solution

The Loans_processed.csv file is the dataset we obtained after the pre-processing part where the clean-up python code was used.

2018 Predictive Analytics Symposium Session 10: Cracking the Black Box with Awareness & Validation

International Journal of Research in Engineering Technology - Volume 2 Issue 5, July - August 2017

Gradient Descent and the Structure of Neural Network Cost Functions. presentation by Ian Goodfellow

arxiv: v1 [cs.lg] 21 Oct 2018

HKUST CSE FYP , TEAM RO4 OPTIMAL INVESTMENT STRATEGY USING SCALABLE MACHINE LEARNING AND DATA ANALYTICS FOR SMALL-CAP STOCKS

Simple Fuzzy Score for Russian Public Companies Risk of Default

Alternate Models for Forecasting Hedge Fund Returns

$tock Forecasting using Machine Learning

Anurag Sodhi University of North Carolina at Charlotte

P2.T5. Market Risk Measurement & Management. Bruce Tuckman, Fixed Income Securities, 3rd Edition

Investing through Economic Cycles with Ensemble Machine Learning Algorithms

2017 IAA EDUCATION SYLLABUS

MS&E 448 Final Presentation High Frequency Algorithmic Trading

Data Adaptive Stock Recommendation

Role of soft computing techniques in predicting stock market direction

Leverage Financial News to Predict Stock Price Movements Using Word Embeddings and Deep Neural Networks

The New Alchemy: Turning Words into Signals

101: MICRO ECONOMIC ANALYSIS

A Multi-topic Approach to Building Quant Models. Bringing Semantic Intelligence to Financial Markets

STOCK MARKET PREDICTION AND ANALYSIS USING MACHINE LEARNING

Artificially Intelligent Forecasting of Stock Market Indexes

Relative and absolute equity performance prediction via supervised learning

Economic Response Models in LookAhead

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

The Influence of News Articles on The Stock Market.

Wide and Deep Learning for Peer-to-Peer Lending

Examining the Morningstar Quantitative Rating for Funds A new investment research tool.

MFE Course Details. Financial Mathematics & Statistics

Model Calibration with Artificial Neural Networks

Sentiment Extraction from Stock Message Boards The Das and

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, ISSN

Internet Appendix. Additional Results. Figure A1: Stock of retail credit cards over time

LEVEL II CFA 2019 CURRICULUM UPDATES

Machine Learning Applications in Insurance

StatPro Revolution - Analysis Overview

TRΛNSPΛRΣNCY ΛNΛLYTICS

Macroeconomic conditions and equity market volatility. Benn Eifert, PhD February 28, 2016

Deep Learning for Time Series Analysis

Improving Long Term Stock Market Prediction with Text Analysis

Risk Analysis. å To change Benchmark tickers:

σ e, which will be large when prediction errors are Linear regression model

Disclosure of European Embedded Value (summary) as of September 30, 2011

Deep Learning - Financial Time Series application

Prediction of securities behavior using a multi-level artificial neural network with extra inputs between layers

Quantitative Measure. February Axioma Research Team

Business Strategies in Credit Rating and the Control of Misclassification Costs in Neural Network Predictions

Can Twitter predict the stock market?

ACC 121 PRINCIPLES OF MANAGERIAL ACCOUNTING

Predicting Foreign Exchange Arbitrage

Presented at the 2003 SCEA-ISPA Joint Annual Conference and Training Workshop -

A DECISION SUPPORT SYSTEM FOR HANDLING RISK MANAGEMENT IN CUSTOMER TRANSACTION

DATA SUMMARIZATION AND VISUALIZATION

Nonlinear Manifold Learning for Financial Markets Integration

The Comovements Along the Term Structure of Oil Forwards in Periods of High and Low Volatility: How Tight Are They?

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, ISSN

Applications of Neural Networks

Modeling Private Firm Default: PFirm

Loan Approval and Quality Prediction in the Lending Club Marketplace

Data Mining: A Closer Look. 2.1 Data Mining Strategies 8/30/2011. Chapter 2. Data Mining Strategies. Market Basket Analysis. Unsupervised Clustering

Contents Part I Descriptive Statistics 1 Introduction and Framework Population, Sample, and Observations Variables Quali

An enhanced artificial neural network for stock price predications

Money & Capital Markets Fall 2011 Homework #1 Due: Friday, Sept. 9 th. Answer Key

Bond Market Prediction using an Ensemble of Neural Networks

Understanding Risks in a Global Multi-Asset Class Portfolio

Stat 328, Summer 2005

Statistics 101: Section L - Laboratory 6

How Can YOU Use it? Artificial Intelligence for Actuaries. SOA Annual Meeting, Gaurav Gupta. Session 058PD

Deep Learning in Asset Pricing

International Journal of Computer Engineering and Applications, Volume XII, Issue IV, April 18, ISSN

Naïve Bayesian Classifier and Classification Trees for the Predictive Accuracy of Probability of Default Credit Card Clients

Two kinds of neural networks, a feed forward multi layer Perceptron (MLP)[1,3] and an Elman recurrent network[5], are used to predict a company's

To be two or not be two, that is a LOGISTIC question

Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Solutions to Final Exam

Prediction of Stock Price Movements Using Options Data

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 2, Mar Apr 2017

Stat 401XV Exam 3 Spring 2017

Transcription:

UNDERSTANDING ML/DL MODELS USING INTERACTIVE VISUALIZATION TECHNIQUES Chakri Cherukuri Senior Researcher Quantitative Financial Research Group 1

OUTLINE Introduction Applied machine learning in finance Case studies Twitter Sentiment Analysis Learning option prices using deep learning tools Yield Curve dimensionality reduction (PCA vs Autoencoder) Conclusion 2

APPLIED MACHINE LEARNING IN FINANCE 3

STRUCTURED DATA SETS Task Features Labels Machine Learning Technique Time series prediction Past returns, market conditions Future returns LSTM Illiquid asset pricing Asset characteristics Market price Boosted Trees/Random Forests Trading Strategies Market conditions Strategy to invest in Boosted Trees/Random Forests Dimensionality Reduction Yield Curve Yield curve PCA/Autoencoder Exotic option pricing Deal/market parameters Price Neural nets 4

UNSTRUCTURED DATA SETS Task Object detection from satellite images Abstractive summarization of news articles for quick consumption News/twitter sentiment for stocks, commodities etc. Entity embeddings for companies, news, documents Deep Learning Model Conv nets RNN, attention based models NLP models (Word embeddings + Nets) LSTM/RNN 5

TWITTER SENTIMENT ANALYSIS 6

NEWS/TWITTER SENTIMENT News & social sentiment from the raw news story or tweet Unstructured Highly time-sensitive Story-level sentiment Company-level sentiment Sentiment score can be used as a trading signal 7

RUSSELL 2000 STOCKS 8

TWITTER SENTIMENT CLASSIFICATION Problem statement: Predict the sentiment (negative, neutral, positive) of a tweet for a company Ex: $CTIC Rated strong buy by three WS analysts. Increased target rom $5 to $8. : Positive Three way classification problem Input: raw tweets Output: sentiment label {negative, neutral, positive} 9

METHODOLOGY We are given labeled train and test data sets Train classifier on training data set Predict labels on test data and evaluate performance 10

ONE-VS-REST LOGISTIC REGRESSION Train three binary classifiers for each label Model 1: Negative vs. Not Negative Model 2: Neutral vs. Not Neutral Model 3: Positive vs. Not Positive Get probabilities (measures of confidence) for each label Output the label associated with the highest probability 11

CLASSIFIER PERFORMANCE ANALYSIS Look at misclassifications Confusion matrix Understand model predicted probabilities Triangle visualization Fix data issues 12

TRIANGLE VISUALIZATION Model returns 3 probabilities (which sum to 1) How can we visualize these 3 numbers? Points inside an equilateral triangle Negative / Neutral Not sure Very positive 13

PERFORMANCE ANALYSIS DASHBOARD Use the dashboard to: Analyze misclassifications (using confusion matrix) Improve model by adding more features (by looking at model coefficients) Fix data issues (using triangle and lasso) 14

ANALYZE MISCLASSIFICATIONS 15

ANALYZE MISCLASSIFICATIONS 16

ANALYZE MISCLASSIFICATIONS 17

USE LASSO TO FIND DATA ISSUES 18

USE LASSO TO FIND DATA ISSUES 19

DEEP LEARNING TOOLS 20

NEURAL NETWORK WIZARD Graphical tool to build, train and diagnose deep learning models Real time plots during the training process: Loss/Accuracy curves Distributions of weights/biases/activations at each layer Diagnostic plots: Analysis of residuals (for regression) / Confusion matrix (for classification) Partial dependencies Conditional residual plots/histograms 21

NETWORK PARAMETERS 22

NETWORK ARCHITECTURE 23

LOSS AND ACCURACY CURVES 24

DISTRIBUTIONS OF WEIGHTS/BIASES/ACTIVATIONS 25

PARTIAL AND CONDITIONAL DEPENDENCIES Training dataset Conditioned on S=70 Conditioned on S=80 Conditioned on S=120 S T sigma moneyness 103.1720 1.0001 0.2970 1.1086 106.8025 1.9337 0.0059 1.5291 73.6899 1.0049 0.3483 1.2806 96.9050 0.4798 0.2530 1.6489 129.9036 1.6109 0.0286 0.3932 70.6674 0.6879 0.5089 1.6949 126.6076 1.1710 0.4051 1.3195 95.6398 0.0855 0.4133 0.4837 114.1751 1.4486 0.2888 0.3599 79.0308 0.5609 0.4857 1.1420 127.8912 0.3830 0.4798 0.3025 S T sigma moneyness 70 1.0001 0.2970 1.1086 70 1.9337 0.0059 1.5291 70 1.0049 0.3483 1.2806 70 0.4798 0.2530 1.6489 70 1.6109 0.0286 0.3932 70 70 0.6879 0.5089 1.6949 70 1.1710 0.4051 1.3195 70 0.0855 0.4133 0.4837 70 1.4486 0.2888 0.3599 70 0.5609 0.4857 1.1420 70 0.3830 0.4798 0.3025 S T Sigma moneyness 80 1.0001 0.2970 1.1086 80 1.9337 0.0059 1.5291 80 1.0049 0.3483 1.2806 80 0.4798 0.2530 1.6489 80 1.6109 0.0286 0.3932 80 80 0.6879 0.5089 1.6949 80 1.1710 0.4051 1.3195 80 0.0855 0.4133 0.4837 80 1.4486 0.2888 0.3599 80 0.5609 0.4857 1.1420 80 0.3830 0.4798 0.3025 S T sigma moneyness 120 1.0001 0.2970 1.1086 120 1.9337 0.0059 1.5291 120 1.0049 0.3483 1.2806 120 0.4798 0.2530 1.6489 120 1.6109 0.0286 0.3932 120 120 0.6879 0.5089 1.6949 120 1.1710 0.4051 1.3195 120 0.0855 0.4133 0.4837 120 1.4486 0.2888 0.3599 120 0.5609 0.4857 1.1420 120 0.3830 0.4798 0.3025 Training dataset Conditioned on T=1 Conditioned on T=.5 Conditioned on T=2 S T sigma moneyness 103.1720 1.0001 0.2970 1.1086 106.8025 1.9337 0.0059 1.5291 73.6899 1.0049 0.3483 1.2806 96.9050 0.4798 0.2530 1.6489 129.9036 1.6109 0.0286 0.3932 70.6674 0.6879 0.5089 1.6949 126.6076 1.1710 0.4051 1.3195 95.6398 0.0855 0.4133 0.4837 114.1751 1.4486 0.2888 0.3599 79.0308 0.5609 0.4857 1.1420 127.8912 0.3830 0.4798 0.3025 S T sigma moneyness 103.1720 1 0.2970 1.1086 106.8025 1 0.0059 1.5291 73.6899 1 0.3483 1.2806 96.9050 1 0.2530 1.6489 129.9036 1 0.0286 0.3932 1 70.6674 1 0.5089 1.6949 126.6076 1 0.4051 1.3195 95.6398 1 0.4133 0.4837 114.1751 1 0.2888 0.3599 79.0308 1 0.4857 1.1420 127.8912 1 0.4798 0.3025 S T Sigma moneyness 103.1720.5 0.2970 1.1086 106.8025.5 0.0059 1.5291 73.6899.5 0.3483 1.2806 96.9050.5 0.2530 1.6489 129.9036.5 0.0286 0.3932.5 70.6674.5 0.5089 1.6949 126.6076.5 0.4051 1.3195 95.6398.5 0.4133 0.4837 114.1751.5 0.2888 0.3599 79.0308.5 0.4857 1.1420 127.8912.5 0.4798 0.3025 S T sigma moneyness 103.1720 2 0.2970 1.1086 106.8025 2 0.0059 1.5291 73.6899 2 0.3483 1.2806 96.9050 2 0.2530 1.6489 129.9036 2 0.0286 0.3932 2 70.6674 2 0.5089 1.6949 126.6076 2 0.4051 1.3195 95.6398 2 0.4133 0.4837 114.1751 2 0.2888 0.3599 79.0308 2 0.4857 1.1420 127.8912 2 0.4798 0.3025 26

DIAGNOSTIC PLOTS 27

YIELD CURVE DIMENSIONALITY REDUCTION 28

YIELD CURVE PRIMER Bonds have a fixed maturity (1M, 3M, 10Y) and pay coupons Examples of bonds treasury bonds, corporates, muni etc. Yield Curve: Plot of bond yields against maturities Adjacent points on the yield curve move together (correlated) 29

U.S. TREASURY YIELD CURVE 11 tenors/maturities Typically upward sloping Different shapes Pre-crisis Post-crisis Current 30

YIELD CURVE DYNAMICS Yield for each tenor (point on the yield curve) changes every day Problem: How to model the changes in the yield curve driven by 11 correlated variables? Any parsimonious representation possible? 31

PRINCIPAL COMPONENT ANALYSIS (PCA) PCA can be used to: Reduce dimensionality Retain as much variance in the dataset as possible Typically first few (3-5) PCA factors enough to explain almost all the variance 32

PCA OVER DIFFERENT TIME PERIODS PCA factors vary with time periods Interval Selector Quickly select different time intervals Perform stats on the selected time slices (using callbacks) 33

YIELD CURVE PCA: CRISIS 34

YIELD CURVE PCA: AFTER CRISIS 35

YIELD CURVE PCA: CURRENT 36

YIELD CURVE PCA: CURRENT 37

DIMENSION REDUCTION: AUTOENCODERS tanh relu linear Compressed feature vector 38

PCA VS AUTOENCODER PCA Autoencoder 39

DIMENSION REDUCTION: AE VS PCA 40

CONCLUSION Abundance of financial data Abundance of already existing models/techniques ML/DL techniques provide new ways of modeling financial data Interactive visualization tools help us better understand and interpret these models 41

RESOURCES Widget libraries used to build the applications: ipywidgets: https://github.com/jupyter-widgets/ipywidgets bqplot: https://github.com/bloomberg/bqplot (and other custom widgets) ML/DL libraries scikit-learn: http://scikit-learn.org tensorflow: https://www.tensorflow.org keras: https://keras.io Tech at Bloomberg: www.techatbloomberg.com 42