Using Structured Events to Predict Stock Price Movement: An Empirical Investigation Yue Zhang
My research areas
This talk Reading news from the Internet and predicting the stock market
Outline Introduction Method Experiments Conclusion
Introduction Is it possible? Random walk theory Efficient market hypothesis Human/algorithm trading Examples Shares of Apple Inc. fell as trading began in New York on Tuesday morning, the day after former CEO Steve Jobs passed away Google s stock falls after grim earnings come out early
Why events? Previous work Bag-of-words Named Entities Noun Phrases Examples Oracle Corp would sue Google Inc., claiming Google s Android operating system Microsoft agrees to buy Nokia s mobile phone business for $ 7.2 billion.
Method Event Representation Event Extraction Event Generalization Prediction model
Method Event Representation E=(O 1, P, O 2, T) Actor Event Object Time
Method Event Extraction Syntactic parsing Open information extraction
Method Event Extraction Event Phrase Extraction Syntactic constraint every multi-word event phrase must begin with a verb, end with a preposition, and be a contiguous sequence of words in the sentence Lexical constraint an event phrase should appear with at least a minimal number of distinct argument pairs in a large corpus Argument Extraction For each event phrase Pv identified in the above step, we find the nearest noun phrase O 1 to the left of Pv in the sentence, and O 1 should contain the subject of the sentence (if it does not contain the subject of Pv, find the second nearest noun phrase)
Method Event Generalization First, we construct a morphological analysis tool based on the WordNet stemmer to extract lemma forms of inflected words Second, we generalize each verb to its class name in VerbNet For example Instant view: Private sector adds 114,000 jobs in July. (Private sector, adds, 114,000 jobs) (private sector, multiply_class, 114,000 job)
Method Prediction Model Linear model Most previous work uses linear models to predict the stock market. To make direct comparisons, this paper constructs a linear prediction model by using SVM with linear kernel Nonlinear model Intuitively, the relationship between events and the stock market may be more complex than linear, due to hidden and indirect relationships. We exploit a deep neural network model, the hidden layers of which is useful for learning such hidden relationships
Class +1 The polarity of the stock price movement is positive Class -1 The polarity of the stock price movement is negative Output Layer Hidden Layers Input Layer φ 1 φ 2 φ 3 φ M News documents
Method Feature Representation Bag-of-words TF*IDF Events O 1, P, O 2, O 1 + P, P + O 2, O 1 + P + O 2 For Example (Microsoft, buy, Nokia's mobile phone business) (#arg1=microsoft, #action=buy, #arg2= Nokia's mobile phone business, #arg1_action=microsoft buy, #action_arg2=buy Nokia's mobile phone business, #arg1_action_arg2= Microsoft buy Nokia's mobile phone business)
Experiments Data Description We use publicly available financial news from Reuters and Bloomberg over the period from October 2006 to November 2013. This time span witnesses a severe economic downturn in 2007-2010, followed by a modest recovery in 2011-2013. There are 106,521 documents in total from Reuters News and 447,145 from Bloomberg News. We mainly focus on predicting the Standard &Poor's 500 stocks (S&P 500) index, obtaining indices and stock price from Yahoo Finance.
Evaluation Metrics Accuracy and MCC Experiments Overall Results
Experiments Experiments with Different Number of Hidden Layers of the Deep Neural Network Model
Experiments Experiments with Different Quantities of Data
Experiments Individual Stock Prediction
Experiments Individual Stock Prediction
Experiments Individual Stock Prediction
Section One Conclusion Events are useful. Events are more useful representations compared to bags-of-words for the task of stock market prediction. Hidden relations useful. A deep neural network model can be more accurate on predicting the stock market compared to the linear model. Robust results obtained. Our approach can achieve stable experiment results on S&P 500 index prediction and individual stock prediction over a large amount of data (eight years of stock prices and more than 550,000 pieces of news). Quality of information is more important than quantity. The most relevant information (i.e. news title vs news content, individual company news vs all news) is better than more, but less relevant information.
Section Two Overview Better event encodings Long term history
Event Embedding Previous work Learning entity embedding (Socher et al. 2013)
Neural Tensor Network Neural Network Neural Tensor Network
Neural Tensor Network for Event Embedding U R 1 R 2 T 3 O 1 T 1 P T 2 O 2
Neural Tensor Network for Event Embedding R 1 O 1 T 1 P
Training Minimize the margin loss 500 iterations Standard back-propagation Parameters Random replace with an object Regulation weight,set to 0.0001
Deep Prediction Model We model long-, mid-, short-term events Long-term events (Last month) Mid-term events (Last week) Short-term events (Last day)
Architecture Deep Prediction Model
Deep Prediction Model Convolution and Max-pooling Convolution layer to obtain local feature Max-pooling to determine the global representative feature
Dataset Experiment
Experiment Baselines Input Method Luss and d Aspremont [2012] Bag of words NN Ding et al. [2014] (E-NN) Structured event NN WB-NN Word embedding NN WB-CNN Word embedding CNN E-CNN Structured event CNN EB-NN Event embedding NN EB-CNN Event embedding CNN
Experiment Index Prediction Events are better features than words Reducing sparsity if helpful in the task CNN-based is more powerfull
Experiment 15 companies from S&P 500 Consists of High-,mid- and low-ranking companies Evaluation metric: Accuarcy and MCC
Thanks!