STOCK market price behavior has been studied extensively.

Size: px
Start display at page:

Download "STOCK market price behavior has been studied extensively."

Transcription

1 1 Stock Market Prediction through Technical and Public Sentiment Analysis Kien Wei Siah, Paul Myers I. INTRODUCTION STOCK market price behavior has been studied extensively. It is influenced by a myriad of factors, including political and economic events, among others, and is a complex nonlinear time-series problem. Traditionally, stock price forecasting is performed based on technical analysis, which focuses on price action, which is the process of finding patterns in price history. More recently, research has shown that public sentiment is correlated with stock market events [1], [2], [3]. This project proposes to study the potential of using both behavioral and technical features in stock price prediction models based on traditional classifiers and popular neural networks. We believe that behavioral data may offer insights into financial market dynamics in addition to that captured by technical analysis. An improved price forecasting model can yield enormous rewards in stock market trading. A. Problem Statement For this project, we focus on the Nikkei 225 (N225) stock index. N225 is the stock market index for the Tokyo Stock Exchange. It constitutes a price-weighted index average of 225 top rated Japanese companies in the Tokyo Stock Exchange. With Japan being the third largest economy in the world currently, and Tokyo being one of the largest global financial centers, the N225 price index is certainly a critical financial indicator that is closely watched by traders and banks around the world. We formulate the stock price prediction problem as a binary classification problem: whether the future daily returns of N225 will be positive (1) or negative (0), i.e. whether N225 s closing price tomorrow will be higher (1) or lower (0) than today s closing price. Daily return is defined in Equation 1. R i = C i C i 1 C i 1 (1) where R i is the daily return for the i-th day and C i is the N225 closing price for the i-th day. Daily return for day i is essentially the percent change in closing price from day (i 1) to day i. Future daily return for day i is just R (i+1). Take note that to get the classification target, we must take the sign of the future daily return R (i+1) rather than its numerical value. As described in the introduction, we will investigate the use price histories and public sentiment indicators available up to day i to predict sign(r (i+1) ). Subsequent sections cover the data collection process. Since this is framed as a classification task, we may use classification accuracy as a metric for evaluating the performances of various models. II. DATA COLLECTION AND FEATURE GENERATION A. Price History We queried daily historical prices of N225, for all trading days spanning January 1, 2004 to December 31, 2014, from Yahoo! Finance. However financial time series are well-known to be non-stationary, with means, variances and covariances that change over time. Such non-stationary data are difficult to model and will likely give poor classification accuracy when directly used as features. By viewing the daily prices as random walks, we attempted to stationarize the price history (through differencing and lagging) before using them as predictors. To this end, we used three main types of conventional price technical indicators as features [13]: 1) n-day Returns R i,n = C i C i n C i n (2) where R i,n is the i-th day return with respect to the (i n)-th day, or the percentage difference between the i-th day closing price C i and the (i n)-th day closing price C i n. Positive values imply that the N225 index has risen over the n days. For n = 1, we get the simple daily returns equation (Equation 1). 2) n-day Returns Moving Average MA i,n = R i,1 + R (i 1),1 + + R (i n),1 (3) n where MA i,n is the average returns over the previous n days, and n > 1 because a one day average is the day s return itself. 3) n-time Lagged 1-Day Returns R i,1, R (i 1),1,..., R (i n),1 (4) where R (i n),1 is (i n)-th day s 1-Day returns. By varying n, we have different numbers of features which contains varying degrees of information about price trends and past prices. This is one of the multiple parameters we will vary and decide upon using cross validation. B. Public Sentiment Indicators In addition to conventional technical indicators, we also looked at public sentiment indicators. The theory of behavioral economics postulates that emotions plays a significant role influencing economic decisions of individuals. Research has shown that this applies to societies as large as well. In fact, Bollen et al used Twitter messages as indicators of public mood states and demonstrated that they were correlated to,

2 2 and predictive of the Dow Jones Industrial Average over time [2]. In another study, Preis et al found patterns in Google query volumes, for search terms related to finance, that constitutes early warning signs of stock market movements. They hypothesize that investors search for information online about the markets before eventually deciding whether to buy or sell stocks. This indicates that search query data from Google Trends may contain valuable predictive information about the information gathering process that precedes trading decisions in the stock market [3]. This project takes inspiration from these two widely cited studies and attempts to integrate some aspects of public sentiment analysis as part of our features, in hope that combining behavioral data with technical price indicators will lead to improved performance. To this end, we used behavioral data from two sources: Bloomberg Businessweek and Google Trends. We were unable to replicate Bollen et al s study using Twitter messages as Twitter has restricted public access to very limited amounts of data. Other Twitter data sources required paid subscriptions. Therefore, similar to [3], we used trends in Google query volume for finance-related search terms as a proxy for public sentiment. Further, we wrote a script to crawl a free online news archive Bloomberg Businessweek for articles published from 2004 to 2014: approximately 210,000 articles were gathered. It is hoped that the state of the economy and prevalent stock market conditions can be extracted through sentiment analysis from these articles. For Google Trends, we focused on the daily search volumes of five finance-related search terms that showed the greatest predictive potential for stock market forecasting in [3], namely economics, debt, inflation, risk and stocks. Google Trends scores the daily query volumes on a scale of 0-100, normalized with respect to the peak within the date range (2004 to 2014 in our case). Subsequently we performed a relatively simple sentiment analysis on the news articles crawled from Bloomberg Businessweek to obtain daily sentiment scores. First, we obtained lists of positive and negative words that are both financialspecific and general. For financial-specific words, we used the lists published by McDonald, originating from his research on sentiment analysis on financial texts [4]. This is particularly relevant in our case as words with positive meanings in the general context may actually be negative in the financial context. For the general case, we used the lists of positive and negative opinion words or sentiment words by Hu and Liu [5]. To compute the sentiment score for each article, we used the following equation: Score = POS NEG POS + NEG where POS refers to the number of positive words (from the lists obtained earlier) counted in the article, NEG refers to the number of negative words (from the lists obtained earlier) counted in the article. The positive and negative words were counted as many times as they appear. A score of +1 implies an entirely positive article, 0 (when no words are counted) implies neutral, and -1 implies an entirely negative article. Daily scores were obtained by averaging over all the articles (5) in that day. Just computing this score for the 210,000 articles crawled took up a few days and had to be done in batches. It is likely that a more sophisticated sentiment analysis would have required longer time and unfeasible within the time framework of this project. C. Missing Data and Look-Ahead Bias By crawling for our own data, we inevitably face the problem of missing data e.g. price histories for some days are missing, the Bloomberg Businessweek archive does not have articles for every trading day. In dealing with this issue, we have three options: mean imputation, interpolation based on previous and next data point, or sample and hold. We opted to go with the last option (using the last observed valid data point) as we felt that mean imputation and interpolation will introduce some extent of look-ahead bias (using information that would not have been available during that time). For instance, the interpolation of prices or returns implicitly uses the future price, i.e. the interpolated point will be higher if the next price is high. This will lead to inaccurate results. While there are certainly more sophisticated and effective techniques of dealing with missing data, we considered only the simpler methods in view of time constraints. III. RECURRENT NEURAL NETWORK A. Vanilla Recurrent Neural Network Recurrent Neural Networks (RNNs) have shown great potential in many Natural Language Processing tasks (e.g. machine translation, language models...etc.) and are becoming increasingly popular. Unlike vanilla Neural Networks (NNs), RNN s network topology allows it make use of sequential information. This is a natural fit for stock market prediction, a time series problem - knowing previous days prices may help us predict tomorrow s price. Fig. 1. Recurrent Neural Network topology [6]. As illustrated in Figure 1, RNN performs the same operations, with the same weights, for each element of the sequence. It takes into account the previous step s state (s t 1 ) while computing the output for the current step. This recurrent property allows it to have a memory as mentioned earlier. The relevant equations are as follows: s t = tanh(ux t + W s t 1 ) (6) o t = sigmoid(v s t ) (7) where U, W and V are the weight matrices used across all time steps, x t is the input at time step t, s t is the hidden state

3 3 at time step t and o t is the output at time step t. We may think of s t as the memory of the RNN which contains information about inputs and computations of all the previous time steps (subject to the vanishing gradient problem elaborated below)! As described earlier, the output is computed based on the previous hidden state s t 1 and current input x t (Equation 6). The first hidden state s 0 is typically initialized with zeros. In our stock market prediction problem, we can think of x t as the feature vector of each day (composing of features from Section II). Figure 1 has outputs at all time steps, but in our case, we are really only concerned with the output at the final step, which is the prediction whether price will rise or fall. In other words, we input feature vectors from previous t days into the RNN sequentially, and o t (a sigmoid output (Equation 7)) represents the probability of price rising or falling for the (t+1)-th day. This allows it capture more temporal information than classifiers (e.g. Support Vector Machines, NNs, Logistic Regression) that only take input of one time step. Training for RNNs is similar to that for vanilla NNs: backpropagation. However for RNNs, we backpropagate through time to obtain dl. The idea is to unfold the RNN across time (similar to that in Figure 1) and do backpropagation as if it were a normal NN. Since this is a classification problem, we can use the binary cross entropy loss as the error function L. Because we are only looking at the final output, we can mask all other outputs and only consider loss from the final output. From here, we may use stochastic gradient descent to minimize the error. There is one caveat: the vanishing gradient problem. As we know from NN backpropagation in class, the gradients dl dl dw, dl dv du, dl dw, dl dv du, are derived from the chain rule, meaning they are products of multiple derivatives. These chain rule derivatives have upper bounds of 1 (apparent from the tanh and sigmoid activation functions used). And this means that gradient values can shrink exponentially fast and vanish after a few time steps, particularly when the neurons are saturated. Because gradients vanish within a limited number of time steps, the vanilla RNN model typically has issues learning long range dependencies, i.e. the RNN will not learn much from inputs more than a certain number of time steps before the final output. From this, we know that the number of time steps in the input sequence for this RNN model cannot be too large. We may determine this hyper-parameter from cross validation. Note that this is a problem in deep NNs as well. Also, exploding gradient may be a problem, but this can be circumvented effectively by clipping the gradients. For this project, we implemented the above described RNN model from scratch in Python and tested its performance on the stock market prediction problem. B. Gated Recurrent Unit We also implemented from scratch in Python a more sophisticated RNN variant - the Gated Recurrent Unit (GRU). GRUs are identical to the vanilla RNN described above (takes sequential inputs) except in the way the hidden states s t are calculated. They were designed to alleviate the vanishing gradient problem through the use of gates (Figure 2). These are illustrated through the GRU equations 8, 9, 10, 11 and 12. Fig. 2. Gated Recurrent Unit topology [8], [9]. z = sigmoid(u z x t + W z s t 1 ) (8) r = sigmoid(u r x t + W r s t 1 ) (9) h = tanh(u h x t + W h (s t 1 r)) (10) s t = (1 z) h + z s t 1 (11) o t = sigmoid(v s t ) (12) where denotes element-wise multiplication. GRU has two gates, specifically a reset gate r and an update gate z. The reset gate r determines how to combine the new input x t with the previous hidden state s t 1, while the update gate z determines how much of the previous hidden state s t 1 to retain in the current hidden state s t. We obtain the vanilla RNN by setting r to all 1 s and z to all 0 s [8]. The GRU is a relatively new model published in recent years. They have fewer parameters than Long Short Term Memory (another RNN variant), rendering them faster to train and requiring less data to generalize. We tested our implementation of GRU on the stock market prediction problem as well. IV. METHODOLOGY A. Baseline and Other Models Since we have framed stock market prediction as a binary classification problem, Logistic Regression (LR) is a natural choice as a baseline model. Beyond LR, we also tested several other more sophisticated models (some of which were not covered in lectures) to gain exposure to common machine learning algorithms. They are Support Vector Machines RBF (SVM RBF), K-Nearest Neighbors (KNN) and AdaBoost (implemented in Scikit-Learn). B. Experiment Design The range of data (price history and sentiment scores) collected span 11 years from January 1, 2004 to December 31, In this project, we would like to predict whether tomorrow s price will be higher (1) or lower (0) than today s price. Thus, each day may be viewed as an observation from which a training example or testing example may be constructed. We created feature vectors based on the features described in Section II: each vector is essentially a concatenation of price technical indicators and public sentiment scores. The target variable is binary and is simply the sign of tomorrow s 1-day returns. We show an example feature vector x (i) and target variable y (i) pair for some arbitrary i-th day below:

4 4 x (i) = y (i) = [ R i,1, R i,2, , R i,n, MA i,2, , MA i,n, R (i 1),1, R (i 2),1,..., R (i n),1 GT i,econ, GT i,debt, GT i,inflat, GT i,risk, GT i,stocks, ] Score i [ ] Sign(R (i+1),1 ) where notation remains the same as introduced in Section II, GT i,yyy refers to the Google Trends query volumes for the word YYY. It is important that the feature vector x (i) does not contain any future information and only uses information available up to that point. n determines that amount of information about past prices and price trends incorporated into the feature vector; the dimensions of the feature vector changes with n. Note that because we are predicting tomorrow s price change, we lose one day: no prediction can be made for the last day in the data set, December 31, 2014, because we do not know the true price on January 1, Also, depending on the n chosen, we have to drop the first n days observations: to calculate the n-days returns, n-day returns moving average and n-time lagged 1-day returns, we need the previous n days prices. So these features cannot be calculated for the first n days in the data set because we do not know prices prior the first day, January 1, We select n from cross validation. f(θ) = N i=1 [ ] y (i) log(q(x (i) ) + (1 y (i) )log(1 q(x (i) ) (13) where N is the number of training examples. This is a binary classification task so we may use the binary cross entropy error function as objective to minimize for LR and the RNN (Equation 13). TABLE I TRAIN AND TEST SET SPLIT Data Set January 2004 to December 2014 Train Set January 2004 to December 2012 Test Set January 2013 to December 2014 Before we began training, we split the data set of observations into train and test sets, roughly 80% and 20% respectively each (Table I). We will train our models (RNN, GRU, LR, SVM, KNN and AdaBoost) based on the train set, and subsequently evaluate their performance on the untouched test set. C. RNN Training For conventional classifiers like LR, the training method is straightforward: for each prediction, we use x (i) as input, y (i) as target and minimize the error function either stochastically (stochastic gradient descent) or collectively (batch gradient descent). This is not the case for RNNs. Recall that one of the properties of RNNs is that they can process sequential data. This means that we are not restricted to using one feature vector for each prediction; we may input feature vectors from some previous t days into the RNN sequentially and take the final output prediction (minimize cross entropy error of final step prediction). Using t = 3 as a concrete example: y (i) s (0) RNN s (1) RNN s (2) RNN x (i 2) x (i 1) x (i) where s (t) are the hidden state vectors at time step t from the RNN, and s (0) is initialized with all zeros. For training RNN, we used inputs that are sequences of feature vectors [x (i t+1),...,, x (i 1), x (i) ]. We feed them into the RNN sequentially beginning from x (i t+1) to x (i). And the final output gives us a probability for the target variable y (i) since we use the sigmoid function (Equation 7). Again, similar to that described in the previous section, depending on t we have to drop the first few days of training examples. This allows the RNN to capture some extent of temporal information that LR does not (e.g. finer grain resolution of how returns are changing day to day). The larger t is, the more temporal information we are feeding into the RNN. However, as mentioned in Section III, t is intrinsically limited by the vanishing gradient problem. t, together the dimensions of hidden state vectors s (t) are the hyper parameters we can tune using cross validation. The above training method also applies for GRUs (a variant of RNN). However we may expect better results for GRUs as they should theoretically face a less extent of the vanishing gradient problem. D. Cross Validation for Time Series Cross validation is an important step in model selection and parameters tuning. It provides a measure of the generalization error of the trained classifier. To a certain extent, this technique allows us to avoid over-fitting on the training data (and perhaps under-fitting), and consequently do better on the test data. For independent data, we can typically use K-Folds cross validation, where the training data is randomly split in K ideally equally sized folds. Each fold may then be used as a validation set while the remaining (K-1) folds become the new training set. We cycle through the K folds so that each fold is left out of training and used for validation once. By taking the average error over these K folds validation, we get an estimate of the generalization error (i.e. how well the classifier will likely perform on unseen test sets). However, for this project, the data involved is financial time series and they are not independent! Correlation between adjacent observations is often prevalent in time series data; the data has some intrinsic order. The K-Folds cross validation method described earlier breaks down because (assuming we randomly split the training data into K Folds) the validation

5 5 and training samples are no longer independent. Furthermore, the train set should not contain any information that occurs after the validation set. But splitting the data randomly, we cannot be sure of that. TABLE II CROSS VALIDATION FOR TIME SERIES Fold Train Set Validation Set , , 2005, , 2005, 2006, A more principled approach for time series cross validation is forward chaining [7]. Using 5 years of training time series data from 2004 to 2008 as example, we may split it into 4 folds and perform cross validation as in Table II. This is a more accurate reflection of the situation during testing where we train on past data and predict future price changes. We adopted this approach for cross validation in this project. In Table III, we summarize the hyper-parameters for each model we tested, and the respective ranges over which we did a grid search for. TABLE III GRID SEARCH HYPER-PARAMETERS Hyper-Parameters Sweep Range n (refer to section II) 3, 4, 5, 6, 7, 8, 9 GT, Score with and without LR Regularization C 10e-2, 10e-1, 10e-0, 10e1, 10e2 SVM RBF Bandwidth γ 10e-2, 10e-1, 10e-0, 10e1, 10e2 C 10e-2, 10e-1, 10e-0, 10e1, 10e2 KNN No. of neighbors 5, 10, 25, 50, 75, 100 AdaBoost No. of estimators 5, 10, 25, 50, 75, 100 Learning rate 0.01, 0.05, 0.1, 0.5, 1 RNN Time steps t 2, 4, 6 Hidden state s (t) dimensions 10, 30, 50 GRU Time steps t 2, 4, 6 Hidden state s (t) dimensions 10, 30, 50 Fig. 3. Grid search heat map for Logistic Regression. The optimal parameters from cross validation are n = 8 and regularization C = 0.1, without sentiment scores. Fig. 4. Grid search heat map for K-Nearest Neighbor. The optimal parameters from cross validation are n = 8 and no. of neighbors= 5, without sentiment scores. V. RESULTS AND DISCUSSION A. Grid Search Cross Validation Results We performed extensive grid searches for each model to choose the best hyper-parameters based on the resulting cross validation accuracy. Selected results are presented as heat maps in Figures 3, 4, 5, 6, 7 and 8. Using the best hyper-parameter combination, we trained fresh models (LR, KNN, AdaBoost, SVM RBF, RNN and GRU) based on the entire train set (from January 2004 to December 2012) and tested them on the unseen test set (from January 2013 to December 2014). The results are summarized in Table IV. B. Discussion From our grid search experiments, we realized that including Google query volumes and sentiment scores did not necessarily lead to improved performance. In fact for some models Fig. 5. Grid search heat map for AdaBoost. We swept n as mentioned in Table III. For easy visualization we only present heat map of the best n here. The optimal parameters from cross validation are n = 3, no. of estimators= 5 and learning rate= 1, without sentiment scores. (like KNN and LR), including these sentiment scores caused a significant drop in test accuracy. The reason becomes apparent when we overlay Google query volumes and sentiments scores with the N225 price index. From Figures 9 and 10, we can see that both scores do

6 6 TABLE IV BEST CROSS VALIDATION ACCURACY AND TEST ACCURACY Model Best Cross Validation Accuracy Test Accuracy LR (baseline) KNN AdaBoost SVM RBF RNN GRU Fig. 6. Grid search heat map for Support Vector Machine RBF. We swept n as mentioned in Table III. For easy visualization we only present heat map of the best n here. The optimal parameters from cross validation are n = 8, bandwidth γ = 0.1 and C = 1000, without sentiment scores. Fig. 9. Plot of Bloomberg Businessweek sentiment scores and the N225 price index over time from 2007 to Fig. 7. Grid search heat map for Recurrent Neural Network. We swept n as mentioned in Table III. For easy visualization we only present heat map of the best n here. The optimal parameters from cross validation are n = 5, hidden state s (t) dimensions= 30 and time steps t = 4, without sentiment scores. Fig. 10. Plot of Google Trends query volume for the word debt and the N225 price index over time from 2010 to Fig. 8. Grid search heat map for Gated Recurrent Unit. We swept n as mentioned in Table III. For easy visualization we only present heat map of the best n here. The optimal parameters from cross validation are n = 5, hidden state s (t) dimensions= 50 and time steps t = 4, without sentiment scores. not seem to be consistently correlated with the N225 price. They do not seem to be predictive of N225 price changes (while the figures are plotted at the monthly-level, the same holds true when we zoom in to the daily-level). This likely explains why the sentiment score features do not improve the classifiers performance; they do not provide useful additional information. It seems that our simple sentiment analysis (scoring by counting positive and negative words from pre-specified lists) is too coarse to extract useful information. Perhaps using more sophisticated sentiment analysis methods that goes beyond the word-level (such as OpinionFinder in [2], that looks at sentence-level subjectivity) will yield more informative scores. In addition, it may be useful to crawl articles from multiple news archives, rather than just the Bloomberg Businessweek, to gain a more diverse set of corpus that may be more representative of the state of world affairs. Unlike that reported in [3], Google search volume trends did not improve our results.

7 7 This could be simply due to the fact we are analyzing N225 in this project, and not the Dow Jones Industrial Index as in the original paper. On hindsight, perhaps using volume trends for search terms in the Japanese language would have been more appropriate since English is not Japan s first language (but then again, with globalization, N225 is tradable from almost anywhere in the world). Further, [3] could have used a greater set of search terms; we restricted ourselves to 5 finance related terms to keep data collection and computation time reasonable. Out of all the models tested, LR gave one of the poorest accuracy at This is only slightly better than randomly guessing (0.5). However such result is consistent with our understanding that LR is ultimately a linear classification model (we did not kernelize LR for this project). It is natural that stock market prediction, a non-linear problem, cannot be well-modeled by a linear model. Nevertheless this serves as a baseline benchmark to evaluate other more sophisticated algorithms. Both the RNN and GRU performed better than LR. Because these are non-linear models, it is natural that they can give better accuracy than LR. One observation is that the GRU (0.558) performs slightly better then the vanilla RNN (0.531), suggesting that the GRU gating architecture may have indeed helped to alleviate the vanishing gradient problem, allowing it to learn better. We also note that both the RNN and GRU required significant longer times to train as compared to the other models. This posed as an issue particularly for time series cross validation. As a result, we only managed to sweep 3 values for both the time steps t and the hidden state s (t) dimensions (on top of the n) - sweeping these parameters took over a day for each of the two models. Finally, we see that GRU has comparable performance/slightly lower with the SVM RBF (0.565). In general, our SVM RBF accuracy is consistent with that reported in literature and other implementations online ([10], [11], [12] and [13]). However we feel that the GRU has potential to outperform the SVM RBF classifier: Firstly, as mentioned earlier, we only swept 3 values for both GRU parameters. Given more time and resources, we could sweep the parameters at finer resolutions and for a larger range. This will likely give better performance. In addition, we used simple stochastic gradient descent in the GRU implementation. There are more sophisticated optimization methods available (such as RMSprop) that could potentially lead to improved accuracy. Lastly, we are currently looking at daily data, which gives us around 2000 training examples. This data set size may be insufficient to learn the reset and update gates weights effectively. Perhaps if we looked at minute scale data (which would vastly increase the number of training examples), the GRU will perform much better than the SVM RBF. Lastly, we did not have sufficient time to thoroughly analyze the results for KNN and AdaBoost. As mentioned in Section IV, we tested these models mostly to gain exposure to a wider range of common machine learning algorithms. VI. CONCLUSION In this project we collected price history from Yahoo! Finance, crawled articles from Bloomberg Businessweek and obtained Google query volumes from Google Trends for the period 2004 to Using the data, we generated price technical indicators and sentiment scores to be used as features for predicting future (tomorrow s) price change direction. We implemented a vanilla RNN and GRU from scratch in Python and tested them against LR as a baseline. Through grid searches and cross validation for time series, we chose the optimal (according to cross validation error) hyper-parameters for each model. From our experiments, sentiment scores and Google query volumes did not improve classifiers performance. This is likely because our simple sentiment analysis does not extract useful information from the news articles. Consistent with our expectations, LR performed the poorest among SVM RBF, RNN and GRU. It is logical than a linear model cannot adequately describe a complex non-linear problem such as stock prices. The GRU performed slightly better than the vanilla RNN, indicating that the gating mechanism was effective to some extent in relieving the vanishing gradient issue. Finally, we observed that the GRU has comparable performance with the SVM RBF. However, we feel that the GRU has potential to outperform the SVM RBF given more time and resources. Moving forward, we may perform more advanced sentiment analysis, in terms of using more sophisticated sentence-level methods (such as the OpinionFinder) and also crawling for news articles from a wider range of websites (such as the Wall Street Journal) for a more diverse corpus. This should serve as a better proxy for public sentiment. We could also explore more specialized Google search terms that are predictive of the N225, perhaps in the Japanese language. For the RNN and GRU, we can certainly improve their performances by sweeping a wider range of parameters at finer resolutions, and using more advanced optimization methods like the RMSprop. Also, we feel that their ac curacies should improve given more data (working at the hourly/minute scale instead of the daily scale). Currently, we train the RNN and GRU using a fixed train set and test them on the test set. An alternative way is to have a moving train set where we retrain the model every year based on the latest D years prices, i.e. firstly train on 2004 and 2005 and test on 2006; train a fresh model on 2005 and 2006 and test on etc. This will allow us to capture short term trends more effectively. Finally, we used simple sample and hold to deal with missing data in this project. There are definitely more robust methods on dealing with such cases that we did not have the time to explore here. REFERENCES [1] Ruiz, Eduardo J. et al. Correlating Financial Time Series With Micro- Blogging Activity. Proceedings of the fifth ACM international conference on Web search and data mining - WSDM 12 (2012): n. pag. Web. 11 Nov [2] Bollen, Johan, Huina Mao, and Xiaojun Zeng. Twitter Mood Predicts The Stock Market. Journal of Computational Science 2.1 (2011): 1-8. Web. [3] Preis, Tobias, Helen Susannah Moat, and H. Eugene Stanley. Quantifying Trading Behavior In Financial Markets Using Google Trends. Sci. Rep. 3 (2013): n. pag. Web. [4] McDonald, Bill. Bill Mcdonald s Word Lists Page. Nd.edu. N.p., Web. 7 Dec

8 [5] Hu, Minqing, and Bing Liu. Mining And Summarizing Customer Reviews. Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 04 (2004): n. pag. Web. 7 Dec [6] LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. Deep Learning. Nature (2015): Web. 7 Dec [7] Arlot, Sylvain, and Alain Celisse. A Survey Of Cross-Validation Procedures For Model Selection. Statistics Surveys 4.0 (2010): Web. 8 Dec [8] Britz, Denny. Recurrent Neural Network. WildML. N.p., Web. 8 Dec [9] Chung, Junyoung, Gulcehre, Caglar, Cho, KyungHyun, Bengio, Yoshua/ Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. NIPS Deep Learning Workshop, 2014 [10] Fu, Tong, Shou Chen, and Chuanqi Wei. Hong Kong Stock Index Forecasting Web. 9 Dec [11] Dai, Yuqing, and Yuning Zhang. Machine Learning In Stock Price Trend Forecasting Web. 9 Dec [12] Halls-Moore, Michael. Forecasting Financial Time Series. Quantstart. N.p., Web. 9 Dec [13] Pochetti, Francesco. Stock Market Prediction Scikit Classification Algorithms. N.p., Web. 9 Dec

STOCK MARKET PREDICTION AND ANALYSIS USING MACHINE LEARNING

STOCK MARKET PREDICTION AND ANALYSIS USING MACHINE LEARNING STOCK MARKET PREDICTION AND ANALYSIS USING MACHINE LEARNING Sumedh Kapse 1, Rajan Kelaskar 2, Manojkumar Sahu 3, Rahul Kamble 4 1 Student, PVPPCOE, Computer engineering, PVPPCOE, Maharashtra, India 2 Student,

More information

Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques

Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques 6.1 Introduction Trading in stock market is one of the most popular channels of financial investments.

More information

$tock Forecasting using Machine Learning

$tock Forecasting using Machine Learning $tock Forecasting using Machine Learning Greg Colvin, Garrett Hemann, and Simon Kalouche Abstract We present an implementation of 3 different machine learning algorithms gradient descent, support vector

More information

Application of Deep Learning to Algorithmic Trading

Application of Deep Learning to Algorithmic Trading Application of Deep Learning to Algorithmic Trading Guanting Chen [guanting] 1, Yatong Chen [yatong] 2, and Takahiro Fushimi [tfushimi] 3 1 Institute of Computational and Mathematical Engineering, Stanford

More information

Novel Approaches to Sentiment Analysis for Stock Prediction

Novel Approaches to Sentiment Analysis for Stock Prediction Novel Approaches to Sentiment Analysis for Stock Prediction Chris Wang, Yilun Xu, Qingyang Wang Stanford University chrwang, ylxu, iriswang @ stanford.edu Abstract Stock market predictions lend themselves

More information

Can Twitter predict the stock market?

Can Twitter predict the stock market? 1 Introduction Can Twitter predict the stock market? Volodymyr Kuleshov December 16, 2011 Last year, in a famous paper, Bollen et al. (2010) made the claim that Twitter mood is correlated with the Dow

More information

Stock Prediction Using Twitter Sentiment Analysis

Stock Prediction Using Twitter Sentiment Analysis Problem Statement Stock Prediction Using Twitter Sentiment Analysis Stock exchange is a subject that is highly affected by economic, social, and political factors. There are several factors e.g. external

More information

Foreign Exchange Forecasting via Machine Learning

Foreign Exchange Forecasting via Machine Learning Foreign Exchange Forecasting via Machine Learning Christian González Rojas cgrojas@stanford.edu Molly Herman mrherman@stanford.edu I. INTRODUCTION The finance industry has been revolutionized by the increased

More information

Role of soft computing techniques in predicting stock market direction

Role of soft computing techniques in predicting stock market direction REVIEWS Role of soft computing techniques in predicting stock market direction Panchal Amitkumar Mansukhbhai 1, Dr. Jayeshkumar Madhubhai Patel 2 1. Ph.D Research Scholar, Gujarat Technological University,

More information

Leverage Financial News to Predict Stock Price Movements Using Word Embeddings and Deep Neural Networks

Leverage Financial News to Predict Stock Price Movements Using Word Embeddings and Deep Neural Networks Leverage Financial News to Predict Stock Price Movements Using Word Embeddings and Deep Neural Networks Yangtuo Peng A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL FULFILLMENT OF THE

More information

Applications of Neural Networks

Applications of Neural Networks Applications of Neural Networks MPhil ACS Advanced Topics in NLP Laura Rimell 25 February 2016 1 NLP Neural Network Applications Language Models Word Embeddings Tagging Parsing Sentiment Machine Translation

More information

arxiv: v1 [cs.ai] 7 Jan 2018

arxiv: v1 [cs.ai] 7 Jan 2018 Trading the Twitter Sentiment with Reinforcement Learning Catherine Xiao catherine.xiao1@gmail.com Wanfeng Chen wanfengc@gmail.com arxiv:1801.02243v1 [cs.ai] 7 Jan 2018 Abstract This paper is to explore

More information

Predicting stock prices for large-cap technology companies

Predicting stock prices for large-cap technology companies Predicting stock prices for large-cap technology companies 15 th December 2017 Ang Li (al171@stanford.edu) Abstract The goal of the project is to predict price changes in the future for a given stock.

More information

ALGORITHMIC TRADING STRATEGIES IN PYTHON

ALGORITHMIC TRADING STRATEGIES IN PYTHON 7-Course Bundle In ALGORITHMIC TRADING STRATEGIES IN PYTHON Learn to use 15+ trading strategies including Statistical Arbitrage, Machine Learning, Quantitative techniques, Forex valuation methods, Options

More information

Statistical and Machine Learning Approach in Forex Prediction Based on Empirical Data

Statistical and Machine Learning Approach in Forex Prediction Based on Empirical Data Statistical and Machine Learning Approach in Forex Prediction Based on Empirical Data Sitti Wetenriajeng Sidehabi Department of Electrical Engineering Politeknik ATI Makassar Makassar, Indonesia tenri616@gmail.com

More information

Market Variables and Financial Distress. Giovanni Fernandez Stetson University

Market Variables and Financial Distress. Giovanni Fernandez Stetson University Market Variables and Financial Distress Giovanni Fernandez Stetson University In this paper, I investigate the predictive ability of market variables in correctly predicting and distinguishing going concern

More information

distribution of the best bid and ask prices upon the change in either of them. Architecture Each neural network has 4 layers. The standard neural netw

distribution of the best bid and ask prices upon the change in either of them. Architecture Each neural network has 4 layers. The standard neural netw A Survey of Deep Learning Techniques Applied to Trading Published on July 31, 2016 by Greg Harris http://gregharris.info/a-survey-of-deep-learning-techniques-applied-t o-trading/ Deep learning has been

More information

Available online at ScienceDirect. Procedia Computer Science 89 (2016 )

Available online at  ScienceDirect. Procedia Computer Science 89 (2016 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 89 (2016 ) 441 449 Twelfth International Multi-Conference on Information Processing-2016 (IMCIP-2016) Prediction Models

More information

An enhanced artificial neural network for stock price predications

An enhanced artificial neural network for stock price predications An enhanced artificial neural network for stock price predications Jiaxin MA Silin HUANG School of Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR S. H. KWOK HKUST Business

More information

CS 475 Machine Learning: Final Project Dual-Form SVM for Predicting Loan Defaults

CS 475 Machine Learning: Final Project Dual-Form SVM for Predicting Loan Defaults CS 475 Machine Learning: Final Project Dual-Form SVM for Predicting Loan Defaults Kevin Rowland Johns Hopkins University 3400 N. Charles St. Baltimore, MD 21218, USA krowlan3@jhu.edu Edward Schembor Johns

More information

Iran s Stock Market Prediction By Neural Networks and GA

Iran s Stock Market Prediction By Neural Networks and GA Iran s Stock Market Prediction By Neural Networks and GA Mahmood Khatibi MS. in Control Engineering mahmood.khatibi@gmail.com Habib Rajabi Mashhadi Associate Professor h_mashhadi@ferdowsi.um.ac.ir Electrical

More information

STOCK PRICE PREDICTION: KOHONEN VERSUS BACKPROPAGATION

STOCK PRICE PREDICTION: KOHONEN VERSUS BACKPROPAGATION STOCK PRICE PREDICTION: KOHONEN VERSUS BACKPROPAGATION Alexey Zorin Technical University of Riga Decision Support Systems Group 1 Kalkyu Street, Riga LV-1658, phone: 371-7089530, LATVIA E-mail: alex@rulv

More information

Examining Long-Term Trends in Company Fundamentals Data

Examining Long-Term Trends in Company Fundamentals Data Examining Long-Term Trends in Company Fundamentals Data Michael Dickens 2015-11-12 Introduction The equities market is generally considered to be efficient, but there are a few indicators that are known

More information

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, ISSN

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18,   ISSN Volume XII, Issue II, Feb. 18, www.ijcea.com ISSN 31-3469 AN INVESTIGATION OF FINANCIAL TIME SERIES PREDICTION USING BACK PROPAGATION NEURAL NETWORKS K. Jayanthi, Dr. K. Suresh 1 Department of Computer

More information

Forecasting stock market prices

Forecasting stock market prices ICT Innovations 2010 Web Proceedings ISSN 1857-7288 107 Forecasting stock market prices Miroslav Janeski, Slobodan Kalajdziski Faculty of Electrical Engineering and Information Technologies, Skopje, Macedonia

More information

Credit Card Default Predictive Modeling

Credit Card Default Predictive Modeling Credit Card Default Predictive Modeling Background: Predicting credit card payment default is critical for the successful business model of a credit card company. An accurate predictive model can help

More information

Stock Market Predictor and Analyser using Sentimental Analysis and Machine Learning Algorithms

Stock Market Predictor and Analyser using Sentimental Analysis and Machine Learning Algorithms Volume 119 No. 12 2018, 15395-15405 ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Stock Market Predictor and Analyser using Sentimental Analysis and Machine Learning Algorithms 1

More information

Two kinds of neural networks, a feed forward multi layer Perceptron (MLP)[1,3] and an Elman recurrent network[5], are used to predict a company's

Two kinds of neural networks, a feed forward multi layer Perceptron (MLP)[1,3] and an Elman recurrent network[5], are used to predict a company's LITERATURE REVIEW 2. LITERATURE REVIEW Detecting trends of stock data is a decision support process. Although the Random Walk Theory claims that price changes are serially independent, traders and certain

More information

Lending Club Loan Portfolio Optimization Fred Robson (frobson), Chris Lucas (cflucas)

Lending Club Loan Portfolio Optimization Fred Robson (frobson), Chris Lucas (cflucas) CS22 Artificial Intelligence Stanford University Autumn 26-27 Lending Club Loan Portfolio Optimization Fred Robson (frobson), Chris Lucas (cflucas) Overview Lending Club is an online peer-to-peer lending

More information

Stock Price Prediction using Deep Learning

Stock Price Prediction using Deep Learning San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 2018 Stock Price Prediction using Deep Learning Abhinav Tipirisetty San Jose State University

More information

Stock Market Index Prediction Using Multilayer Perceptron and Long Short Term Memory Networks: A Case Study on BSE Sensex

Stock Market Index Prediction Using Multilayer Perceptron and Long Short Term Memory Networks: A Case Study on BSE Sensex Stock Market Index Prediction Using Multilayer Perceptron and Long Short Term Memory Networks: A Case Study on BSE Sensex R. Arjun Raj # # Research Scholar, APJ Abdul Kalam Technological University, College

More information

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, ISSN

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18,   ISSN International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, www.ijcea.com ISSN 31-3469 AN INVESTIGATION OF FINANCIAL TIME SERIES PREDICTION USING BACK PROPAGATION NEURAL

More information

Deep Learning for Time Series Analysis

Deep Learning for Time Series Analysis CS898 Deep Learning and Application Deep Learning for Time Series Analysis Bo Wang Scientific Computation Lab 1 Department of Computer Science University of Waterloo Outline 1. Background Knowledge 2.

More information

Predicting Market Fluctuations via Machine Learning

Predicting Market Fluctuations via Machine Learning Predicting Market Fluctuations via Machine Learning Michael Lim,Yong Su December 9, 2010 Abstract Much work has been done in stock market prediction. In this project we predict a 1% swing (either direction)

More information

COGNITIVE LEARNING OF INTELLIGENCE SYSTEMS USING NEURAL NETWORKS: EVIDENCE FROM THE AUSTRALIAN CAPITAL MARKETS

COGNITIVE LEARNING OF INTELLIGENCE SYSTEMS USING NEURAL NETWORKS: EVIDENCE FROM THE AUSTRALIAN CAPITAL MARKETS Asian Academy of Management Journal, Vol. 7, No. 2, 17 25, July 2002 COGNITIVE LEARNING OF INTELLIGENCE SYSTEMS USING NEURAL NETWORKS: EVIDENCE FROM THE AUSTRALIAN CAPITAL MARKETS Joachim Tan Edward Sek

More information

Wide and Deep Learning for Peer-to-Peer Lending

Wide and Deep Learning for Peer-to-Peer Lending Wide and Deep Learning for Peer-to-Peer Lending Kaveh Bastani 1 *, Elham Asgari 2, Hamed Namavari 3 1 Unifund CCR, LLC, Cincinnati, OH 2 Pamplin College of Business, Virginia Polytechnic Institute, Blacksburg,

More information

Business Strategies in Credit Rating and the Control of Misclassification Costs in Neural Network Predictions

Business Strategies in Credit Rating and the Control of Misclassification Costs in Neural Network Predictions Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2001 Proceedings Americas Conference on Information Systems (AMCIS) December 2001 Business Strategies in Credit Rating and the Control

More information

The Use of Artificial Neural Network for Forecasting of FTSE Bursa Malaysia KLCI Stock Price Index

The Use of Artificial Neural Network for Forecasting of FTSE Bursa Malaysia KLCI Stock Price Index The Use of Artificial Neural Network for Forecasting of FTSE Bursa Malaysia KLCI Stock Price Index Soleh Ardiansyah 1, Mazlina Abdul Majid 2, JasniMohamad Zain 2 Faculty of Computer System and Software

More information

Predictive Model Learning of Stochastic Simulations. John Hegstrom, FSA, MAAA

Predictive Model Learning of Stochastic Simulations. John Hegstrom, FSA, MAAA Predictive Model Learning of Stochastic Simulations John Hegstrom, FSA, MAAA Table of Contents Executive Summary... 3 Choice of Predictive Modeling Techniques... 4 Neural Network Basics... 4 Financial

More information

k-layer neural networks: High capacity scoring functions + tips on how to train them

k-layer neural networks: High capacity scoring functions + tips on how to train them k-layer neural networks: High capacity scoring functions + tips on how to train them A new class of scoring functions Linear scoring function s = W x + b 2-layer Neural Network s 1 = W 1 x + b 1 h = max(0,

More information

A TEMPORAL PATTERN APPROACH FOR PREDICTING WEEKLY FINANCIAL TIME SERIES

A TEMPORAL PATTERN APPROACH FOR PREDICTING WEEKLY FINANCIAL TIME SERIES A TEMPORAL PATTERN APPROACH FOR PREDICTING WEEKLY FINANCIAL TIME SERIES DAVID H. DIGGS Department of Electrical and Computer Engineering Marquette University P.O. Box 88, Milwaukee, WI 532-88, USA Email:

More information

Using Structured Events to Predict Stock Price Movement: An Empirical Investigation. Yue Zhang

Using Structured Events to Predict Stock Price Movement: An Empirical Investigation. Yue Zhang Using Structured Events to Predict Stock Price Movement: An Empirical Investigation Yue Zhang My research areas This talk Reading news from the Internet and predicting the stock market Outline Introduction

More information

Predicting Economic Recession using Data Mining Techniques

Predicting Economic Recession using Data Mining Techniques Predicting Economic Recession using Data Mining Techniques Authors Naveed Ahmed Kartheek Atluri Tapan Patwardhan Meghana Viswanath Predicting Economic Recession using Data Mining Techniques Page 1 Abstract

More information

Backpropagation and Recurrent Neural Networks in Financial Analysis of Multiple Stock Market Returns

Backpropagation and Recurrent Neural Networks in Financial Analysis of Multiple Stock Market Returns Backpropagation and Recurrent Neural Networks in Financial Analysis of Multiple Stock Market Returns Jovina Roman and Akhtar Jameel Department of Computer Science Xavier University of Louisiana 7325 Palmetto

More information

Visualization on Financial Terms via Risk Ranking from Financial Reports

Visualization on Financial Terms via Risk Ranking from Financial Reports Visualization on Financial Terms via Risk Ranking from Financial Reports Ming-Feng Tsai 1,2 Chuan-Ju Wang 3 (1) Department of Computer Science, National Chengchi University, Taipei 116, Taiwan (2) Program

More information

A Comparative Study of Various Forecasting Techniques in Predicting. BSE S&P Sensex

A Comparative Study of Various Forecasting Techniques in Predicting. BSE S&P Sensex NavaJyoti, International Journal of Multi-Disciplinary Research Volume 1, Issue 1, August 2016 A Comparative Study of Various Forecasting Techniques in Predicting BSE S&P Sensex Dr. Jahnavi M 1 Assistant

More information

THE investment in stock market is a common way of

THE investment in stock market is a common way of PROJECT REPORT, MACHINE LEARNING (COMP-652 AND ECSE-608) MCGILL UNIVERSITY, FALL 2018 1 Comparison of Different Algorithmic Trading Strategies on Tesla Stock Price Tawfiq Jawhar, McGill University, Montreal,

More information

Feedforward Neural Networks for Sentiment Detection in Financial News

Feedforward Neural Networks for Sentiment Detection in Financial News World Journal of Social Sciences Vol. 2. No. 4. July 2012. Pp. 218 234 Feedforward Neural Networks for Sentiment Detection in Financial News Caslav Bozic* and Detlef Seese* With a rise of algorithmic trading

More information

Multi-factor Stock Selection Model Based on Kernel Support Vector Machine

Multi-factor Stock Selection Model Based on Kernel Support Vector Machine Journal of Mathematics Research; Vol. 10, No. 5; October 2018 ISSN 1916-9795 E-ISSN 1916-9809 Published by Canadian Center of Science and Education Multi-factor Stock Selection Model Based on Kernel Support

More information

arxiv: v2 [stat.ml] 19 Oct 2017

arxiv: v2 [stat.ml] 19 Oct 2017 Time Series Prediction: Predicting Stock Price Aaron Elliot ellioa2@bu.edu Cheng Hua Hsu jack0617@bu.edu arxiv:1710.05751v2 [stat.ml] 19 Oct 2017 Abstract Time series forecasting is widely used in a multitude

More information

Deep Learning - Financial Time Series application

Deep Learning - Financial Time Series application Chen Huang Deep Learning - Financial Time Series application Use Deep learning to learn an existing strategy Warning Don t Try this at home! Investment involves risk. Make sure you understand the risk

More information

Deep Learning for Forecasting Stock Returns in the Cross-Section

Deep Learning for Forecasting Stock Returns in the Cross-Section Deep Learning for Forecasting Stock Returns in the Cross-Section Masaya Abe 1 and Hideki Nakayama 2 1 Nomura Asset Management Co., Ltd., Tokyo, Japan m-abe@nomura-am.co.jp 2 The University of Tokyo, Tokyo,

More information

Measuring DAX Market Risk: A Neural Network Volatility Mixture Approach

Measuring DAX Market Risk: A Neural Network Volatility Mixture Approach Measuring DAX Market Risk: A Neural Network Volatility Mixture Approach Kai Bartlmae, Folke A. Rauscher DaimlerChrysler AG, Research and Technology FT3/KL, P. O. Box 2360, D-8903 Ulm, Germany E mail: fkai.bartlmae,

More information

Enhancing Financial Decision-Making Using Social Behavior Modeling

Enhancing Financial Decision-Making Using Social Behavior Modeling Enhancing Financial Decision-Making Using Social Behavior Modeling Ruoqian Liu, Ankit Agrawal, Wei-keng Liao, Alok Choudhary Department of Electrical Engineering and Computer Science Northwestern University

More information

Session 5. Predictive Modeling in Life Insurance

Session 5. Predictive Modeling in Life Insurance SOA Predictive Analytics Seminar Hong Kong 29 Aug. 2018 Hong Kong Session 5 Predictive Modeling in Life Insurance Jingyi Zhang, Ph.D Predictive Modeling in Life Insurance JINGYI ZHANG PhD Scientist Global

More information

UNDERSTANDING ML/DL MODELS USING INTERACTIVE VISUALIZATION TECHNIQUES

UNDERSTANDING ML/DL MODELS USING INTERACTIVE VISUALIZATION TECHNIQUES UNDERSTANDING ML/DL MODELS USING INTERACTIVE VISUALIZATION TECHNIQUES Chakri Cherukuri Senior Researcher Quantitative Financial Research Group 1 OUTLINE Introduction Applied machine learning in finance

More information

HKUST CSE FYP , TEAM RO4 OPTIMAL INVESTMENT STRATEGY USING SCALABLE MACHINE LEARNING AND DATA ANALYTICS FOR SMALL-CAP STOCKS

HKUST CSE FYP , TEAM RO4 OPTIMAL INVESTMENT STRATEGY USING SCALABLE MACHINE LEARNING AND DATA ANALYTICS FOR SMALL-CAP STOCKS HKUST CSE FYP 2017-18, TEAM RO4 OPTIMAL INVESTMENT STRATEGY USING SCALABLE MACHINE LEARNING AND DATA ANALYTICS FOR SMALL-CAP STOCKS MOTIVATION MACHINE LEARNING AND FINANCE MOTIVATION SMALL-CAP MID-CAP

More information

Development and Performance Evaluation of Three Novel Prediction Models for Mutual Fund NAV Prediction

Development and Performance Evaluation of Three Novel Prediction Models for Mutual Fund NAV Prediction Development and Performance Evaluation of Three Novel Prediction Models for Mutual Fund NAV Prediction Ananya Narula *, Chandra Bhanu Jha * and Ganapati Panda ** E-mail: an14@iitbbs.ac.in; cbj10@iitbbs.ac.in;

More information

Relative and absolute equity performance prediction via supervised learning

Relative and absolute equity performance prediction via supervised learning Relative and absolute equity performance prediction via supervised learning Alex Alifimoff aalifimoff@stanford.edu Axel Sly axelsly@stanford.edu Introduction Investment managers and traders utilize two

More information

Cognitive Pattern Analysis Employing Neural Networks: Evidence from the Australian Capital Markets

Cognitive Pattern Analysis Employing Neural Networks: Evidence from the Australian Capital Markets 76 Cognitive Pattern Analysis Employing Neural Networks: Evidence from the Australian Capital Markets Edward Sek Khin Wong Faculty of Business & Accountancy University of Malaya 50603, Kuala Lumpur, Malaysia

More information

STOCK MARKET FORECASTING USING NEURAL NETWORKS

STOCK MARKET FORECASTING USING NEURAL NETWORKS STOCK MARKET FORECASTING USING NEURAL NETWORKS Lakshmi Annabathuni University of Central Arkansas 400S Donaghey Ave, Apt#7 Conway, AR 72034 (845) 636-3443 lakshmiannabathuni@gmail.com Mark E. McMurtrey,

More information

Predicting the Success of a Retirement Plan Based on Early Performance of Investments

Predicting the Success of a Retirement Plan Based on Early Performance of Investments Predicting the Success of a Retirement Plan Based on Early Performance of Investments CS229 Autumn 2010 Final Project Darrell Cain, AJ Minich Abstract Using historical data on the stock market, it is possible

More information

Is there a decoupling between soft and hard data? The relationship between GDP growth and the ESI

Is there a decoupling between soft and hard data? The relationship between GDP growth and the ESI Fifth joint EU/OECD workshop on business and consumer surveys Brussels, 17 18 November 2011 Is there a decoupling between soft and hard data? The relationship between GDP growth and the ESI Olivier BIAU

More information

Analyzing Representational Schemes of Financial News Articles

Analyzing Representational Schemes of Financial News Articles Analyzing Representational Schemes of Financial News Articles Robert P. Schumaker Information Systems Dept. Iona College, New Rochelle, New York 10801, USA rschumaker@iona.edu Word Count: 2460 Abstract

More information

Automated Options Trading Using Machine Learning

Automated Options Trading Using Machine Learning 1 Automated Options Trading Using Machine Learning Peter Anselmo and Karen Hovsepian and Carlos Ulibarri and Michael Kozloski Department of Management, New Mexico Tech, Socorro, NM 87801, U.S.A. We summarize

More information

Recurrent Residual Network

Recurrent Residual Network Recurrent Residual Network 2016/09/23 Abstract This work briefly introduces the recurrent residual network which is a combination of the residual network and the long short term memory network(lstm). The

More information

Journal of Insurance and Financial Management, Vol. 1, Issue 4 (2016)

Journal of Insurance and Financial Management, Vol. 1, Issue 4 (2016) Journal of Insurance and Financial Management, Vol. 1, Issue 4 (2016) 68-131 An Investigation of the Structural Characteristics of the Indian IT Sector and the Capital Goods Sector An Application of the

More information

The Use of Neural Networks in the Prediction of the Stock Exchange of Thailand (SET) Index

The Use of Neural Networks in the Prediction of the Stock Exchange of Thailand (SET) Index Research Online ECU Publications Pre. 2011 2008 The Use of Neural Networks in the Prediction of the Stock Exchange of Thailand (SET) Index Suchira Chaigusin Chaiyaporn Chirathamjaree Judith Clayden 10.1109/CIMCA.2008.83

More information

Pattern Recognition by Neural Network Ensemble

Pattern Recognition by Neural Network Ensemble IT691 2009 1 Pattern Recognition by Neural Network Ensemble Joseph Cestra, Babu Johnson, Nikolaos Kartalis, Rasul Mehrab, Robb Zucker Pace University Abstract This is an investigation of artificial neural

More information

Prediction Algorithm using Lexicons and Heuristics based Sentiment Analysis

Prediction Algorithm using Lexicons and Heuristics based Sentiment Analysis IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727 PP 16-20 www.iosrjournals.org Prediction Algorithm using Lexicons and Heuristics based Sentiment Analysis Aakash Kamble

More information

GOOGLE TRENDS AND STOCK RETURNS A STUDY OF INVESTOR SENTIMENTS USING BIG DATA. School of Business, Amrita Vishwa Vidyapeetham, Coimbatore.

GOOGLE TRENDS AND STOCK RETURNS A STUDY OF INVESTOR SENTIMENTS USING BIG DATA. School of Business, Amrita Vishwa Vidyapeetham, Coimbatore. Volume 118 No. 22 2018, 941-946 ISSN: 1314-3395 (on-line version) url: http://acadpubl.eu/hub ijpam.eu GOOGLE TRENDS AND STOCK RETURNS A STUDY OF INVESTOR SENTIMENTS USING BIG DATA 1 Hari Krishnan.A.V,

More information

Stock market price index return forecasting using ANN. Gunter Senyurt, Abdulhamit Subasi

Stock market price index return forecasting using ANN. Gunter Senyurt, Abdulhamit Subasi Stock market price index return forecasting using ANN Gunter Senyurt, Abdulhamit Subasi E-mail : gsenyurt@ibu.edu.ba, asubasi@ibu.edu.ba Abstract Even though many new data mining techniques have been introduced

More information

Deep Learning in Asset Pricing

Deep Learning in Asset Pricing Deep Learning in Asset Pricing Luyang Chen 1 Markus Pelger 1 Jason Zhu 1 1 Stanford University November 17th 2018 Western Mathematical Finance Conference 2018 Motivation Hype: Machine Learning in Investment

More information

Abstract Making good predictions for stock prices is an important task for the financial industry. The way these predictions are carried out is often

Abstract Making good predictions for stock prices is an important task for the financial industry. The way these predictions are carried out is often Abstract Making good predictions for stock prices is an important task for the financial industry. The way these predictions are carried out is often by using artificial intelligence that can learn from

More information

Forecasting Stock Market Movements using Google Trend Searches

Forecasting Stock Market Movements using Google Trend Searches Forecasting Stock Market Movements using Google Trend Searches Melody Y. Huang, Randall R. Rojas, Patrick D. Convery Department of Economics University of California, Los Angeles Los Angeles, CA 90095

More information

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model Simerjot Kaur (sk3391) Stanford University Abstract This work presents a novel algorithmic trading system based on reinforcement

More information

OPENING RANGE BREAKOUT STOCK TRADING ALGORITHMIC MODEL

OPENING RANGE BREAKOUT STOCK TRADING ALGORITHMIC MODEL OPENING RANGE BREAKOUT STOCK TRADING ALGORITHMIC MODEL Mrs.S.Mahalakshmi 1 and Mr.Vignesh P 2 1 Assistant Professor, Department of ISE, BMSIT&M, Bengaluru, India 2 Student,Department of ISE, BMSIT&M, Bengaluru,

More information

Algorithmic Trading using Sentiment Analysis and Reinforcement Learning Simerjot Kaur (SUNetID: sk3391 and TeamID: 035)

Algorithmic Trading using Sentiment Analysis and Reinforcement Learning Simerjot Kaur (SUNetID: sk3391 and TeamID: 035) Algorithmic Trading using Sentiment Analysis and Reinforcement Learning Simerjot Kaur (SUNetID: sk3391 and TeamID: 035) Abstract This work presents a novel algorithmic trading system based on reinforcement

More information

LendingClub Loan Default and Profitability Prediction

LendingClub Loan Default and Profitability Prediction LendingClub Loan Default and Profitability Prediction Peiqian Li peiqian@stanford.edu Gao Han gh352@stanford.edu Abstract Credit risk is something all peer-to-peer (P2P) lending investors (and bond investors

More information

SURVEY OF MACHINE LEARNING TECHNIQUES FOR STOCK MARKET ANALYSIS

SURVEY OF MACHINE LEARNING TECHNIQUES FOR STOCK MARKET ANALYSIS International Journal of Computer Engineering and Applications, Volume XI, Special Issue, May 17, www.ijcea.com ISSN 2321-3469 SURVEY OF MACHINE LEARNING TECHNIQUES FOR STOCK MARKET ANALYSIS Sumeet Ghegade

More information

Machine Learning in Risk Forecasting and its Application in Low Volatility Strategies

Machine Learning in Risk Forecasting and its Application in Low Volatility Strategies NEW THINKING Machine Learning in Risk Forecasting and its Application in Strategies By Yuriy Bodjov Artificial intelligence and machine learning are two terms that have gained increased popularity within

More information

Performance analysis of Neural Network Algorithms on Stock Market Forecasting

Performance analysis of Neural Network Algorithms on Stock Market Forecasting www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 3 Issue 9 September, 2014 Page No. 8347-8351 Performance analysis of Neural Network Algorithms on Stock Market

More information

Investigating Algorithmic Stock Market Trading using Ensemble Machine Learning Methods

Investigating Algorithmic Stock Market Trading using Ensemble Machine Learning Methods Investigating Algorithmic Stock Market Trading using Ensemble Machine Learning Methods Khaled Sharif University of Jordan * kldsrf@gmail.com Mohammad Abu-Ghazaleh University of Jordan * mohd.ag@live.com

More information

Introducing GEMS a Novel Technique for Ensemble Creation

Introducing GEMS a Novel Technique for Ensemble Creation Introducing GEMS a Novel Technique for Ensemble Creation Ulf Johansson 1, Tuve Löfström 1, Rikard König 1, Lars Niklasson 2 1 School of Business and Informatics, University of Borås, Sweden 2 School of

More information

Application of Innovations Feedback Neural Networks in the Prediction of Ups and Downs Value of Stock Market *

Application of Innovations Feedback Neural Networks in the Prediction of Ups and Downs Value of Stock Market * Proceedings of the 6th World Congress on Intelligent Control and Automation, June - 3, 006, Dalian, China Application of Innovations Feedback Neural Networks in the Prediction of Ups and Downs Value of

More information

Predicting Stock Movements Using Market Correlation Networks

Predicting Stock Movements Using Market Correlation Networks Predicting Stock Movements Using Market Correlation Networks David Dindi, Alp Ozturk, and Keith Wyngarden {ddindi, aozturk, kwyngard}@stanford.edu 1 Introduction The goal for this project is to discern

More information

SELECTION BIAS REDUCTION IN CREDIT SCORING MODELS

SELECTION BIAS REDUCTION IN CREDIT SCORING MODELS SELECTION BIAS REDUCTION IN CREDIT SCORING MODELS Josef Ditrich Abstract Credit risk refers to the potential of the borrower to not be able to pay back to investors the amount of money that was loaned.

More information

Chapter IV. Forecasting Daily and Weekly Stock Returns

Chapter IV. Forecasting Daily and Weekly Stock Returns Forecasting Daily and Weekly Stock Returns An unsophisticated forecaster uses statistics as a drunken man uses lamp-posts -for support rather than for illumination.0 Introduction In the previous chapter,

More information

State Switching in US Equity Index Returns based on SETAR Model with Kalman Filter Tracking

State Switching in US Equity Index Returns based on SETAR Model with Kalman Filter Tracking State Switching in US Equity Index Returns based on SETAR Model with Kalman Filter Tracking Timothy Little, Xiao-Ping Zhang Dept. of Electrical and Computer Engineering Ryerson University 350 Victoria

More information

A Machine Learning Investigation of One-Month Momentum. Ben Gum

A Machine Learning Investigation of One-Month Momentum. Ben Gum A Machine Learning Investigation of One-Month Momentum Ben Gum Contents Problem Data Recent Literature Simple Improvements Neural Network Approach Conclusion Appendix : Some Background on Neural Networks

More information

Journal of Computational and Applied Mathematics. The mean-absolute deviation portfolio selection problem with interval-valued returns

Journal of Computational and Applied Mathematics. The mean-absolute deviation portfolio selection problem with interval-valued returns Journal of Computational and Applied Mathematics 235 (2011) 4149 4157 Contents lists available at ScienceDirect Journal of Computational and Applied Mathematics journal homepage: www.elsevier.com/locate/cam

More information

Forecasting Price Movements using Technical Indicators: Investigating the Impact of. Varying Input Window Length

Forecasting Price Movements using Technical Indicators: Investigating the Impact of. Varying Input Window Length Forecasting Price Movements using Technical Indicators: Investigating the Impact of Varying Input Window Length Yauheniya Shynkevich 1,*, T.M. McGinnity 1,2, Sonya Coleman 1, Ammar Belatreche 3, Yuhua

More information

Shynkevich, Y, McGinnity, M, Coleman, S, Belatreche, A and Li, Y

Shynkevich, Y, McGinnity, M, Coleman, S, Belatreche, A and Li, Y Forecasting price movements using technical indicators : investigating the impact of varying input window length Shynkevich, Y, McGinnity, M, Coleman, S, Belatreche, A and Li, Y http://dx.doi.org/10.1016/j.neucom.2016.11.095

More information

In physics and engineering education, Fermi problems

In physics and engineering education, Fermi problems A THOUGHT ON FERMI PROBLEMS FOR ACTUARIES By Runhuan Feng In physics and engineering education, Fermi problems are named after the physicist Enrico Fermi who was known for his ability to make good approximate

More information

Multistage risk-averse asset allocation with transaction costs

Multistage risk-averse asset allocation with transaction costs Multistage risk-averse asset allocation with transaction costs 1 Introduction Václav Kozmík 1 Abstract. This paper deals with asset allocation problems formulated as multistage stochastic programming models.

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 2, Mar Apr 2017

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 2, Mar Apr 2017 RESEARCH ARTICLE Stock Selection using Principal Component Analysis with Differential Evolution Dr. Balamurugan.A [1], Arul Selvi. S [2], Syedhussian.A [3], Nithin.A [4] [3] & [4] Professor [1], Assistant

More information

JACOBS LEVY CONCEPTS FOR PROFITABLE EQUITY INVESTING

JACOBS LEVY CONCEPTS FOR PROFITABLE EQUITY INVESTING JACOBS LEVY CONCEPTS FOR PROFITABLE EQUITY INVESTING Our investment philosophy is built upon over 30 years of groundbreaking equity research. Many of the concepts derived from that research have now become

More information

Alpha-Beta Soup: Mixing Anomalies for Maximum Effect. Matthew Creme, Raphael Lenain, Jacob Perricone, Ian Shaw, Andrew Slottje MIRAJ Alpha MS&E 448

Alpha-Beta Soup: Mixing Anomalies for Maximum Effect. Matthew Creme, Raphael Lenain, Jacob Perricone, Ian Shaw, Andrew Slottje MIRAJ Alpha MS&E 448 Alpha-Beta Soup: Mixing Anomalies for Maximum Effect Matthew Creme, Raphael Lenain, Jacob Perricone, Ian Shaw, Andrew Slottje MIRAJ Alpha MS&E 448 Recap: Overnight and intraday returns Closet-1 Opent Closet

More information

An Introduction to the Mathematics of Finance. Basu, Goodman, Stampfli

An Introduction to the Mathematics of Finance. Basu, Goodman, Stampfli An Introduction to the Mathematics of Finance Basu, Goodman, Stampfli 1998 Click here to see Chapter One. Chapter 2 Binomial Trees, Replicating Portfolios, and Arbitrage 2.1 Pricing an Option A Special

More information

Draft. emerging market returns, it would seem difficult to uncover any predictability.

Draft. emerging market returns, it would seem difficult to uncover any predictability. Forecasting Emerging Market Returns Using works CAMPBELL R. HARVEY, KIRSTEN E. TRAVERS, AND MICHAEL J. COSTA CAMPBELL R. HARVEY is the J. Paul Sticht professor of international business at Duke University,

More information