Portfolio Recommendation System Stanford University CS 229 Project Report 2015

Similar documents
Application of Deep Learning to Algorithmic Trading

$tock Forecasting using Machine Learning

Iran s Stock Market Prediction By Neural Networks and GA

Stock Prediction Using Twitter Sentiment Analysis

Improving Stock Price Prediction with SVM by Simple Transformation: The Sample of Stock Exchange of Thailand (SET)

Forecasting Agricultural Commodity Prices through Supervised Learning

ALGORITHMIC TRADING STRATEGIES IN PYTHON

International Journal of Computer Engineering and Applications, Volume XII, Issue IV, April 18, ISSN

Predicting the Success of a Retirement Plan Based on Early Performance of Investments

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, ISSN

Relative and absolute equity performance prediction via supervised learning

Predicting stock prices for large-cap technology companies

PERFORMANCE ANALYSIS and PREDICTION of NEPAL STOCK MARKET (NEPSE) for INVESTMENT DECISION using MACHINE LEARNING TECHNIQUES

Novel Approaches to Sentiment Analysis for Stock Prediction

Artificially Intelligent Forecasting of Stock Market Indexes

Supervised classification-based stock prediction and portfolio optimization

Stock Market Prediction System

STOCK MARKET PREDICTION AND ANALYSIS USING MACHINE LEARNING

Multi-factor Stock Selection Model Based on Kernel Support Vector Machine

Abstract Making good predictions for stock prices is an important task for the financial industry. The way these predictions are carried out is often

arxiv: v1 [q-fin.st] 3 Jun 2014

A Comparative Study of Ensemble-based Forecasting Models for Stock Index Prediction

Comparitive Automated Bitcoin Trading Strategies

Examining Long-Term Trends in Company Fundamentals Data

Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, ISSN

Deep Learning - Financial Time Series application

Machine Learning for Quantitative Finance

Two kinds of neural networks, a feed forward multi layer Perceptron (MLP)[1,3] and an Elman recurrent network[5], are used to predict a company's

Academic Research Review. Algorithmic Trading using Neural Networks

Foreign Exchange Forecasting via Machine Learning

Statistical and Machine Learning Approach in Forex Prediction Based on Empirical Data

Exercise: Support Vector Machines

HKUST CSE FYP , TEAM RO4 OPTIMAL INVESTMENT STRATEGY USING SCALABLE MACHINE LEARNING AND DATA ANALYTICS FOR SMALL-CAP STOCKS

An enhanced artificial neural network for stock price predications

Machine Learning in Finance

Do Media Sentiments Reflect Economic Indices?

Final Examination CS540: Introduction to Artificial Intelligence

Dynamic Resource Allocation for Spot Markets in Cloud Computi

STOCK PRICE PREDICTION: KOHONEN VERSUS BACKPROPAGATION

Prediction of Stock Closing Price by Hybrid Deep Neural Network

Can Twitter predict the stock market?

Analyzing Representational Schemes of Financial News Articles

Accepted Manuscript. Enterprise Credit Risk Evaluation Based on Neural Network Algorithm. Xiaobing Huang, Xiaolian Liu, Yuanqian Ren

Bond Market Prediction using an Ensemble of Neural Networks

Prediction of stock price developments using the Box-Jenkins method

UNIVERSITY OF CALGARY. Analyzing Causality between Actual Stock Prices and User-weighted Sentiment in Social Media. for Stock Market Prediction

Forecasting Initial Public Offering Pricing Using Particle Swarm Optimization (PSO) Algorithm and Support Vector Machine (SVM) In Iran

Based on BP Neural Network Stock Prediction

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model

An introduction to Machine learning methods and forecasting of time series in financial markets

A New Method Based on Clustering and Feature Selection for Credit Scoring of Banking Customers Seyedeh Maryam Anaei 1 and Mohsen Moradi 2

Credit Scoring Analysis using LASSO Logistic Regression and Support Vector Machine (SVM)

Journal of Internet Banking and Commerce

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Wide and Deep Learning for Peer-to-Peer Lending

Composite+ ALGORITHMIC PRICING IN THE CORPORATE BOND MARKET MARKETAXESS RESEARCH

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Prediction Using Back Propagation and k- Nearest Neighbor (k-nn) Algorithm

Predicting Economic Recession using Data Mining Techniques

High Frequency Price Movement Strategy. Adam, Hujia, Samuel, Jorge

Applications of Neural Networks in Stock Market Prediction

Forecasting stock market prices

Deep Learning for Forecasting Stock Returns in the Cross-Section

Distance-Based High-Frequency Trading

Copy Right to GARPH Page 38

Decision model, sentiment analysis, classification. DECISION SCIENCES INSTITUTE A Hybird Model for Stock Prediction

Research Article Stock Price Change Rate Prediction by Utilizing Social Network Activities

Classification of trading strategies of agents in a competitive market

Understanding neural networks

The Crystal Ball of Safety

Using Structured Events to Predict Stock Price Movement: An Empirical Investigation. Yue Zhang

Measuring the Impact of Financial News and Social Media on Stock Market Modeling Using Time Series Mining Techniques

arxiv: v1 [cs.ai] 7 Jan 2018

How To Prevent Another Financial Crisis On Wall Street

Mean Reverting Asset Trading. Research Topic Presentation CSCI-5551 Grant Meyers

Topic-based vector space modeling of Twitter data with application in predictive analytics

Portfolio replication with sparse regression

Trading Financial Markets with Online Algorithms

Support Vector Machines: Training with Stochastic Gradient Descent

Neural Network Prediction of Stock Price Trend Based on RS with Entropy Discretization

Regression Review and Robust Regression. Slides prepared by Elizabeth Newton (MIT)

Using artificial neural networks for forecasting per share earnings

Predictive Analytics for Risk Management

Investing just got social

Lecture outline W.B.Powell 1

Draft. emerging market returns, it would seem difficult to uncover any predictability.

Large-Scale SVM Optimization: Taking a Machine Learning Perspective

Agricultural and Applied Economics 637 Applied Econometrics II

Automated Options Trading Using Machine Learning

Style Investing with Machine Learning

Visualization on Financial Terms via Risk Ranking from Financial Reports

Role of soft computing techniques in predicting stock market direction

Prediction of Stock Price Movements Using Options Data

Predicting the direction of stock market prices using random forest

A Genetic Algorithm for the Calibration of a Micro- Simulation Model Omar Baqueiro Espinosa

A Novel Prediction Method for Stock Index Applying Grey Theory and Neural Networks

SURVEY OF MACHINE LEARNING TECHNIQUES FOR STOCK MARKET ANALYSIS

Session 5. Predictive Modeling in Life Insurance

CHAPTER 3 MA-FILTER BASED HYBRID ARIMA-ANN MODEL

Transcription:

Portfolio Recommendation System Stanford University CS 229 Project Report 205 Berk Eserol Introduction Machine learning is one of the most important bricks that converges machine to human and beyond. Considering the last twenty years improvement in the area, it is easy to foresee that machine learning will continue being valuable for daily life. The improvement of machine learning can be attributed into four main reasons. Technical achievements in the area (), growth of machine capacities and capabilities (2), advance connectivity and spreads of service technologies (3) and massive increase of the data amount (4) (Horvitz 2006). Using the power of machine learning to analyze the historical data, future predictions and projections can be performed on many different subjects. Even though, it is affected by various different external events, in this application project, stock market prices are tried to be predicted using only their historical data and a portfolio recommendation result is generated via the output of the regression and scoring. The aim is to recommend a portfolio with high accuracy profit return. The system is designed to produce result on any market structure that can be represented with similar data. For the project, historical NASDAQ stock prices are used. Related Work Stock market can be associated with various different events and data. Earning announcements are one of the key events that affects the stock prices. Using company earnings data with Gaussian kernelled SVM can lead predictions with 64% accuracy (Pouransari H., Chalabi H. 204). Other financial assets such as currency and underground resources can be another source of information and increase predictions up to 77.6% for some markets (Shen S., Jiang H., Zhang T. 202). Additional features can enhance the prediction for 3, 5 or 0 days results up to 70% accuracy (Di X. 204). Algorithm selection is another factor on the accuracy. 50% accuracy is achievable with neural networks (Lin H. 203). Increasing the prediction range, for example 44 days with SVM can produce more accurate results up to 79% (Dai Y., Zhang Y. 203). Other than regression approaches, without predicting the price but classifying the stock as positive/negative with SVM and creating an equally distributed portfolio can return nearly 3% higher than the market average (Arık S., Eryılmaz B. and Goldberg A. 203) Dataset and Features All available historical price data for most of the companies that have stocks tradable through NASDAQ is collected. The data contains daily values from the beginning of its trade date to a recent date (ideally one day before the calculation). Each day data is in the following format: MSFT Year Month Day Open High Low Close Volume 205 2 53.4800 53.9800 53.900 53.3200 34485500 Yahoo Finance service and herval/yahoo-finance open source project (https://github.com/herval/yahoo-finance) is used for data retrieval.

Currently the system has 300 company histories from their first day on NASDAQ to /2/205. The data can be updated with most recent values to increase accuracy before the calculation. The systems expects a set of inputs and use them to fill the feature vectors or as parameters. The expected inputs are the following - Set of NASDAQ stock names: Instead of running on all 300 companies, the system considers only the given set of stocks and only uses them in the recommended portfolio. - Budget: Budget is the maximum amount of total stock price in the recommendation. The sum of the recommended stock price does not exceed the given budget amount. - Keep Time (KTI): The maximum intended time to keep the purchased stock. The keep time interval affects the features as a parameter. It can be minimum one day and maximum one year in the type of day count. - History (HI): History interval is used to trim the historical data to be considered. The data with the date outside of this interval is not considered. Using the data bordered with inputs and using the parameters, the feature vector is filled such that x 0 x x 2 x 3 x 4 x 5 x 6 x 7 y Maximum Minimum Volume Average Day Month Year Close The interval is referring to the two times of the keep time interval. The data set is created by an interval length sliding window for every data day. Methods As a preprocessing according to the given input parameters, for every historical day of the given companies, a new data point is created in the form of the feature vector. The interval is considered as two times of keep time interval in order to make a better prediction. As an example, if the given interval is five days, then the data points are generated according to ten days intervals. The last day is considered as a new point (half of the interval) and the gain is calculated accordingly. As a learning method, locally weighted linear regression (LWR) is used minimizing J(θ) = m 2 w(i) (θ T x (i) y (i) ) 2 i= Where the weights are calculated as w (i) = exp( x x(i) 2τ 2 ) τ is the bandwidth parameter. The bandwidth parameter is used to prevent overfitting and underfitting. The θ is calculated using the normal equations θ = (X T WX) X T Wy 2

Train is performed during the query time for the new point and predicted interval close price is determined. The difference between last close price and predicted interval close price is the initial score of the stock. If the initial score of a stock is negative then it is removed from the recommended bundle (RB). The positive initial score stocks are rescored between [0, ] based on their initial scores such that total second scores of the stocks in the recommended bundle is. The budged is distributed proportional to the second scores of the recommended bundle stocks. Results The calculation of the training error is performed on the same data changing the latest date of the system into an earlier date and comparing the result with actual results. Using the leave one out cross validation, error values are calculated for meaningful subsets of the feature set and best result is achieved with using the all predefined features. The bandwidth parameter is used as another variable for minimizing the training error (best at 0.8). The titles of the graphs represent the given input to the system in the orders of KTI/Budget/HI. Each graph has four dates (execution of the system) of recommendation results indicating the day of the market buy order requested. The stocks are assumed sold KTI days later and the charts show the virtual profit in terms of dollars after KTI days of the given date. The following system training errors (profit errors) are calculated according to the virtual profit of each stock that appears in the recommended bundle (RB). The result considered as error if it appears in the recommended bundle and the virtual profit is not positive after KTI days such that ε (p) = m m i= {p(s(i) ) < 0} s RB ε (p) 0/22/205 0/29/205 /05/205 /2/205 5/50000/3000 0.00 0.40 0.80 0.00 8/50000/4800 0.00 0.00 0.60 0.00 0/50000/6000 0.00 0.00 0.33 0.00 3/50000/7800 0.00 0.00 0.00 0.00 Using these 6 executions of the system with the predefined trending set of stocks, the average profit error is 0.33 (profit accuracy is 86.69%). The total virtual profit combining the result of four execution of the system per given set of inputs are the following. Total Virtual Profit 5/50000/3000 8/50000/4800 0/50000/6000 3/50000/7800 7639.448 744.044 3654.637 6835.232 A set of 2 trending stocks {MSFT, GOOGL, AAPL, TXN, GT, INTC, ADP, ATRI, CASY, COST, CINF, CLBH} is selected as an example user input. Four executions of the system with different buy market order date results are represented in each charts. Positive results are considered as correct and negative results are considered as errors. Detailed virtual profit results after KTI days later are shown separately. 5/50000/3000 0/22/205 0/29/205 /05/205 /2/205 MSFT 0 0 0 0.53998 GOOGL 4587.7 0 0 54.55957 AAPL 0 0 0 575.2796 TXN 0 02.9202 0 55.0403 GT 0 68.99985 0 84.080 INTC 0 0-289.6 260.2996 ADP 0-40.76-89.6802 2.209974 ATRI.39004-6.0803 026.02 0 CASY 0 234.80 0 2.85 COST 0 0-5.202 0 CINF 0 0-8.600 33.8004 CLBH 0 0 0 0 3

8/50000/4800 0/22/205 0/29/205 /05/205 /2/205 MSFT 0 0 0 49.73 GOOGL 3587.04 0 0 0 AAPL 0 0 0 445.5594 TXN 0 0-20.460 4.6 GT 0 994.4969 0 0 INTC 0 0-64.56 468.4398 ADP 0 0-67.86 36.25985 ATRI 0 6.22 460.599 0 COST 0 0 60.25988 0 CINF 0 0 0 80.52 CLBH 32.3 25.6 0 6.3 0/50000/6000 0/22/205 0/29/205 /05/205 /2/205 MSFT 0 0 0 0 GOOGL 0 0 0 0 AAPL 0 0 0 242.439652 TXN 0 0 00.64987 75.680086 GT 0 34.779257 39.600072 0 INTC 0 0 0 556.999 ADP 0 0-259.075 7.44052 ATRI 0 0 0 0 COST 0 0 0 30.780082 CINF 0 0 0 59.36059 CLBH 398.92 76.8 0 0 3/50000/7800 0/22/205 0/29/205 /05/205 /2/205 MSFT 0 0 0 0 GOOGL 0 0 0 0 AAPL 0 0 0 76.7973 TXN 0 0 0 9.520 GT 0 0 0 48.999 INTC 0 0 0 478.7 ADP 0 0 0 29.7206 ATRI 2394.962 0 0 0 COST 0 0 0.9799 CINF 0 0 0 0 CLBH 0 208.62 294.8 0 Conclusion and Future Work The Portfolio Recommendation System shows that adding a portfolio layer on top of the stock regression results is increasing the success rate (profit accuracy) up to 86.69%, when success is calculated by the profitability of the recommendations. Moreover, it helps to reduce the risk by distributing the budget over a set of stocks and tries to minimize the reflection of the regression errors to the profit. The project can be enriched with additional features and alternative algorithms. Another suitable algorithm for the problem would be SVR instead of LWR. Some other hybrid solutions can also be applied such as determining the positive stocks with SVM and rate them with a regression algorithms. 4

References Horvitz, E. 2006. Machine learning, reasoning, and intelligence in daily life: Directions and challenges. Technical Report TR- 2006-85, Microsoft Research. Pouransari H., Chalabi H. 204. Event-based stock market prediction, Stanford University Di X. 204. Stock Trend Prediction with Technical Indicators using SVM, Stanford University Lin H. 203. Feature Investigation for Stock market Prediction, Department of Aeronautics and Astronautics, Stanford University Dai Y., Zhang Y. 203. Machine Learning in Stock Trend Forecasting, Stanford University Arık S., Eryılmaz B., Goldberg A. 203. Supervised classication-based stock prediction and portfolio optimization Shen S., Jiang H., Zhang T. 202. Stock Market Forecasting Using Machine Learning Algorithms 5