Predicting First Day Returns for Japanese IPOs

Similar documents
Improving Lending Through Modeling Defaults. BUDT 733: Data Mining for Business May 10, 2010 Team 1 Lindsey Cohen Ross Dodd Wells Person Amy Rzepka

Predicting Changes in Quarterly Corporate Earnings Using Economic Indicators

SHORT RUN PERFORMANCE OF INITIAL PUBLIC OFFERINGS IN INDIA

Session 5. Predictive Modeling in Life Insurance

Grandstanding and Venture Capital Firms in Newly Established IPO Markets

Predicting Companies Delisting to Improve Mutual Fund Performance

Lending Club Loan Portfolio Optimization Fred Robson (frobson), Chris Lucas (cflucas)

The Variability of IPO Initial Returns

An Application of Data Mining Algorithms For Shipbuilding Cost Estimation

Table IA.1 CEO Pay-Size Elasticity and Increased Labor Demand Panel A: IPOs Scaled by Full Sample Industry Average

Should IPOs be Auctioned? The Impacts of Japanese Auction-Priced IPOs

Statistical Case Estimation Modelling

Predictive Modeling Cross Selling of Home Loans to Credit Card Customers

Internet Appendix to Quid Pro Quo? What Factors Influence IPO Allocations to Investors?

Credit Card Default Predictive Modeling

SUMMARY FEES CHARGED IN 2015 BY UCITS DISTRIBUTED IN FRANCE

Health and Retirement Study Imputations of Lifetime Earnings Records

COURSE 6 MORNING SESSION SECTION A WRITTEN ANSWER

Machine Learning in Risk Forecasting and its Application in Low Volatility Strategies

REGIONAL WORKSHOP ON TRAFFIC FORECASTING AND ECONOMIC PLANNING

Internet Appendix for. Fund Tradeoffs. ĽUBOŠ PÁSTOR, ROBERT F. STAMBAUGH, and LUCIAN A. TAYLOR

IPO Underpricing and Information Disclosure. Laura Bottazzi (Bologna and IGIER) Marco Da Rin (Tilburg, ECGI, and IGIER)

Internet Appendix B for Pre-Market Trading and IPO Pricing: The Post-Sample Period

Market Microstructure Invariants

The Role of Industry Affiliation in the Underpricing of U.S. IPOs

IPO Underpricing and Insider Wealth Maximization in Internet firms

Are New Modeling Techniques Worth It?

Hedge Funds as International Liquidity Providers: Evidence from Convertible Bond Arbitrage in Canada

Predictive modelling around the world Peter Banthorpe, RGA Kevin Manning, Milliman

Milestone2. Zillow House Price Prediciton. Group: Lingzi Hong and Pranali Shetty

Predictive Model for Prosper.com BIDM Final Project Report

HILDA PROJECT TECHNICAL PAPER SERIES No. 2/09, December 2009

CS 7646 Exam 1 October 12, 2017 Exam Version D. Do not open this booklet until instructed to begin

Assessing the reliability of regression-based estimates of risk

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998

Performance of. Gilt Mutual Funds. ICRA Online Limited

CHAPTER 2 SECURITIES MARKETS. Teaching Guides for Questions and Problems in the Text

Internet Appendix to. Glued to the TV: Distracted Noise Traders and Stock Market Liquidity

Adverse Selection on Maturity: Evidence from On-Line Consumer Credit

Institutional Affiliation and the Role of Venture Capital: Evidence from Initial Public Offerings in Japan

Statistical Data Mining for Computational Financial Modeling

SUBSTANCE, SYMBOLISM AND THE SIGNAL STRENGTH OF VENTURE CAPITALIST PRESTIGE

DATA GAPS AND NON-CONFORMITIES

Regressing Loan Spread for Properties in the New York Metropolitan Area

Lecture 2 Describing Data

Internet Appendix: High Frequency Trading and Extreme Price Movements

Jaime Frade Dr. Niu Interest rate modeling

Firing Costs, Employment and Misallocation

Rating Efficiency in the Indian Commercial Paper Market. Anand Srinivasan 1

Supporting Information for:

The Variability of IPO Initial Returns

A Multi-topic Approach to Building Quant Models. Bringing Semantic Intelligence to Financial Markets

DATA MINING ON LOAN APPROVED DATSET FOR PREDICTING DEFAULTERS

Chapter 18: The Correlational Procedures

Discussion of: Banks Incentives and Quality of Internal Risk Models

Online Appendix Information Asymmetries in Consumer Credit Markets: Evidence from Payday Lending

FINANCIAL MARKETS. Products. Providing a comprehensive view of the global ETP market to support the needs of all participants

Pecuniary Mistakes? Payday Borrowing by Credit Union Members

WORKING P A P E R. Individuals Uncertainty about Future Social Security Benefits and Portfolio Choice ADELINE DELAVANDE SUSANN ROHWEDDER WR-782

Does Exchange Rate Behavior Change when Interest Rates are Negative? Allaudeen Hameed and Andrew K. Rose*

Modeling and Forecasting Customer Behavior for Revolving Credit Facilities

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing

EENG473 Mobile Communications Module 3 : Week # (11) Mobile Radio Propagation: Large-Scale Path Loss

Exploring Data and Graphics

Winner s Curse in Initial Public Offering Subscriptions with Investors Withdrawal Options

Methodology of Calculation of the Benchmark Treasury Bills Curve

Auctions as an Alternative to Book Building in the IPO Process: An Examination of Underpricing for Large Firms in France

ECS171: Machine Learning

FE670 Algorithmic Trading Strategies. Stevens Institute of Technology

Supplementary Appendix for Moral Hazard, Incentive Contracts and Risk: Evidence from Procurement

Bond Pricing AI. Liquidity Risk Management Analytics.

Internet Appendix. Additional Results. Figure A1: Stock of retail credit cards over time

STAT 157 HW1 Solutions

Models of Patterns. Lecture 3, SMMD 2005 Bob Stine

Income inequality and the growth of redistributive spending in the U.S. states: Is there a link?

Variable Life Insurance

A STUDY ON INITIAL PERFORMANCE OF IPO S IN SINDIA DURING COMPARISON OF BOOK BUILDING AND FIXED PRICE MECHANISM

IPO Market Cycles: Bubbles or Sequential Learning?

Hybrid Intelligence Technology for the Investment Consensus

The Geography of Institutional Investors, Information. Production, and Initial Public Offerings. December 7, 2016

ALGORITHMIC TRADING STRATEGIES IN PYTHON

SFSU FIN822 Project 1

Nomura Announces Issuance of New Shares and Secondary Offering of Shares

CHAPTER 5 RESULT AND ANALYSIS

The Effects of Increasing the Early Retirement Age on Social Security Claims and Job Exits

A note on forecasting exchange rates using a cluster technique

Methodology of Calculation of the Benchmark Certificate of Deposit Curve

Mortality Table Development 2014 VBT Primary Tables. Table of Contents

Development of a Multi-Agent AI Framework for Autonomous FOREX Trading

When determining but for sales in a commercial damages case,

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

SEX DISCRIMINATION PROBLEM

Better decision making under uncertain conditions using Monte Carlo Simulation

Session 5. A brief introduction to Predictive Modeling

Mortality of Beneficiaries of Charitable Gift Annuities 1 Donald F. Behan and Bryan K. Clontz

Morningstar Style Box TM Methodology

Online Appendix to. The Value of Crowdsourced Earnings Forecasts

FUNDINDUSTRY INSIGHT REPORT

A Comparison of the Results in Barber, Odean, and Zhu (2006) and Hvidkjaer (2006)

Investors seeking access to the bond

Transcription:

Predicting First Day Returns for Japanese IPOs Executive Summary Goal: To predict the First Day returns on Japanese IPOs (based on first day closing price), using public information available prior to the offer Purpose: To use the model to predict whether a new IPO coming on the market will make first day gains or not, and use the result to decide whether to invest in the IPO. Type of Problem: Predictive Data Gathering We used a publically available data-set on Japanese IPO data from 1997-2009, from Kaneko and Pettway s Japanese IPO Database (KP-JIPO) : http://www.fbc.keio.ac.jp/~kaneko/kp-jipo/top.htm This dataset has 1561 records, for all IPOs in Japan from 1997 to 2009. Data Cleaning The raw data had multiple issues 1) The date was not in a format recognized by excel, so we had to convert all the dates to excelrecognizable format. 2) Records from 1997-1999 were missing information about the Lead Manager s fees and Percentage of allocation to Lead Manager, which we considered to be important predictors. This led to removal of the 128 records within this time period. 3) Industry column had 2 spelling mistakes, which were removed. Data Preparation We created some columns (combining some of the columns from the data) which we thought would be a better predictor for first day returns 1) Minimum bid size The minimum bid a retail investor has to make to participate in the IPO. The rationale for using this was that retail investors would be more averse to investing in IPOs if the minimum upfront commitment required was high. 2) Secondary Offering %age The percentage of secondary shares being offered in the IPO, i.e. the number of shares of existing shareholders that were exiting the company via the IPO. The rationale behind this variable was that IPOs in which all shares are new issues give all the money to the company, whereas if current shareholders are seen to be using the IPO as an exit route, then the public may see that as a negative signal about the quality of the IPO.

We also created binned variables from categorical data 1) Industry Of the 33 industries in the cleaned data, we binned them 4 categories of industries, representing the 4 types of patterns in the first day returns observed as a function of industry. 2) BRLMs Out of 56 Lead Managers in the data, we binned them into 3 categories of Lead Managers, depending on their effect on the first day return. 3) Market There were 11 types of markets over which the IPOs were listed, however we differentiated only between OTC (over the counter) and exchange traded IPOs, i.e. a dummy variable of whether the issue was OTC or not. Post these steps, we partitioned the data into the standard 50-30-20 split for running prediction models. Predictors Used: After data preparation and some exploration of the data, we narrowed down on the following 9 predictors: 1) Age of company at time of IPO 2) Gross Proceeds (size of IPO) 3) Minimum Bid Amount 4) Underwriter s Gross Spread (fees as %age of size of IPO) 5) Percentage shares allocated to Lead Manager 1 6) Secondary offering as %age of total 7) IS_OTC listing 8) Industry_Type (binned categorical variable 4 categories) 9) Lead_Manager (binned categorical variable 3 categories) Exploration 1) One of the things we noticed was that the y-variable, the first day IPO returns, was clustered around a small area with decreasing density away from the cluster. This led us to think that the y-variable may be better represented using a logarithmic relationship. So we converted our y- variable to log(y) and used that for all predictions. 2) We noticed that the first day return was increasing when the allocation to the lead manager was higher, which is somewhat intuitive as well that the lead manager would have an incentive to underprice the issue and ensure greater chance of full subscription if he had to bear a larger in case the issue wasn t fully subscribed.

3) The underpricing of the IPO decreased further if the company was mature at the time of IPO, i.e. younger companies were much more likely to be underpriced than mature ones. 4) Our newly created variable, IS_OTC, was also helpful in assessing underpricing Analysis We used the Multi-Linear Regression, K-Nearest Neighbors and Regression Tree algorithms to attempt to predict the first day return (on log scale). We note that using the Naïve rule, the average of first day returns was 67%, with a standard error of 106%. We used Spotfire Miner for Liner Regression, Regression Trees (full and pruned) and XLMiner for K- Nearest Neighbours. The Spotfire Miner setup for our procedure is shown below.

The plots for the residuals using the different methods are presented as (Exhibit 1) at the end of the report. 1) Multiple Linear Regression: Multiple linear regression gave poor results, with high RMSE of 110% and Mean absolute error of 67% 2) Full Regression Tree: (Exhibit 2) The relative importance of the predictors was as follows Age, Gross proceeds (size of issue) and Minimum bid amount ranked as the top 3 factors affecting first day IPO returns. The Regression tree however, did not improve much on the prediction. 3) Regression: Pruned Tree

The pruned tree gave marginally better results than the full tree. 4) K-Nearest Neighbours: We ran KNN using a k of 5, since when we allowed k to vary to large numbers the algorithm kept on choosing the largest number possible (went upto 20, the max that XLMiner can handle), with marginal improvements in error rates with each incremental increase in k (Exhibit 4). So we decided to choose the k at a point from where onwards the improvements seemed to taper off. 5) Ensemble : We took the average prediction of the above 4 methods to predict the value for a new record, however the results from the ensemble are not significantly better than any of our individual methods. However the dispersion of the residuals (Exhibit 1) seems to be lesser than any of the methods individually. Conclusion On average the prediction algorithms do not seem to do significantly better than the naïve rule, with standard errors bordering near 100%. With an average expected return of 67%, and standard error of about 100% is not good enough to use for investment purposes. Ideally we would want the mean 1 standard deviation of the first day returns to be higher than 0 for us to have any confidence in the model. Submitted by: Vivek Kumar 61210021 Rohan Mahadar 61210055 Gaurav Jain 61210585 Tejas Pahlajani 61210606

Exhibit 1: Residuals using the different methods Exhibit 2: Full Regression Tree

Exhibit 3: Pruned Tree Value of k Training RMS Error Validation RMS Error 1 0 0.722727291 2 0 0.650958645 3 0 0.622172251 4 0 0.617716972 5 0 0.612118456 6 0 0.608404501 7 0 0.607777356 8 0 0.606935697 9 0 0.605849004 10 0 0.604014927 11 0 0.603725082 12 0 0.603623558 13 0 0.602380018 14 0 0.602627935 15 0 0.601277304 16 0 0.600809471 17 0 0.601014934 18 0 0.600638226 19 0 0.600669147 20 0 0.600285097 <--- Best k Exhibit 4: Choice of (k) for KNN