Logistics Regression & Industry Modeling Framing Financial Problems as Probabilities Russ Koesterich, CFA Chief North American Strategist
Logistics Regression & Probability So far as the laws of mathematics refer to reality, they are not certain. And so far as they are certain, they do not refer to reality. -Albert Einstein
Key Topics 1. Introduction to Logistics Regression 2. Methodology 3. Rationales for its use 4. Applications in Sector/Industry Models
Introduction Methodology for modeling a dichotomous event Output suited to less quantitative professionals, who can intuitively appreciate a probability Well suited for Industry, Sector, Style Modeling Can also be adopted to absolute return strategies by looking at positive or negative absolute returns
Methodology What is Logistics Regression? A mathematical modeling approach that can be used to describe the relationship of several X s to a dichotomous dependent variable - Dr. David Kleinbaum, Logistics Regression. Uses maximum likelihood algorithm to estimate regression coefficients Technique is common in biostatistics, particularly in the field of epidemiology. Easily adopted to dichotomous events: Outperform/Underperform, Growth/Value, Large Cap/Small cap, et.
Logistic Model 1 0.5 0 Logit Function F(z)=1/(1+e -z ) Logistics function uses logit link to describe a probability by using an S- shaped function: F(z) = 1/(1+ e -z ) Where z is the traditional regression equation: Z=ά +β 1 X 1 + β 2 X 2 +.β k X k
Logistics Formula Model Describes Expected Value of Y (I.e. E(Y) in terms of the formula: E(Y) = 1 1+ exp[ -(β 0 + Σβ j X j )]
Maximum Likelihood (ML) Estimation If dependent variable is assumed to be normal, ML estimation gives same estimate as OLS Because Logistics Regression is a non-linear model, ML estimation is preferred method ML estimation requires no restrictions on the characteristics of the independent variables Variables can be nominal, ordinal, and/or interval
Features & Benefits of Logistics Regression Different perspective Look at financial problems as a set of possible outcomes, what is the likelihood of the different outcomes Probability output provides an intuitive framework for evaluating future scenarios Can use to forecast probability of multiple events (nominal or ordinal logistics regression) Odds Ratio
Odds Ratio Calculation Odds = P/(1-P) Odds Ratio = Odds X 1 /Odds X 2 or OR(1,0) = P(X 1 )/(1-P(X 1 )) P(X 0 )/(1-P(X 0 )) Odds Ratio also equals=> exponentiate product of the coefficient and change in the variable. Odds Ratio = e βi(xi1-xi2)
Example Odds Ratio 1. Changes in rates impact Retail Stocks. Specifically, the 6 month rate-of-change in the 10 yr yield impacts the probability of outperformance. 2. Coefficient for the relationship is 2.451. 3. Compare likelihood of outperformance when rates are down 20% in 6 months (5% to 4%) vs. when rates are up 20% (5% to 6%).
Example Odds Ratio (continued) Calculation: If all other factors held constant, odds ratio = e (-2.451*(-0.2 0.2)) = e (-2.451*(-0.4)) = 2.66 Conclusion: Retail stocks are 2.6x more likely to outperform when interest rates have dropped 20% over the past 6 months versus periods following a 20% rise in rates.
Example Odds Ratio Dichotomous Variable 1. Seasonality Impacts Consumer Discretionary Stocks. Sector More likely to Outperform Q1 2. Code Seasonality as Dummy Variable, 1 = Q1, 0= all other quarters 3. Coefficient =.6606 4. Odds Ratio = e 0.6606(1-0) = e 0.6606 = 1.51 Conclusion: Consumer Discretionary Sector 1.5x more likely to beat market in Q1 than in all other quarters.
Industry Sector Model Objectives Provide framework for intermediate (1-6 month) sector and group recommendations Isolate those relevant factors which demonstrate a consistent and leading relationship to a sector s future relative performance Combine factors in a systematic and controlled interaction framework Deliver output which indicates which sectors to overweight/underweight
3 Examples of Group/Sector Specific Factors Healthcare Sector: Sector Specific Input: Medicare Payments Rule: Are quarterly changes above/below recent median? Impact: If above median, group 2.7x more likely to outperform Utilities Sector: Sector Specific Input: Electric Power Use Rule: Are annual changes high(top quartile) or low (bottom quartile)? Impact: If changes high, group 2.4x more likely to outperform. Retail Industry: Industry Specific Input: CPI Apparel Rule: Is apparel inflation above its recent median? Impact: If apparel inflation above median, group is 2x more likely to outperform.
Model Example Communications Equipment Industry Factors: (1)New Investment in Fixed Technology (2)Changes in Tech. Capacity Utilization (3)ISM New Orders Index (4)Risk Appetite (Measured by the VIX Index)
Sample Model & Returns In Sample Backtests Probability Score Average Median Count Win 1st Quartile 3rd Quartile 0.00% 50.00% -1.71% -1.43% 56 42.86% -6.37% 4.72% 50.00% 100.00% 1.69% 1.56% 114 64.04% -2.52% 5.44% Out Sample Backtests Probability Score Average Median Count Win 1st Quartile 3rd Quartile 0.00% 50.00% -3.54% -2.06% 18 38.89% -9.08% 2.09% 50.00% 100.00% 1.01% 1.55% 24 70.83% -4.71% 5.58%
Model Probability & Returns Communication Equipment Model 1 Month Forward Rel. Rt. Model Probability Outperformance 1 Month Rel. Rt. 0.3 0.2 0.1 0-0.1-0.2-0.3-0.4 Mar-90 Nov-91 Jul-93 Mar-95 Nov-96 Jul-98 Mar-00 Nov-01 Jul-03 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Probability Outperformance
Conclusions Logistics Regression provides a different perspective to many financial problems The methodology provides an intuitive output An added benefit of the methodology is the Odds Ratio, which can be easily extracted from the model Finally, it is well suited towards Industry and Relative Return Analysis