Does Judgement Improve the Accuracy of Macroeconomic Forecasting?

Does Judgement Improve the Accuracy of Macroeconomic Forecasting? Ana Galvão, Anthony Garratt and James Mitchell University of Warwick, WBS March 26th, 2018 March 26th, 2018 1 / 10

Introduction The aim of this paper is to assess the role of judgement in macro forecasting Role of judgment a much debated question, where different types of forecasters impose and use judgement to differing degrees Here we define judgement, in the context of the UK, as - the collective view of the 9 members of MPC regarding point/density (fan chart) forecasts of inflation and output growth their "judgement" may or may not be informed by a model and if they are we are not sure which model and what weight two piece normal distribution, we use published parameters (OBR also use two piece normal) here we focus on output growth and CPI inflation in the UK Also judgement as captured by the Bank of England Survey of External forecasters To undertake this we compare Bank/Survey point/density forecasts for UK output growth/inflation with a sophisticated statistical "vehicle" - combined/average model used at WBS to forecast March 26th, 2018 2 / 10

Outline Three types of forecaster: Judgement based (B of E), Model based (combined models), Survey (B of E External Forecasters) future work to include: NIESR,Treasury panel, others? Three dimensions Does judgement matter for point and/or density forecasts? How does the role of judgement vary overtime and in relation to business cycles? Do alternative metrics (loss functions), other than statistical, matter for evaluation? e.g. probability events. March 26th, 2018 3 / 10

Model Based Combination Log pool combination, using log score recursive (time-varying ) weights, of 24 commonly used atheoretical econometrics models Empirical modelling judgement in selection, evaluation, combination etc. Model averaging tends to deliver robust density forecasts, no one single model is likely correct - certainly across time and forecast horizons, temporal instabilities in model performance Models: AR(2) plus 13 single indicator MIDAS models to exploit monthly information e.g. industrial production, retail sales, consumer confidence, oil prices, stocks, exchange rates etc.; Macro BVAR model of real GDP and CPI + 5 quarterly macro series (Consumption, Investment, Hours, Real Wages, Bank Rate), variables in levels, p=4; Medium sized BVAR model of real GDP and CPI + 13 indicators (as in the MIDAS models). 8 specifications: p=1,...,4 x Levels/Diff Medium sized BVAR (with 15 variables) with p=4 and in differences with stochastic volatility (Carriero, Clark and Marcellino (2017)). March 26th, 2018 4 / 10

Survey: Bank of England External Forecasters Quarterly survey, began 1996, asks a panel of forecasters for their probabilities that the value of the variable of interest lies in each one of a number of preassigned intervals, at end quarter of current and next year, covers city firms, academic institutions and private consultancies (see Boreo, Smith and Wallis, JAE, 2014) Use from 2006q2 as the questions switched to being a fixed horizon as opposed to fixed event, Annual output growth and CPI inflation at 1 year (h=4) and 2 year (h=8) horizons, aggregate data, fit a normal distribution to the histogram to compute the mean and variances - see Clements (2012). March 26th, 2018 5 / 10

March 26th, 2018 6 / 10

March 26th, 2018 7 / 10

Does Judgement Effect Point and Density Forecasts? Point Forecasts: judgement improves point forecasts relative to combined models The B of E is the best point forecaster for h=1 and 4, particularly for inflation, but also for output growth: judgement matters (information set?) Surveys are the preferred h=8 point forecaster relative to the Bank and combined model, but AR(2) is hard to beat Density Forecasts: Combined models mostly improves forecast densities relative to judgement, but mixed results At h=1, combined model densities preferred to AR(2)/Bank (no survey) for inflation and output growth At h=4, surveys preferred for output growth but both Bank and combined model preferred for inflation - Bank and combined model similar for both At h=8, survey preferred to Bank and combined model for output growth, but combined models preferred to Bank and surveys for inflation Some evidence of this changing overtime March 26th, 2018 8 / 10

X 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 h=1 Inflation Combined vs B of E: Cumulative sq pred error diff 2006 2008 2010 2012 2014 2016 X Inflation Combined versus B of E: Cumulative log score diff 10 8 6 h=1 X 4 2 0 2006 2008 2010 2012 2014 2016 X

Probability Event Evaluation Alternative way of evaluating a density is through a probability event different loss function Here, focussing on inflation, we define the event as being the probability of the MPC having to write a letter i.e. Pr(1% < π > 3%) Where as one diagnostic we can compute the Brier score (also it s decomposition and Kuipers score etc., see Galbraith and van Norden (21012)): BS = 1 N N i=1 (p i o i ) 2 where p i is the probability of the event occurring and o i takes the value 1 if the event actual occurred in period i and zero otherwise. Smaller values are preferred with a perfect score being zero March 26th, 2018 9 / 10

Probability of a letter at forecast horizon h=1 (uncond = 0.367) Letter BofE (BS=0.052) log RLS (BS=0.051) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 2004 2004 2005 2005 2006 2006 2007 2007 2008 2008 2009 2009 2010 2010 2011 2011 2012 2012 2013 2013 2014 2014 2015 2015 2016 2016 2017 2017

Probability of a letter at forecast horizon h=8 (uncond = 0.429) Letter BofE (BS=0.219) Log RLS (BS=0.308) Survey (BS=0.358 vs 0.219 0.308) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Q1 2004 Q3 2004 Q1 2005 Q3 2005 Q1 2006 Q3 2006 Q1 2007 Q3 2007 Q1 2008 Q3 2008 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 2009 2009 2010 2010 2011 2011 2012 2012 2013 2013 2014 2014 2015 2015 2016 2016 2017 2017

Summary of Results Point Forecast Density Forecast Output Growth Inflation Output Growth Inflation h = 1 Bof E Bof E Comb Comb h = 4 Bof E Bof E Survey Comb/ B of E h = 8 Survey Survey Survey Comb/Survey Galvão, Garratt & Mitchell (University of Warwick, Does Judgement WBS) Improve the Accuracy of MacroeconomicMarch Forecasting? 26th, 2018 10 / 10