Agricultural and Applied Economics 637 Applied Econometrics II

Similar documents
Financial Econometrics

PhD Qualifier Examination

Financial Econometrics Jeffrey R. Russell. Midterm 2014 Suggested Solutions. TA: B. B. Deng

Lecture 17: More on Markov Decision Processes. Reinforcement learning

User Guide of GARCH-MIDAS and DCC-MIDAS MATLAB Programs

The method of Maximum Likelihood.

1 Answers to the Sept 08 macro prelim - Long Questions

Solving dynamic portfolio choice problems by recursing on optimized portfolio weights or on the value function?

Project exam for STK Computational statistics

1 The Solow Growth Model

F UNCTIONAL R ELATIONSHIPS BETWEEN S TOCK P RICES AND CDS S PREADS

Chapter 7: Estimation Sections

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

This homework assignment uses the material on pages ( A moving average ).

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

Lecture outline W.B.Powell 1

A RIDGE REGRESSION ESTIMATION APPROACH WHEN MULTICOLLINEARITY IS PRESENT

X ln( +1 ) +1 [0 ] Γ( )

Homework 1 Due February 10, 2009 Chapters 1-4, and 18-24

Characterization of the Optimum

Bivariate Birnbaum-Saunders Distribution

Financial Econometrics

2D5362 Machine Learning

Assicurazioni Generali: An Option Pricing Case with NAGARCH

Chapter 8. Markowitz Portfolio Theory. 8.1 Expected Returns and Covariance

EE266 Homework 5 Solutions

1 A tax on capital income in a neoclassical growth model

What the hell statistical arbitrage is?

Review of key points about estimators

9. Real business cycles in a two period economy

MACROECONOMICS. Prelim Exam

Graduate Macro Theory II: Notes on Value Function Iteration

Exercises on the New-Keynesian Model

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

Questions 3-6 are each weighted twice as much as each of the other questions.

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, Last revised January 10, 2017

Case Study: Heavy-Tailed Distribution and Reinsurance Rate-making

1 Dynamic programming

10/1/2012. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1

DRAFT. 1 exercise in state (S, t), π(s, t) = 0 do not exercise in state (S, t) Review of the Risk Neutral Stock Dynamics

Monetary Economics Final Exam

Structural change and spurious persistence in stochastic volatility SFB 823. Discussion Paper. Walter Krämer, Philip Messow

P2.T8. Risk Management & Investment Management. Jorion, Value at Risk: The New Benchmark for Managing Financial Risk, 3rd Edition.

Is regulatory capital pro-cyclical? A macroeconomic assessment of Basel II

Non-Deterministic Search

Chapter 7: Estimation Sections

Econ 582 Nonlinear Regression

Introduction to Fall 2007 Artificial Intelligence Final Exam

Modelling the Sharpe ratio for investment strategies

Machine Learning for Quantitative Finance

Monitoring - revisited

Quantile Regression. By Luyang Fu, Ph. D., FCAS, State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting

P1: TIX/XYZ P2: ABC JWST JWST075-Goos June 6, :57 Printer Name: Yet to Come. A simple comparative experiment

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Midterm

Notes on the EM Algorithm Michael Collins, September 24th 2005

Multi-armed bandits in dynamic pricing

Much of what appears here comes from ideas presented in the book:

Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing

Clark. Outside of a few technical sections, this is a very process-oriented paper. Practice problems are key!

PhD Qualifier Examination

The homework is due on Wednesday, September 7. Each questions is worth 0.8 points. No partial credits.

Non-Inferiority Tests for Two Means in a 2x2 Cross-Over Design using Differences

Section 0: Introduction and Review of Basic Concepts

The Simple Regression Model

Final exam solutions

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, Last revised January 13, 2018

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

RISK BASED LIFE CYCLE COST ANALYSIS FOR PROJECT LEVEL PAVEMENT MANAGEMENT. Eric Perrone, Dick Clark, Quinn Ness, Xin Chen, Ph.D, Stuart Hudson, P.E.

Course information FN3142 Quantitative finance

Market Timing Does Work: Evidence from the NYSE 1

Econometrics and Economic Data

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

INFORMATION EFFICIENCY HYPOTHESIS THE FINANCIAL VOLATILITY IN THE CZECH REPUBLIC CASE

Chapter 6. Transformation of Variables

An Empirical Examination of the Electric Utilities Industry. December 19, Regulatory Induced Risk Aversion in. Contracting Behavior

Statistics for Business and Economics

Economics 2010c: Lecture 4 Precautionary Savings and Liquidity Constraints

Online Appendix for The Importance of Being. Marginal: Gender Differences in Generosity

Chapter 2 Uncertainty Analysis and Sampling Techniques

Introduction to Reinforcement Learning. MAL Seminar

M.I.T Fall Practice Problems

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

- 1 - **** d(lns) = (µ (1/2)σ 2 )dt + σdw t

ADVANCED MACROECONOMIC TECHNIQUES NOTE 7b

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS001) p approach

The Simple Regression Model

APPM 2360 Project 1. Due: Friday October 6 BEFORE 5 P.M.

Stochastic Models. Statistics. Walt Pohl. February 28, Department of Business Administration

u (x) < 0. and if you believe in diminishing return of the wealth, then you would require

C.10 Exercises. Y* =!1 + Yz

Implementing an Agent-Based General Equilibrium Model

Random Variables and Probability Distributions

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

What can we do with numerical optimization?

Maximum Likelihood Estimation

Financial Econometrics

6 Central Limit Theorem. (Chs 6.4, 6.5)

Financial Engineering and Structured Products

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

Transcription:

Agricultural and Applied Economics 637 Applied Econometrics II Assignment I Using Search Algorithms to Determine Optimal Parameter Values in Nonlinear Regression Models (Due: February 3, 2015) (Note: Make sure you hand in the computer code and output files used in answering the following questions. If your output contains a listing of iterative results, edit the output file so only the first few and last few iterations are shown. There is no need to waste paper.) (60 Total Points) 1. (40 pts) As you will discover throughout the semester, unlike estimating parameters of the classical regression model (CRM), finding optimal parameters of a nonlinear (in parameter) regression model usually requires an iterative search process. The definition of what constitutes optimal obviously depends on the objective function used as a guide in determining the preferred parameter values. For example, are you trying to find parameter values that minimize the sum of squared differences between predicted and actual dependent variable values (i.e., the sum of squared errors, [SSE s]) or are you trying to find parameter values that maximize the joint probability of obtaining the endogenous variable values you have in your dataset. Unlike the CRM, a nonlinear regression model requires you to make an initial guess as to parameter values and then to check to determine if indeed these parameter values are optimal. If not optimal then the estimation algorithm you are using should have specific procedures for obtaining new updated parameter estimates. This parameter updating is an iterative process where for each vector change, the parameters are improved. Improvement is defined relative to algorithm objective function. 1

Over the next few weeks you will be learning alternative methods for conducting the above iterative process in parameter selection. We will undertake this estimation assuming alternative algorithm objective functions. One method we will not be reviewing to any large degree is what is known as the Grid Search method given its severe limitations as to the number of optimal parameters that can be identified. Under the Grid Search method you divide the feasible range of parameter values into a finite grid of discrete values and then evaluate the impact of all parameter combination(s) on the defined objective function (e.g., the SSE value, loglikelihood function (LLF) value, etc.). a) (10 pts) Let s assume you want to estimate the following relationship between annual per capita U.S. gasoline quantity (not expenditures) (PC_Gas_Qt), lagged consumption (PC_Gas_Qt-1) and gasoline price index (Gas_Pt). t t-1 P Q PC_Gas_Q Gas_P PC_Gas_Q + ε t (1.1) where t = 1953-2004, the s are unknown parameters whose values you are trying to estimate and εt is an error term for the t th year where εt~ (0,σ 2 ). Note the lag structure depicted in (1.1). Also note that (1.1) cannot be linearized with respect to the parameters given the assumed error structure. I would like you to develop MATLAB code that will determine the values of P and Q that minimize the SSE from predicting PC_Gas_Qt via the Grid Search method. The code you develop should display (and write to an output file) your estimate of σ 2 conditional on these final estimates. Remember under the Grid Search method you use a finite number of pre-defined grid points and you compare the SSE s under these finite number of candidate parameter combinations. How do you evaluate whether this pair of parameters generates a global versus local minimum SSE value. In contrast to the CRM, SSE functions for many nonlinear, in parameter, regression models, the SSE function may not be globally convex. t 2

For the 1 st 10 and last 10 iterations have your software print out: i. The iteration number; ii. Current parameter pair values; and iii. Resulting SSE values. What direction should the movement of the SSE values as more iterations are undertaken if you develop a correctly working estimation system and reasonable starting values? Does the behavior of your iterations follow this pattern? The data you will be using for this question is a dataset containing annual total U.S. gasoline expenditures and other aggregate U.S. data encompassing the period 1953-2004. This data is contained in the file gas_market_1_15.xlsx and can be obtained from the class website. You will have to create some of the variables used in (1.1) given that you only have the raw market data on your desk. b) (15 pts) To further refine your parameter estimate, I would like you to modify the code you developed in 1(a) so that once you obtain your parameter estimates via the Grid Search method you use these values as starting points for a more refined General Search algorithm. Under the General Search algorithm you take the optimal Grid Search obtained parameter values and then examine relative SSE values within the neighborhood of these optimal values. This refined General Search algorithm for a single parameter (ρ) can be illustrated via the diagram shown to the right. In words, we can describe this iterative General Search algorithm via the following: i. Use the Grid Search estimates of P and Q as starting values; ii. Given the above estimate of Q compare the current SSE values with SSE values obtained under the scenario of P being slightly 3

larger and slightly smaller than the above Grid Search parameter estimates; iii. Of the two, (i.e., larger and smaller) new candidate values of P, identify the new value of P that generates the smaller SSE then the Grid Search SSE parameter value used in (i) as your new updated P estimate; iv. Continue to change your P value in the same direction as identified in (iii) by adjusting the value of P used in (iii) until the SSE starts to increase; v. When the SSE starts to increase, reverse the direction of the change in the parameter value and continue in generating new estimates of P until the SSE starts to increase. In this iteration make the absolute value of the change in parameter value to be smaller than that used in (iv) (or the previous iteration). vi. Repeat step (v) until you feel you are close enough to the true but unknown value of P conditional on the fixed value of Q. vii. Given the P value obtained in (vi) undertake the same iterative process starting with step (ii) but instead of changing P, you change Q. viii. Repeat the iterative process starting with step (vii) but this time changing Q given the current value of P obtained in (vii), etc In developing this new General Search algorithm you will need to address several questions: i. What is the magnitude of the parameter steps (changes) that I should use to move from one parameter value to another? (Note: the parameter step is the absolute value of the parameter value changes from one iteration to another). Specifically how do I determine a parameter specific step length given that the parameters can vary significantly in size. ii. What new step size should I use whenever I reverse the search direction? iii. What criteria do I use to determine whether my current parameter estimates are close enough to the true but unknown optimal parameter values? 4

Present in words a summary of your algorithm you developed. How did you address the issues raised in (i) (iii) above. Similar to 1(a) your program should present your 1 st five iterations as well as the last five with the final estimated parameter, SSE and your final estimate of σ 2 being generated from the final updated parameter vector. What are your results? How many iterations did it take for you to say you have obtained parameter values that minimize the SSE function? c) (15 pts) Finally, lets extend the methodologies you developed in sections (a) and (b) above to estimate the following t t t-1 t t P Q N PC_Gas_Q Gas_P PC_Gas_Q NC_ P + ε (1.2) where NC_Pt is the price of new cars We now have 4 parameters to be estimated, P, Q, N, and σ 2. Present your final estimates of these parameters. your program should be designed to handle any number of parameters without having to change the iteration code but by having the model size being dynamically determined and the same code can estimate optimal parameter vectors regardless of size. What were your starting parameter, SSE and σ 2 values? What are your final parameter, SSE and σ 2 values? What was your convergence criteria? How many iterations did it take to generate your final parameter estimates? 2. (20 pts) When attempting to determine optimal parameter values in (1) we did not make any assumption concerning the shape of the distribution of the error term, εt, other than E(εt) = 0 and its variance (i.e., σ 2 ) is homoscedastic and non-autocorrelated. Another method that can be used to obtain parameter estimates is to make an assumption concerning the data generating process of our observed dependent variable, PC_Gas_Qt (and therefore εt). Once an assumption is made concerning the dependent variable probability distribution one can choose as the preferred parameter values that maximize the joint probability of observing our T dependent variable values, PC_Gas_Qt (t=1,,t). The typical assumption is that the dependent variable values are independently and identically distributed 5

(i.e., iid). This implies that given the relationships represented in (1.1) we have via the Markov theorem: f(y1,y2,,yt) = p(y1)p(y2 y1) p(yt yt-1)= p(ε1)p(ε2) p(εt) (2.1) where the yt s are our dependent variable values (t=1,,t). Let s assume that our error terms, the ε s, are iid normally distributed. This implies that our dependent variable (PC_Gas_Qt) is also normally distributed. Incorporating this additional information we can restate (1.1) to be the following: t t-1 P Q PC_Gas_Qt Gas_P PC_Gas_Q + ε (2.2) where for this applied we have εt~n(0,σ 2 μ). Given the above normality distribution assumption we can represent the natural logarithm of the joint PDF of the T observations of our dependent variable via the following: ˆ T 2 2 ln f y 1 1,...,y T 0.5 ln 2 ln σˆ ˆ σˆ ˆ t t (2.3) t1 where ˆ is the current estimate of. Given that we are treating our data as given, our objective is to choose the values of P and Q that maximizes the logarithm of the joint probability (i.e., eq. 2.3) of observing our data that we actually have. We can derive what is referred to as the data sample s log-likelihood function, L( ), where: T 2 2 ˆ ˆ 1 L ˆ y,...,y ln f y,...,y ˆ 0.5 ln 2 ln σ εˆ σ εˆ 1 T 1 T t t (2.4) t1 t (Hint: Remember the formula for an unbiased estimate of σ 2 under the CRM and how one identifies the error term vector given current parameter estimates.) A depiction of a log-likelihood function for a single parameter, θ, is shown in the figure to the right which also displays the general search procedure for identifying the optimal parameter value. 6

a) (15 pts) Modify the General Search method you developed for (1a) to estimate instead the two parameters using the maximum likelihood function approach. Your task is to find the values of these two parameters that maximize the value of (2.4). The following figure depicts a similar problem but with a different log-likelihood function and two parameters, Beta_1 and Beta_2. Besides the final maximum likelihood parameter estimates, you should also display the final total sample log-likelihood function value. b) (5 pts) Generate a graph similar to the above for parameter values surrounding the final parameter estimates. Graphically identify the optimal values of P and G. Extra Credit Due Feb. 10, 2015: The above questions have been devoted to using search methods to obtain estimates of a limited number of unknown regression parameters. Obviously to examine the properties of these point estimates we need to know the distribution of these estimates. I would like you to propose a method by which you can obtain parameter estimate standard errors. There is no one method to do this. Develop MATLAB code to implement your proposed algorithm and apply this to the analysis of the regression model applied in question #2 above. Modify your output to include these standard errors in a final result table. 7

8