Econometric Methods for Valuation Analysis

Econometric Methods for Valuation Analysis Margarita Genius Dept of Economics M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 1 / 25

Outline We will consider econometric models for dependent variables that can only take discrete values We will focus on binary discrete dependent variables We will discuss maximum likelihood estimation of binary discrete choice models The methods will be applied to the estimation of willingness to pay for the preservation of a natural park using Stata M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 2 / 25

Motivation Very often the dependent variable in our model is discrete Example 1: Binary Choice: the choice between commuting to work with public transportation or private transportation { 1 if i goes to work with public transportation y i = 0 if i goes to work with private transportation Example 2: Ordered Choices: the choice between not working, working part-time or working full-time 0 if i does not work y i = 1 if i works part-time 2 if i works full-time Example 3: Unordered Choices: the choice of commuting to work by car, by bus, by train or other. 0 if i goes by car 1 if i goes by bus y i = 2 if i goes by train 3 if i goes by other We will focus only on the Binary Choice case only M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 3 / 25

Binary Choice Model If we want to estimate how different factors (explanatory variables) affect the choice between public or private transportation, then the linear regression model is not very well suited. Why? The linear model is given by y i = β 0 + β 1 x i1 + + β k x ik + ɛ i = β x i + ɛ i Note that, from regression model, but also E ( y i xi ) = β x i E ( y i x i ) = 1P ( yi = 1 x i ) + 0P ( yi = 0 x i ) = P ( yi = 1 x i ) because y is a Bernouilli variable Implications: ˆβ x i need not be between 0 and 1. M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 4 / 25

Binary Choice Model: The Latent Variable Model We define a new variable, y*, that is not observed but we observe its sign The Latent Variable Model y i = β 0 + β 1 x i1 + + β k x ik + ɛ i = β x i + ɛ i y i is not observed but we observe y i = { 1 if y i > 0(public transportation) 0 if y i 0(private transportation) yi can be interpreted as the difference in the utilities between the two choices. Therefore, { 1 if ɛ y i = i > β x i 0 if ɛ i β x i M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 5 / 25

Binary Choice Model: The Latent Variable Model From the above we have, P ( y i = 1 x i ) = P ( ɛi > β x i ) Denoting by G the cumulative distribution function of ɛ and assuming the distribution is symmetric we have P ( y i = 1 xi ) = P ( ɛi < β x i ) = G ( β x i ) (1) The estimation of the unknown parameters β will require assuming the functional form of G in (1) and using maximum likelihood estimation. Note that P ( ( ) ɛ i < β ) ɛi x i = P σ < β σ x i This implies that it is not possible to estimate both σ (the standard deviation of the error term) and β and we will have to make an assumption about the value of σ M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 6 / 25

Binary Choice Model: Maximum Likelihood Estimation Intuition behind the Maximum Likelihood Estimation Method The maximum likelihood method is based on the idea that Different populations give rise to different samples A given sample is more likely to come from a specific population If we assume that the population distribution is known except for some unknown parameters, then the maximum likelihood estimator of an unknown parameter based on a sample from that population, is the value of the parameter that maximizes the probability of observing the specific sample. M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 7 / 25

Binary Choice Model: Maximum Likelihood Estimation The log-likelihood function is easier to maximize than the original likelihood function. However, in most cases the first order conditions will be nonlinear and will require the use of optimization algorithms. Many computer programs used by economists (Stata, Eviews, R among others) perform maximum likelihood estimation. M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 8 / 25 The Likelihood Function and the Maximum Likelihood Estimator Let y 1,, y n be a random sample from a population which has a density f (y, θ). The form of the density is known but the parameter θ is unknown. Then the likelihood function is given by L (θ; y 1,, y n ) = f (y 1, θ)f (y 2, θ) f (y n, θ) = Π n i=1 f (y i, θ) The maximum likelihood estimator of θ, denoted by ˆθ, is defined as that value of θ that maximizes L (θ; y 1,, y n ). Note that maximizing the likelihood function is equivalent to maximizing the log-likelihood function given below, l (θ; y 1,, y n ) = ln (L (θ; y 1,, y n )) = n i=1 ln (f (y i, θ)) (2)

Binary Choice Model: Estimation by ML In the case of the binary discrete model The dependent variable takes only two values: 0 and 1 P ( y i = 1 xi ) = G (β x i ) P ( y i = 0 xi ) = 1 G (β x i ) The log-likelihood will be a function of G (β x i ) If G is assumed to be N(0,1) then we have the probit model and ˆβ is called the PROBIT ESTIMATOR. Recall that the density function of a N(0,1) is given by g(x) = 1 e x2 2 2π If If G is assumed to be a logistic then we have the logit model and ˆβ is called the LOGIT ESTIMATOR. The density function of the logistic with mean 0 is given by g(x) = ex (1+e x ) 2 The N(0,1) and logistic are both symmetric around 0, but the logistic has slightly fatter tails. The N(0,1) has variance 1 while the logistic has variance π 2 3 M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 9 / 25

Binary Choice Model: Interpretation of estimates In a linear regression model where y i = β 0 + β 1 x i1 + + β k x ik + ɛ i we have β j = E(y x) x j and this is estimated by ˆβ j. In a probit or logit model ˆβ j does not give us an estimate of the magnitude of the effect of x j on y but gives us information only about the direction of the effect on P(y = 1 x). If ˆβ j is positive than an increase in x j will lead to an increase in P(y = 1 x) while if it is negative it will have the opposite effect. The exact effect of x j on P(y = 1 x) is called the marginal effect and can be easily computed using statistical programs. M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 10 / 25

Binary Choice Model: Hypotheses tests For tests on an individual β we use the usual t-tests assuming we have a large sample As we did in the case of the linear regression we can test that all β s except β 0 are equal to zero. So we are testing that the fit of our model is significantly better than the fit of a model that has only a constant term. H 0 : β 1 = = β k = 0 H 1 : at least one different from zero In order to test the above hypothesis we use the LIKELIHOOD RATIO TEST (LR). We follow the following steps Estimate the model with all the independent variables. Keep the value of the log-likelihood function (l 1 ) Estimate the model with a constant term only. Keep the value of the log-likelihood function (l 0 ) LR statistic: LR = 2 (l 1 l 0 ) which for large samples has approximately a chi-squared distribution with k degrees of freedom (the number of independent variables). If the LR statistic is larger than the value form the chi-squared tables then we reject the null hypothesis. M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 11 / 25

Binary Choice Model: Goodness of Fit Measures McFadden R 2 = 1 l( ˆβ) l (pseudo-r 2 ). Where 0 l( ˆβ) is the log-likelihood evaluated at the value of the estimates and l 0 is the log-likelihood of a model that has a constant term but no regressors. The chi-squared test we saw before can also be used as a goodness of fit test. Comparing probit and logit estimates: As we saw before, the normal and logistic distributions do not have the same variance so in order to compare estimates from both models we will use the following equality 1.6 ˆβ probit ˆβ logit M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 12 / 25

Binary Choice Model: Application to Valuation of Natural Park We are going to see how the binary discrete choice model can be used to estimate willingness to pay for an environmental good The data come from a study conducted by Paulo Nunes (Nunes, Paulo (2000) Contingent Valuation of the Benefits of natural areas and its warm glow component,phd thesis 133, FETEW, KULeuven) and discussed in Marno Verbeek s book (Marno Verbeek (2004) A Guide to Modern Econometrics ) The survey was conducted in 1997 in Portugal and its objective was to estimate how much individuals were willing to pay to avoid the commercial and tourism development of Alentejo Natural Park in southwest Portugal. The survey respondents were presented with an amount (called a bid which varies across respondents) and asked whether they would be willing to pay this amount in order to preserve the park. This is known as the SINGLE BOUND MODEL. This model can be estimated using the binary discrete choice model we described above. M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 13 / 25

Binary Choice Model: Application to Valuation of Natural Park The table below presents the bids presented to respondents Case Bid 1 6 2 12 3 24 4 48 Table: Bids presented to respondents M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 14 / 25

Application to Valuation of Natural Park: Single Bound Model Given the above setting, the only information we have is whether the respondent is willing to pay the bid he is presented. We do not observe what his/her willingness to pay (WTP) is and therefore WTP is a latent variable. We will assume that WTP depends on some characteristics of the respondent according to a linear model. Following the definition of the latent variable model we have WTP i y i = WTP i = β 0 + β 1 x i1 + + β k x ik + ɛ i = β x i + ɛ i is not observed but we observe { 1 if WTP i > bid i (answers yes: is willing to pay at least bid i ) 0 if WTP i bid i (answers no: is willing to pay less than bid i ) Therefore, y i = { 1 if ɛ i > bid i β x i 0 if ɛ i bid i β x i M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 15 / 25

Application to Valuation of Natural Park: Single Bound Model From the above we have, P ( y i = 1 x i ) = P ( ɛi > bid i β x i ) (3) Note that compared to (1), (3) has bid i on the right hand side and this allows us to estimate both σ and β as P ( ( ɛ i > bid i β ) ɛi x i = P σ > bid ) i σ β σ x i P ( ɛ i > bid i β x i ) = P ( ɛi σ > α bid i β x i ) where α = 1 σ and β = β σ. Denoting by G the cumulative distribution function of ɛ σ and assuming the distribution is symmetric we have M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 16 / 25

Application to Valuation of Natural Park: Single Bound Model ( ɛi ) P σ > α bid i β x i = G ( β x i α ) bid i After we estimate our model by maximum likelihood by probit (G is N(0,1)) or logit (G is logistic) we can obtain estimates of σ and β as ˆσ = 1ˆ, ˆβ = βˆ ˆσ α In the next slides we are going to look at the description of the data and the results from the probit estimator. The underlying model for WTP is given by WTP = β 0 + β 2 age + β 2 sex + β 3 income + ɛ M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 17 / 25

Application to Valuation of Natural Park: Single Bound Model Table: Frequency for answers to the two bids answers Freq Percent no 141 45.19 yes 171 54.81 N 312 M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 18 / 25

Application to Valuation of Natural Park: Single Bound Model Table: Summary statistics Variable Mean Std. Dev. Min. Max. bid 22.577 15.982 6 48 y 0.548 0.498 0 1 age 3.029 1.564 1 6 sex 0.442 0.497 0 1 income 2.516 1.33 1 8 answers 1.34 1.179 0 3 N 312 where bid and y are the bid and response to the valuation question. We can see that 54.8% of the respondents accepted the bid, 44.2% are males. The data for age has been grouped in six classes while the data for income has been grouped in 8 classes. M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 19 / 25

Application to Valuation of Natural Park: Single Bound Model In order to estimate a probit model in stata, you can either use menu-driven commands or type commands directly in the command window. Menu-driven commands: click the following selections in the top menu, Statistics, Binary Outcome, Probit Regression then select the dependent (y) and independent variables (bid, age, sex, income). In addition to getting the results you can see the exact command that you would need to use if working in the command driven environment. Command driven estimation: Type probit y bid age sex income. The next slide shows a screenshot of the estimation results for the probit model. The slide after that shows a shorter version of the results using the a Stata facility to export tables to my slides. M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 20 / 25

Application to Valuation of Natural Park: Single Bound Model All coefficients are statistically significant at the 5% level except the constant (t-statistics, 95% CI and p-values). The LR test has a p-value of 0, therefore the model with the four variables has a better fit than a model with just a constant. M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 21 / 25

Application to Valuation of Natural Park: Estimation Table: Probit Estimation Results: Dep = y1 bid age sex Variable income Coefficient (Std. Err.) -0.012 (0.005) -0.223 (0.051) 0.364 (0.151) 0.147 (0.061) Intercept 0.540 (0.281) N 312 Log-likelihood -191.444 χ 2 (4) 46.746 Significance levels : : 10% : 5% : 1% M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 22 / 25

Application to Valuation of Natural Park: Estimation Recall that the estimates we get from the probit model correspond to α = 1 σ and β = β σ. The estimates of σ and β are then given by Table: Probit Estimates of σ and β Variable Coefficient sigma 85.282 age -19.059 sex 31.068 income 12.570 Intercept 46.076 M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 23 / 25

Application to Valuation of Natural Park: Interpretation From the specification of our model WTP = β 0 + β 2 age + β 2 sex + β 3 income + ɛ Older people are less willing to pay for the preservation of the natural park than younger people of the same sex and income class (ceteris paribus). As we move from one age class to the next one (leaving the other variables unchanged) average willingness to pay decreases by 19 euros. Males are willing to pay on average 31 euros more than females with the same characteristics (age and income). Willingness to pay increases with income. As we move from one income class to the next one (leaving the other variables unchanged) average willingness to pay increases by 12.57 euros. The mean WTP given by E(WTP) = β 0 + β 2 age + β 2 sex + β 3 income can be estimated by using the coefficient estimates and the mean values of the independent variables E(WTP) ˆ = 46.076 19.059 3.03 + 31.0068 0.44 + 12.57 2.54 = 33.67euros M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 24 / 25

References Wooldridge, J. M. (2015). Introductory Econometrics: A Modern Approach. South-Western College Pub. Stock, J. H. and Watson, M. W. (2010). Introduction to Econometrics. Addison-Wesley. Bateman, J. and Willis, K. G. (2002). Valuing Environmental Preferences: Theory and Practice of the Contingent Valuation Method in the US, EU, and Developing Countries. Oxford University Press. M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 25 / 25