Part I: Discrete Choice Models (Theory and Applications)

Size: px
Start display at page:

Download "Part I: Discrete Choice Models (Theory and Applications)"

Transcription

1 Part I: Discrete Choice Models (Theory and Applications) Mauricio Sarrias Universidad Católica del Norte Workshop SOCHER 2017 Fondecyt Project N , Individual-specific inference for choice models October 5, 2017

2 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

3 Outline of this workshop Part I: Discrete choice models (Theory and Applications): Binary outcomes. Ordered outcomes. Multinomial Logit Model. Part II: Discrete choice models and individual-heterogeneity (Theory and Applications): Binary outcomes. Multinomial Logit Model.

4 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

5 Binary Dependent Variable We will study method for estimating model with binary dependent variable: { 1 if some event occurs y i = 0 if the even does not occurs Some examples: Is an adult a member of the labor force? Did a citizen vote in the last election? Does a high school student decide to go to college? Is a consumer more likely to buy the same brand or try a new brand? Does the individual migrate?

6 Binary Dependent Variable So, the question is: How to estimate a model then the dependent variable is binary? The first approach is to apply OLS as if the dependent variable is continuous.

7 What if OLS...? The linear probability model is the regression model applied to a binary dependent variable. The structural model is: where y i = y i = x iβ + ɛ i, { 1 if some event occurs 0 if the even does not occurs When y is a binary random variable, then: E (y i x i ) = [1 Pr (y i = 1 x i )]+[0 Pr (y i = 0 x i )] = Pr (y i = 1 x i ) = x iβ

8 Linear Probability Model For a Single Independent Variable

9 Problems with the LPM Heterokedasticity: If a binary random variable has mean µ, then its variance is µ(1 µ). Then: Var (y i x i ) = Pr (y i = 1 x i ) [1 Pr (y i = 1 x i )] = x iβ (1 x iβ), which implies that the variance of the errors depend on the x s and is not constant. Nonsensical Predictions: The LPM predicts values of y that are negative or greater than 1. Functional Form: Since the model is linear, a unit increase in x k results in change of β k in the probability of an event. The increase is the same regardless of the current value of x.

10 Problems with the LPM More on Functional Form Consider the following two specifications: where d i is a dummy variable. y i = α + βx i + δd i + ɛ i y i = F (α + βx i + δd i ) The discrete change in y as d changes from 0 to 1, holding x constant as: y d = (α + βx i + δ 1) (α + βx i + δ 0) = δ For our second function, the discrete change is?

11 Problems with the LPM More on Functional Form

12 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

13 WLS Since we know the exact form of the heterokedasticity function we can use Weighted-Least-Square (WLS) In this case: E ( ɛ 2 i X ) = Var (ɛ i X) = σ 2 0h i (X) (1) where h i (X) = x i β ( 1 x i β ) (2) Then we can estimate the WLS estimator: [ n ] 1 [ n ] β W LS = w i x i x i w i x i y i, w i = 1/h i (3) i=1 i=1

14 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

15 Determinants of Personal Computer Ownership Consider the following model: where: PC i = β 0 + β 1 hsgpa i + β 2 ACT i + β 3 parcoll i + ɛ i (4) PC: binary indicator equal to unity if the student owns a computer, zero otherwise. hsgpa: High school GPA. ACT: is achievement test score. parcoll: binary indicator equal to unity if at least one parent attended college. 1 We use data in GPA1.dta to estimate the probability of owning a computer. 1 Separate college indicators for the mother and father do not yield individually significant results, as these are pretty highly correlated.

16 * Open Data cd "/Users/mauriciosarrias/Documents/Clases/Discrete Choice Mode use "GPA1.DTA", clear * Gen parcoll quietly{ gen parcoll = 0 replace parcoll =1 if fathcoll == 1 mothcoll == 1 reg PC hsgpa ACT parcoll } * Plot Predicted Values quietly margins, at(hsgpa = (16(0.5)33)) atmeans marginsplot, yline(0, lcolor(red)) yline(1, lcolor(red))

17

18 * Predicted value qui reg PC hsgpa ACT parcoll predict yhat, xb sum yhat Variable Obs Mean Std. Dev. Min M yhat * Weights gen h_hat = yhat * (1 - yhat)

19 * Models quietly eststo ols : reg PC hsgpa ACT parcoll quietly eststo olsr: reg PC hsgpa ACT parcoll, robust quietly eststo wls : reg PC hsgpa ACT parcoll [aweight = 1/ h_hat] esttab ols olsr wls, b se /// mtitle("ols" "OLS Rob" "WLS") (1) (2) (3) OLS OLS Rob WLS hsgpa (0.137) (0.141) (0.130) ACT (0.0155) (0.0161) (0.0155) parcoll 0.221* 0.221* 0.215* (0.0930) (0.0880) (0.0863) _cons (0.491) (0.496) (0.477) N Standard errors in parentheses

20 WLS Final Remarks WLS and Robust Standard Errors help us to lead with Heterokedasticity.... but it does not solve the problems related to interpretation.

21 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

22 Latent Variable Approach The latent y is assumed to be linearly related to the observed x s through the structural model: y i = x iβ + ɛ i The latent variable y is linked to the observed binary variable y by the measurement equation: { 1, if yi y i = > τ 0, if yi τ (5)

23 Latent Variable Approach

24 Latent Variable Approach Pr (y i = 1 x i ) = Pr (y i > 0 x i ) = Pr (x iβ + ɛ i > 0 x i ) = Pr (ɛ i > x iβ x i ) = 1 Pr (ɛ i x iβ x i ) Pr(X > x) = 1 Pr(X x) = Pr (ɛ i x iβ x i ) by symmetry = F (x iβ) = x i β f(ɛ i )dɛ i As usually we assume that E (ε X) = 0

25 Latent Variable Approach

26 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

27 Probit Error term distributed as Normal When ε is normal with E (ε X) = 0 and Var (ε X) = 1, the pdf is φ(ε) = 1 ) exp ( ε2 2π 2 and the cumulative distribution function (cdf) is: ɛ ( ) 1 Φ(ε) = exp t2 dt 2π 2

28 Logit Error term distributed as Logistic In the logistic model, the errors are assumed to have a standard logistic distribution with mean 0 and variance π 2 /3. This unusual variance is chosen because it results in a particularly simple equation for the pdf: λ(ɛ) = exp (ɛ) [1 + exp (ɛ)] 2 and an even simpler equation for the cdf: Λ(ɛ) = exp (ɛ) 1 + exp(ɛ)

29 Normal and Logistic Distribution

30 Probit If the ɛ i s are independently and normally distributed, ɛ i N(0, σ 2 ), then ( ɛi Pr (y i = 1 x i ) = Pr σ > x i β ) σ x i ( = 1 Φ x i β ) σ ( ) x = Φ i β by symmetry of standard normal distribution σ The probability depends on β and σ, but only the ration β/σ is identified, but not the single parameters β and σ. For instance, if β and σ each are multiplied by a constant c, then the probability remains unchanged. Typically, we let σ = 1 as normalization.

31 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

32 ML Estimation The outcome is Bernoulli distributed, the binomial distribution with just one trial. A very convenient compact notation for the density of y i, or more formally its probability mass function, is: f (y i x i ) = P yi i (1 P i ) 1 yi, y i = 1, 0 where P i = F (x i β). This yields probabilities P i and (1 P i ) since f(1) = P 1 (1 P ) 0 = P and f(0) = P 0 (1 P ) 1 = 1 P. Assuming that each probability is independent of that other, the joint probability function (or Likelihood function) is: Pr (Y 1 = y 1, Y 2 = y 2,..., Y n = y n X) = N f (y i x i ) The likelihood function for a sample of n obvervations can be written as n L β y, X = [F (x iβ)] yi [1 F (x iβ)] 1 yi }{{} data i=1 i=1

33 Log-Likelihood Function Taking logs, we obtain the Log-Likelihood function, which must be maximized ( n ) log L (β data) = log [F (x iβ)] yi [1 F (x iβ)] 1 yi Useful Trick = = n i=1 i=1 { log ( [F (x iβ)] yi) )} + log ([1 F (x iβ)] 1 yi n {y i log [F (x iβ)] + (1 y i ) log [1 F (x iβ)]} i=1 If the distribution is symmetric, as the normal and logistic, then 1 F (x i β) = F ( x i β). Let q i = 2y i 1. Then: log L (β data) = n log F (qi x iβ)

34 Asymptotic Distribution of Probit and Logit Model Recall that n ( β β0 ) d N(0, I 1 ) where I is the sample Fisher Information defined as: [ ] 2 I = E β β log f(y i x i ; β) Then, an estimator of the expected value of the Hessian (or information matrix), is given by: [ 2 ] log L I(β) = E β β = 1 n f 2 (x i β) n F (x i β) [1 F (x i β)]x ix i i=1 We can also use the following estimators: ( n i=1 H(w i, β) ) 1 or the outer product of the gradients. (6)

35 What about small samples? Griffiths et al. (1987) We have three estimators. NR: inverse of the negative of the Hessian matrix from the log-likelihood function. Scoring: inverse of the information matrix is used. BHHH: the inverse of the outer product of the first derivatives of the log-likelihood function. They are asymptotically equivalent, but their performance can vary in finite samples. They find that, on average, the Hessian matrix and the information matrix give almost identical results and lead to more accurate estimates of the asymptotic covariance matrix than does the estimator based on first derivatives.

36 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

37 Marginal Effects Recall that: E (y i x i ) = [1 Pr (y i = 1 x i )]+[0 Pr (y i = 0 x i )] = Pr (y i = 1 x i ) = F (x The marginal effect is given by: E (y i x i ) = x }{{ i } (K 1) [ ] df (x i β) d(x i β) β = f(x iβ) where f( ) is the probability density function. So: }{{} (1 1) β }{{} (K 1) Probit = E (y i x i ) x i = φ(x iβ i )β Logit = E (y i x i ) x i = Λ(x iβ) [1 Λ(x iβ)] β

38 Marginal Effects Comments Some aspects are important to note: MEs will vary with the values of x. Current practices: Calculate MEs at means of the variables. Calculate MEs at specific values. Evaluate MEs at every observation and use the sample average of the individual MEs.

39 Marginal Effects Dummy variable The appropriate ME for a binary independent variable, say, d, would be: ME = [ Pr ( y i = 1 x (d), d i = 1 )] [ Pr ( y i = 1 x (d), d i = 0 )] where x (d) denotes the means of all the other variables in the model.

40 Average Partial Effects Current practice: average partial effects. The quantity of interest is: [ ] E(y x) AP E = E x x In practical terms, this suggests the computation: ÂP E = γ = 1 n n i=1 f(x i β) β.

41 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

42 Goodness-of-Fit A number of suggestions have been made for how to evaluate the overall quality of a binary response model. First approach: to mimic the R 2 measure. Assess the predictive performance of the model R 2 are not directly applicable in non-linear models such as binary response models, since we do not have a proper variance decomposition result The so-called pseudo R 2 measures have been suggested.

43 Goodness-of-Fit Let: log L( β r ): the value of the maximized log-likelihood function in the constant-only model. log L( β u ): is the maximized log-likelihood value in the full model. Note that the value of the log-likelihood function is always negative, so: log L( β u ) log L( β r ) = log L( β u ) log L( β r ) So that: 0 1 log L( β u ) log L( β r ) = R2 McFadden 1 The McFadden R 2 will be zero if the full model has no explanatory power.

44 Information Measures Definition (Akaike s Information Criterion (AIC)) Akaike s (1973) information criterion is defined as AIC = 2 log L + 2K n (7) where log L is the likelihood of the model and K is the number of parameters in the model. Larger values of log L indicates a better fit. 2 log L ranges from to 0 + with smaller values indicating a better fit. As K increases, 2 log L becomes smaller since more parameters make what is observed more likely. 2K is added as a penalty. All else being equal, smaller values suggest a better fitting model. Use to compare models across different samples or to compare nonested models.

45 Information Measures Definition (Bayes Information Criterion (BIC)) BIC information criterion is defined as BIC = 2 log L + K log n n (8) where log L is the likelihood of the model and K is the number of parameters in the model and n is the number of individuals. It is possible to increase the likelihood by adding parameters, but doing so may result in overfitting. Both BIC and AIC resolve this problem by introducing a penalty term for the number of parameters in the model; the penalty term is larger in BIC than in AIC.

46 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

47 Stata Example Open BinaryStata.do

48 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

49 Ordinal Outcomes When a variable is ordinal, its categories can be ranked from low to high, but the distances between adjacent categories are unknown. Examples: Kind of Job: low skilled, medium skilled, and highly skilled. Subjective Health Status: very bad, bad, good, very good. Likert Scales: strongly agree, agree, have no opinion, disagree, or strongly disagree with a statement.

50 Ordinal Outcomes Researchers often treat ordinal dependent variables as if they were interval. The dependent categories are numbered sequantially and the LRM is used. This involves the implicit assumption that the intervals between adjacent categories are equal. For example, the distance between strongly agreeing and agreeing is assumed to be the same as the distance between agreeing and being neutral on a Likert Scale. This is not correct. See for example MacKelvey and Zavoina (1975) and Winship and Mare (1984).

51 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

52 Latent Approach Consider the latent variable yi the following structural model ranging from to, determind by y i = x iβ + ɛ i (9) However, we do not observe yi. We only observed a discrete variable y i, which is though of as providing incomplete information about and underlying yi according to the following rule: 1 if yi < κ 1 y i = 2 if κ 1 < yi < κ 2 (10) 3 if κ 2 < yi or more general mechanism that accounts for the ordering nature of yi is y i = j κ j 1 < y i κ j j = 1,..., J (11) The κ s are called thresholds or cutpoints. The extreme categories 1 and J are defined by open-ended intervals with κ 0 = and κ J =. When J = 2, we have the BRM.

53 Latent Approach Consider the following question: How do you grade your actual health status? The answer are: 1 = Poor, 2 = Fair, 3 = Good, and 4 = Very Good. Assume that this ordinal variable is related to a continuous, latent variable y that indicates and individual s underlying health status. The observed y is related to y according to: 1 = Poor if κ 0 = yi < κ 1 2 = Fair if κ 1 yi y i = < κ 2 3 = Good if κ 2 yi < κ 3 4 = Very Good if κ 3 yi < κ 4 = (12)

54

55 The probabilities First, consider the probability that y = 1. This implies that: Pr(y i = 1 x i ) = Pr(κ 0 yi < κ 1 x i ) = Pr(κ 0 x iβ + ɛ i < κ 1 x i ) = Pr(κ 0 x iβ ɛ i < κ 1 x iβ x i ) = Pr(ɛ i < κ 1 x iβ x i ) Pr(ɛ i < κ 0 x iβ x i ) = F (κ 1 x iβ) F (κ 0 x iβ) (13) So, generalizing we obtain: Note that: and Pr(y i = j x i ) = F (κ j x iβ) F (κ j 1 x iβ) (14) F (κ 0 x iβ) = F ( x iβ) = 0 (15) F (κ J x iβ) = F ( x iβ) = 1 (16)

56 Ordered Probit Model In the ordered probit model, we assume that the error term follow a standard normal distribution, F (ε) = Φ(ε). In this case, the probabilities are: ( κj x i Pr(y i = j x i ) = Φ β ) ( κj 1 x i Φ β ) σ ɛ σ ɛ (17) For identification we need σ ɛ

57 Ordered Probit Model

58 Ordered Logist Model If ɛ i iid Logistically distributed, then ( κj x i Pr(y i = j x i ) = Λ β ) ( κj 1 x i Λ β ) where σ ɛ = π 2 /3 σ ɛ σ ɛ (18)

59 Identification Since y is latent, its means and variance cannot be estimated. For logit Var(ε x) = π 2 /3. For Probit Var(ε x) = 1 Can we estimate a constant is this model?

60 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

61 Estimation The conditional probabilities are: f(y i x i ; β, κ 1,..., κ J 1 ) = P di1 i1...p d ij ij = J j=1 P dij ij (19) where d ij is defined as a binary indicator equal to one if y i = j and 0 otherwise. For a sample of n independent pairs of obserations (y i, x i ), the likelihood function is given by L(θ; y, X) = n J i=1 j=1 P dij ij (20)

62 Estimation Taking logarithms converts products into sums and we can write the log-likelihood funcition as log L(θ; y, X) = n i=1 j=1 J d ij P ij (21) ML estimation of the parameters yields consistent, asymptotically efficient, and asymptotically normally distributed estimators. We can use LR, Wald, and Score test to test for general restrictions, and the invariance property together with the Delta Method for estimation and inference of predicted probabilities or marginal probability effects.

63 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

64 Compesating Variation Definition CV The variation in two regressors such that the latent variable does not change. Let x il denote the l-element in x i and β l the corresponding parameter, and let m index the m-th elements in both vectors x i and β, respectively. Now consider a change in x il and x im at the same time, such that yi = 0 (and therefore all probabilities are unchanged). This requires: 0 = β l x il + β m x im or x il x im = β m β l (22) If, for example, x il is logarithmic income and x im is a dummy variable indicating unemployment, then the above ratio gives a trade-off ration: The relative increase in income required to compensate for the negative effect of unemployment

65 Compesating Variation In some applications, it could be interesting to know how much an explanatory varible should change to reach the next higher response category. For this purpose, one could consider the ratio of the interval length to the parameter k j k j 1 β l (23) The smallest this ratio in absolute terms, the smallest the maximum change in x il required to move the response from y i = j to y i = j + 1

66 Discrete Probability Effect The ceteris paribus effect of a discrete change in one explanatory variable, say the l-th element in x i, is simply the difference between the probabilities before and those after the change, given the values of the other variables: P ij = Pr(y i = j x i + x il ) Pr(y i = j x i ) = [F (κ j βx i x il β) F (κ j 1 βx i x il β)] [F (κ j βx i ) F (κ j 1 βx i )] (24)

67 Shift in Density Due to a Change x i > 0 (β > 0)

68 Marginal Effects The marginal probability effect (MPE) is given by: MP E ijl = Pij x il = [ f(k j 1 x iβ) f(k j x iβ) ] β l (25) Neither the signs nor the magnitudes of the coefficients are directly interpretable in the ordered choice models. MPE s are functions of the covariates and therefore vary across individuals. To calculate average marginal probability effects (AMPE s) we have to take expectations with respect to x, which is estimated consistently by replacing β by its ML estimate β and averaging over the sample.

69 Marginal Effects Furthermore, in the computation of the MPE, the only certainties in the signs of the partial effects in this model are as follows, where we consider a variable with a positive coefficient: Increases in that variable will increase the probability in the highest cell and decrease the probability in the lowest cell. The sum of all the changes will be zero the new probabilities must still sum to one. The effects will begin at Pr(y i = 1) with one or more negative values, then change to a set of positives values; there will be one sign change (Single Crossing)

70 Marginal Effects One might also be interested in cummulative values of the partial effects, such as Pr(y j x i ) x il = J [f(k j 1 x iβ) f(k j x iβ)] β l (26) j=1

71 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

72 Stata Example Open Ordered Lab.do

73 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

74 Introduction Type of dependent variable Discrete choices: choie of one among several mutually exclusive alternatives: Commuting mode: train, car, bycicle, walking. Type of car: standard car, hybrid car, electric car. Type of data Revealed preferences: data which means that the data are observed choice of individuals for, say, a transport mode. Stated preferences data: in this case individuals face a virtual situation of choice for example the choice between different types of automated cars.

75 Design of the choice experiment Figure: Sample of a Choice Situation Presented to Respondents

76 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

77 Random Utility Model The utility of a individual: U ij = V ij + ɛ ij where V ij is the deterministic component, and ɛ ij is the random component. This random component is intended to capture the various unpredictable choices people make. Deterministic Part It is usually assumed that is a linear combination of the observed explantory variables: V ij = α j + x ijβ + z iγ j + w ijδ j

78 Random Utility Model Trhee kinds of variables: alternative specific variables x ij with generic coefficient β. individual specific variables z i with an alternative specific coefficients γ j. alternative specific variables w ij with an alternative specific coefficient δ j

79 Random Utility Model Only differences between utility matters. Let two alternatives j and k V ij V ik = (α j α k ) + β(x ij x ik ) + (γ j γ k )z i + (δ j w ij δ k w ik ) Coefficients for individual specific variables (including the intercept) should be alternative specific. Normalization is required γ 1 = 0. Coefficients for alternative specific variables may (or may not) be alternative specific. Cost is alternative specific, but the cost of standard vehicles may not have the same impact on utility that the cost in a electric car.

80 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

81 Probabilities If the error terms are EV type I, then: P ij = evij = e x ij β (27) j evij j ex ij β For a sample of n independent pairs of obserations (y i, x ij ), the log-likelihood function is given by log L(θ; y, X) = y ij log P ij (28) i j where y ij is a dummy variable which equal to 1 if individual i made choice j and 0 otherwise.

82 Marginal effects The marginal effects respect to individual-specific z i or alternative-specific x ij variables are: P ij z i = P ij ( β j l P ij = γp ij (1 P ij ) x ij P ij = γp ij P il x il P il β l )

83 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

84 gmnl package in R

85 Formulas in gmnl package #load packages and data library("gmnl") library("mlogit") data("travelmode", package = "AER") with(travelmode, prop.table(table(mode[choice == "yes"]))) ## ## air train bus car ## head(travelmode) ## individual mode choice wait vcost travel gcost income size ## 1 1 air no ## 2 1 train no ## 3 1 bus no ## 4 1 car yes ## 5 2 air no ## 6 2 train no

86 Formulas in gmnl package Data is in long-format (one row per available mode). We transform the data in mlogit.format # transform data TM <- mlogit.data(travelmode, choice = "choice", shape = "long", alt.levels = c("air", "train", "bus", "car"))

87 Formulas in gmnl package Suppose we want to estimate a multinomial logit model where the variables wait and vcost are alternative-specific variables with a generic coefficient β, income is an individual-specific variable with an alternative specific coefficient γ j ; and the variable travel is alternative-specific variable with an alternative-specific coefficient δ j. f1 <- choice wait + vcost income travel

88 Formulas in gmnl package By default, the alternative-specific constants (ASC) for each alternative are included. They can be omitted by adding +0 or -1 in the second part of the formula. f2 <- choice wait + vcost income + 0 travel f2 <- choice wait + vcost income - 1 travel

89 Formulas in gmnl package Some parts may by omitted when there is no ambiguity. For instance, a model with only individual-specific variables can be specified as follows: f2 <- choice 0 income + size 0 f2 <- choice 0 income + size 1

90 1 Binary outcomes Introduction to Linear Probability Model (LPM) WLS Example Non-Linear Probability Model Probit and Logit Estimation Marginal Effects Goodness-of-Fit Stata Example 2 Ordinal outcomes Motivation Latent Approach ML Estimation Interpretation of Parameters Stata Example 3 Multinomial Logit Model Introduction RUM Probabilities and Estimation Formulas using gmnl package An example

91 Example The dataset Train contains data about a stated preference survey in Netherlands. Users are asked to choose between two trains trips characterized by four attributes: price: the price in cents of guilders, time: travel time in minutes, change: the number of changes, comfort: the class of comfort, 0, 1 or 2; 0 being the most comfortable class. data("train", package = "mlogit") Tr <- mlogit.data(train, shape = "wide", choice = "choice", varying = 4:11, sep = "", alt.levels = c(1, 2), id = "id")

92 Example We first convert price and time in more meaningful unities, hours and euros (1 guilder is euros): Tr$price <- Tr$price / 100 * Tr$time <- Tr$time / 60

93 Example Now we estimate the model: both alternatives being virtual equal (train trips), it is relevant to use only generic coefficients and to remove the ASC mnl <- gmnl(choice price + time + change + comfort -1, data = Tr, model = "mnl")

94 Example summary(mnl) ## ## Model estimated on: Thu Oct 05 18:16: ## ## Call: ## gmnl(formula = choice price + time + change + comfort -1, ## data = Tr, model = "mnl", method = "nr") ## ## Frequencies of categories: ## ## 1 2 ## ## ## The estimation took: 0h:0m:0s ## ## Coefficients: ## Estimate Std. Error z-value Pr(> z ) ## price < 2.2e-16 *** ## time < 2.2e-16 *** ## change e-08 *** ## comfort < 2.2e-16 *** ## --- ## Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Optimization of log-likelihood by Newton-Raphson maximisation ## Log Likelihood: ## Number of observations: 2929

95 Example Coefficients are not directly interpretable, but dividing them by the price coefficient, we get the WTPs coef(mnl)[-1] / coef(mnl)[1] ## time change comfort ## or we can use wtp.gmnl(mnl, wrt = "price") ## ## Willigness-to-pay respect to: price ## ## Estimate Std. Error t-value Pr(> t ) ## time < 2.2e-16 *** ## change e-09 *** ## comfort < 2.2e-16 *** ## --- ## Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 We obtain the value of 26 euros for an hour of traveling, 5 euros for a change and 14 euros to access a more confortable class.

96 Example Now we use the Fish data set. data("fishing", package = "mlogit") head(fishing, 3) ## mode price.beach price.pier price.boat price.charter catch.beach ## 1 charter ## 2 charter ## 3 boat ## catch.pier catch.boat catch.charter income ## ## ## Fish <- mlogit.data(fishing, shape = "wide", varying = 2:9, choice = "mode") mnl.fish <- gmnl(mode price income catch, data = Fish, model = "mnl")

97 Example summary(mnl.fish) ## ## Model estimated on: Thu Oct 05 18:16: ## ## Call: ## gmnl(formula = mode price income catch, data = Fish, model = "mnl", ## method = "nr") ## ## Frequencies of categories: ## ## beach boat charter pier ## ## ## The estimation took: 0h:0m:1s ## ## Coefficients: ## Estimate Std. Error z-value Pr(> z ) ## boat:(intercept) e e ** ## charter:(intercept) e e e-13 *** ## pier:(intercept) e e *** ## price e e < 2.2e-16 *** ## boat:income e e ## charter:income e e ## pier:income e e ** ## beach:catch e e e-05 *** ## boat:catch e e e-06 *** ## charter:catch e e e-07 *** ## pier:catch e e *** ## --- ## Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Optimization of log-likelihood by Newton-Raphson maximisation ## Log Likelihood: ## Number of observations: 1182 ## Number of iterations: 6 ## Exit of MLE: successive function values within tolerance limit

98 Example Predicted probabilities for the outcome or for all alternatives if outcome = FALSE: head(fitted(mnl.fish)) ## 1.beach 2.beach 3.beach 4.beach 5.beach 6.beach ## head(fitted(mnl.fish, outcome = FALSE)) ## beach boat charter pier ## [1,] ## [2,] ## [3,] ## [4,] ## [5,] ## [6,]

Econometric Methods for Valuation Analysis

Econometric Methods for Valuation Analysis Econometric Methods for Valuation Analysis Margarita Genius Dept of Economics M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 1 / 25 Outline We will consider econometric

More information

Introduction to the Maximum Likelihood Estimation Technique. September 24, 2015

Introduction to the Maximum Likelihood Estimation Technique. September 24, 2015 Introduction to the Maximum Likelihood Estimation Technique September 24, 2015 So far our Dependent Variable is Continuous That is, our outcome variable Y is assumed to follow a normal distribution having

More information

Econometrics II Multinomial Choice Models

Econometrics II Multinomial Choice Models LV MNC MRM MNLC IIA Int Est Tests End Econometrics II Multinomial Choice Models Paul Kattuman Cambridge Judge Business School February 9, 2018 LV MNC MRM MNLC IIA Int Est Tests End LW LW2 LV LV3 Last Week:

More information

Intro to GLM Day 2: GLM and Maximum Likelihood

Intro to GLM Day 2: GLM and Maximum Likelihood Intro to GLM Day 2: GLM and Maximum Likelihood Federico Vegetti Central European University ECPR Summer School in Methods and Techniques 1 / 32 Generalized Linear Modeling 3 steps of GLM 1. Specify the

More information

3. Multinomial response models

3. Multinomial response models 3. Multinomial response models 3.1 General model approaches Multinomial dependent variables in a microeconometric analysis: These qualitative variables have more than two possible mutually exclusive categories

More information

Logit Models for Binary Data

Logit Models for Binary Data Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis These models are appropriate when the response

More information

STA 4504/5503 Sample questions for exam True-False questions.

STA 4504/5503 Sample questions for exam True-False questions. STA 4504/5503 Sample questions for exam 2 1. True-False questions. (a) For General Social Survey data on Y = political ideology (categories liberal, moderate, conservative), X 1 = gender (1 = female, 0

More information

Multinomial Choice (Basic Models)

Multinomial Choice (Basic Models) Unversitat Pompeu Fabra Lecture Notes in Microeconometrics Dr Kurt Schmidheiny June 17, 2007 Multinomial Choice (Basic Models) 2 1 Ordered Probit Contents Multinomial Choice (Basic Models) 1 Ordered Probit

More information

Nonlinear Econometric Analysis (ECO 722) Answers to Homework 4

Nonlinear Econometric Analysis (ECO 722) Answers to Homework 4 Nonlinear Econometric Analysis (ECO 722) Answers to Homework 4 1 Greene and Hensher (1997) report estimates of a model of travel mode choice for travel between Sydney and Melbourne, Australia The dataset

More information

Economics Multinomial Choice Models

Economics Multinomial Choice Models Economics 217 - Multinomial Choice Models So far, most extensions of the linear model have centered on either a binary choice between two options (work or don t work) or censoring options. Many questions

More information

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS Answer any FOUR of the SIX questions.

More information

Estimating Market Power in Differentiated Product Markets

Estimating Market Power in Differentiated Product Markets Estimating Market Power in Differentiated Product Markets Metin Cakir Purdue University December 6, 2010 Metin Cakir (Purdue) Market Equilibrium Models December 6, 2010 1 / 28 Outline Outline Estimating

More information

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples 1.3 Regime switching models A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples (or regimes). If the dates, the

More information

Random Variables and Probability Distributions

Random Variables and Probability Distributions Chapter 3 Random Variables and Probability Distributions Chapter Three Random Variables and Probability Distributions 3. Introduction An event is defined as the possible outcome of an experiment. In engineering

More information

Course information FN3142 Quantitative finance

Course information FN3142 Quantitative finance Course information 015 16 FN314 Quantitative finance This course is aimed at students interested in obtaining a thorough grounding in market finance and related empirical methods. Prerequisite If taken

More information

Day 3C Simulation: Maximum Simulated Likelihood

Day 3C Simulation: Maximum Simulated Likelihood Day 3C Simulation: Maximum Simulated Likelihood c A. Colin Cameron Univ. of Calif. - Davis... for Center of Labor Economics Norwegian School of Economics Advanced Microeconometrics Aug 28 - Sep 1, 2017

More information

Discrete Choice Theory and Travel Demand Modelling

Discrete Choice Theory and Travel Demand Modelling Discrete Choice Theory and Travel Demand Modelling The Multinomial Logit Model Anders Karlström Division of Transport and Location Analysis, KTH Jan 21, 2013 Urban Modelling (TLA, KTH) 2013-01-21 1 / 30

More information

[BINARY DEPENDENT VARIABLE ESTIMATION WITH STATA]

[BINARY DEPENDENT VARIABLE ESTIMATION WITH STATA] Tutorial #3 This example uses data in the file 16.09.2011.dta under Tutorial folder. It contains 753 observations from a sample PSID data on the labor force status of married women in the U.S in 1975.

More information

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions. ME3620 Theory of Engineering Experimentation Chapter III. Random Variables and Probability Distributions Chapter III 1 3.2 Random Variables In an experiment, a measurement is usually denoted by a variable

More information

Chapter 7: Estimation Sections

Chapter 7: Estimation Sections 1 / 40 Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: Chapter 7 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods:

More information

Amath 546/Econ 589 Univariate GARCH Models

Amath 546/Econ 589 Univariate GARCH Models Amath 546/Econ 589 Univariate GARCH Models Eric Zivot April 24, 2013 Lecture Outline Conditional vs. Unconditional Risk Measures Empirical regularities of asset returns Engle s ARCH model Testing for ARCH

More information

sociology SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 SO5032 Quantitative Research Methods

sociology SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 SO5032 Quantitative Research Methods 1 SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 Lecture 10: Multinomial regression baseline category extension of binary What if we have multiple possible

More information

Equity, Vacancy, and Time to Sale in Real Estate.

Equity, Vacancy, and Time to Sale in Real Estate. Title: Author: Address: E-Mail: Equity, Vacancy, and Time to Sale in Real Estate. Thomas W. Zuehlke Department of Economics Florida State University Tallahassee, Florida 32306 U.S.A. tzuehlke@mailer.fsu.edu

More information

Analysis of Microdata

Analysis of Microdata Rainer Winkelmann Stefan Boes Analysis of Microdata Second Edition 4u Springer 1 Introduction 1 1.1 What Are Microdata? 1 1.2 Types of Microdata 4 1.2.1 Qualitative Data 4 1.2.2 Quantitative Data 6 1.3

More information

Vlerick Leuven Gent Working Paper Series 2003/30 MODELLING LIMITED DEPENDENT VARIABLES: METHODS AND GUIDELINES FOR RESEARCHERS IN STRATEGIC MANAGEMENT

Vlerick Leuven Gent Working Paper Series 2003/30 MODELLING LIMITED DEPENDENT VARIABLES: METHODS AND GUIDELINES FOR RESEARCHERS IN STRATEGIC MANAGEMENT Vlerick Leuven Gent Working Paper Series 2003/30 MODELLING LIMITED DEPENDENT VARIABLES: METHODS AND GUIDELINES FOR RESEARCHERS IN STRATEGIC MANAGEMENT HARRY P. BOWEN Harry.Bowen@vlerick.be MARGARETHE F.

More information

Limited Dependent Variables

Limited Dependent Variables Limited Dependent Variables Christopher F Baum Boston College and DIW Berlin Birmingham Business School, March 2013 Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 1 / 47 Limited dependent

More information

Final Exam - section 1. Thursday, December hours, 30 minutes

Final Exam - section 1. Thursday, December hours, 30 minutes Econometrics, ECON312 San Francisco State University Michael Bar Fall 2013 Final Exam - section 1 Thursday, December 19 1 hours, 30 minutes Name: Instructions 1. This is closed book, closed notes exam.

More information

Logit with multiple alternatives

Logit with multiple alternatives Logit with multiple alternatives Matthieu de Lapparent matthieu.delapparent@epfl.ch Transport and Mobility Laboratory, School of Architecture, Civil and Environmental Engineering, Ecole Polytechnique Fédérale

More information

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based

More information

The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis

The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis Dr. Baibing Li, Loughborough University Wednesday, 02 February 2011-16:00 Location: Room 610, Skempton (Civil

More information

The Delta Method. j =.

The Delta Method. j =. The Delta Method Often one has one or more MLEs ( 3 and their estimated, conditional sampling variancecovariance matrix. However, there is interest in some function of these estimates. The question is,

More information

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29 Chapter 5 Univariate time-series analysis () Chapter 5 Univariate time-series analysis 1 / 29 Time-Series Time-series is a sequence fx 1, x 2,..., x T g or fx t g, t = 1,..., T, where t is an index denoting

More information

Choice Probabilities. Logit Choice Probabilities Derivation. Choice Probabilities. Basic Econometrics in Transportation.

Choice Probabilities. Logit Choice Probabilities Derivation. Choice Probabilities. Basic Econometrics in Transportation. 1/31 Choice Probabilities Basic Econometrics in Transportation Logit Models Amir Samimi Civil Engineering Department Sharif University of Technology Primary Source: Discrete Choice Methods with Simulation

More information

Converting to the Standard Normal rv: Exponential PDF and CDF for x 0 Chapter 7: expected value of x

Converting to the Standard Normal rv: Exponential PDF and CDF for x 0 Chapter 7: expected value of x Key Formula Sheet ASU ECN 22 ASWCC Chapter : no key formulas Chapter 2: Relative Frequency=freq of the class/n Approx Class Width: =(largest value-smallest value) /number of classes Chapter 3: sample and

More information

Log-linear Modeling Under Generalized Inverse Sampling Scheme

Log-linear Modeling Under Generalized Inverse Sampling Scheme Log-linear Modeling Under Generalized Inverse Sampling Scheme Soumi Lahiri (1) and Sunil Dhar (2) (1) Department of Mathematical Sciences New Jersey Institute of Technology University Heights, Newark,

More information

The method of Maximum Likelihood.

The method of Maximum Likelihood. Maximum Likelihood The method of Maximum Likelihood. In developing the least squares estimator - no mention of probabilities. Minimize the distance between the predicted linear regression and the observed

More information

Categorical Outcomes. Statistical Modelling in Stata: Categorical Outcomes. R by C Table: Example. Nominal Outcomes. Mark Lunt.

Categorical Outcomes. Statistical Modelling in Stata: Categorical Outcomes. R by C Table: Example. Nominal Outcomes. Mark Lunt. Categorical Outcomes Statistical Modelling in Stata: Categorical Outcomes Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Nominal Ordinal 28/11/2017 R by C Table: Example Categorical,

More information

Alastair Hall ECG 790F: Microeconometrics Spring Computer Handout # 2. Estimation of binary response models : part II

Alastair Hall ECG 790F: Microeconometrics Spring Computer Handout # 2. Estimation of binary response models : part II Alastair Hall ECG 790F: Microeconometrics Spring 2006 Computer Handout # 2 Estimation of binary response models : part II In this handout, we discuss the estimation of binary response models with and without

More information

A Two-Step Estimator for Missing Values in Probit Model Covariates

A Two-Step Estimator for Missing Values in Probit Model Covariates WORKING PAPER 3/2015 A Two-Step Estimator for Missing Values in Probit Model Covariates Lisha Wang and Thomas Laitila Statistics ISSN 1403-0586 http://www.oru.se/institutioner/handelshogskolan-vid-orebro-universitet/forskning/publikationer/working-papers/

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation The likelihood and log-likelihood functions are the basis for deriving estimators for parameters, given data. While the shapes of these two functions are different, they have

More information

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5] 1 High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5] High-frequency data have some unique characteristics that do not appear in lower frequencies. At this class we have: Nonsynchronous

More information

Financial Risk Management

Financial Risk Management Financial Risk Management Professor: Thierry Roncalli Evry University Assistant: Enareta Kurtbegu Evry University Tutorial exercices #4 1 Correlation and copulas 1. The bivariate Gaussian copula is given

More information

Duration Models: Parametric Models

Duration Models: Parametric Models Duration Models: Parametric Models Brad 1 1 Department of Political Science University of California, Davis January 28, 2011 Parametric Models Some Motivation for Parametrics Consider the hazard rate:

More information

Resolving Failed Banks: Uncertainty, Multiple Bidding, & Auction Design

Resolving Failed Banks: Uncertainty, Multiple Bidding, & Auction Design Resolving Failed Banks: Uncertainty, Multiple Bidding, & Auction Design Jason Allen, Rob Clark, Brent Hickman, and Eric Richert Workshop in memory of Art Shneyerov October 12, 2018 Preliminary and incomplete.

More information

PROBABILITY DISTRIBUTIONS

PROBABILITY DISTRIBUTIONS CHAPTER 3 PROBABILITY DISTRIBUTIONS Page Contents 3.1 Introduction to Probability Distributions 51 3.2 The Normal Distribution 56 3.3 The Binomial Distribution 60 3.4 The Poisson Distribution 64 Exercise

More information

Business Statistics 41000: Probability 3

Business Statistics 41000: Probability 3 Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404

More information

Your Name (Please print) Did you agree to take the optional portion of the final exam Yes No. Directions

Your Name (Please print) Did you agree to take the optional portion of the final exam Yes No. Directions Your Name (Please print) Did you agree to take the optional portion of the final exam Yes No (Your online answer will be used to verify your response.) Directions There are two parts to the final exam.

More information

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. Summary of the previous lecture Moments of a distribubon Measures of

More information

Lecture 21: Logit Models for Multinomial Responses Continued

Lecture 21: Logit Models for Multinomial Responses Continued Lecture 21: Logit Models for Multinomial Responses Continued Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University

More information

Chapter 7: Estimation Sections

Chapter 7: Estimation Sections 1 / 31 : Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods: 7.5 Maximum Likelihood

More information

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise.

Version A. Problem 1. Let X be the continuous random variable defined by the following pdf: 1 x/2 when 0 x 2, f(x) = 0 otherwise. Math 224 Q Exam 3A Fall 217 Tues Dec 12 Version A Problem 1. Let X be the continuous random variable defined by the following pdf: { 1 x/2 when x 2, f(x) otherwise. (a) Compute the mean µ E[X]. E[X] x

More information

Lecture 1: Logit. Quantitative Methods for Economic Analysis. Seyed Ali Madani Zadeh and Hosein Joshaghani. Sharif University of Technology

Lecture 1: Logit. Quantitative Methods for Economic Analysis. Seyed Ali Madani Zadeh and Hosein Joshaghani. Sharif University of Technology Lecture 1: Logit Quantitative Methods for Economic Analysis Seyed Ali Madani Zadeh and Hosein Joshaghani Sharif University of Technology February 2017 1 / 38 Road map 1. Discrete Choice Models 2. Binary

More information

GPD-POT and GEV block maxima

GPD-POT and GEV block maxima Chapter 3 GPD-POT and GEV block maxima This chapter is devoted to the relation between POT models and Block Maxima (BM). We only consider the classical frameworks where POT excesses are assumed to be GPD,

More information

Models of Multinomial Qualitative Response

Models of Multinomial Qualitative Response Models of Multinomial Qualitative Response Multinomial Logit Models October 22, 2015 Dependent Variable as a Multinomial Outcome Suppose we observe an economic choice that is a binary signal from amongst

More information

6. Genetics examples: Hardy-Weinberg Equilibrium

6. Genetics examples: Hardy-Weinberg Equilibrium PBCB 206 (Fall 2006) Instructor: Fei Zou email: fzou@bios.unc.edu office: 3107D McGavran-Greenberg Hall Lecture 4 Topics for Lecture 4 1. Parametric models and estimating parameters from data 2. Method

More information

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu September 5, 2015

More information

9. Logit and Probit Models For Dichotomous Data

9. Logit and Probit Models For Dichotomous Data Sociology 740 John Fox Lecture Notes 9. Logit and Probit Models For Dichotomous Data Copyright 2014 by John Fox Logit and Probit Models for Dichotomous Responses 1 1. Goals: I To show how models similar

More information

Applications of Good s Generalized Diversity Index. A. J. Baczkowski Department of Statistics, University of Leeds Leeds LS2 9JT, UK

Applications of Good s Generalized Diversity Index. A. J. Baczkowski Department of Statistics, University of Leeds Leeds LS2 9JT, UK Applications of Good s Generalized Diversity Index A. J. Baczkowski Department of Statistics, University of Leeds Leeds LS2 9JT, UK Internal Report STAT 98/11 September 1998 Applications of Good s Generalized

More information

Discrete Choice Model for Public Transport Development in Kuala Lumpur

Discrete Choice Model for Public Transport Development in Kuala Lumpur Discrete Choice Model for Public Transport Development in Kuala Lumpur Abdullah Nurdden 1,*, Riza Atiq O.K. Rahmat 1 and Amiruddin Ismail 1 1 Department of Civil and Structural Engineering, Faculty of

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 4 Random Variables & Probability Distributions Content 1. Two Types of Random Variables 2. Probability Distributions for Discrete Random Variables 3. The Binomial

More information

Bivariate Birnbaum-Saunders Distribution

Bivariate Birnbaum-Saunders Distribution Department of Mathematics & Statistics Indian Institute of Technology Kanpur January 2nd. 2013 Outline 1 Collaborators 2 3 Birnbaum-Saunders Distribution: Introduction & Properties 4 5 Outline 1 Collaborators

More information

STRESS-STRENGTH RELIABILITY ESTIMATION

STRESS-STRENGTH RELIABILITY ESTIMATION CHAPTER 5 STRESS-STRENGTH RELIABILITY ESTIMATION 5. Introduction There are appliances (every physical component possess an inherent strength) which survive due to their strength. These appliances receive

More information

Estimating Mixed Logit Models with Large Choice Sets. Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013

Estimating Mixed Logit Models with Large Choice Sets. Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013 Estimating Mixed Logit Models with Large Choice Sets Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013 Motivation Bayer et al. (JPE, 2007) Sorting modeling / housing choice 250,000 individuals

More information

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی یادگیري ماشین توزیع هاي نمونه و تخمین نقطه اي پارامترها Sampling Distributions and Point Estimation of Parameter (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی درس هفتم 1 Outline Introduction

More information

Solution to Exercise E5.

Solution to Exercise E5. Solution to Exercise E5. The Multiple Regression Model. Estimation. Exercise E5.1. Beach umbrella rental Part I. Simple Linear Regression Model. a. Regression model: U t = β 1 + β 2 T t + u t t = 1,...,

More information

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Lecture 17: More on Markov Decision Processes. Reinforcement learning Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture

More information

Continuous Distributions

Continuous Distributions Quantitative Methods 2013 Continuous Distributions 1 The most important probability distribution in statistics is the normal distribution. Carl Friedrich Gauss (1777 1855) Normal curve A normal distribution

More information

Principles of Econometrics Mid-Term

Principles of Econometrics Mid-Term Principles of Econometrics Mid-Term João Valle e Azevedo Sérgio Gaspar October 6th, 2008 Time for completion: 70 min For each question, identify the correct answer. For each question, there is one and

More information

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2017, Mr. Ruey S. Tsay. Solutions to Final Exam

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2017, Mr. Ruey S. Tsay. Solutions to Final Exam The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2017, Mr. Ruey S. Tsay Solutions to Final Exam Problem A: (40 points) Answer briefly the following questions. 1. Describe

More information

1. You are given the following information about a stationary AR(2) model:

1. You are given the following information about a stationary AR(2) model: Fall 2003 Society of Actuaries **BEGINNING OF EXAMINATION** 1. You are given the following information about a stationary AR(2) model: (i) ρ 1 = 05. (ii) ρ 2 = 01. Determine φ 2. (A) 0.2 (B) 0.1 (C) 0.4

More information

Random variables. Contents

Random variables. Contents Random variables Contents 1 Random Variable 2 1.1 Discrete Random Variable............................ 3 1.2 Continuous Random Variable........................... 5 1.3 Measures of Location...............................

More information

FIT OR HIT IN CHOICE MODELS

FIT OR HIT IN CHOICE MODELS FIT OR HIT IN CHOICE MODELS KHALED BOUGHANMI, RAJEEV KOHLI, AND KAMEL JEDIDI Abstract. The predictive validity of a choice model is often assessed by its hit rate. We examine and illustrate conditions

More information

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ. 9 Point estimation 9.1 Rationale behind point estimation When sampling from a population described by a pdf f(x θ) or probability function P [X = x θ] knowledge of θ gives knowledge of the entire population.

More information

Logistic Regression with R: Example One

Logistic Regression with R: Example One Logistic Regression with R: Example One math = read.table("http://www.utstat.toronto.edu/~brunner/appliedf12/data/mathcat.data") math[1:5,] hsgpa hsengl hscalc course passed outcome 1 78.0 80 Yes Mainstrm

More information

The test has 13 questions. Answer any four. All questions carry equal (25) marks.

The test has 13 questions. Answer any four. All questions carry equal (25) marks. 2014 Booklet No. TEST CODE: QEB Afternoon Questions: 4 Time: 2 hours Write your Name, Registration Number, Test Code, Question Booklet Number etc. in the appropriate places of the answer booklet. The test

More information

Introduction to POL 217

Introduction to POL 217 Introduction to POL 217 Brad Jones 1 1 Department of Political Science University of California, Davis January 9, 2007 Topics of Course Outline Models for Categorical Data. Topics of Course Models for

More information

Quant Econ Pset 2: Logit

Quant Econ Pset 2: Logit Quant Econ Pset 2: Logit Hosein Joshaghani Due date: February 20, 2017 The main goal of this problem set is to get used to Logit, both to its mechanics and its economics. In order to fully grasp this useful

More information

ECON Introductory Econometrics. Lecture 1: Introduction and Review of Statistics

ECON Introductory Econometrics. Lecture 1: Introduction and Review of Statistics ECON4150 - Introductory Econometrics Lecture 1: Introduction and Review of Statistics Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 1-2 Lecture outline 2 What is econometrics? Course

More information

Example 1 of econometric analysis: the Market Model

Example 1 of econometric analysis: the Market Model Example 1 of econometric analysis: the Market Model IGIDR, Bombay 14 November, 2008 The Market Model Investors want an equation predicting the return from investing in alternative securities. Return is

More information

Continuous random variables

Continuous random variables Continuous random variables probability density function (f(x)) the probability distribution function of a continuous random variable (analogous to the probability mass function for a discrete random variable),

More information

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 13, 2018

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 13, 2018 Maximum Likelihood Estimation Richard Williams, University of otre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 3, 208 [This handout draws very heavily from Regression Models for Categorical

More information

CHOICE THEORY, UTILITY FUNCTIONS AND RISK AVERSION

CHOICE THEORY, UTILITY FUNCTIONS AND RISK AVERSION CHOICE THEORY, UTILITY FUNCTIONS AND RISK AVERSION Szabolcs Sebestyén szabolcs.sebestyen@iscte.pt Master in Finance INVESTMENTS Sebestyén (ISCTE-IUL) Choice Theory Investments 1 / 65 Outline 1 An Introduction

More information

Credit Risk Modelling

Credit Risk Modelling Credit Risk Modelling Tiziano Bellini Università di Bologna December 13, 2013 Tiziano Bellini (Università di Bologna) Credit Risk Modelling December 13, 2013 1 / 55 Outline Framework Credit Risk Modelling

More information

Financial Econometrics

Financial Econometrics Financial Econometrics Volatility Gerald P. Dwyer Trinity College, Dublin January 2013 GPD (TCD) Volatility 01/13 1 / 37 Squared log returns for CRSP daily GPD (TCD) Volatility 01/13 2 / 37 Absolute value

More information

Modeling Co-movements and Tail Dependency in the International Stock Market via Copulae

Modeling Co-movements and Tail Dependency in the International Stock Market via Copulae Modeling Co-movements and Tail Dependency in the International Stock Market via Copulae Katja Ignatieva, Eckhard Platen Bachelier Finance Society World Congress 22-26 June 2010, Toronto K. Ignatieva, E.

More information

15. Multinomial Outcomes A. Colin Cameron Pravin K. Trivedi Copyright 2006

15. Multinomial Outcomes A. Colin Cameron Pravin K. Trivedi Copyright 2006 15. Multinomial Outcomes A. Colin Cameron Pravin K. Trivedi Copyright 2006 These slides were prepared in 1999. They cover material similar to Sections 15.3-15.6 of our subsequent book Microeconometrics:

More information

Likelihood Methods of Inference. Toss coin 6 times and get Heads twice.

Likelihood Methods of Inference. Toss coin 6 times and get Heads twice. Methods of Inference Toss coin 6 times and get Heads twice. p is probability of getting H. Probability of getting exactly 2 heads is 15p 2 (1 p) 4 This function of p, is likelihood function. Definition:

More information

The Bernoulli distribution

The Bernoulli distribution This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Probability Distributions II

Probability Distributions II Probability Distributions II Summer 2017 Summer Institutes 63 Multinomial Distribution - Motivation Suppose we modified assumption (1) of the binomial distribution to allow for more than two outcomes.

More information

CHAPTER 11 Regression with a Binary Dependent Variable. Kazu Matsuda IBEC PHBU 430 Econometrics

CHAPTER 11 Regression with a Binary Dependent Variable. Kazu Matsuda IBEC PHBU 430 Econometrics CHAPTER 11 Regression with a Binary Dependent Variable Kazu Matsuda IBEC PHBU 430 Econometrics Mortgage Application Example Two people, identical but for their race, walk into a bank and apply for a mortgage,

More information

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg :

More information

Web-based Supplementary Materials for. A space-time conditional intensity model. for invasive meningococcal disease occurence

Web-based Supplementary Materials for. A space-time conditional intensity model. for invasive meningococcal disease occurence Web-based Supplementary Materials for A space-time conditional intensity model for invasive meningococcal disease occurence by Sebastian Meyer 1,2, Johannes Elias 3, and Michael Höhle 4,2 1 Department

More information

Strategies for High Frequency FX Trading

Strategies for High Frequency FX Trading Strategies for High Frequency FX Trading - The choice of bucket size Malin Lunsjö and Malin Riddarström Department of Mathematical Statistics Faculty of Engineering at Lund University June 2017 Abstract

More information

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 10, 2017

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 10, 2017 Maximum Likelihood Estimation Richard Williams, University of otre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 0, 207 [This handout draws very heavily from Regression Models for Categorical

More information

Contents Part I Descriptive Statistics 1 Introduction and Framework Population, Sample, and Observations Variables Quali

Contents Part I Descriptive Statistics 1 Introduction and Framework Population, Sample, and Observations Variables Quali Part I Descriptive Statistics 1 Introduction and Framework... 3 1.1 Population, Sample, and Observations... 3 1.2 Variables.... 4 1.2.1 Qualitative and Quantitative Variables.... 5 1.2.2 Discrete and Continuous

More information

Phd Program in Transportation. Transport Demand Modeling. Session 11

Phd Program in Transportation. Transport Demand Modeling. Session 11 Phd Program in Transportation Transport Demand Modeling João de Abreu e Silva Session 11 Binary and Ordered Choice Models Phd in Transportation / Transport Demand Modelling 1/26 Heterocedasticity Homoscedasticity

More information

One period models Method II For working persons Labor Supply Optimal Wage-Hours Fixed Cost Models. Labor Supply. James Heckman University of Chicago

One period models Method II For working persons Labor Supply Optimal Wage-Hours Fixed Cost Models. Labor Supply. James Heckman University of Chicago Labor Supply James Heckman University of Chicago April 23, 2007 1 / 77 One period models: (L < 1) U (C, L) = C α 1 α b = taste for leisure increases ( ) L ϕ 1 + b ϕ α, ϕ < 1 2 / 77 MRS at zero hours of

More information

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables Chapter 5 Continuous Random Variables and Probability Distributions 5.1 Continuous Random Variables 1 2CHAPTER 5. CONTINUOUS RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS Probability Distributions Probability

More information

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models IEOR E4707: Foundations of Financial Engineering c 206 by Martin Haugh Martingale Pricing Theory in Discrete-Time and Discrete-Space Models These notes develop the theory of martingale pricing in a discrete-time,

More information

Multinomial Logit Models - Overview Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 13, 2017

Multinomial Logit Models - Overview Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 13, 2017 Multinomial Logit Models - Overview Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 13, 2017 This is adapted heavily from Menard s Applied Logistic Regression

More information