Predicting the Probability of Being a Smoker: A Probit Analysis

Similar documents
Econometric Methods for Valuation Analysis

Jamie Wagner Ph.D. Student University of Nebraska Lincoln

Example 1 of econometric analysis: the Market Model

Marital Disruption and the Risk of Loosing Health Insurance Coverage. Extended Abstract. James B. Kirby. Agency for Healthcare Research and Quality

a. Explain why the coefficients change in the observed direction when switching from OLS to Tobit estimation.

Egyptian Married Women Don t desire to Work or Simply Can t? A Duration Analysis. Rana Hendy. March 15th, 2010

Analyzing the Determinants of Project Success: A Probit Regression Approach

Econometrics II Multinomial Choice Models

Quant Econ Pset 2: Logit

Online appendix for W. Kip Viscusi, Joel Huber, and Jason Bell, Assessing Whether There Is a Cancer Premium for the Value of a Statistical Life

Equity, Vacancy, and Time to Sale in Real Estate.

Choice Probabilities. Logit Choice Probabilities Derivation. Choice Probabilities. Basic Econometrics in Transportation.

Review questions for Multinomial Logit/Probit, Tobit, Heckit, Quantile Regressions

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

Gender Differences in the Labor Market Effects of the Dollar

CHAPTER 4 DATA ANALYSIS Data Hypothesis

The Earnings Function and Human Capital Investment

In Debt and Approaching Retirement: Claim Social Security or Work Longer?

TOURISM GENERATION ANALYSIS BASED ON A SCOBIT MODEL * Lingling, WU **, Junyi ZHANG ***, and Akimasa FUJIWARA ****

Name: 1. Use the data from the following table to answer the questions that follow: (10 points)

United Way Worldwide: MyFreeTaxes Survey November 18-23, Report Date: January 28, 2016

FIGURE I.1 / Per Capita Gross Domestic Product and Unemployment Rates. Year

CHAPTER 4 ESTIMATES OF RETIREMENT, SOCIAL SECURITY BENEFIT TAKE-UP, AND EARNINGS AFTER AGE 50

What do frictions mean for Q-theory?

The model is estimated including a fixed effect for each family (u i ). The estimated model was:

Logit Models for Binary Data

The Impact of a $15 Minimum Wage on Hunger in America


Logistic Regression Analysis

Table 4. Probit model of union membership. Probit coefficients are presented below. Data from March 2008 Current Population Survey.

Your Name (Please print) Did you agree to take the optional portion of the final exam Yes No. Directions

ECO671, Spring 2014, Sample Questions for First Exam

Effect of Education on Wage Earning

Poverty and Witch Killing

Testing the Solow Growth Theory

What You Don t Know Can t Help You: Knowledge and Retirement Decision Making

PASS Sample Size Software

Financial Risk Tolerance and the influence of Socio-demographic Characteristics of Retail Investors

The Lack of Persistence of Employee Contributions to Their 401(k) Plans May Lead to Insufficient Retirement Savings

Married Women s Labor Force Participation and The Role of Human Capital Evidence from the United States

THE IMPACT OF BANKING RISKS ON THE CAPITAL OF COMMERCIAL BANKS IN LIBYA

1 Roy model: Chiswick (1978) and Borjas (1987)

1) The Effect of Recent Tax Changes on Taxable Income

The U.S. Gender Earnings Gap: A State- Level Analysis

Chapter 4 Level of Volatility in the Indian Stock Market

A Micro Data Approach to the Identification of Credit Crunches

A Two-Step Estimator for Missing Values in Probit Model Covariates

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam

FS January, A CROSS-COUNTRY COMPARISON OF EFFICIENCY OF FIRMS IN THE FOOD INDUSTRY. Yvonne J. Acheampong Michael E.

Why Housing Gap; Willingness or Eligibility to Mortgage Financing By Respondents in Uasin Gishu, Kenya

CAN AGENCY COSTS OF DEBT BE REDUCED WITHOUT EXPLICIT PROTECTIVE COVENANTS? THE CASE OF RESTRICTION ON THE SALE AND LEASE-BACK ARRANGEMENT

Women in the Labor Force: A Databook

Discrete Choice Modeling

Religion and Volunteerism

SALARY EQUITY ANALYSIS AT ARL INSTITUTIONS

Final Exam - section 1. Thursday, December hours, 30 minutes

Tests for One Variance

How exogenous is exogenous income? A longitudinal study of lottery winners in the UK

Reemployment after Job Loss

CHAPTER 6 DATA ANALYSIS AND INTERPRETATION

Income Convergence in the South: Myth or Reality?

An Evaluation of Nonresponse Adjustment Cells for the Household Component of the Medical Expenditure Panel Survey (MEPS) 1

Project for the Regional Advancement of Statistics in the Caribbean - PRASC

The Relationship Between Household Size, Real Wages, and Labor Force Participation Rates of Men and Women

CONVERGENCES IN MEN S AND WOMEN S LIFE PATTERNS: LIFETIME WORK, LIFETIME EARNINGS, AND HUMAN CAPITAL INVESTMENT $

State Dependence in a Multinominal-State Labor Force Participation of Married Women in Japan 1

Demand for Outpatient Health Services in Korea

Appendix A. Additional Results

Non-Inferiority Tests for the Ratio of Two Proportions

Econometric Methods for Valuation Analysis

Tests for Two Variances

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2010, Mr. Ruey S. Tsay Solutions to Final Exam

Estimating Ordered Categorical Variables Using Panel Data: A Generalised Ordered Probit Model with an Autofit Procedure

Saving for Retirement: Household Bargaining and Household Net Worth

Financial Econometrics Jeffrey R. Russell. Midterm 2014 Suggested Solutions. TA: B. B. Deng

An Empirical Examination of Traditional Equity Valuation Models: The case of the Athens Stock Exchange

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, Last revised January 13, 2018

Econometrics is. The estimation of relationships suggested by economic theory

Tests for Two Means in a Cluster-Randomized Design

Nature or Nurture? Data and Estimation Appendix

Introduction to the Maximum Likelihood Estimation Technique. September 24, 2015

Two-term Edgeworth expansions of the distributions of fit indexes under fixed alternatives in covariance structure models

THE CHORE WARS Household Bargaining and Leisure Time

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 6

An Empirical Study about Catering Theory of Dividends: The Proof from Chinese Stock Market

Risk Tolerance Profile of Cash-Value Life Insurance Owners

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Women in the Labor Force: A Databook

CHAPTER 11 Regression with a Binary Dependent Variable. Kazu Matsuda IBEC PHBU 430 Econometrics

The coverage of young children in demographic surveys

Compensating Differentials and Fringe Benefits: Evidence from the Medical Expenditure Panel Survey

9. Logit and Probit Models For Dichotomous Data

Interrelationship between Profitability, Financial Leverage and Capital Structure of Textile Industry in India Dr. Ruchi Malhotra

Modeling wages of females in the UK

AFFORDABLE CARE ACT AND PREMIUM VARIATION RULES: COULD CERTAIN CONSUMER SEGMENTS BE DISPROPORTIONATELY PROFITABLE TO INSURERS?

Public-private sector pay differential in UK: A recent update

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, Last revised January 10, 2017

Small Sample Bias Using Maximum Likelihood versus. Moments: The Case of a Simple Search Model of the Labor. Market

Fiscal Implications of Personal Tax Adjustments in the Czech Republic

Problem Set 9 Heteroskedasticty Answers

Transcription:

Predicting the Probability of Being a Smoker: A Probit Analysis Department of Economics Florida State University Tallahassee, FL 32306-2180 Abstract This paper explains the probability of being a smoker, based on 23 variables, using a probit analysis model. Specifically, age, gender, marital status, location, race, risky behavior, health insurance coverage, obtaining routine medical care and highest degree obtained are the basis of the construction of the model. They are hypothesized to be significant factors. 14 variables are individually significant at the 5% and 1% level. The regressors are jointly significant at both levels. However, the probit marginal effects demonstrate that race and possessing a high school degree can affect the probability of smoking by -10% to 34%.

1 INTRODUCTION Tobacco use in the United States is a behavior that has been studied intensely due to its perceived benefits by users and extreme externalities. A problem of interest with tobacco use, primarily smoking, is the ability to predict who is a current smoker. Annually, the United States Department of Health and Human Services (DHHS) conducts the Medical Expenditure Panel Survey (MEPS). MEPS is a comprehensive examination of individual health and medical expenditures. There are approximately 1,099 variables within MEPS and 33,691 observations. Smoking and tobacco use does not comprise a majority of this data set. However, several variables along with a binary for individuals who are current smokers should enable an econometrician to explore this aspect of human behavior. By utilizing MEPS which is compiled by the Agency for Healthcare Research and Quality (AHRQ) and applying limited dependent variable regression analysis, namely a probit model, the goal of this analysis is to predict the probability of a person being a smoker. 1.1 Problem of Interest The ability to predict the behavior of certain individuals is of paramount importance to statisticians, econometricians and economists. Smoking is unique in that it is a form of behavior that extremely restricted. Almost every aspect of smoking is restricted by government, groups and individuals, via the use of cultural rules. Enabling someone with the power to foretell who is a smoker based on a particular set of characteristics would create enormous benefits in the form of reduced transaction costs and greater efficiency. Improvements in the provision of healthcare and insurance could be realized. Healthcare providers would possibly be able to make informed and optimal decisions as opposed to decisions in the face of uncertainty. Another interesting aspect, or result, would be lower transaction costs in the search for personal relationships. Individuals could lower their search costs as well as make informed decisions. Hence, the power of a model that predicts the likelihood 1

someone is a smoker would be invaluable. 1.2 Application of Limited Dependent Variable Methods Chester Ittner Bliss (1934), a biologist, first introduced the notion of a probit. Bliss was concerned with the treatment of a particular type of data. Specifically, Bliss (1934) sought to express the percentage of organisms killed by pesticides. Maddala (1983,pp.22) notes that Goldberger (1964) developed the probit analysis model. Theoretically, a latent variable, J i, is observed instead of Y i which is an unobserved, qualitative dependent variable. Recall, an econometrician is faced with a classical regression model subject to qualitative observation of the dependent variable. In the context of classical regression, Y i is observable in the following model: Y i = α i + X i β + ɛ i. This is not the case with the current problem of interest. Here, Y i is not observable. A latent variable, J i,is observed, where J i = 1 if the individual currently smokes and J i = 0 otherwise. The binary choice model, Y i = X i β σɛ i, is needed to analyze this problem. Estimation of the binary choice model requires the establishment of the relationship between J i and X i. Recall, J i =1 if and only if Y i > 0. This implies that X i β σɛ i > 0. Solving for ɛ i yields: ɛ i < (X i β/σ). Therefore, P (J i =1)=P ( ɛ i < X ) iβ = F σ ( ) Xi β σ (1) which implies that ( ) Xi β P (J i =0)=1 F. (2) σ The latent variable, J i, takes the value of 1 or 0. Thus, the density function for J i is: f(j i )= [ F ( Xi β σ )] Ji [ 1 F ( )] 1 Ji Xi β. (3) σ 2

The variables β and σ are not identified; however, δ = σ 1 β is identified. Using this fact, the log-likelihood function for the binary choice model is: ln L(δ) = n { Ji ln F (X i δ)+(1 J i )ln [ 1 F (X i δ) ]}. (4) i=1 The dependent variable, smoke i, takes on discrete values; it is an indicator for individuals who currently smoke. One can infer from equations (1) and (2) that a binary choice model allows for a clear statement of the relationship between the latent variable, smoke i,and the regressors. This does not occur in the context of the classical regression model. Hence, limited dependent variable methods must be used to predict the likelihood of an individual being a smoker. 2 DISCUSSION OF MODEL A binary choice model, specifically a probit model, is to be employed to derive the probability that someone smokes. Using the data, a probit model is constructed: P (smoke i =1)=Φ ( α + β sex sex + β age age + β race1 race 1 + β race2 race 2 + β race3 race 3 + β race4 race 4 + β race5 race 5 + β married married + β ged ged + β hidipl hidipl + β bach bach + β mastr mastr + β medcare medcare + β hrwg hrwg + β hourwk hourwk + β inscov inscov + β risk1 risk 1 + β risk2 risk 2 + β risk3 risk 3 + β risk4 risk 4 + β region1 region1+β region2 region2+β region3 region3 ) (5) The following variables, which were extracted from the MEPS panel, are purported to have explanatory power on the decision to smoke: sex (gender), age, race, marital status, education (in terms of highest degree obtained), routine medical care, hourly wage, hours worked per week, health insurance coverage, willingness to take risks and location in the U.S. Descriptive statistics are provided in Table 1. 3

2.1 Regressors A discusion of the regressors and their implication in an individual s choice to smoke is in order. Gender and age are believed to play an ambiguous role. This is due to the fact that men and women of all ages smoke. Race 1, race 2, race 3, race 4 and race 5 are dummy variables that were used to indicate if persons are White, Black, American Indian/Alaska Native, Asian or Native Hawaiian/Pacific Islander, respectively. These are intriguing variables in the sense that different races and cultures accept smoking, or at least perceive it differently. An indicator is included in the model for marital status. Marriage is thought to be an important factor when an individual decides to smoke. Spouses can influence their significant other, especially with respect to decisions regarding health. The variable for highest degree obtained was decomposed into five binary variables. A higher degree should be associated with an individual who is more health conscious. Whether an individual obtains routine medical care and currently maintains health insurance coverage or not are important determinants. These determinants are represented by the variables medcare and inscov, respectively. An individual s employment environment, work week schedule and income can obviously create undue stress and frustration. Hourwk and hrwg are variables that attempt to capture these aspects, or byproducts of employment, and hopefully will explain an individual s decision to smoke. Within this panel of data, ARHQ includes a variable that describes an individual s willingness to risks. If an individual is willing to take risks, then she should be willing, to some degree, to smoke or be open to smoking (This statement is based heavily on the assumption that smoking is a risk). Analogous to the reasoning for binary variables for race, there exist binary variables for the individual s location within the U.S. A more detailed examination of these variables is conducted in the Data Appendix. The probit model is a special case where the error terms are independent and identically distributed with mean 0 and variance 1, ɛ i iidn(0, 1). Regarding the binary choice model, this assumption about the error terms implies F ( X i δ ) =Φ ( X i δ ), where Φ ( ) is the standard 4

normal distribution function. The log-likelihood function is now simply: ln L(δ) = n { Ji ln Φ(X i δ)+(1 J i )ln [ 1 Φ(X i δ) ]} (6) i=1 where J i is the latent variable smoke i, X i are the regressors in equation (5) and δ is the ratio, σ 1 β. 3 RESULTS The probit model, equation (5), was estimated. Results from this analysis can be found in Table 2. Coefficients, standard errors, t-statistics and p-values are reported for the twentyfour regressors. The value of the log-likelihood function is -3534.0952. Probit model estimates can be used to test the joint significance of the regression and the individual significance of the estimates. The following is the null-alternative pair for testing the significance of each coefficient estimate: H 0 : ˆβi =0 H A : ˆβi 0 for i = sex, age,, region3 (7) At the α =0.05 and α =0.01, the significance of each regressor will be tested. Hence, it is necessary to use a two-tailed test. Based on the number of observations and number of regressors, n =7628andk = 24, the degrees of freedom (d.f.) are 7624. A level of significance, α =0.05, yields a two-tailed critical value of κ = ±1.96. Based on this κ, 14of the 24 regressors are individually significant. These are: sex, race 1, race 2, race 4, married, ged, hidipl, medcare, hrwg, hourwk, inscov, region1, region2 andregion3. Choosing α =0.01 yields a two-tailed critical value of κ = ±2.57. At this level, all of the variables mentioned are still significant with the exception of race 1. To test the joint significance of the regressors, the log-likelihood ratio is employed. The 5

null-alternative hypothesis pair is: H 0 : ˆ β sex = ˆ β age = = ˆ β region3 =0 H A : at least one ˆβ i 0. (8) Essentially, the null hypothesis states that that all of the regressors have no explanatory power in the variation of the dependent variable, smoke i. Using the log-likelihood ratio, [ ] A 2 ln L( δ) ln L(ˆδ) χ 2 k 1, (9) the following results. Note that ln L( δ) is the value of the constrained likelihood function and ln L(ˆδ) is the value of the unconstrained likelihood function, which have respective values of -3817.6241 and -3534.0952. This yields a value of 567.06; hence ˆt =576.06 A χ 2 k 1 where k 1 = 23. The critical value for χ 2 23 at α =0.05 is 35.17 (κ 1 )andatα =0.01, it is 41.64 (κ 2 ). Thus, since ˆt >κ 1 and ˆt >κ 2, the null hypothesis is rejected. This result implies the regressors are jointly significant at the 5% and 1% level. Marginal effects for the probit model were then calculated. Results can be found in Table 3. Recall that marginal effects, in the context of the probit model, are the vector of standardized coefficients. That is, P(smoke i =1) X T i = Φ(X iδ) X T i = φ(x i δ)δ (10) It is known that for different i s, φ( ) now varies. The coefficients are scaled differently, but still proportionately. The marginal effects allow for a more appropriate analysis when determining the specific effect of a one unit change in X i on the latent variable, smoke i. Table 3 indicates that race 2, race 4, ged and hidipl have a -10%, -15%, 31% and 19% effect on smoke i. Specifically, if the value of race 4 changes from 0 to 1, then this implies that there is a 15% decrease in the probability that the individual is a smoker. Analogously, 6

if race 2 were to change in value from 0 to 1, there would by a 10% decrease in the probability of a person being a smoker. On the other hand, possessing a general equivalence diploma (ged) or a high school diploma (hidipl) cause the probability of a person being a smoker to increase by 31% and 19%, respectively. The remaining regressors have a marginal effect of -3% to 8% on the probability of smoke i being 1. In terms of demographics, behavior and smoking, these variables would be of primary interest since they have an effect of 10% or greater on the probability of smoke i being one. That is, P (smoke i =1). After having compiled all of the results, one should note that the binaries for willingness to take risks were not statistically significant at the 5% or 1% level. Furthermore, the marginal effects for risk 1, risk 2 and risk 3 are -3% while risk 4 is -1%. This is an interesting result in the sense that one could reasonably assume that an individual s willingness to take risks would be an integral part of predicting if they are a smoker. The disparity, though, may lie in the fact that the individual does not perceive smoking as risky behavior. 4 CONCLUSION Predicting the probability of an individual being a smoker could be an invaluable tool for an econometrician s, as well as an economist s and policy analyst s, toolbox. The purpose of this paper was to utilize a probit model to estimate or predict this probability based on several variables that were thought to explain the decision to smoke. The probit model estimation and probit marginal effects yield interesting results. Regressors, in this model, were jointly and individually significant at the 5% level. They were jointly and individually significant at the 1% level with the exception of race 1 at the individual level. One was able to infer from the probit marginal effects that race 2, race 4, ged and hidipl have the greatest impact on smoke i. Having only a high school education tremendously impacts the probability of an individual being a smoker. In regard to policy analysis, health professionals seeking to curb smoking rates would then know where to direct their efforts. Other explanatory variables may provide better results in the sense of predicting the 7

probabiliity of an individual being a smoker. Based on the results contained in this paper, one might be better off constructing a model using variables that describe an individual s ethnicity and education. However, the model used in this paper may serve as a benchmark for others who wish to pursue an invaluable tool to add to their toolbox. 5 APPENDIX: MEPS DATA Data for analyzing this project is taken from the MEPS HC-090: 2005 Full Year Population Characteristics. According to AHRQ, the data set is comprised of a nationally representative sample of the civilian non-institutionalized population of the United States. It is compiled annually in rounds by the AHRQ. The data is coded by the AHRQ. MEPS consists of two panels of data which were collected in 2005. This model is built using one of these panels that contains 33,691 observations, or persons, and 1,099 variables. The following variables were extracted from the data: ADSM OK42, SEX, AGE05X, RACE05X, MARRY31X, HIDEG, ADRT CR42, HRWG31X, HOUR31, INSCOV 05, ADRISK42 and REGION05. Missing and inapplicable observations were then dropped from this original sample. The process of elimination yielded a total of 7,628 observations. ADSM OK42 is a binary variable for individuals who currently smoke, which is denoted by smoke i. The variable SEX is a binary for male. INSCOV 05 is an indicator for health insurance coverage, both public and private. Important transformations were performed on the already coded variables, RACE05X, MARRY 31X, HIDEG, ADRT CR42, ADRISK42 and REGION05. It was necessary to decompose these variables into binaries. Specifically, RACE05X was used to create a total of six dummy variables, namely race 1, race 2, race 3, race 4, race 5 and race 6. Within the MEPS panel, MARRY31X is variable that indicates the marital status of the individual. The categories for this variable are married, widowed, single, divorced, separated, don t know, inapplicable and refused. To construct the variable, married, all observations that are married are coded as one and the other categories are assigned zero. Note that the ARHQ uses metropolitan statistical areas (MSAs) from the 8

U.S. Census to classify individuals in regard to their location within the U.S. Similar procedures were performed to construct the remaining variables from HIDEG, ADRT CR42, ADRISK42 AND REGION05. In all, 23 variables are constructed using solely the MEPS panel. Descriptive statistics, including the mean and variance, for these variables are contained in Table 1. 9

6 TABLES Table 1: Descriptive Statistics Variable Mean Std. Deviation smoke (d.v.) 0.20 0.40 age 40.80 12.77 sex (d.v.) 0.48 0.50 married (d.v.) 0.58 0.49 race 1 (d.v.) 0.77 0.42 race 2 (d.v.) 0.16 0.37 race 3 (d.v.) 0.01 0.09 race 4 (d.v.) 0.05 0.21 race 5 (d.v.) 0.01 0.07 region1 (d.v.) 0.16 0.37 region2 (d.v.) 0.23 0.42 region3 (d.v.) 0.38 0.49 ged (d.v.) 0.06 0.23 hidipl (d.v.) 0.63 0.48 bach (d.v.) 0.22 0.41 mastr (d.v.) 0.08 0.28 hourwk 38.94 11.06 hrwg 17.24 10.88 inscov (d.v.) 0.87 0.34 medcare (d.v.) 0.63 0.48 risk 1 (d.v.) 0.38 0.49 risk 2 (d.v.) 0.24 0.43 risk 3 (d.v.) 0.15 0.35 risk 4 (d.v.) 0.18 0.38 10

Table 2: Probit Model Estimation Regressor Coefficient Std. Error t-stat Prob > t Con -1.42 0.26-5.47 0.00 sex 0.14 0.04 3.84 0.00 age 0.002 0.002 1.47 0.14 race 1-0.28 0.13-2.13 0.03 race 2-0.41 0.14-3.03 0.00 race 3-0.09 0.21-0.43 0.66 race 4-0.58 0.16-3.55 0.00 race 5 0.16 0.25 0.62 0.53 married -0.15 0.04-4.08 0.00 ged 1.21 0.20 6.17 0.00 hidipl 0.74 0.19 3.94 0.00 bach 0.29 0.19 1.53 0.13 mastr 0.12 0.20 0.61 0.55 medcare -0.14 0.04-3.84 0.00 hrwg -0.008 0.002-3.51 0.00 hourwk 0.01 0.001 6.10 0.00 inscov -0.15 0.05-3.00 0.00 risk 1-0.11 0.08-1.40 0.16 risk 2-0.11 0.08-1.30 0.19 risk 3-0.10 0.08-1.15 0.25 risk 4-0.05 0.08-0.55 0.58 region1 0.23 0.06 3.95 0.00 region2 0.30 0.05 5.78 0.00 region3 0.16 0.05 3.32 0.00 11

Table 3: Probit Marginal Effects Regressor Marginal Std. Error t-stat Prob > t Con -0.37 0.07-5.50 0.00 sex 0.04 0.01 3.84 0.00 age 0.0006 0.0004 1.47 0.14 race 1-0.07 0.03-2.13 0.03 race 2-0.11 0.04-3.03 0.00 race 3-0.02 0.06-0.43 0.66 race 4-0.15 0.04-3.56 0.00 race 5 0.04 0.07 0.62 0.53 married -0.04 0.01-4.08 0.00 ged 0.32 0.05 6.20 0.00 hidipl 0.19 0.05 3.96 0.00 bach 0.08 0.05 1.54 0.12 mastr 0.03 0.05 0.61 0.55 medcare -0.04 0.01-3.85 0.00 hrwg -0.002 0.001-3.52 0.00 hourwk 0.003 0.0004 6.10 0.00 inscov -0.04 0.01-2.99 0.00 risk 1-0.03 0.02-1.40 0.16 risk 2-0.03 0.02-1.30 0.19 risk 3-0.03 0.02-1.15 0.25 risk 4-0.01 0.02-0.55 0.58 region1 0.06 0.01 3.96 0.00 region2 0.08 0.01 5.78 0.00 region3 0.04 0.01 3.32 0.00 12

REFERENCES Bliss, C. I. (1934), The Method of Probits, Science, 79, 38-39. Goldberger, A. S. (1964). Econometric Theory. New York: Wiley. Maddala, G. S. (1983). Limited-Depedent and Qualitative Variables in Econometrics. New York: Cambridge University Press.