Quasi-Experimental Methods. Technical Track

Similar documents
DIFFERENCE DIFFERENCES

Measuring Impact. Impact Evaluation Methods for Policymakers. Sebastian Martinez. The World Bank

Session III Differences in Differences (Dif- and Panel Data

Applied Impact Evaluation

MEASURING IMPACT Impact Evaluation Methods for Policy Makers

Measuring Impact. Paul Gertler Chief Economist Human Development Network The World Bank. The Farm, South Africa June 2006

Technical Track Title Session V Regression Discontinuity (RD)

Session III The Regression Discontinuity Design (RD)

Session V Regression Discontinuity (RD)

Applied Economics. Quasi-experiments: Instrumental Variables and Regresion Discontinuity. Department of Economics Universidad Carlos III de Madrid

Bakke & Whited [JF 2012] Threshold Events and Identification: A Study of Cash Shortfalls Discussion by Fabian Brunner & Nicolas Boob

Empirical Approaches in Public Finance. Hilary Hoynes EC230. Outline of Lecture:

The Impact of a $15 Minimum Wage on Hunger in America

DIME WORKSHOP OCTOBER 13-17, 2014 LISBON, PORTUGAL

EVALUATING INDONESIA S UNCONDITIONAL CASH TRANSFER PROGRAM(S) *

Yannan Hu 1, Frank J. van Lenthe 1, Rasmus Hoffmann 1,2, Karen van Hedel 1,3 and Johan P. Mackenbach 1*

Peer Effects in Retirement Decisions

Labor Economics Field Exam Spring 2014

Labour Force Participation in the Euro Area: A Cohort Based Analysis

For Online Publication Additional results

Depression Babies: Do Macroeconomic Experiences Affect Risk-Taking?

Your Name (Please print) Did you agree to take the optional portion of the final exam Yes No. Directions

Principles Of Impact Evaluation And Randomized Trials Craig McIntosh UCSD. Bill & Melinda Gates Foundation, June

Economics 270c. Development Economics Lecture 11 April 3, 2007

1 Payroll Tax Legislation 2. 2 Severance Payments Legislation 3

14.471: Fall 2012: Recitation 3: Labor Supply: Blundell, Duncan and Meghir EMA (1998)

Contents: Appendix 3: Parallel Trends. Appendix

Empirical Methods for Corporate Finance. Regression Discontinuity Design

Labor Economics Field Exam Spring 2011

The model is estimated including a fixed effect for each family (u i ). The estimated model was:

The impact of cash transfers on productive activities and labor supply. The case of LEAP program in Ghana

DYNAMICS OF URBAN INFORMAL

LABOR SUPPLY RESPONSES TO TAXES AND TRANSFERS: PART I (BASIC APPROACHES) Henrik Jacobsen Kleven London School of Economics

THE ECONOMIC IMPACT OF RISING THE RETIREMENT AGE: LESSONS FROM THE SEPTEMBER 1993 LAW*

Regression Discontinuity Design

RANDOMIZED TRIALS Technical Track Session II Sergio Urzua University of Maryland

Effects of working part-time and full-time on physical and mental health in old age in Europe

Core methodology I: Sector analysis of MDG determinants

TAXES, TRANSFERS, AND LABOR SUPPLY. Henrik Jacobsen Kleven London School of Economics. Lecture Notes for PhD Public Finance (EC426): Lent Term 2012

Manufacturing Busts, Housing Booms, and Declining Employment

Data and Methods in FMLA Research Evidence

Nutrition and productivity

1. (9; 3ea) The table lists the survey results of 100 non-senior students. Math major Art major Biology major

Gone with the Storm: Rainfall Shocks and Household Wellbeing in Guatemala

Comment on Gary V. Englehardt and Jonathan Gruber Social Security and the Evolution of Elderly Poverty

Planning Sample Size for Randomized Evaluations Esther Duflo J-PAL

The Effect of a Longer Working Horizon on Individual and Family Labour Supply

CASE STUDY 2: EXPANDING CREDIT ACCESS

Does Participation in Microfinance Programs Improve Household Incomes: Empirical Evidence From Makueni District, Kenya.

Migration Responses to Household Income Shocks: Evidence from Kyrgyzstan

Long Term Effects of Temporary Labor Demand: Free Trade Zones, Female Education and Marriage Market Outcomes in the Dominican Republic

David Newhouse Daniel Suryadarma

Closing routes to retirement: how do people respond? Johannes Geyer, Clara Welteke

Effects of Tax-Based Saving Incentives on Contribution Behavior: Lessons from the Introduction of the Riester Scheme in Germany

LECTURE: MEDICAID HILARY HOYNES UC DAVIS EC230 OUTLINE OF LECTURE: 1. Overview of Medicaid. 2. Medicaid expansions

Measuring and Mapping the Welfare Effects of Natural Disasters A Pilot

Econ Spring 2016 Section 12

Does Female Empowerment Promote Economic Development? Matthias Doepke (Northwestern) Michèle Tertilt (Mannheim)

FINAL REPORT AN EVALUATION OF THE IMPACT OF PROGRESA CASH PAYMENTS ON PRIVATE INTER-HOUSEHOLD TRANSFERS. Graciela Teruel Benjamin Davis

Gender Differences in the Labor Market Effects of the Dollar

Input Tariffs, Speed of Contract Enforcement, and the Productivity of Firms in India

Mobile Financial Services for Women in Indonesia: A Baseline Survey Analysis

Evaluating Search Periods for Welfare Applicants: Evidence from a Social Experiment

Ministry of Health, Labour and Welfare Statistics and Information Department

Using Randomized Evaluations to Improve Policy

Evaluation of the effects of the active labour measures on reducing unemployment in Romania

Empirical Methods for Corporate Finance

Fiscal Policy and Long-Term Growth

Acemoglu, et al (2008) cast doubt on the robustness of the cross-country empirical relationship between income and democracy. They demonstrate that

Population Economics Field Exam Spring This is a closed book examination. No written materials are allowed. You can use a calculator.

Online Appendix (Not For Publication)

Shale Gas Development and Housing Values Over a Decade: Evidence from the Barnett Shale

Happy Voters. Exploring the Intersections between Economics and Psychology. Federica Liberini 1, Eugenio Proto 2 Michela Redoano 2.

The Role of Exponential-Growth Bias and Present Bias in Retirment Saving Decisions

I ll Have What She s Having : Identifying Social Influence in Household Mortgage Decisions

RESOURCE POOLING WITHIN FAMILY NETWORKS: INSURANCE AND INVESTMENT

Home Energy Reporting Program Evaluation Report. June 8, 2015

The Effects of Increasing the Early Retirement Age on Social Security Claims and Job Exits

Unequal Burden of Retirement Reform: Evidence from Australia

Evaluation of Public Policy

Analysis of Microdata

Does Investing in School Capital Infrastructure Improve Student Achievement?

Abadie s Semiparametric Difference-in-Difference Estimator

Obesity, Disability, and Movement onto the DI Rolls

University of Mannheim

Does Expanding Health Insurance Beyond Formal-Sector Workers Encourage Informality? Measuring the Impact of Mexico s Seguro Popular

: Corruption Lecture 2

Female Labour Supply, Human Capital and Tax Reform

Does Female Empowerment Promote Economic Development?

Manufacturing Decline, Housing Booms, and Non-Employment Manufacturing Decline, Housing Booms, and Non-Employment

Student Loan Nudges: Experimental Evidence on Borrowing and. Educational Attainment. Online Appendix: Not for Publication

Trading and Enforcing Patent Rights. Carlos J. Serrano University of Toronto and NBER

a. Explain why the coefficients change in the observed direction when switching from OLS to Tobit estimation.

Reproductive health, female empowerment and economic prosperity. Elizabeth Frankenberg Duncan Thomas

Call Your Leader: Does the Mobile Phone Affect Policymaking?

NBER WORKING PAPER SERIES THE EFFECTS OF CHANGES IN STATE SSI SUPPLEMENTS ON PRE-RETIREMENT LABOR SUPPLY. David Neumark Elizabeth T.

Topic 11: Disability Insurance

Review questions for Multinomial Logit/Probit, Tobit, Heckit, Quantile Regressions

Center for Demography and Ecology

The Effects of the Health Insurance Availability on the Demand-side: An. Impact Evaluation of China s New Cooperative Medical Scheme

Transcription:

Quasi-Experimental Methods Technical Track East Asia Regional Impact Evaluation Workshop Seoul, South Korea Joost de Laat, World Bank

Randomized Assignment IE Methods Toolbox Discontinuity Design Difference-in- Differences Matching

Anti-poverty Programs Pensions Education Agriculture Discontinuity Design Many social programs select beneficiaries using an index or score: Targeted to households below a given poverty index/income Targeted to population above a certain age Scholarships targeted to students with high scores on standarized text Fertilizer program targeted to small farms less than given number of hectares)

Example: Effect of scholarship program on school attendance Goal Improve school attendance for poor students Method o Households with a score (Pa) of assets 50 are poor o Households with a score (Pa) of assets >50 are not poor Intervention Poor households receive scholarships to send children to school

Enrollment enrolled 0.2.4.6.8 1 POOR NON POOR 0 10 20 30 40 50 60 70 80 90 100 score

Enrollment enrolled 0.2.4.6.8 1 POOR NON POOR 0 10 20 30 40 50 60 70 80 90 100 score

Regression Discontinuity Design-Baseline Eligible Not eligible

Regression Discontinuity Design-Post Intervention IMPACT

For a Discontinuity Design you need 1) Continuous eligibility index (e.g. income) 2) Clearly defined cut-off. Households with a score cutoff are eligible Households with a score > cutoff are not-eligible Or vice-versa

Example Progresa CCT Eligibility for Progresa is based on national poverty index Household is poor if score 750 Eligibility for Progresa: o Eligible if score 750 o Not eligible if score > 750

Example of Progresa Score vs. consumption at Baseline-No treatment 379.224 Consumption Fitted values Fitted values 153.578 276 1294 puntaje estimado en focalizacion Poverty Index

Example of Progresa Score vs. consumption post-intervention period-treatment 399.51 Consumption Fitted values Fitted values 30.58** Estimated impact on consumption (Y) 183.647 276 1294 puntaje estimado en focalizacion Poverty Index (**) Significant at 1%

Example Cambodia CCT Eligibility is based on an index of the likelihood of dropping out of school. 2 cutoff points within each school: Applicants with the highest dropout risk offered US $60 per year scholarship Applicants with intermediate dropout risk offered US $45 per year scholarship Applicants with low dropout risk were not offered scholarship by the program No Scholarship US$ 45 scholarship US$ 60 scholarship Cutoff 1 Cutoff 2 Likelihood of dropping out of school

Large impact on US $45 scholarship No scholarship versus $45 $60 versus $45 scholarship 1 1 0.8 0.8 Estimate of impact Probability 0.6 0.4 Estimate of impact Probability 0.6 0.4 0.2 0.2 0-25 -15-5 5 15 25 Relative ranking 0-25 -15-5 5 15 25 Relative ranking Recipients Non-recipients Recipients Non-recipients Source: Filmer, and Schady. 2011. Does More Cash in Conditional Cash Transfer Programs Always Lead to Larger Impacts on School Attendance?, Journal of Development Economics

Sharp and Fuzzy Discontinuity Sharp discontinuity The discontinuity precisely determines treatment Equivalent to random assignment in a neighborhood Fuzzy discontinuity Discontinuity is highly correlated with treatment. Use the assignment to estimate the probability of enrollment for the program

Identification for sharp discontinuity y i = β 0 + β 1 D i + δ(score i ) + ε i D i = 1 If household i receives transfer 0 If household i does not receive transfer δ(score i ) = Function that is continuous around the cut-off point Assignment rule under sharp discontinuity: D i = 1 D i = 0 score i 50 score i > 50

Identification for fuzzy discontinuity y i = β 0 + β 1 D i + δ(score i ) + ε i D i = 1 If household receives transfer 0 If household does not receive transfer However, some who are not ineligible take up the program. Some who are eligible do not. The reason why they do this could be correlated with the outcome of interest.

Estimation for fuzzy discontinuity In a first regression, use the score to predict whether individual takes up program or not. D i = γ 0 + γ 1 I(score i > 50) + η i Dummy variable In the second equation, use this predicted value for enrollment rather than actual enrollment. ^ y i = β 0 + β 1 D i + δ(score i ) + ε i Continuous function

Example Social assistance to the unemployed: o Low social assistance payments to individuals under 30 o Higher payments for individuals over 30 What is the effect of increased social assistance on employment? Lemieux & Milligan, 2008

Advantages of RDD for evaluation Yields an unbiased estimate of treatment effect at the discontinuity Can take advantage of a known rule for assigning the benefit o o This is common in the design of social interventions No need to exclude a group of eligible households/ individuals from treatment

Potential disadvantages of RDD Local average treatment effects: We estimate the effect of the program around the cut-off point This is not always generalizable Power. The effect is estimated at the discontinuity, so we generally have fewer observations than in a randomized experiment with the same sample size. Specification can be sensitive to functional form. Make sure the relationship between the assignment variable and the outcome variable is correctly modeled, including: (1) Nonlinear Relationships and (2) Interactions.

False RDD

Keep in Mind Discontinuity Design Requires continuous eligibility criteria with clear cut-off. Gives unbiased estimate of the treatment effect: Observations just across the cut-off are good comparisons. No need to exclude a group of eligible households/ individuals from treatment. Can sometimes use it for programs that already ongoing.

IE Methods Randomized Assignment Toolbox Discontinuity Design Difference-in- Differences Matching

Differences-in-Difference- Outline 1. What is Differences-in-Differences (diff-in-diff)? 2. Weaknesses 3. Test for strength of internal validity 4. When to use

What is Differences-differences? (diff-in-diff) Compare the change in outcomes for those that enrolled in the program with the change in outcomes for those that did not enroll in the program. If we can not randomize, can we try to mimic randomization? Natural Experiments: unexpected change in policy, natural disasters. Exploit variation of policies in time and space

Group affected by the policy change (treatment) Group that is not affected by the policy change (comparison) After the program start Before the program start Difference Y 1 D i =1 Y 1 D i =0 Y 0 D i =1 Y 0 D i =0 (Y 1 D=1)-(Y 0 D=1) (Y 1 D=0)-(Y 0 D=0) DD=[(Y 1 D=1)-(Y 0 D=1)] - [(Y 1 D=0)-(Y 0 D=0)]

Difference-in-differences (Diff-in-diff) Y=School attendance P=Girls scholarship program Enrolled (T) After (1) 0.74 0.81 - - Before (0) 0.60 0.78 Not Enrolled (C) - = Difference +0.14 +0.03 0.11 Diff-in-Diff: Impact=(Y T1 -Y T0 )-(Y C1 -Y C0 )

Impact =(A-B)-(C-D)=(A-C)-(B-D) School Attendance Not enrolled Enrolled D=0.78 C=0.81 A=0.74 Impact=0.11 B=0.60 Similar trends before the program t=0 t=1 Time

Example of Progresa Follow-up (t=1) Consumption (Y) Baseline (t=0) Consumption (Y) Enrolled Not Enrolled Difference 268.75 290-21.25 233.47 281.74-48.27 Difference 35.28 8.26 27.02 Estimated Impact on Consumption (Y) Linear Regression 27.06** Multivariate Linear Regression 25.53** Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).

Progresa Policy Recommendation? Impact of Progresa on Consumption (Y) Case 1: Before & After 34.28** Case 2: Enrolled & Not Enrolled -4.15 Case 3: Randomized Assignment 29.75** Case 4: Randomized Promotion 30.4** Case 5: Discontinuity Design 30.58** Case 6: Difference-in-Differences 25.53** Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).

Regression (for 2 time periods) Y ii = α + β tttt + γ D i + δ tttt D i + ε ii y: outcome D i : treatment dummy tttt D i : if treatment in second period DD=[(Y 1 D=1)-(Y 0 D=1)] - [(Y 1 D=0)-(Y 0 D=0)]

Conditional Expectation Y ii = α + β tttt + γ D i + δ tttt D i + ε ii E(Y ii D i = 1) = +β + γ + δ E Y ii D i = 1 = +γ E Y ii D i = 0 = +β E Y ii D i = 0 = Treatment Group Control Group

Conditional Expectation Y ii = α + β tttt + γ D i + δ tttt D i + ε ii E Y ii Y ii D i = 1 E Y ii Y ii D i = 0 =(change in Y for treatment ) (change in Y for control) β + δ (β) = δ

If we have more than 2 time periods/groups Regression with fixed effects for time and group Y ii = +λ t + θ i + δ D ii + ε ii D ii indicates treatment in group i and period t λ t : year dummies θ t : group dummies

Differences-in-Difference- Outline 1. What is Differences-in-Differences (diffin-diff)? 2. Weaknesses 3. Test for strength of internal validity 4. When to use

Problem I: Common trends or shocks across groups Diff-in-Diff only valid if both groups would have had similar trends without the program. Then the change in observed outcomes for those not enrolled is a good counterfactual What if attendance for those enrolled would have increased by more than those not enrolled in any case? VIOLATION OF EQUAL TRENDS!

Same Trend School Attendance D=0.78 C=0.81 A=0.74 B=0.60 Similar trends before the program T=0 T=1 Time

Different Trend School Attendance Different trends before the program D=0.78 B=0.60 C=0.81 A=0.74 Diff-in-Diff cannot measure the impact of the program T=0 T=1 Time

What if an event affects only one group? Case 1: Training program Only highly motivated people participated in the program A new company is opened in the village and only the more motivated people apply for a job. Job prospects for those in the treatment would have improved even in the absence of the training program. DD overestimate the effect of the program Case 2: Subsidies on fertilizer (weather shocks) Treatment group: subsidized farmers. Control group: Unsubsidized farmers. Drought severely affects farmers that use fertilizer. DD underestimates the effect of the program

. Problem 2: Changes in group composition over time Diff-in-Diff requires that we follow the same types of people over time. For example, all the healthy people drop out of a healthcare program, because they don t need the treatment. So average health outcomes for those in the program is lower at the end of the program DD underestimates the effect of the program For example, all the sick people drop out of a health-care program, because they cannot walk to the clinic. DD overestimates the effect of the program

Considerations If program impact is heterogeneous across individual characteristics, pre-treatment differences in observed characteristics can create non-parallel outcome dynamics (Abadie, 2005). Similarly, bias would occur when the size of the response depends in a non-linear way on the size of the intervention, and we compare a group with high treatment intensity, with a group with low treatment intensity When outcomes within the unit of time/group are correlated, OLS standard errors understate the st. dev. of the DD estimator (Bertrand et al., 2004).

Differences-in-Difference- Outline 1. What is Differences-in-Differences (diffin-diff)? 2. Weaknesses 3. Test for strength of internal validity 4. When to use

Test for Trend School Attendance To test this, at least 3 observations in time are needed: o 2 observations before o 1 observation after. Before treatment t=-1 Before treatment t=0 After treatment t=1 Time

Sensitivity Analysis Placebo Test: Use a fake treatment group Should have no impact Use a different comparison group. Should still have an impact. Use a different outcome which should not be affected by the program. Should have no impact

Example Schooling and labor market consequences of school construction in Indonesia: evidence from an unusual policy experiment Esther Duflo, MIT American Economic Review, Sept 2001

School infrastructure Research questions Educational achievement? Educational achievement Salary level? What is the economic return on schooling?

Program description 1973-1978: The Indonesian government built 61,000 schools equivalent to one school per 500 children between 5 and 14 years old The enrollment rate increased from 69% to 85% between 1973 and 1978 Assignment rule -> The number of schools built in each region depended on the number of children out of school in those regions in 1972, before the start of the program.

Identification of the treatment effect There are 2 sources of variations in the intensity of the program for a given individual: By region There is variation in the number of schools received in each region. By age o o Children who were older than 12 years in 1972 did not benefit from the program. The younger a child was 1972, the more it benefited from the program because she spent more time in the new schools.

Sources of data 1995 population census. Individual-level data on: o birth date o 1995 salary level o 1995 level of education The intensity of the building program in the birth region of each person in the sample. Sample: men born between 1950 and 1972.

A first estimation of the impact Step 1: Let s simplify the problem and estimate the impact of the program. We simplify the intensity of the program: high or low We simplify the groups of children affected by the program o Young cohort of children who benefitted o Older cohort of children who did not benefit

Let s look at the average of the outcome variable years of schooling Intensity of the Building Program Age in 1974 High Low 2-6 (young cohort) 12-17 (older cohort) 8.49 9.76 8.02 9.4 Difference 0.47 0.36 0.12 DD (0.089)

Let s look at the average of the outcome variable years of schooling Intensity of the Building program Age in 1974 High Low Difference 2-6 (young cohort) 12-17 (older cohort) 8.49 9.76-1.27 8.02 9.4-1.39 0.12 DD (0.089)

Idea: o o Placebo DD (Cf. p.798, Table 3, panel B) Look for 2 groups whom you know did not benefit, compute a DD, and check whether the estimated effect is 0. If it is NOT 0, we re in trouble Intensity of the Building Program Age in 1974 High Low 12-17 8.02 9.40 18-24 7.70 9.12 Difference 0.32 0.28 0.034 DD (0.098)

Step 2: Let s estimate this with a regression S = c+ α + β + γ.( PT. ) + δ.( C. T) + ε i ijk j k j i j i ijk with S P T ijk j j ijk = education level of person i in region j in cohort k = 1 if the person was born in a region with a high program intensity = 1 if the person belongs to the "young" cohort C j = dummy variable for region j βk = cohort fixed-effect α = district of birth fixed-effect ε = error term for person i in region j in cohort k

Step 3: Let s use additional information We will use the intensity of the program in each region: S = c+ α + β + γ.( PT. ) + δ.( C. T) + ε ijk j k j i j i ijk where P C j j = the intensity of building activity in region j = a vector of regional characteristics We estimate the effect of the program for each cohort separately: 23 23 S c α β γ.( P. d) δ CT ε ijk j k l j i l j i ijk l= 2 l= 2 where = + + + + + d i = a dummy variable for belonging to cohort i

Program effect per cohort γ l Age in 1974

For y = Dependent variable = Salary

Conclusion Results: For each school built per 1000 students; o The average educational achievement increase by 0.12-0.19 years o The average salaries increased by 2.6 5.4 % Making sure the DD estimation is accurate: o A placebo DD gave 0 estimated effect o Use various alternative specifications o Check that the impact estimates for each age cohort make sense.

Keep in Mind! Difference-in-Differences Combines Enrolled & Not Enrolled with Before & After. Slope: Generate counterfactual for change in outcome FUNDAMENTAL ASSUMPTION Trends slopes- are the same in treatments and comparisons To test this, at least 3 observations in time are needed: o 2 observations before o 1 observation after.

IE Methods Randomized Assignment Toolbox Discontinuity Design Difference-in- Differences Matching

Matching The group that enrolled is, on average, different the group that did not enroll However, some individuals are similar. So, can match similar individuals with each other

ENROLLED NOT ENROLLED VERY POOR POOR RICH VERY RICH

Compare Outcomes for Similar People ENROLLED Y NOT ENROLLED VERY POOR POOR RICH 2 3 4 1 2 3 VERY RICH 5 4

More Complicated in Practice Match on all observable characteristics (e.g. income, gender, education ) Comparison group: non-participants with similar characteristics Create one aggregate Propensity Score to match: Compute everyone s probability of participating, based on their observable characteristics. Choose matches that have the same probability of participation as the treatments.

Density of propensity scores Density Non-Participants Participants Common Support 0 Propensity Score 1

Estimation strategy Predict the propensity scores for participants and nonparticipants. If participation status is binary, run a limited dependent variable regression and predict participation status for all units. Common support: Restrict the analysis to participants with P(X) s which are identical P(X) s to nonparticipants.

Estimation strategy Estimate the treatment effect for participant by finding the set of nonparticipants with P(X) s similar to that of the participant Take the difference between the outcome for the participant and the mean outcome for the similar nonparticipants. Repeat the exercise for all participants. Take the weighted average of the outcome differences across all matched participants to obtain: The average treatment effect on the matched treated. Estimate the standard error around the treatment effect for statistical inference.

Finding similar nonparticipants Different weighting functions to match nonparticipants with P(X) s similar to the P(X) of the participant: Stratification Nearest neighbor Radius Kernel

Main Problems

Problem One: Need Similar People ENROLLED NOT ENROLLED VERY POOR POOR RICH VERY RICH

Problem Two: Can Only Match on Observables MATCHING DOES NOT OVERCOME SELECTION PROBLEM! What if we can t collect data on people characteristics that are relevant for program participation and outcomes?

Summary Requirements for successful matching implementation: Data on variables that matter for participation. Common support No selection on based on unobservables. Matching can be combined with DID. Matching performed on baseline X s. DD controls for time-invariant unobservables.

Looking for a Volunteer

Case 7: Progresa Matching (P-Score) Baseline Characteristics Estimated Coefficient Probit Regression, Prob Enrolled=1 Head s age (years) -0.022** Spouse s age (years) -0.017** Head s education (years) -0.059** Spouse s education (years) -0.03** Head is female=1-0.067 Indigenous=1 0.345** Number of household members 0.216** Dirt floor=1 0.676** Bathroom=1-0.197** Hectares of Land -0.042** Distance to Hospital (km) 0.001* Constant 0.664** Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).

Case 7: Progresa Common Support Density: Pr (Enrolled) Density: Pr (Enrolled) Density: Pr (Enrolled) Pr (Enrolled)

Case 7: Progresa Matching (P-Score) Estimated Impact on Consumption (Y) Multivariate Linear Regression 7.06+ Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). If significant at 10% level, we label impact with +

When to use Use when selection into program participation status is based on observable variables. Requirements Understand which variables matter for participation in the program (e.g., program rules) Data available on these variables that matter prior to units becoming participants or nonparticipants (baseline data). Common Support Best when combined with diff-diff Key assumption: There are no remaining unobservable differences between participants and nonparticipants

Keep in Mind! Matching Requires large samples and good quality data. Matching at baseline can be very useful: o o Know the assignment rule and match based on it combine with other techniques (i.e. diff-in-diff) Ex-post matching is risky: o o If there is no baseline, be careful! matching on endogenous expost variables gives bad results.

Progresa Policy Recommendation? Impact of Progresa on Consumption (Y) Case 1: Before & After 34.28** Case 2: Enrolled & Not Enrolled -4.15 Case 3: Randomized Assignment 29.75** Case 4: Randomized Promotion 30.4** Case 5: Discontinuity Design 30.58** Case 6: Differences in Differences 25.53** Case 7: Matching 7.06+ Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). If significant at 10% level, we label impact with +

Progresa Policy Recommendation? Impact of Progresa on Consumption (Y) Case 1: Before & After 34.28** Case 2: Enrolled & Not Enrolled -4.15 Case 3: Randomized Assignment 29.75** Case 4: Randomized Promotion 30.4** Case 5: Discontinuity Design 30.58** Case 6: Differences in Differences 25.53** Case 7: Matching 7.06+ Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). If significant at 10% level, we label impact with +

IE Methods Randomized Assignment Discontinuity Design Difference-in- Differences Toolbox Choose Your Method Matching + Diff-in- Diff

Where Do Comparison Groups come from? The rules of program operation determine the evaluation strategy. We can almost always find a valid comparison group if: the operational rules for selecting beneficiaries are equitable, transparent and accountable; the evaluation is designed prospectively.

Choosing your IE method(s) Money Excess demand No Excess demand Targeting Timing Phased Roll-out Targeted Universal Targeted Universal 1 Randomized assignment 4 RDD 1 Randomized assignment 2 Randomized promotion 3 DD with 5 Matching 1 Random ized Assignment 4 RDD 1 Randomized assignment to phases 2 Randomized Promotion to early take-up 3 DD with 5 matching Immediate Roll-out 1 Randomized Assignment 4 RDD 1 Randomized Assignment 2 Randomized Promotion 3 DD with 5 Matching 4 RDD If less than full Take-up: 2 Randomized Promotion 3 DD with 5 Matching

Test

Q1: What is the short-coming(s) of difference-difference? A. Those enrolled in the program might have a different trend over time as those not enrolled. B. It does not have a counter-factual. C. Sample size might be too small. D. People who are different to comparison group might drop out of the program E. Both A and C F. Both A and D.

Q2 You are evaluating a school management reform program that targets poor school. You decide to perform a diff-diff, comparing target schools with schools that did not receive the program. Over the same period government deployed more teachers to poor areas. Would this overestimate or under-estimate the program? A. Over-estimate B. Under-estimate C. Neither

Q3: What is the biggest short-coming of propensity match scoring? A. Cannot match on observables characteristics B. Cannot match on unobservables characteristics C. Different trends between treatment and comparison groups.

When is it possible to do regression discontinuity design? A. When there is a continuous eligibility criteria with a clear cut-off. B. When there is a comparison group of people who do not receive the program. C. When government randomly assigns some to receive the program and some not.