RANDOMIZED TRIALS Technical Track Session II Sergio Urzua University of Maryland

Similar documents
Planning Sample Size for Randomized Evaluations Esther Duflo J-PAL

Planning Sample Size for Randomized Evaluations

Evaluation Design: Assignment of Treatment

Randomized Evaluation Start to finish

Cost-Effectiveness Analysis and Cost-Benefit Analysis. Dagmara Celik Katreniak HSE

Principles Of Impact Evaluation And Randomized Trials Craig McIntosh UCSD. Bill & Melinda Gates Foundation, June

Measuring Impact. Impact Evaluation Methods for Policymakers. Sebastian Martinez. The World Bank

Abdul Latif Jameel Poverty Action Lab Executive Training: Evaluating Social Programs Spring 2009

Using Randomized Evaluations to Improve Policy

Labour Supply, Taxes and Benefits

Policy Evaluation: Methods for Testing Household Programs & Interventions

COST-EFFECTIVENESS ANALYSIS

What can we learn from impact assessments? Jonathan Bauchet, Aparna Dalal, and Jonathan Morduch

Potential Pilot Problems. Charles M. Jones Columbia Business School December 2014

Quasi-Experimental Methods. Technical Track

Econ Spring 2016 Section 12

Sampling Distributions Chapter 18

Labour Supply and Taxes

CASE STUDY 2: EXPANDING CREDIT ACCESS

The Oregon Health Insurance Experiment and the Value of Randomized Evaluation

DIME WORKSHOP OCTOBER 13-17, 2014 LISBON, PORTUGAL

DIME WORKSHOP OCTOBER 13-17, 2014 LISBON, PORTUGAL

Experiments! Benjamin Graham

Evaluation of Public Policy

Savings, Subsidies and Sustainable Food Security: A Field Experiment in Mozambique November 2, 2009

CABARRUS COUNTY 2008 APPRAISAL MANUAL

Policy Brief. Monitoring and Evaluation A Roadmap to Results on Roma Inclusion

VARIABILITY: Range Variance Standard Deviation

ECON1980o: Health, Education and Development. Lecture 3 October 2, 2008

Module 4: Probability

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Work-Life Balance and Labor Force Attachment at Older Ages. Marco Angrisani University of Southern California

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Lecture 23

The following content is provided under a Creative Commons license. Your support

Problem Set #4. Econ 103. (b) Let A be the event that you get at least one head. List all the basic outcomes in A.

Math 14 Lecture Notes Ch. 4.3

work to get full credit.

Supplementary Material to: Peer Effects, Teacher Incentives, and the Impact of Tracking: Evidence from a Randomized Evaluation in Kenya

Monte Carlo Methods for Uncertainty Quantification

FEEG6017 lecture: The normal distribution, estimation, confidence intervals. Markus Brede,

Trends in Financial Literacy

Innovations for Agriculture

P1: TIX/XYZ P2: ABC JWST JWST075-Goos June 6, :57 Printer Name: Yet to Come. A simple comparative experiment

Math 140 Introductory Statistics

Empirical Approaches in Public Finance. Hilary Hoynes EC230. Outline of Lecture:

Bias Reduction Using the Bootstrap

Chapter 14 : Statistical Inference 1. Note : Here the 4-th and 5-th editions of the text have different chapters, but the material is the same.

Sampling and sampling distribution

Module 4: Point Estimation Statistics (OA3102)

Saving Constraints and Microenterprise Development

Chapter 7 Study Guide: The Central Limit Theorem

How can we assess the policy effectiveness of randomized control trials when people don t comply?

Measuring Impact. Paul Gertler Chief Economist Human Development Network The World Bank. The Farm, South Africa June 2006

Sampling & Statistical Methods for Compliance Professionals. Frank Castronova, PhD, Pstat Wayne State University

5IE475 Program Evaluation and Cost-Benefit Analysis

3. The n observations are independent. Knowing the result of one observation tells you nothing about the other observations.

The Binomial Distribution

Statistical Evidence and Inference

Chapter 10 Estimating Proportions with Confidence

Economics 300 Econometrics Econometric Approaches to Causal Inference: Instrumental Variables

Identifying Cost-Effective Interventions. Capturing and Analyzing Costs of Interventions

8.1 Estimation of the Mean and Proportion

Statistical Sampling Approach for Initial and Follow-Up BMP Verification

Value (x) probability Example A-2: Construct a histogram for population Ψ.

Sampling Distributions

Evaluation, Measurement, and Verification (EM&V) of Residential Behavior-Based Energy Efficiency Programs: Issues and Recommendations

Random variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny.

Adaptive Experiments for Policy Choice. March 8, 2019

Economics 270c. Development Economics Lecture 11 April 3, 2007

IMPACTS OF COMMUNITY-DRIVEN DEVELOPMENT PROGRAMS ON INCOME AND ASSET ACQUISITION IN AFRICA: THE CASE OF NIGERIA

Chapter 6 Confidence Intervals Section 6-1 Confidence Intervals for the Mean (Large Samples) Estimating Population Parameters

Sampling Distributions For Counts and Proportions

Inflation Expectations and Behavior: Do Survey Respondents Act on their Beliefs? October Wilbert van der Klaauw

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

3. The n observations are independent. Knowing the result of one observation tells you nothing about the other observations.

Math 140 Introductory Statistics

The Simple Regression Model

The Fallacy of Large Numbers

Evaluation of the Uganda Social Assistance Grants For Empowerment (SAGE) Programme. What s going on?

Essential Question: What is a probability distribution for a discrete random variable, and how can it be displayed?

Some Characteristics of Data

SOCIAL NETWORKS, FINANCIAL LITERACY AND INDEX INSURANCE

2 General Notions 2.1 DATA Types of Data. Source: Frerichs, R.R. Rapid Surveys (unpublished), NOT FOR COMMERCIAL DISTRIBUTION

BIOL The Normal Distribution and the Central Limit Theorem

CS 361: Probability & Statistics

How to Hit Several Targets at Once: Impact Evaluation Sample Design for Multiple Variables

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

Chapter 5. Sampling Distributions

Linear Regression with One Regressor

Example 1: Identify the following random variables as discrete or continuous: a) Weight of a package. b) Number of students in a first-grade classroom

The Binomial Distribution

Central Limit Theorem

The Binomial Distribution

The binomial distribution p314

Cash or Stuff: Benchmarking Aid Programs with a Preference-Based Approach

The Simple Regression Model

1 Introduction. Term Paper: The Hall and Taylor Model in Duali 1. Yumin Li 5/8/2012

Testing Microfinance Program Innovation with Randomized Control Trials: An Example from Group versus Individual Lending

Probability. An intro for calculus students P= Figure 1: A normal integral

Economics 345 Applied Econometrics

Transcription:

RANDOMIZED TRIALS Technical Track Session II Sergio Urzua University of Maryland

Randomized trials o Evidence about counterfactuals often generated by randomized trials or experiments o Medical trials o Eliminates common biases (or confounders) when done properly o Selection bias o Trends concurrent with intervention o Therefore, often considered the gold standard of estimating causal impacts

Randomized trials o Not magic o Still subject to basic constraints of statistics o Need large samples o Drop out, non-compliance a problem o Though not biased, estimated parameters might differ from desired parameters o Sometimes not politically feasible

Outline 1. Randomization solves selection bias 2. What should be the unit of randomization? i. Bias ii. Statistical power iii. Externalities 3. How do you actually randomize? 4. Stratification (what is it, why do we need it) 5. Difference between random sampling and randomization 6. Other issues i. Attrition ii. Compliance (both for subjects and implementers) iii. Estimated parameters 7. Non-randomized methods

Randomized trials overcome potential confounders o Let s return to earlier examples: o Health insurance o Conditional cash transfers o Bias 1: Selection bias o Participants might be innately different from nonparticipants o Consider a simple lottery o Take all eligible people in population of interest o Place all names on slips of paper in a jar o Pick half of the slips of paper out of jar o Chosen names get intervention, those not chosen do not

Bias 1: Selection bias Eligible population o Green = treatment (with intervention) o Pink = comparison (without intervention) o Assume this array represents geographical spread of sample population

Bias 1: Selection bias Eligible population o Green = treatment (with intervention) o Pink = comparison (without intervention) o Should average characteristics differ across treatment and comparison groups prior to the intervention? o No.

Bias 1: Selection bias o Average characteristics should be the same for treatment and comparison groups prior to the intervention o Expenditure o Health status o Motivation to send children to school o Fear of dogs o Everything! o So prior to a health insurance intervention, average expenditure (ē) should be identical in treatment and comparison groups

Bias 2: Common trends Eligible population o Green = treatment (with intervention) o Pink = comparison (without intervention) o Heavy rains or other program

Bias 2: Common trends o When treated units selected randomly, rain shock common to both treatment and comparison groups o What happens when we look at health expenditures of both groups after the intervention? o Average outcome for treatment group = ē + impact of health insurance + impact of rains o Average outcome for comparison group = ē + impact of rains o Difference between treatment and comparison = [ē + impact of health insurance + impact of rains] - [ē + impact of rains] = impact of health insurance

Randomization and selection bias more generally 0] ) ( [ 1] ) ( [ 1] ) ( ) ( [ 1] ) ( [ 1] ) ( [ 0] ) ( [ 1] ) ( [ 0] ) ( [ 1] ) ( [ 0 0 0 1 0 0 0 1 0 1 D u Y E D u Y E D u Y u Y E D u Y E D u Y E D u Y E D u Y E D u Y E D u Y E U U U U U U U U U Selection bias: Difference in average untreated outcomes between treatment and comparison groups

Randomization solves selection bias o Randomization ensures that o Treatment and comparison groups differ in expectation only through exposure to treatment o Therefore, in absence of treatment, outcomes should have been the same for both groups o Therefore, E U[ 0 Y0 ( u) D 1] EU [ Y ( u) D 0] 0

Randomization solves selection bias o Since selection bias is equal to zero, T (an indicator for D=1) is an unbiased estimator of treatment impact y u T u o Control variables o Should not affect bias since in expectation treatment and comparison groups should be balanced on controls o Can increase precision of estimated impact

Can this be done in practice? o A few examples implemented in developing countries o Textbooks, deworming drugs, contract teachers, performance pay for teachers, merit based scholarships, HIV/AIDS education, school uniforms, health insurance, conditional cash transfers, vouchers to learn HIV results, vouchers for private school, iron supplementation, information about returns to schooling, gender/caste of village leader, fertilizer, micro-credit, school report cards, community score cards, school based management, school meals, savings products, computers in the classroom, interest rates, prices for malaria medicines, prices for mosquito nets,.. o See websites of SIEF, Poverty Action Lab, Innovations for Poverty Action and Development Impact for more information on studies

The unit of randomization: Why it matters so much

Unit of randomization o Determines 1. Extent to which randomization solves selection bias 2. Statistical power 3. Ability to measure externalities

Unit of randomization and bias o Extreme example o 1 treatment district and 1 comparison district o What happens if only 1 district suffers a shock (positive or negative)? o Cannot disentangle treatment effect and effect of shock o Treatment and comparison district unlikely to be balanced on average traits (law of large numbers cannot apply) o These concerns still apply when N Treatment = 5 and N Comparison = 5

Unit of randomization and statistical power o When do we have enough units? o Depends on o Underlying variance of outcome of interest both across units and within units o If underlying variance is high, will need a large sample to separate signal (treatment impact) from noise o The more correlated are units within unit of randomization (e.g. households within a village), the more the unit of randomization becomes the effective sample size o Too few units can lead to low statistical power o Perhaps the true treatment impact is non-zero, but your estimates are so noisy (imprecise) that you cannot distinguish them from zero o Will not learn anything useful from impact evaluation o Impact could be a 50% improvement or it could be zero I can t really tell. o Therefore, large geographical units not ideal candidates for unit of randomization

Unit of randomization and externalities o What if we believe that our treatment causes externalities? I.e. controls may be impacted by treatment of others o Examples o Deworming medicine o Information campaign o We might underestimate true treatment impact if individuals randomly selected to receive treatment since comparison group also indirectly benefits o What can we do?

Unit of randomization and externalities o We can we do? o Randomize at a more aggregate level, and o Make sure to measure degree of connectedness among units within treatment and comparison group o Deworming example o Randomize at level of school, not individual, so everyone in treated school can receive medicine o Compare average outcomes across T and C schools o Measure comparison schools physical distance from treatment schools o Since worms spread through contact with contaminated fecal matter and since open defecation common, schools closer to treated schools should be more likely to experience positive externalities o Measure social networks o Since intervention randomized, percentage of network that is treated may also be random. Those with more treated networks should also experience more externalities

How do you actually randomize?

How to randomize? o Randomize participation o Units are either in treatment or comparison group o Randomize order of participation o All units eventually treated, but in the interim, later treatment units serve as comparison for early treatment units o Randomize inducement for participation o More on this in later presentations o Also called an encouragement design

How to randomize? o But how do we actually do this? o Many options o Flip a coin o Public or private lottery (pull names from a jar) o Roll dice How do you actually randomize? o Software that allows you to generate a random number o Faster than above options o Can later prove that randomization was legitimate o Example: A unit can be in 1 of 4 experimental groups o Assign random number to all units o First quartile of random number distribution in comparison group, and other quartiles correspond to other 3 experimental groups

Stratification and randomization

What is stratification? o Separate units into sub-populations o Geographic areas o Gender or ethnicity o Income level o Within each strata, randomize treatment o Example: Half of women in sample are treated, half are in the comparison

Why do we need strata? Geography example = T = C

Why do we need strata? What s the impact in a particular region? Sometimes hard to say with any confidence

Why do we need strata? Random assignment to treatment within geographical units Within each unit, ½ will be treatment, ½ will be comparison. Similar logic for any other sub-population

Why do we need strata? o Also allows us to cleanly measure heterogeneous treatment impacts o Separate impacts for each group o Also guarantees balance of stratified variables between treatment and control and improves power

Random sampling and randomization: They are not the same, but both are important

Randomization o Random assignment of units to treatment and comparison groups o Treatment impact will be unbiased for that sample

Random sampling o Randomly choosing units from overall study population to observe o Could occur before or after assignment of treatment o Would occur after if intervention is large and we do not need to survey everyone to estimate treatment impact

Typical sequencing First stage A random sample of units is selected from a defined population. Second stage This sample of units is randomly assigned to treatment and comparison groups.

Eligible Population Random sample Sample Treatment Group Randomized assignment Comparison Group

Why two stages? First stage Random sampling from population For external validity Ensures that the results in the sample will represent the results in the population within a defined level of sampling error Second stage Randomized assignation of treatment For internal validity Ensures that the observed effect on the dependent variable is due to the treatment rather than to other confounding factors

Other issues: Attrition, compliance, estimated parameters

Attrition o Drop out from intervention or survey sample o Why this matters o What if only treatment units experiencing high returns remain in intervention? o Will over-estimate impact of intervention o What if most desperate members of comparison group migrate to another area? o Will under-estimate impact of intervention o Need to be concerned about o Differential attrition across T and C groups o Differential attrition across types within an experimental group

o Often difficult to avoid o Methods to address this if extent of non-compliance is not large (discussed in later presentation) (Non)compliance o Some members of treatment group do not take up the treatment o Some members of comparison group get the treatment o Could occur through actions of either experimental units or implementers o Non-compliance usually not random o Interferes with causal inference

Estimated parameters o Still need to think about what these are even when randomizing! o Randomization can remove selection bias but we can still estimate something that is o Irrelevant o Different from what we were intending to estimate

Estimated parameters o Are we measuring partial or total derivative? o Example 1: School meals offered in randomly selected schools o We are interested in impact of school meals on school attendance o What if schools offering school meals raise their (effective) prices after they observe everyone wants to go to their school? o Can induce some children to drop out of school o We will end up measuring the sum of direct impact on attendance and indirect impact on attendance operating through prices (total derivative) o But price variation occurs because some schools do not offer meals o Would not occur during scale-up o Therefore, we might be more interested in partial derivative

Estimated parameters o Example 2: Mandated provision of health insurance in formal sector o We are interested in impact on service utilization o Immediate impact o Formal sector firms must provide insurance o Increase in insurance coverage and utilization o Partial derivative o Potential impact over time o Reform decreases incentive to be a formal firm o Decrease in insurance coverage and utilization o Total derivative o In this case, we might be more interested in the total derivative o Should be incorporated into evaluation design o Timing of measurement o Units to measure (e.g. firms and households) o Variables to measure (e.g. formal sector status, insurance offer by firm)

Estimated parameters o Hawthorne effects o Act of observation or demonstrated interest makes units behave differently o Treatment impact = true treatment impact + observation effect o Experiments on productivity effects of lighting from 1924-1932 at the Hawthorne Works factory o Productivity effects disappeared when study concluded even though intervention remained o John Henry effects o Comparison group alters behavior because they know they are in the comparison group o May try to compensate (Folklore: John Henry tries to lay railroad faster than a machine) o May become disgruntled o The effects might not occur during scale up o Problem if effect observed in pilots results from Hawthorne or John Henry effects rather than treatment

Randomization and non-randomized methods o Randomization solves selection bias problem o All other methods (even quasi-experimental) will always try to approximate randomization o Randomization does not solve every problem o Statistical power o Attrition and compliance o Potential deviation from estimated parameters and parameters of interest

References o o o o o Esther Duflo, Rachel Glennerster, and Michael Kremer (2007), Using Randomization in Development Economics Research: A Toolkit, in T.Paul Schultz and John Strauss (eds.) Handbook of Development Economics, Vol 4. Edward Miguel and Michael Kremer (2004), Worms: Identifying Impacts on Education and Health in the Presence of Treatment Externalities, Econometrica, 72(1) Michael Kremer and Edward Miguel (2007), The Illusion of Sustainability, Quarterly Journal of Econometrics, 122(3). Michael Kremer and Alaka Holla (2009), Pricing and Access: Lessons from Randomized Evaluations in Education and Health, in Jessica Cohen and William Easterly (eds.) What Works in Development? Thinking Big and Thinking Small, Brookings University Press See also websites of o o o SIEF [Spanish Impact Evaluation Fund] J-PAL [Abdul Latif Jameel Poverty Action Lab] IPA [Innovations for Poverty Action]