Regression Discontinuity Design

Similar documents
Empirical Methods for Corporate Finance. Regression Discontinuity Design

Two-stage least squares examples. Angrist: Vietnam Draft Lottery Men, Cohorts. Vietnam era service

Session III The Regression Discontinuity Design (RD)

Bakke & Whited [JF 2012] Threshold Events and Identification: A Study of Cash Shortfalls Discussion by Fabian Brunner & Nicolas Boob

Applied Economics. Quasi-experiments: Instrumental Variables and Regresion Discontinuity. Department of Economics Universidad Carlos III de Madrid

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998

Session V Regression Discontinuity (RD)

Final Exam - section 1. Thursday, December hours, 30 minutes

Technical Track Title Session V Regression Discontinuity (RD)

Impact of Household Income on Poverty Levels

Incentive effects of social assistance: A regression discontinuity approach *

Firm Manipulation and Take-up Rate of a 30 Percent. Temporary Corporate Income Tax Cut in Vietnam

Quasi-Experimental Methods. Technical Track

Econometrics is. The estimation of relationships suggested by economic theory

Quantitative Techniques Term 2

Your Name (Please print) Did you agree to take the optional portion of the final exam Yes No. Directions

ECON Introductory Econometrics. Seminar 4. Stock and Watson Chapter 8

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

The Multivariate Regression Model

Incentive effects of social assistance: A regression discontinuity approach *

The Runner-up Effect: Online Appendix

The relationship between GDP, labor force and health expenditure in European countries

Can the Hilda survey offer additional insight on the impact of the Australian lifetime health cover policy?

Public Employees as Politicians: Evidence from Close Elections

Stat 328, Summer 2005

Early Retirement Incentives and Student Achievement. Maria D. Fitzpatrick and Michael F. Lovenheim. Online Appendix

sociology SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 SO5032 Quantitative Research Methods

Measuring the Impact of the European Regional Policy on Economic Growth: a Regression Discontinuity Design Approach

Measuring Impact. Impact Evaluation Methods for Policymakers. Sebastian Martinez. The World Bank

The Effect of Unemployment Benefits on the Duration of. Unemployment Insurance Receipt: New Evidence from a

tm / / / / / / / / / / / / Statistics/Data Analysis User: Klick Project: Limited Dependent Variables{space -6}

Econ 371 Problem Set #4 Answer Sheet. 6.2 This question asks you to use the results from column (1) in the table on page 213.

Disincentive Effects of a Generous Social Assistance Scheme. Anders Bruun Jonassen

İnsan TUNALI 8 November 2018 Econ 511: Econometrics I. ASSIGNMENT 7 STATA Supplement

Online Appendix Information Asymmetries in Consumer Credit Markets: Evidence from Payday Lending

Problem Set 2. PPPA 6022 Due in class, on paper, March 5. Some overall instructions:

Groupe de Recherche en Économie et Développement International. Cahier de Recherche / Working Paper 16-05

Stat3011: Solution of Midterm Exam One

Does Investing in School Capital Infrastructure Improve Student Achievement?

Advanced Industrial Organization I Identi cation of Demand Functions

For Online Publication Additional results

Measuring Impact. Paul Gertler Chief Economist Human Development Network The World Bank. The Farm, South Africa June 2006

Regression Discontinuity and. the Price Effects of Stock Market Indexing

How Extending the Maximum Benefit Duration Affects the Duration of Unemployment

Evaluating China s Poverty Alleviation Program: A Regression Discontinuity Approach

Alternate Specifications

International Journal of Multidisciplinary Consortium

Risk Management and Rating Segmentation in Credit Markets

Childhood Medicaid Coverage and Later Life Health Care Utilization * Laura R. Wherry, Sarah Miller, Robert Kaestner, Bruce D. Meyer.

Session III Differences in Differences (Dif- and Panel Data

The Retirement-Consumption Puzzle and the German Pension System - A Regression Discontinuity Approach

NBER WORKING PAPER SERIES REGRESSION KINK DESIGN: THEORY AND PRACTICE. David Card David S. Lee Zhuan Pei Andrea Weber

Problem Set 6 ANSWERS

The Impact of a $15 Minimum Wage on Hunger in America

Public Economics. Contact Information

Final Exam Suggested Solutions

Problem Set 9 Heteroskedasticty Answers

Model fit assessment via marginal model plots

Title: Evaluating the effect of Economic Freedom and other Factors on the Economic Prosperity of Nations

Assignment #5 Solutions: Chapter 14 Q1.

Dummy variables 9/22/2015. Are wages different across union/nonunion jobs. Treatment Control Y X X i identifies treatment

Labor Participation and Gender Inequality in Indonesia. Preliminary Draft DO NOT QUOTE

1) The Effect of Recent Tax Changes on Taxable Income

Effect of Education on Wage Earning

Categorical Outcomes. Statistical Modelling in Stata: Categorical Outcomes. R by C Table: Example. Nominal Outcomes. Mark Lunt.

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron

Longitudinal Logistic Regression: Breastfeeding of Nepalese Children

Postestimation commands predict Remarks and examples References Also see

Example 1 of econometric analysis: the Market Model

You created this PDF from an application that is not licensed to print to novapdf printer (

Online Appendix. income and saving-consumption preferences in the context of dividend and interest income).

Full Web Appendix: How Financial Incentives Induce Disability Insurance. Recipients to Return to Work. by Andreas Ravndal Kostøl and Magne Mogstad

Comment on Gary V. Englehardt and Jonathan Gruber Social Security and the Evolution of Elderly Poverty

Technical Documentation for Household Demographics Projection

9. Logit and Probit Models For Dichotomous Data

Labor Force Participation and the Wage Gap Detailed Notes and Code Econometrics 113 Spring 2014

Testing the Solow Growth Theory

Effect of Health Expenditure on GDP, a Panel Study Based on Pakistan, China, India and Bangladesh

Advanced Econometrics

DATA HANDLING Five-Number Summary

CHAPTER 2 ESTIMATION AND PROJECTION OF LIFETIME EARNINGS

F^3: F tests, Functional Forms and Favorite Coefficient Models

Lifetime Earnings and Vietnam Era Draft Lottery. Evidence from Social Security Administration Records. Joshua Angrist

Credit Constraints and Search Frictions in Consumer Credit Markets

Online Appendices Practical Procedures to Deal with Common Support Problems in Matching Estimation

Example 2.3: CEO Salary and Return on Equity. Salary for ROE = 0. Salary for ROE = 30. Example 2.4: Wage and Education

u panel_lecture . sum

Financial Innovation and Borrowers: Evidence from Peer-to-Peer Lending

Old-Age Pension and Extended Families: How is Adult Children s Internal Migration Affected?

Today's Agenda Hour 1 Correlation vs association, Pearson s R, non-linearity, Spearman rank correlation,

Decentralization of Public Education: Does Everyone Benefit?

Economics 345 Applied Econometrics

Economic Growth and Convergence across the OIC Countries 1

ONLINE APPENDIX (NOT FOR PUBLICATION) Appendix A: Appendix Figures and Tables

DIFFERENCE DIFFERENCES

Regression Review and Robust Regression. Slides prepared by Elizabeth Newton (MIT)

Abadie s Semiparametric Difference-in-Difference Estimator

Web Appendix For "Consumer Inertia and Firm Pricing in the Medicare Part D Prescription Drug Insurance Exchange" Keith M Marzilli Ericson

Closing routes to retirement: how do people respond? Johannes Geyer, Clara Welteke

Home Energy Reporting Program Evaluation Report. June 8, 2015

Transcription:

Regression Discontinuity Design Aniceto Orbeta, Jr. Philippine Institute for Development Studies Stream 2 Impact Evaluation Methods (Intermediate) Making Impact Evaluation Matter Better Evidence for Effective Policies and Program September 1-3 204, ADB Headquarters

Outline Basic Characteristics Validity Test Specification Decisions Estimation methods Classic examples

Basic Characteristics There is an continuous indicator / forcing variable that can order observation units in some manner Poverty Index (e.g. PMT) Impact of development projects to communities/households above a poverty incidence threshold (e.g. Pantawid Pamilya (Philippine CCT)) Age Impact on access to public goods of discounts to senior citizens (above 60 years old) Exam scores Impact of remedial school program mandatory for children whose score is less than some cut off level on a test Impact of migration when eligibility is based on qualifying exam scores

Basic Characteristics: RDD before and after intervention Before Intervention After Intervention Impact Gertler et al. (2011) Impact Evaluation in Practice. World Bank

Validity Tests Shown mostly by scatter plots No jump in outcome before treatment. Without the intervention, there exist a smooth relationship between the outcome and the predictor of treatment assignment (plot of outcomes against running variable) No jumps in relevant covariates at cut off. Only the treatment variable should cause a jump in the outcome variable; there should be no other covariate that exhibits similar jump at the threshold or it will invalidate the design (plot of covariates vs running variable) Eligibility cannot be manipulated McCrary (2008) density test (density plot of running variable around the cutoff)

Decision to Make: Choice of functional form RDD requires a break in outcome at the threshold point Panel A Linear - Will require linear model Panel B Non-linear - Will require a non-linear model Panel C Not a RD - There is no break only a non-linear relationship

Decision to make: Choice of How wide from the threshold should be included? trade-off between bias and variance The wider BW: more bias but smaller the variance and viceversa Optimum BW i.e. optimality based on MSE Common practice: Provide several estimates at different BW to reveal bias Bandwidth (BW)

Data-Determined Optimum Bandwidth Trade-off between bias and variance Imbens and Kalyanaraman (2012) Calonico, Cattaneo, Titiunik (2014) Cross Validation (Ludwig & Miller, 2007)

Two Types of RD When treatment is determined completely by predictor / forcing variable (called sharp); if partly (called fuzzy) Outcome Y i Yi (0) if Wi 0, Yi (1) if Wi 1 Where W i is the treatment assignment Let X be the predictor and c threshold value Sharp: W i =1{X i c} ( deterministic assignment ) Fuzzy: P(W i =1 X i =x) from below c P(W i =1 X i =x) from above c rather than having treatment jumping, it s the probability of getting treatment that jumps ( probabilistic assignment )

Impact Estimates Sharp: Sharp E[ Y (1) Y (0) X c] Fuzzy: Fuzzy E[ Y (1) X c] E[ Y (0) X c] E[ W (1) X c] E[ W (0) X c] measured from above (1) and from below (0) of threshold c

Importance of Graphical Analysis Validity test are shown through graphs Specification are often revealed by a graph If there is no visible jump in the graph, chances are there is significant impact Of course, a graph cannot give us a precise numerical estimate of the impact

Estimation Global Polynomial Local Linear Local Randomization

Global Polynomial Naïve (assumes constant treatment effect with equal slope ) Y * W ( X c)*... ( X c) * p i sharp i 1 i p i Flexible p Y * W ( X c)*... ( X c) * i sharp i i i p W *( X c)*... W *( X c) * p i 1 i p i

Local-Polynomial Estimate separately a weighted regression (using kernel weights) locally (within a bandwidth, h) Left of cutoff [-h,cutoff] Right of cut-off [cutoff,h] Yi ( X i c)* _ _, i Y ( X c)*, Or combined i i i ˆ ˆ_ ˆ ( h ) sharp Y * W ( X c)* W *( X c)* i sharp i i 1 i i 1 i

Local Randomization Idea: Near the cutoff treat as if randomly assigned Find a window [-h<x<h] such that for all Xi, Wi is independent of outcome - Y(0), Y(1) Employ RCT methods in the window, i.e. test of difference in means below and above the cutoff

Classic Examples Impact of social assistance on labor market outcomes (Lemeiux and Milligan, 2008) (Sharp RD) Impact of Class Size on Achievement (Angrist and Lavy, 1999) (Sharp and Fuzzy RD)

Example 1 Impact of social assistance on labor market outcomes Lemeiux, T. and K. Milligan (2008) Incentive effects of social assistance: A regression discontinuity approach Journal of Econometrics (also NBER Working Paper 10541)

Evaluation Issue: What is the incentive effect of social assistance? What impact of social assistance on labor market behavior? Evaluation Model: Output Social assistance rule Intermediate outcomes Amount of social assistance Final outcomes Labor market outcome

Background - Before 1989, childless social assistance recipients in Quebec under age 30 received much lower benefits than recipient over age 30 - Used this policy rule to estimate the effects of social assistance on labor market outcomes

Social Assistance benefits for under 30 and over 30, 1980-1993 Benefits are indeed very different before 1989 between those under 30 and 30 and above

Data - Canadian census a detailed questionnaire (long form) is assigned to approximately 20% of households, consisting of questions on labor market characteristics and participation, education, income, and the demographics of respondents - From the 20% sample, obtained samples of men around 3,000 high school dropouts for each age group, keep only men without children for each age group around the discontinuity at age 30 - Employment data employment status during the reference period, hours worked - Labor force survey provide too small number of sample per age group; used to provide data on the labor market context

3-year moving average of employment rate by agegroup (25-29; 30-34), 1976-1997 Observations -Cyclicality of employment rates => need for a control group to separate business cycle effects from policy effects of interest -The pattern of employment for those aged 25-29 and aged 30-34 are similar => labor market conditions are similar to the two groups -Quebec has lower employment rate compared to rest of Canada

Empirical Model Y 1 TREAT ( a) ia o ia ia Where Y ia =outcome for individual i of age a δ(a)=effect of age on outcome variable TREAT=treatment dummy 0 if a<30 and 1 if age>=30 β 1 is the parameter of interest

Key identifying assumption δ(.) is a smooth (continuous) function We have a sharp RD design since the treatment variable is a deterministic function of the regression variable (age) The assumption that δ(.) is a continuous function means that differential benefits are the only discontinuity in outcomes around age 30

Threats to the validity of the assumption Employment rate by age is a well-known profile Violations can happen when Some people can find ways to cheat on their age by, for example, falsifying their birth certificates difficult to do because can be easily verified Differential benefits are only for individuals without dependent children to the extent that fertility and living arrangements decisions (live with your children or not) are endogenous, this generates a problem of non-random selection

Outcomes Employment rate during reference week Employment rate based on the fraction of weeks worked in the previous year [did not define how many weeks working constitutes employment]

Estimation Estimated a variety of the specification for the regression function δ(a) Linear Quadratic Cubic Linear spline Quadratic spline Used age-specific cell averages in the estimation Use weighted OLS using the inverse of the sampling variance as weights which is similar to having the number of observations per cell as weights

Graphical Evidence: Employment at census week, Quebec, 1986 A very strong evidence that employment drops abruptly once the individual is eligible for higher social assistance Employment also tends to trend down faster (steeper) as a function of age especially after age 30

Graphical Evidence Employment rate in previous year, Quebec, 1986 Similar patterns observable using employment rate based on weeks worked in previous year

Estimation Results RD estimate of impact of social assistance on labor market outcome, Quebec, 1986 Employment impacts are more precisely estimated (lower standard errors) by the first four models compared to quadratic spline The impact is even stronger using employment rate based on status during reference week compared to that based on weeks worked during the past year; more precisely estimated Similar estimates except for quadratic spline the estimate is from -0.038 to - 0.056 Goodness of fit test shows that even simpler models (linear and linear spline) fit the data very well fitted not significantly different from actual

Example 2 Impact of Class Size on Achievement Angrist and Lavy (1999) Using Maimonides Rule to Estimate the Effect of Class Size on Scholastic Achievement, Quarterly Journal of Economics, 144(2), 533-575

Evaluation Issue: Impact of school output on school outcomes Impact of class size on test scores Logic Model: Output Intermediate outcome Maimonides rule Class size Test score Final outcome

Background Class size in Israeli schools is capped at 40. Student in a grade with up to 40 students can expect to be in classes as large as 40, but grade with 41 students are split into two classes, grade with 81 students are split into three classes, and so on Called Maimonides Rule the rule was proposed by the medieval Talmudic scholar Maimonides

Implications of Maimonides Rule on class size Maimonides Rule implies that the predicted class size (in a given grade) assigned to a class c in schools s (m sc ) is m sc es ( es 1) int 1 40 e s =enrollment in grade Int(a)=interger part of (a) Enrollment 1-40 in a single class; enrollment 41-80 split into 2 classes; enrollment 81-120 into three classes, and so on m sc is an increasing function of e s, making it an important control

Maimonides Rule and actual class size, 5 th Grade Actual data on class size reveal, Maimonides rule was not strictly followed (or there are other factors determining class size) but it clearly is largely determined by the rule While class sizes is not in multiples of 40, class size increases with enrollment size; and drops sharply at integer multiples of 40

Class size and (reading) test scores Test scores are generally higher in school with larger enrollments (positive relationship) Showing, in part, a mirror image upand-down pattern Apparent positive correlation is partly attributable to the fact that larger schools are more likely to be located in relatively prosperous cities, while poorer schools are more likely to be located in relatively poor town outside of major urban centers

Model Y X n isc o s sc c s isc Y isc =student i test score in school s and class c, X=vector of school characteristics n sc =the size of this class c in school s Use class-level estimating equations

Fuzzy RD Implementation Fuzzy version: m sc is an instrument to n sc n X m sc s 0 sc 1 sc

OLS Estimates, 1991 With no controls show strong positive correlation between class size and achievement test scores (0.221 for reading; 0.322 for math) When the percentage disadvantage children was added as control, the estimate falls to -0.031 but insignificant; coefficient of math score remains to be positive Neither does the addition of enrollment significantly affect estimates

IV Estimate, 5 th grade With IV estimates, large class size are associated with lower test scores Impact of class size on reading without controls is -0.16 (0.04); with linear and quadratic controls for enrollment size ranges from -0.26 (0.08) and -0.28 (0.07) Impact on math scores is virtually zero without enrollment control; with linear and quadratic controls, the impact on math scores is -0.23 (0.09) to -0.261 (0.11) Even bigger impact with RD sample Note: Full sample includes everyone; discontinuity sample include only those in the vicinity of the thresholds class size +/- 5.

Strength of RDD Strong internal validity as we get near the threshold, the treatment and comparison groups are as if chosen by randomized assignment to treatment; strongest among the quasi-experimental methods Less ethical issues - no need to exclude eligible units from receiving treatment

Issues with RDD External validity impact is valid only around the threshold and not for the whole population; should not be used if policy issue is the impact on the whole population Statistical power - Requires larger sample size than the others (e.g. approx. 3 times than RCT*); there may not be enough sample around the threshold and expanding the band around the threshold weakens internal validity Functional form dependence may require additional functional form assumption to obtain credible impact as one gets farther from the threshold *Lee, H. & Munk, T. (2008), 'Using Regression Discontinuity Design in Program Evaluation', Survey Research Methods.; Schochet, P. Z. (2009), 'Statistical Power for Regression Discontinuity Designs in Education Evaluations', Journal of Educational and Behavioral Statistics 34(2), 238-266.

Basic References Lee, D. S. & Lemieux, T. (2010), 'Regression Discontinuity Designs in Economics', Journal of Economic Literature 48(2), 281-355. Imbens, G. W. & Lemieux, T. (2008), 'Regression discontinuity designs: A guide to practice', Journal of Econometrics 142(2), 615-635.

Thank You

Reproducing Angrist and Lavy (1999)

Estimate to reproduce

Data Available at http://economics.mit.edu/faculty/angrist/data1 /data/anglavy99

Preliminaries use AL1999_final5, clear lab var c_size "Enrollment lab var tipu "Percent disadvantage lab var classize "Class size lab var avgverb "Score, Reading lab var avgmath "Score, Math ** Variables generation replace avgverb= avgverb-100 if avgverb>100 replace avgmath= avgmath-100 if avgmath>100 g func1= c_size/(int((c_size-1)/40)+1) g func2= cohsize/(int(cohsize/40)+1) replace avgverb=. if verbsize==0 replace passverb=. if verbsize==0 replace avgmath=. if mathsize==0 replace passmath=. if mathsize==0 keep if 1<classize & classize<45 & c_size>5 keep if avgverb~=. * RD sample indicator g byte disc= (c_size>=36 & c_size<=45) (c_size>=76 & c_size<=85) (c_size>=116 & c_size<=125) g c_size2= (c_size^2)/100 * GENERATE TREND g trend= c_size if c_size>=0 & c_size<=40 replace trend= 20+(c_size/2) if c_size>=41 & c_size<=80 replace trend= (100/3)+(c_size/3) if c_size>=81 & c_size<=120 replace trend= (130/3)+(c_size/4) if c_size>=121 & c_size<=160

Descriptive stats. summ avgverb avgmath classize func1 tipu c_size schlcode, sep(0) Variable Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- avgverb 2019 74.38641 7.684038 34.8 93.86 avgmath 2018 67.29267 9.598066 27.69 93.93 classize 2019 29.93512 6.545885 8 44 func1 2019 30.95594 6.107924 8 40 tipuach 2019 14.10203 13.49887 0 76 c_size 2019 77.74195 38.81073 8 226 schlcode 2019 39637.98 15266.16 11005 61365

Full sample (col 1). ivregress 2sls avgverb (classize=func1) tipu, vce(cl schlcode) // col 1 Instrumental variables (2SLS) regression Number of obs = 2019 Wald chi2(2) = 595.02 Prob > chi2 = 0.0000 R-squared = 0.3568 Root MSE = 6.1612 (Std. Err. adjusted for 1002 clusters in schlcode) ------------------------------------------------------------------------------ Robust avgverb Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- classize -.1584777.0416256-3.81 0.000 -.2400624 -.0768929 tipuach -.3714599.0159817-23.24 0.000 -.4027836 -.3401363 _cons 84.36879 1.344373 62.76 0.000 81.73387 87.00371 ------------------------------------------------------------------------------ Instrumented: classize Instruments: tipuach func1 Note: Angrist and Lavy (1999) used Moulton (1986) correction, here we used cluster option in ivregress Moulton (1986) Random group effects and the precision of regression estimates, J. of Econometrics, 32, 385-97

Full sample (col 2). ivregress 2sls avgverb (classize=func1) tipu c_size, vce(cl schlcode) // col 2 Instrumental variables (2SLS) regression Number of obs = 2019 Wald chi2(3) = 582.76 Prob > chi2 = 0.0000 R-squared = 0.3397 Root MSE = 6.2424 (Std. Err. adjusted for 1002 clusters in schlcode) ------------------------------------------------------------------------------ Robust avgverb Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- classize -.2770197.0758487-3.65 0.000 -.4256804 -.128359 tipuach -.3687071.0160188-23.02 0.000 -.4001034 -.3373107 c_size.0222903.009124 2.44 0.015.0044076.040173 _cons 86.14565 1.785436 48.25 0.000 82.64626 89.64504 ------------------------------------------------------------------------------ Instrumented: classize Instruments: tipuach c_size func1

Full sample (col 3). ivregress 2sls avgverb (classize=func1) tipu c_size c_size2, vce(cl schlcode) // col 3 Instrumental variables (2SLS) regression Number of obs = 2019 Wald chi2(4) = 606.12 Prob > chi2 = 0.0000 R-squared = 0.3428 Root MSE = 6.2279 (Std. Err. adjusted for 1002 clusters in schlcode) ------------------------------------------------------------------------------ Robust avgverb Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- classize -.2631278.0937161-2.81 0.005 -.4468079 -.0794477 tipuach -.3687087.0159984-23.05 0.000 -.400065 -.3373524 c_size.0131031.0261633 0.50 0.616 -.0381759.0643822 c_size2.0041682.0099564 0.42 0.675 -.015346.0236823 _cons 86.12938 1.797086 47.93 0.000 82.60715 89.6516 ------------------------------------------------------------------------------ Instrumented: classize Instruments: tipuach c_size c_size2 func1

Full sample (col 4). ivregress 2sls avgverb (classize=func1) trend, vce(cl schlcode) // col 4 Instrumental variables (2SLS) regression Number of obs = 1961 Wald chi2(2) = 42.25 Prob > chi2 = 0.0000 R-squared =. Root MSE = 7.7144 (Std. Err. adjusted for 990 clusters in schlcode) ------------------------------------------------------------------------------ Robust avgverb Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- classize -.1898637.1216194-1.56 0.118 -.4282333.0485059 trend.1369107.035901 3.81 0.000.0665461.2072753 _cons 72.55187 1.977625 36.69 0.000 68.6758 76.42795 ------------------------------------------------------------------------------ Instrumented: classize Instruments: trend func1

Discontinuity Sample (+/- 5) (col 5). ivregress 2sls avgverb (classize=func1) tipu if disc==1, vce(cl schlcode) // col 5 Instrumental variables (2SLS) regression Number of obs = 471 Wald chi2(2) = 111.51 Prob > chi2 = 0.0000 R-squared = 0.3139 Root MSE = 6.7689 (Std. Err. adjusted for 224 clusters in schlcode) ------------------------------------------------------------------------------ Robust avgverb Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- classize -.410168.1176445-3.49 0.000 -.6407469 -.1795891 tipuach -.4772855.0484219-9.86 0.000 -.5721907 -.3823803 _cons 93.62 4.001931 23.39 0.000 85.77636 101.4636 ------------------------------------------------------------------------------ Instrumented: classize Instruments: tipuach func1

Discontinuity Sample (+/- 5) (col 6). ivregress 2sls avgverb (classize=func1) tipu c_size if disc==1, vce(cl schlcode) // col 6 Instrumental variables (2SLS) regression Number of obs = 471 Wald chi2(3) = 108.98 Prob > chi2 = 0.0000 R-squared = 0.2401 Root MSE = 7.124 (Std. Err. adjusted for 224 clusters in schlcode) ------------------------------------------------------------------------------ Robust avgverb Coef. Std. Err. z P> z [95% Conf. Interval] -------------+---------------------------------------------------------------- classize -.5823683.205291-2.84 0.005 -.9847313 -.1800053 tipuach -.4611878.0464016-9.94 0.000 -.5521333 -.3702424 c_size.0529979.0316929 1.67 0.094 -.009119.1151149 _cons 94.66023 4.621571 20.48 0.000 85.60212 103.7183 ------------------------------------------------------------------------------ Instrumented: classize Instruments: tipuach c_size func1