class: center, middle, inverse, title-slide .title[ # Regression Discontinuity ] .subtitle[ ## EDLD 650: Week 4 ] .author[ ### David D. Liebowitz ] --- <style type="text/css"> .inverse { background-color : #2293bf; } </style> # Agenda #### 1. Roadmap and Goals (9:00-9:10) - Thoughts on DARE #1 #### 2. Discussion Questions (9:10-10:20) - Murnane and Willett - Angrist and Lavy - Dee and Penner #### 3. Break (10:20-10:30) #### 4. Applied regression discontinuity (10:30-11:40) #### 5. Wrap-up (11:40-11:50) - DARE #2 prep --- # DARE #1: Last words - You *all* did a good job; many of you have stellar skills in writing functions and/or familiarity with the `tidyverse` - All DARE exemplars will be substantively consistent in sign/magnitude with paper. Sometimes identical. - If you see your results are different, you know that misalignment exists; .red[**that's okay!**] - Try to solve it, but if you can't write up what you have, note and interpret the differences -- - Need to make transition to drafting for public audiences - Non-causal and causal estimates shouldn't appear side-by-side in tables (w/o very good reason) -- beware [Table 2 fallacy](https://arxiv.org/abs/2005.10314)!!! - Prepare for challenge of assignment without model answers -- think hard about model development --- # Roadmap <img src="causal_id.jpg" width="1707" style="display: block; margin: auto;" /> --- # Goals ### 1. Describe conceptual approach to regression discontinuity analysis ### 2. Assess validity of RD assumptions in applied context ### 3. Conduct and interpret RD analysis in simplified data --- class: middle, inverse # So random... --- class: middle, inverse # Break --- # Recall the basic set up of RD <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" /> --- # Failing a graduation test <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" /> --- # Failing a graduation test <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" /> --- # Failing a graduation test <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> --- ## The basic set up in regression Given a continuous forcing variable `\(S_{i}\)` such that individuals receive a treatment `\((D_{i})\)` when `\(S_{i} \geq\)` a cutoff `\((C)\)`: `$$Y_i=\beta_{0} + \beta_{1} S_{i} + \mathbb{1}(S_{i} \geq C)\beta_{2} + \varepsilon_{i}$$` -- .blue[**Can you explain what is happening in this regression?**] -- .blue[**What about applied in a specific context?**] `$$p(COLL_{i}=1)= \beta_{0} + \beta_{1} TESTSCORE_{i} + 1(TESTSCORE_{i} \geq 60)\beta_{2} + \varepsilon_{i}$$` -- > This equation estimates a linear probability model, in which whether individuals attend college or not (expressed as a dichomotous indicator taking on the values of 0 or 1), is regressed on a linear measure of individual *i*'s test score `\((TESTSCORE_{i})\)` and a indicator variable that takes the value of 1 if individual *i* scored 60 or higher on the test. `\(\beta_{2}\)` is the causal parameter of interests and represents the discontinuous jump in the probability (p.p.) of attending college (adjusting for test score) of scoring just above the pass score. --- # Let's practice! Read in modified Angrist & Lavy data and look at its characteristics: ```r maimonides <- read_dta(here("data/ch9_angrist.dta")) d <- select(maimonides, read, size, intended_classize, observed_classize) summary(d) ``` ``` #> read size intended_classize observed_classize #> Min. :34.80 Min. : 8.00 Min. : 8.00 Min. : 8.00 #> 1st Qu.:69.86 1st Qu.: 50.00 1st Qu.:27.00 1st Qu.:26.00 #> Median :75.38 Median : 72.00 Median :31.67 Median :31.00 #> Mean :74.38 Mean : 77.74 Mean :30.96 Mean :29.94 #> 3rd Qu.:79.84 3rd Qu.:100.00 3rd Qu.:35.67 3rd Qu.:35.00 #> Max. :93.86 Max. :226.00 Max. :40.00 Max. :44.00 ``` ```r sapply(d, sd, na.rm=TRUE) ``` ``` #> read size intended_classize observed_classize #> 7.678460 38.810731 6.107924 6.545885 ``` --- # Variation in the treatment? <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-8-1.png" style="display: block; margin: auto;" /> -- .blue[*Note that we are plotting the receipt of treatment (actual class size) against the forcing variable (cohort size). What assumption are we testing?*] --- # Maimonides Rule Redux Angrist, J. Lavy, V. Leder-Luis, J. & Shany, A. (2019). Maimonides' rule redux. *American Economic Review: Insights, 1*(3), 1-16. .pull-left[ <img src="angrist_sort.jpg" width="1479" style="display: block; margin: auto;" /> .blue[What does the picture on the left tell you about class size in Israel from 2002-2011?] ] -- .pull-right[ <img src="angrist_2019_results.jpg" width="1388" style="display: block; margin: auto;" /> .blue[What does the picture on the right tell you about the effects of class size in Israel from 2002-2011?] ] --- # Are RD assumptions met? ```r bunch <- ggplot() + geom_histogram(data=d, aes(size), fill=blue, binwidth = 1) ``` <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-13-1.png" style="display: block; margin: auto;" /> --- # Are RD assumptions met? ```r sort <- ggplot() + geom_boxplot(data=d, aes(x=as.factor(size), y=ses), fill=red_pink, alpha=0.4) ``` <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-15-1.png" style="display: block; margin: auto;" /> --- # Are RD assumptions met? .small[ ```r quantile <- ggplot() + geom_quantile(data=filter(d, size<41), aes(size, ses), quantiles=0.5, color=purple) + geom_quantile(data=filter(d, size>=41), aes(size, ses), quantiles=0.5, color=red_pink) ``` ] <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-17-1.png" style="display: block; margin: auto;" /> --- # Let's see if there's an effect <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-18-1.png" style="display: block; margin: auto;" /> --- # Refresh on how RD works `$$READSCORE_{i} = \beta_{0} + \beta_{1} COHORTSIZE_{i} + 1(COHORTSIZE_{i} \geq 41)\beta_{2} + \varepsilon_{i}$$` -- Could also write this as: `$$READSCORE_{i} = \beta_{0} + \beta_{1} COHORTSIZE_{i} + \beta_{2} SMALLCLASS_{i} + \varepsilon_{i}$$` Can you explain the identification strategy as you would in your methods section (using *secular trend, forcing variable, equal in expectation, projecting across the discontinuity, ITT*)? --- # Refresh on how RD works > We estimate the effects of class size on individual *i*'s reading score. Specifically, we regress their test score outcome on whether the size of their grade cohort predicts that they will be assigned to a small class. We account for the secular relationship between test scores and cohort size by adjusting our estimates for the linear relationship between cohort size and test scores. > Our identification strategy relies on the assumption that cohorts that differ in size by only a few students are equal in expectation prior to the exogenous assignment to a small class size `\((D_{i}=1)\)`. Our modeling approach depends on our ability to project a smooth relationship between reading scores and cohort size across the discontinuity and then estimate the discontinuous effect of being quasi-randomly assigned to learn in smaller classes. Given that compliance with Maimonides Rule is imperfect, our approach models Intent-to-Treat estimates. Specifically, what is the effect on reading scores of being assigned by rule to a smaller class size? --- # Let's see if there's an effect ```r d <- d %>% mutate(small = ifelse(size >= 41,TRUE,FALSE)) fx2 <- ggplot() + geom_point(data=d, aes(x=size, y=read, color=small), alpha=0.8, shape=16) ``` <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-20-1.png" style="display: block; margin: auto;" /> -- .small[Scatter prevents visual detection of discontinuity...thus, the value of [.red-pink[**bin scatter**]](https://arxiv.org/abs/1902.09608).] --- # Let's see if there's an effect ```r bin<- d %>% group_by(size) %>% summarise(across(c("read", "small"), mean)) binned_plot <- ggplot() + geom_point(data=bin, aes(x=size, y=read, color=as.factor(small)), alpha=0.8, shape=16, size=3) ``` <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-22-1.png" style="display: block; margin: auto;" /> --- # Let's see if there's an effect ### Fitted lines: <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-23-1.png" style="display: block; margin: auto;" /> --- # Let's see if there's an effect ### Different slopes: <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-24-1.png" style="display: block; margin: auto;" /> --- # Let's see if there's an effect ### Change the bandwidth: <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-25-1.png" style="display: block; margin: auto;" /> --- # Let's see if there's an effect ### Formal-ish: <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-26-1.png" style="display: block; margin: auto;" /> --- # But it could be non-linear ### Formal-ish: <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-27-1.png" style="display: block; margin: auto;" /> --- # Regression RD Let's fit three different intent-to-treat (ITT) RD models, each of which assumes a different functional form for the forcing variable: **Linear trend, same slope** $$ `\begin{aligned} (1) READSCORE_i=\beta_0 + \beta_1 COHORTSIZE_i + \beta_2SMALLCLASS_i + \epsilon_i \end{aligned}` $$ **Linear trend, different slope** $$ `\begin{aligned} (2) READSCORE_i= &\beta_0 + \beta_1 COHORTSIZE_i + \beta_2SMALLCLASS_i + \\ &\beta_3 COHORTSIZE \times SMALLCLASS_i + \epsilon_i \end{aligned}` $$ **Quadratic trend, same slope** $$ `\begin{aligned} (3) READSCORE_i= &\beta_0 + \beta_1 COHORTSIZE_i + \beta_2 \text{COHORTSIZE}_{i}^2 + \\ & \beta_3SMALLCLASS_i + \epsilon_i \end{aligned}` $$ --- # Results .small[ <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> (1) </th> <th style="text-align:center;"> (2) </th> <th style="text-align:center;"> (3) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Intercept </td> <td style="text-align:center;"> 75.825*** </td> <td style="text-align:center;"> 96.046*** </td> <td style="text-align:center;"> 68.334*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (4.202) </td> <td style="text-align:center;"> (8.621) </td> <td style="text-align:center;"> (1.516) </td> </tr> <tr> <td style="text-align:left;"> Intended size </td> <td style="text-align:center;"> -0.139 </td> <td style="text-align:center;"> -0.725** </td> <td style="text-align:center;"> -33.987+ </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.119) </td> <td style="text-align:center;"> (0.248) </td> <td style="text-align:center;"> (17.666) </td> </tr> <tr> <td style="text-align:left;"> .red[**Intended small class**] </td> <td style="text-align:center;"> 3.953* </td> <td style="text-align:center;"> -24.346* </td> <td style="text-align:center;"> 5.894** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (1.800) </td> <td style="text-align:center;"> (10.708) </td> <td style="text-align:center;"> (1.990) </td> </tr> <tr> <td style="text-align:left;"> Size x Small </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> 0.757** </td> <td style="text-align:center;"> </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> (0.282) </td> <td style="text-align:center;"> </td> </tr> <tr> <td style="text-align:left;"> (Intended size)^2 </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> 22.716* </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> (10.154) </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 423 </td> <td style="text-align:center;"> 423 </td> <td style="text-align:center;"> 423 </td> </tr> <tr> <td style="text-align:left;"> R2 </td> <td style="text-align:center;"> 0.015 </td> <td style="text-align:center;"> 0.031 </td> <td style="text-align:center;"> 0.026 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001</td></tr></tfoot> </table> ] --- ## Can you explain these results? Models 1 and 3 seem sensibly connected to the graphical evidence, but Model 2 suggest that the effect of an offer of small class size is negative and an estimated whopping .red[-24.3 points lower reading scores]. What gives? -- Recall the three models we fit: $$ `\begin{aligned} (1) READSCORE_i=\beta_0 + \beta_1 COHORTSIZE_i + \beta_2SMALLCLASS_i + \epsilon_i \end{aligned}` $$ $$ `\begin{aligned} (2) READSCORE_i= &\beta_0 + \beta_1 COHORTSIZE_i + \beta_2SMALLCLASS_i + \\ &\beta_3 COHORTSIZE \times SMALLCLASS_i + \epsilon_i \end{aligned}` $$ $$ `\begin{aligned} (3) READSCORE_i= &\beta_0 + \beta_1 COHORTSIZE_i + \beta_2 \text{COHORTSIZE}_{i}^2 + \\ & \beta_3SMALLCLASS_i + \epsilon_i \end{aligned}` $$ .small[We need to project the fitted values that our regression results predict *at the discontinuity*. The most straightforward way is to plug in the values for grade cohorts that are just under and over the threshold for being split in two by Maimonides' Rule using the estimated coefficients from the table on the previous slide.] .blue[Take Eq. 2 and try doing this for cohorts of 40 and 41, respectively.] --- ## Can you explain these results? **Big class, grade cohort = 40** $$ `\begin{aligned} \hat{READSCORE_i} = & 96.046 + (-0.725)(40) + (-24.346)(0) + (0.757)(40)(0) \\ & 96.046 + (-29) + 0 + 0 \\ & 67.046 \end{aligned}` $$ -- **Small Class, grade cohort=41** $$ `\begin{aligned} \hat{READSCORE_i} = & 96.046 + (-0.725)(41) + (-24.346)(1) + (0.757)(41)(1) \\ & 96.046 + (-29.725) + (-24.346) + 31.037 \\ & 73.012 \end{aligned}` $$ -- So the predicted effect of being assigned to receive a smaller class when we allow the slopes to vary around the discontinuity is `\(73.012 - 67.046 = 5.9766\)`, or slightly larger than either the linear, constant slope or the quadratic specifications. -- .purple[*Note: this is all implicitly solved for when you re-center the forcing variable at 0.*] --- ## Can you explain these results? Now that we've harmonized our findings, can you explain these results in technically accurate and substantively clear ways? -- > We estimate an effect of being assigned to a small class of between roughly 4 and 6 scale score points, depending on our assumptions about the nature of the underlying secular relationship between cohort size and reading performance. At the lower bound, these represent effects of around one half of a standard deviation (*SD*) unit. At the upper bound, these effects are as large as three-quarters of a standard deviation in the full sample. These estimates are Local Average Treatment Effects (LATE), specific to being a member of a cohort whose size is just above or below the threshold for being divided into a smaller class. --- # Extensions 1. Bandwidth variation (bias v. variance tradeoff) - Manual - Cross-fold validation (leave-one-out) - Imbens-Kalyanaraman (2009) Optimal Bandwidth Calculation 2. Higher-order polynomials 3. Non-parametric estimates - Local-linear approaches (LOESS) - Kernel (how to value points closest to cutoff) - Machine learning 4. Binning for visualizations 5. Diff-in-RD 6. Packages - R: `rddapp`, `rdd`, `rddtools`, `rdrobust` --- # Just for fun... <img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-29-1.png" style="display: block; margin: auto;" /> --- class: middle, inverse # Wrap-up --- # Goals ### 1. Describe conceptual approach to regression discontinuity analysis ### 2. Assess validity of RD assumptions in applied context ### 3. Conduct RD analysis in simplified data --- # To-Dos ### Week 5: Regression Discontinuity II ### Readings: - Holden (2016) ### DARE #2 - Due 2/4, 11:59pm ### Project proposal - Due 2/2, 11:59pm --- # Feedback .large[.red[**Midterm Student Experience Survey**]]