Regression Discontinuity

class: center, middle, inverse, title-slide

.title[
# Regression Discontinuity
]
.subtitle[
## EDLD 650: Week 4
]
.author[
### David D. Liebowitz
]

---

# Agenda

#### 1. Roadmap and Goals (9:00-9:10)
- Thoughts on DARE #1

#### 2. Discussion Questions (9:10-10:20)
- Murnane and Willett
- Angrist and Lavy
- Dee and Penner

#### 3. Break (10:20-10:30)
#### 4. Applied regression discontinuity (10:30-11:40)
#### 5. Wrap-up (11:40-11:50)
- DARE #2 prep

---
# DARE #1: Last words

- You *all* did a good job; many of you have stellar skills in writing functions and/or familiarity with the `tidyverse`
- All DARE exemplars will be substantively consistent in sign/magnitude with paper. Sometimes identical.
  - If you see your results are different, you know that misalignment exists; .red[**that's okay!**]
  - Try to solve it, but if you can't write up what you have, note and interpret the differences

--
- Need to make transition to drafting for public audiences 
- Non-causal and causal estimates shouldn't appear side-by-side in tables (w/o very good reason) -- beware [Table 2 fallacy](https://arxiv.org/abs/2005.10314)!!!
- Prepare for challenge of assignment without model answers -- think hard about model development

---
# Roadmap

---
# Goals

### 1. Describe conceptual approach to regression discontinuity analysis

### 2. Assess validity of RD assumptions in applied context

### 3. Conduct and interpret RD analysis in simplified data

---
class: middle, inverse
# So random...

---
class: middle, inverse

# Break
---
# Recall the basic set up of RD

---
# Failing a graduation test

---
# Failing a graduation test

---
# Failing a graduation test
<img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" />

---
## The basic set up in regression

Given a continuous forcing variable `$S_{i}$` such that individuals receive a treatment `$(D_{i})$` when `$S_{i} \geq$` a cutoff `$(C)$`:

`$$Y_i=\beta_{0} + \beta_{1} S_{i} + \mathbb{1}(S_{i} \geq C)\beta_{2} + \varepsilon_{i}$$`
  
--
.blue[**Can you explain what is happening in this regression?**]

.blue[**What about applied in a specific context?**]

`$$p(COLL_{i}=1)= \beta_{0} + \beta_{1} TESTSCORE_{i} + 1(TESTSCORE_{i} \geq 60)\beta_{2} + \varepsilon_{i}$$`

--
> This equation estimates a linear probability model, in which whether individuals attend college or not (expressed as a dichomotous indicator taking on the values of 0 or 1), is regressed on a linear measure of individual *i*'s test score `$(TESTSCORE_{i})$` and a indicator variable that takes the value of 1 if individual *i* scored 60 or higher on the test. `$\beta_{2}$` is the causal parameter of interests and represents the discontinuous jump in the probability (p.p.) of attending college (adjusting for test score) of scoring just above the pass score.

---
# Let's practice!

Read in modified Angrist & Lavy data and look at its characteristics:

```r
maimonides <- read_dta(here("data/ch9_angrist.dta"))
d <- select(maimonides, 
            read, size, intended_classize, observed_classize)
summary(d)
```

```
#>       read            size        intended_classize observed_classize
#>  Min.   :34.80   Min.   :  8.00   Min.   : 8.00     Min.   : 8.00    
#>  1st Qu.:69.86   1st Qu.: 50.00   1st Qu.:27.00     1st Qu.:26.00    
#>  Median :75.38   Median : 72.00   Median :31.67     Median :31.00    
#>  Mean   :74.38   Mean   : 77.74   Mean   :30.96     Mean   :29.94    
#>  3rd Qu.:79.84   3rd Qu.:100.00   3rd Qu.:35.67     3rd Qu.:35.00    
#>  Max.   :93.86   Max.   :226.00   Max.   :40.00     Max.   :44.00
```

```r
sapply(d, sd, na.rm=TRUE)
```

```
#>              read              size intended_classize observed_classize 
#>          7.678460         38.810731          6.107924          6.545885
```

---
# Variation in the treatment?

.blue[*Note that we are plotting the receipt of treatment (actual class size) against the forcing variable (cohort size). What assumption are we testing?*]

---
# Maimonides Rule Redux

Angrist, J. Lavy, V. Leder-Luis, J. & Shany, A. (2019). Maimonides' rule redux. *American Economic Review: Insights, 1*(3), 1-16.

.pull-left[
 <img src="angrist_sort.jpg" width="1479" style="display: block; margin: auto;" />
 
.blue[What does the picture on the left tell you about class size in Israel from 2002-2011?]
 ]
--
  
.pull-right[
<img src="angrist_2019_results.jpg" width="1388" style="display: block; margin: auto;" />

.blue[What does the picture on the right tell you about the effects of class size in Israel from 2002-2011?]
]

---
# Are RD assumptions met?

```r
bunch <- ggplot() +
    geom_histogram(data=d, aes(size), fill=blue, binwidth = 1) 
```

---
# Are RD assumptions met?

```r
sort <- ggplot() +
  geom_boxplot(data=d, aes(x=as.factor(size), y=ses), 
  fill=red_pink, alpha=0.4)
```
<img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-15-1.png" style="display: block; margin: auto;" />

---
# Are RD assumptions met?
.small[

```r
quantile <- ggplot() +
  geom_quantile(data=filter(d, size<41), aes(size, ses), quantiles=0.5, color=purple) + 
  geom_quantile(data=filter(d, size>=41), aes(size, ses), quantiles=0.5, color=red_pink)
```
]

<img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-17-1.png" style="display: block; margin: auto;" />
---

# Let's see if there's an effect
<img src="EDLD_650_4_RD_2_files/figure-html/unnamed-chunk-18-1.png" style="display: block; margin: auto;" />

---
# Refresh on how RD works

`$$READSCORE_{i} = \beta_{0} + \beta_{1} COHORTSIZE_{i} + 1(COHORTSIZE_{i} \geq 41)\beta_{2} + \varepsilon_{i}$$`
--

Could also write this as:
`$$READSCORE_{i} = \beta_{0} + \beta_{1} COHORTSIZE_{i} + \beta_{2} SMALLCLASS_{i} + \varepsilon_{i}$$`

Can you explain the identification strategy as you would in your methods section (using *secular trend, forcing variable, equal in expectation, projecting across the discontinuity, ITT*)?

---
# Refresh on how RD works

> We estimate the effects of class size on individual *i*'s reading score. Specifically, we regress their test score outcome on whether the size of their grade cohort predicts that they will be assigned to a small class. We account for the secular relationship between test scores and cohort size by adjusting our estimates for the linear relationship between cohort size and test scores.

> Our identification strategy relies on the assumption that cohorts that differ in size by only a few students are equal in expectation prior to the exogenous assignment to a small class size `$(D_{i}=1)$`. Our modeling approach depends on our ability to project a smooth relationship between reading scores and cohort size across the discontinuity and then estimate the discontinuous effect of being quasi-randomly assigned to learn in smaller classes. Given that compliance with Maimonides Rule is imperfect, our approach models Intent-to-Treat estimates. Specifically, what is the effect on reading scores of being assigned by rule to a smaller class size?