class: center, middle, inverse, title-slide .title[ # Difference-in-Differences ] .subtitle[ ## EDLD 650: Week 2 ] .author[ ### David D. Liebowitz ] --- <style type="text/css"> .inverse { background-color : #2293bf; } </style> # Agenda ### 1. Roadmap and Goals ### 2. Estimating DD effects in data ### 3. Wrap-up - DARE #1 --- # Roadmap <img src="causal_id.jpg" width="1707" style="display: block; margin: auto;" /> --- # Goals .large[ 1. Describe threats to validity in difference-in-differences (DD) identification strategy and multiple approaches to address these threats + *from responding to Discussion Questions* 2. Using a cleaned dataset, estimate multiple DD specifications in R and interpret these results + *from this lecture and accompanying [script](./code/EDLD_650_2_DD_script.R)* ] --- # Programming in EDLD 650 ## What you won't get 🙁 - A heavy dose of data management and visualization strategies - The most efficient code with extensive use of functions -- ## What you will get 😄 - A review of the programming steps you should take as part of the **actual** research process - *Some* model code for data management and visualization - Programming strategies and packages that can be used to estimate the causal inference techniques we will study - A community of skilled programmers who will expand our collective knowledge base! --- class: middle, inverse # Estimating a classic, two-period difference-in-differences (DD) model --- # Replicating Dynarski (2003) Recall Dynarski's primary model (Eq. 2): $$ `\begin{align} y_i=\alpha + \beta(\text{FATHERDEC}_i \times \text{BEFORE}_i) + \delta \text{FATHERDEC}_i + \theta \text{BEFORE}_i + \upsilon_i \end{align}` $$ -- <br> .large[**Let's try to fit this in our data!**] --- # Reading in the data I'm using the `haven` package to import a data file that is in the Stata .dta format. Lotsa options for importing file formats other than .csv ( `foreign` and `rio` are two such ones)! ```r dynarski <- haven::read_dta(here("data/ch8_dynarski.dta")) head(dynarski) ``` ``` #> # A tibble: 6 x 8 #> id hhid wt88 coll hgc23 yearsr fatherdec offer #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl+lbl> <dbl> #> 1 9 9 691916 1 13 81 0 [Father not deceased] 1 #> 2 14 13 784204 1 16 81 0 [Father not deceased] 1 #> 3 15 15 811032 1 16 82 0 [Father not deceased] 0 #> 4 21 20 644853 1 16 79 0 [Father not deceased] 1 #> 5 22 22 728189 1 16 80 0 [Father not deceased] 1 #> 6 24 23 776590 0 12 79 0 [Father not deceased] 1 ``` --- # Viewing the data
--- # Understanding the data (1) ```r d <- select(dynarski, coll, hgc23, fatherdec, offer) summary(d) ``` ``` #> coll hgc23 fatherdec offer #> Min. :0.0000 Min. :10.00 Min. :0.00000 Min. :0.000 #> 1st Qu.:0.0000 1st Qu.:12.00 1st Qu.:0.00000 1st Qu.:0.000 #> Median :0.0000 Median :12.00 Median :0.00000 Median :1.000 #> Mean :0.4579 Mean :13.14 Mean :0.04792 Mean :0.723 #> 3rd Qu.:1.0000 3rd Qu.:14.00 3rd Qu.:0.00000 3rd Qu.:1.000 #> Max. :1.0000 Max. :19.00 Max. :1.00000 Max. :1.000 ``` -- ```r sum(is.na(coll)) ``` ``` #> [1] 0 ``` --- # Understanding the data (2) ```r college <- table(dynarski$fac_fatherdec, dynarski$fac_coll) college ``` ``` #> #> No College College #> Father not deceased 2059 1736 #> Father deceased 102 89 ``` --- # Plot outcome data ```r hg <- ggplot(dynarski, aes(hgc23)) + geom_histogram(binwidth=1) hg + scale_x_continuous(name="Highest-grade completed at 23", breaks=c(10, 12, 14, 16, 18, 20)) + theme_pander(base_size=18) ``` <img src="EDLD_650_2_DD_1_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> --- # Summary statistics table <table style="text-align:center"><caption><strong>Table 1. Descriptive Statistics</strong></caption> <tr><td colspan="4" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Statistic</td><td>N</td><td>Mean</td><td>St. Dev.</td></tr> <tr><td colspan="4" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Attend college at 23</td><td>3,986</td><td>0.46</td><td>0.50</td></tr> <tr><td style="text-align:left">Years schooling at 23</td><td>3,986</td><td>13.14</td><td>1.63</td></tr> <tr><td style="text-align:left">Father deceased</td><td>3,986</td><td>0.05</td><td>0.21</td></tr> <tr><td style="text-align:left">Offer</td><td>3,986</td><td>0.72</td><td>0.45</td></tr> <tr><td colspan="4" style="border-bottom: 1px solid black"></td></tr><tr><td colspan="4" style="text-align:left">Notes: This table presents unweighted means and standard deviations from the NLSY poverty and random samples used in the Dynarski (2003) paper.</td></tr> </table> --- # Graphical DD <img src="EDLD_650_2_DD_1_files/figure-html/unnamed-chunk-11-1.png" style="display: block; margin: auto;" /> -- .blue[**What is treatment effect?**] -- **What is the core .blue[identifying assumption] assumption underlying the DD framework?** How do we know whether we've satisfied it? --- # Graphical DD <img src="EDLD_650_2_DD_1_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" /> .blue[**What would you think if you "knew" this was the pattern?**] --- ## Estimate classic two-period DD Dynarski's original model: $$ `\begin{align} y_i=\alpha + \beta(\text{FATHERDEC}_i \times \text{BEFORE}_i) + \delta \text{FATHERDEC}_i + \theta \text{BEFORE}_i + \upsilon_i \end{align}` $$ -- Murnane and Willet have renamed the variable to make clear that a value of 1 means individuals are eligible for aid, so we'll do the same: $$ `\begin{align} y_i=\alpha + \beta(\text{FATHERDEC}_i \times \text{OFFER}_i) + \delta \text{FATHERDEC}_i + \theta \text{OFFER}_i + \upsilon_i \end{align}` $$ --- ## Estimate classic two-period DD $$ `\begin{align} y_i=\alpha + \beta(\text{FATHERDEC}_i \times \text{OFFER}_i) + \delta \text{FATHERDEC}_i + \theta \text{OFFER}_i + \upsilon_i \end{align}` $$ ```r lm(coll ~ fatherdec*offer, data=dynarski) ``` ``` #> #> Call: #> lm(formula = coll ~ fatherdec * offer, data = dynarski) #> #> Coefficients: #> (Intercept) fatherdec offer fatherdec:offer #> 0.42571 -0.07386 0.04387 0.11523 ``` -- This doesn't quiet match, let's add the weights in... --- ## Estimate classic two-period DD $$ `\begin{align} y_i=\alpha + \beta(\text{FATHERDEC}_i \times \text{OFFER}_i) + \delta \text{FATHERDEC}_i + \theta \text{OFFER}_i + \upsilon_i \end{align}` $$ ```r lm(coll ~ fatherdec*offer, data=dynarski, weights=dynarski$wt88) ``` ``` #> #> Call: #> lm(formula = coll ~ fatherdec * offer, data = dynarski, weights = dynarski$wt88) #> #> Coefficients: #> (Intercept) fatherdec offer fatherdec:offer #> 0.47569 -0.12348 0.02601 0.18223 ``` -- Pretty underwhelming output? --- # Under the hood ```r est_dynarski <- lm(coll ~ fatherdec*offer, data=dynarski, weights=dynarski$wt88) est_dynarski %>% names() ``` ``` #> [1] "coefficients" "residuals" "fitted.values" "effects" #> [5] "weights" "rank" "assign" "qr" #> [9] "df.residual" "xlevels" "call" "terms" #> [13] "model" ``` -- ```r est_dynarski %>% tidy() ``` ``` #> # A tibble: 4 x 5 #> term estimate std.error statistic p.value #> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 (Intercept) 0.476 0.0150 31.8 7.12e-198 #> 2 fatherdec -0.123 0.0752 -1.64 1.01e- 1 #> 3 offer 0.0260 0.0178 1.46 1.43e- 1 #> 4 fatherdec:offer 0.182 0.0893 2.04 4.14e- 2 ``` --- # Further under the hood ```r summary(est_dynarski) ``` ``` ... #> Min 1Q Median 3Q Max #> -490.9 -230.3 -138.6 247.7 554.0 #> #> Coefficients: *#> Estimate Std. Error t value Pr(>|t|) #> (Intercept) 0.47569 0.01496 31.793 <2e-16 *** *#> fatherdec -0.12348 0.07520 -1.642 0.1007 *#> offer 0.02601 0.01777 1.463 0.1435 *#> fatherdec:offer 0.18223 0.08931 2.041 0.0414 * #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> Residual standard error: 285.7 on 3982 degrees of freedom #> Multiple R-squared: 0.001961, Adjusted R-squared: 0.001209 #> F-statistic: 2.607 on 3 and 3982 DF, p-value: 0.04998 ... ``` --- # Making a no-fuss table .small[ ```r stargazer(est_dynarski, type='html', single.row = T) ``` <table style="text-align:center"><tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td><em>Dependent variable:</em></td></tr> <tr><td></td><td colspan="1" style="border-bottom: 1px solid black"></td></tr> <tr><td style="text-align:left"></td><td>coll</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">fatherdec</td><td>-0.123 (0.075)</td></tr> <tr><td style="text-align:left">offer</td><td>0.026 (0.018)</td></tr> <tr><td style="text-align:left">fatherdec:offer</td><td>0.182<sup>**</sup> (0.089)</td></tr> <tr><td style="text-align:left">Constant</td><td>0.476<sup>***</sup> (0.015)</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>3,986</td></tr> <tr><td style="text-align:left">R<sup>2</sup></td><td>0.002</td></tr> <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.001</td></tr> <tr><td style="text-align:left">Residual Std. Error</td><td>285.711 (df = 3982)</td></tr> <tr><td style="text-align:left">F Statistic</td><td>2.607<sup>**</sup> (df = 3; 3982)</td></tr> <tr><td colspan="2" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr> </table> ] --- # Central DD asssumptions In order to fully trust that the estimates produced by a DD analysis are unbiased by endogeneity, we need to make (and defend) the following two assumptions: 1. Not-treated (or not-yet-treated) units are .blue[**valid counterfactuals**] - Parallel trends? - Selection into treatment? (non-exogeneity) -- 2. There are no .blue[**simultaneous shocks**] or unobserved .blue[**secular trends**] - Other observed and unobserved events or patterns? -- We'll look at how to address some of these in the next section of the lecture, and you'll read more about how to do so in the readings and DARE for next week! --- class: middle, inverse # DD in panel data #### A. The two-way fixed effect (TWFE) estimator for staggered implementation #### B. Appropriate statistical inference #### C. Assessing the parallel trends assumption (PTA) #### D. The modern event-study approach --- # End of desegregation .small[ - In 1991, 480 school districts were under court desegregation order - In following two decades, nearly half (215) were released and returned to neighborhood assignment patterns - Timing of release was arguably .blue[**exogenous**] and .blue[**quasi-random**] - This provides strong support to the claim that the districts which were not (or *not yet*) released from court orders were on .blue[**parallel trends**] in their outcomes with districts that were released and, thus, serve as .blue[**valid counterfactuals**]<sup>1</sup> ] .pull-left[ <img src="state_map.png" width="667" style="display: block; margin: auto;" /> ] .pull-right[ <img src="release_timing.png" width="667" style="display: block; margin: auto;" /> ] .footnote[[1] [Liebowitz (2018)](https://journals.sagepub.com/doi/10.3102/0162373717725804)] --- # End of desegregation data
--- # Estimate DD in panel data (1) $$ `\begin{align} \text{DROPOUT_BLACK} _{jt} = \beta_1 \text{UNITARY} _{jt} + \Gamma_j + \Pi_t + \epsilon _{j} \end{align}` $$ -- Take a minute to write down what this model does in words. Use the terms .blue[**mean effect**], .blue[**time series**], .blue[**fixed effects**] and .blue[**causal parameter of interest**]. -- .small[ > The model takes advantage of **time series (or panel or repeated measure)** data in which the Black dropout rate in each district is observed at three points in time. The model regresses the Black dropout rate in a **fixed effect** model in which observations are clustered in two dimensions: within district `\((\Gamma_j)\)` and also within time `\((\Pi_t)\)`. Note: `\(\Gamma_j\)` represents a vector of dummy indicators that take the value of one if observation *j* is equal to district *j* and zero otherwise. `\(\Pi_t\)` represents a vector of dummy indicators that take the value of one if observation *j* is in time *t* (1990, 2000 or 2010). `\(\beta_{1}\)` estimates the **average treatment effect** of being observed after being declared unitary and is the **causal parameter of interest** reflecting the effect of being released from a desegregation order `\(UNITARY_{jt}\)` on the black dropout rate. ] -- .small[In this case, the estimates rely on .blue[**repeated cross-sectional**] panel data. We could also implement the same framework in .blue[**longitudinal**] panel data.] --- # Estimate DD in panel data (2) We are going to shift to using the `fixest` [package](https://cran.r-project.org/web/packages/fixest/index.html); an incredibly versatile and robust tool for regression analysis in R from Laurent Berge. ```r ols_unitary1 <- feols(sd_dropout_prop_b ~ unitary | leaid + year, data=desegregation, vcov = "iid", weights=desegregation$sd_t_1619_b) summary(ols_unitary1) ``` ``` #> OLS estimation, Dep. Var.: sd_dropout_prop_b *#> Observations: 1,403 *#> Weights: desegregation$sd_t_1619_b *#> Fixed-effects: leaid: 476, year: 3 #> Standard-errors: IID *#> Estimate Std. Error t value Pr(>|t|) *#> unitary 0.018185 0.003121 5.82642 7.8155e-09 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> RMSE: 1.17345 Adj. R2: 0.547167 #> Within R2: 0.035437 ``` -- .blue[**Can you interpret this output?**] .small[(*ignore un-highlighted line for now*)] --- ### Addressing serial correlation **The worry**: within-unit correlation of outcomes (e.g., within-state, across state-years) results in correlated (and therefore too small) standard errors. As a result out .blue[**statistical inference**] will be incorrect. -- **The solution**: .blue[**cluster-robust standard errors**]<sup>1</sup>. Clustering standard errors by the *k<sup>th</sup>* regressor inflates iid OLS standard errors by: `$$\tau_{k} \simeq 1 + \rho_{x_{k}} \rho_{\mu} (\bar{N}_{g} - 1)$$` where `\(\rho_{x_{k}}\)` is the within-cluster correlation of regressor `\(x_{igk}\)`, `\(\rho_{\mu}\)` is the within-cluster error correlation and `\(\bar{N}_{g}\)` is the average cluster size. -- `\(\tau_{k}\)` is **asymptotically** correct as number of clusters increase. Current consensus: this estimate of `\(\tau_{k}\)` is accurate with .blue[**~45 clusters**]. Fewer than 40, and this approach can dramatically under-estimate SEs (consider bootstrapping). **Best practice**: cluster at the unit of treatment (or consider two-way clustering).<sup>2</sup> .footnote[.small[[1] Read all about cluster-robust standard errors in [Cameron & Miller's (2015)](http://jhr.uwpress.org/content/50/2/317.refs) accessible practitioner's guide to standard errors. <br> [2] [Bertrand, Mullainathan & Duflo (2004)](https://academic.oup.com/qje/article/119/1/249/1876068) and [Abadie et al. (2017)](https://www.nber.org/papers/w24003).]] --- # Clustered standard errors (1) ```r ols_unitary2 <- feols(sd_dropout_prop_b ~ unitary | leaid + year, data=desegregation, weights=desegregation$sd_t_1619_b) summary(ols_unitary2) ``` ``` #> OLS estimation, Dep. Var.: sd_dropout_prop_b #> Observations: 1,403 #> Weights: desegregation$sd_t_1619_b *#> Fixed-effects: leaid: 476, year: 3 *#> Standard-errors: Clustered (leaid) *#> Estimate Std. Error t value Pr(>|t|) *#> unitary 0.018185 0.004851 3.74879 0.00019958 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> RMSE: 1.17345 Adj. R2: 0.547167 #> Within R2: 0.035437 ``` -- Default behavior in `fixest` is to cluster standard errors on the first fixed effect. --- # Clustered standard errors (2) ```r ols_unitary3 <- feols(sd_dropout_prop_b ~ unitary | leaid + year, data=desegregation, vcov = ~ leaid^year, weights=desegregation$sd_t_1619_b) summary(ols_unitary3) ``` ``` #> OLS estimation, Dep. Var.: sd_dropout_prop_b #> Observations: 1,403 #> Weights: desegregation$sd_t_1619_b *#> Fixed-effects: leaid: 476, year: 3 *#> Standard-errors: Clustered (leaid^year) *#> Estimate Std. Error t value Pr(>|t|) *#> unitary 0.018185 0.004816 3.77557 0.00016631 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> RMSE: 1.17345 Adj. R2: 0.547167 #> Within R2: 0.035437 ``` -- .small[We are going to cluster our standard errors .blue[**at the level of assignment to treatment**]: the district-year.] --- # Addressing serial correlation .small[ <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <caption>A taxonomy of models estimating the end of school desegregation on the black dropout rate, by std. error clustering approach</caption> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Unclustered </th> <th style="text-align:center;"> Clustered (Unit) </th> <th style="text-align:center;"> Clustered (Unit*Period) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> unitary </td> <td style="text-align:center;"> 0.018*** </td> <td style="text-align:center;"> 0.018*** </td> <td style="text-align:center;"> 0.018*** </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> (0.003) </td> <td style="text-align:center;box-shadow: 0px 1.5px"> (0.005) </td> <td style="text-align:center;box-shadow: 0px 1.5px"> (0.005) </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 1403 </td> <td style="text-align:center;"> 1403 </td> <td style="text-align:center;"> 1403 </td> </tr> <tr> <td style="text-align:left;"> R2 </td> <td style="text-align:center;"> 0.702 </td> <td style="text-align:center;"> 0.702 </td> <td style="text-align:center;"> 0.702 </td> </tr> <tr> <td style="text-align:left;"> Std.Errors </td> <td style="text-align:center;"> IID </td> <td style="text-align:center;"> by: leaid </td> <td style="text-align:center;"> by: leaid^year </td> </tr> <tr> <td style="text-align:left;"> FE: leaid </td> <td style="text-align:center;"> X </td> <td style="text-align:center;"> X </td> <td style="text-align:center;"> X </td> </tr> <tr> <td style="text-align:left;"> FE: year </td> <td style="text-align:center;"> X </td> <td style="text-align:center;"> X </td> <td style="text-align:center;"> X </td> </tr> </tbody> <tfoot> <tr><td style="padding: 0; " colspan="100%"> <sup></sup> + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001</td></tr> <tr><td style="padding: 0; " colspan="100%"> <sup></sup> Notes: The table displays coefficients from Equation X with standard errors in parentheses.</td></tr> </tfoot> </table> ] -- .small[Doesn't make too much of a difference here...] -- .small[*Note*: Using `modelsummary` package, but `fixest` comes with the powerful `etable` function.] --- # Addressing parallel trends ### A parametric approach $$ `\begin{aligned} \text{DROPOUT_BLACK}_{jt} = & \beta_1 \text{UNITARY} _{jt} + \beta_2 (\text{UNITARY} \times \text{REL_YEAR})_{jt} + \\ & \beta_3 \text{REL_YEAR}_{jt} + \Gamma_j + \Pi_t + \epsilon _{j} \end{aligned}` $$ -- What is this `\(\text{REL_YEAR}_{jt}\)` and how do we code it? -- ```r desegregation <- desegregation %>% mutate(rel_yr = case_when( !is.na(yrdiss) ~ (year - yrdiss), is.na(yrdiss) ~ -1 ## <-- this is funky, let's talk about it )) summary(desegregation$rel_yr) ``` ``` #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> -19.00 -1.00 -1.00 -1.51 -1.00 19.00 ``` --- # Peek at REL_YEAR .small[
] --- # Map coefficients to graph $$ `\begin{aligned} \text{DROPOUT_BLACK}_{jt} = & \color{red}{\beta_1} \text{UNITARY} _{jt} + \color{orange}{\beta_2} (\text{UNITARY} \times \text{REL_YEAR})_{jt} + \\ & \color{blue}{\beta_3} \text{REL_YEAR}_{jt} + \Gamma_j + \Pi_t + \epsilon _{j} \end{aligned}` $$ <img src="EDLD_650_2_DD_1_files/figure-html/unnamed-chunk-29-1.png" style="display: block; margin: auto;" /> -- **Remember**: given the structure of our model, these parameters are estimated *relative to untreated and not-yet-treated districts*. --- # Parallel trends? ```r ols_unitary_run <- feols(sd_dropout_prop_b ~ unitary*rel_yr | leaid + year, data=desegregation, vcov = ~leaid^year, weights=desegregation$sd_t_1619_b) summary(ols_unitary_run) ``` ``` #> OLS estimation, Dep. Var.: sd_dropout_prop_b #> Observations: 1,403 #> Weights: desegregation$sd_t_1619_b #> Fixed-effects: leaid: 476, year: 3 *#> Standard-errors: Clustered (leaid^year) *#> Estimate Std. Error t value Pr(>|t|) *#> unitary 0.014584 0.005860 2.48893 0.012928 * *#> rel_yr 0.001027 0.000579 1.77312 0.076426 . *#> unitary:rel_yr -0.001367 0.000689 -1.98458 0.047386 * #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> RMSE: 1.16896 Adj. R2: 0.54965 #> Within R2: 0.042803 ``` -- .blue[How would this graph look different than the one on previous slide?] --- # A complete table! <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <caption>Table 2. Effects of end of school desegregation on black dropout rate</caption> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> 1 </th> <th style="text-align:center;"> 2 </th> <th style="text-align:center;"> 3 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Unitary status </td> <td style="text-align:center;"> 0.018*** </td> <td style="text-align:center;"> 0.018*** </td> <td style="text-align:center;"> 0.015* </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.005) </td> <td style="text-align:center;"> (0.005) </td> <td style="text-align:center;"> (0.006) </td> </tr> <tr> <td style="text-align:left;"> Pre-trend </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> 0.001+ </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> (0.001) </td> </tr> <tr> <td style="text-align:left;"> Unitary x Relative-Year </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> -0.001* </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> (0.001) </td> </tr> <tr> <td style="text-align:left;"> Covariates? </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> X </td> <td style="text-align:center;"> X </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 1403 </td> <td style="text-align:center;"> 1403 </td> <td style="text-align:center;"> 1403 </td> </tr> <tr> <td style="text-align:left;"> R2 </td> <td style="text-align:center;"> 0.702 </td> <td style="text-align:center;"> 0.702 </td> <td style="text-align:center;"> 0.704 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> Notes: +p<0.1, *p<0.05, **p<0.01, ***p<0.001. Table displays coefficients and district-by-year clustered standard errors in parentheses. All models include fixed effects for year and district. Models 2 and 3 adjust for proportion of 16-19 year-olds residing in district in 1990 who were Black, interacted with year.</td></tr></tfoot> </table> --- # A flexible approach What if, instead of assigning a particular functional form to our treatment effects over time (either mean, linear or higher-order polynomial), we specified an entirely flexible model? $$ `\begin{aligned} \text{DROPOUT_BLACK} _{jt} = & \beta_1 \text{pre}^{-n}_{jt} + \beta_2 \text{pre8} + \beta_3 \text{pre7} _{jt} +... \\ & +\beta_m \text{post0} _{jt} + ...+ \beta_n \text{post}^{n}_{jt} + \Gamma_j + \Pi_t + \epsilon _{j} \end{aligned}` $$ -- Could also write as: $$ `\begin{align} \text{DROPOUT_BLACK} _{jt} = \sum_{t=-10}^n 1(\text{t}=\text{t}_{j}^*)\beta_t+ \Gamma_j + \Pi_t + \epsilon _{j} \end{align}` $$ -- Think for a moment what this model does? -- >The model adjusts its estimates of the mean rate of Black dropout in district *j* by the mean rate of Black dropout in year *t* across all districts. Then, it estimates what effect does being *t* years pre- or post-unitary have. The comparison in each of these `\(\beta\)` s is to being never or not yet *UNITARY*. --- # Event study This would permit a .blue[**fully flexible specification**], permitting us to both evaluate .blue[**violations of the PTA**] and assess potential .blue[**dynamic effects**] of the treatment: ``` ... #> Observations: 1,403 #> Weights: desegregation$sd_t_1619_b #> Fixed-effects: year: 3, leaid: 476 #> Standard-errors: Clustered (leaid^year) #> Estimate Std. Error t value Pr(>|t|) #> cat_yr::-10+ -0.004020 0.009034 -0.444977 0.656405 #> cat_yr::-7to-9 -0.003755 0.011329 -0.331417 0.740379 #> cat_yr::-6to-4 0.009199 0.010449 0.880378 0.378806 #> cat_yr::-3to-2 0.005798 0.009318 0.622229 0.533892 #> cat_yr::Unitaryto+2 0.020860 0.010611 1.965823 0.049516 * #> cat_yr::3to5 0.022258 0.010005 2.224797 0.026254 * #> cat_yr::7to9 0.019450 0.008586 2.265370 0.023642 * #> cat_yr::10+ 0.018580 0.010454 1.777409 0.075718 . #> --- ... ``` -- .small[*What has happened to our standard errors?* (think about .blue[**bias v. variance tradeoff**])] --- # Event study visualized <img src="EDLD_650_2_DD_1_files/figure-html/unnamed-chunk-33-1.png" style="display: block; margin: auto;" /> -- >The end of desegregation efforts had a causal effect on the Black dropout rate, resulting in a discontinuous and persistent increase of between 1 and 2 percentage points (*caveats, caveats*). --- # C-ITS .large[**An aside on the related Comparative-Interrupted Time Series approach:**] <img src="EDLD_650_2_DD_1_files/figure-html/unnamed-chunk-34-1.png" style="display: block; margin: auto;" /> --- # C-ITS considered ### Strengths - Takes advantage of full range of data - Compared to mean-effect-only DD, allows differentiation of discontinuous jump vs. post-trend - Permits modeling of fully flexible functional form (can include quadratic, cubic, quartic relationships, interactions and more!) - Data-responsive approach -- ### Weaknesses - Encourages over-fitting - Functional-form dependent - Risks generating unstable models -- **Note that a fully-saturated C-ITS model (i.e., a model that estimates a coefficient on an indicator for each time period) is identical to an event study.** --- class: middle, inverse #Wrap-up --- # Goals 1. Describe threats to validity in difference-in-differences (DD) identification strategy and multiple approaches to address these threats. 2. Using a cleaned dataset, estimate multiple DD specifications in R and interpret these results --- # To-Dos #### Reading: Liebowitz, Porter & Bragg (2022) - Critical to read the paper and answer a small set of questions as preparation for DARE - *Further*: MHE: Ch. 5, 'Metrics: Ch. 5, Mixtape: #### DARE #1 - Let's look at assignment - Submit code and memo in response to questions - Indicate partners (or not) - I am available for support! #### Research Project Proposal due 11:59pm, 1/28 - Talk to me!