class: center, middle, inverse, title-slide .title[ # Instrumental Variables ] .subtitle[ ## EDLD 650: Week 6 ] .author[ ### David D. Liebowitz ] --- <style type="text/css"> .inverse { background-color : #2293bf; } </style> # Agenda #### 1. Roadmap and goals (9:00-9:10) #### 2. Discussion Questions (9:10-10:20) - Murnane and Willett - Angrist et al. (x2) - Dee & Penner - Dee #### 3. Break (10:20-10:30) #### 4. Applied instrumental variables (10:30-11:40) #### 5. Wrap-up (11:40-11:50) - DARE #3 prep - Plus/deltas --- # Roadmap <img src="causal_id.jpg" width="1707" style="display: block; margin: auto;" /> --- # Goals ### 1. Describe conceptual approach to instrumental variables (IV) analysis ### 2. Assess validity of IV assumptions in applied context ### 3. Conduct IV analysis in simplified data and interpret results --- class: middle, inverse # So random... --- class: middle, inverse # Break --- # The PACES experiment .pull-left[ - Recall the PACES school voucher experiment ([Angrist et al. 2002](https://www.aeaweb.org/articles?id=10.1257/000282802762024629)) from *Methods Matter*, Chapter 11 - Lottery assignment for vouchers to attend private school in Colombia - What is the .blue[**main outcome**]? - What is the .blue[**endogenous regressor**]? ] .pull.right[ <img src="PACES.jpg" width="321" style="display: block; margin: auto;" /> ] -- .red-pink[**Parameter of interest**]: *effect of using financial aid to attend private school* --- # Let's replicate! ```r paces <- read.csv(here("./data/ch11_PACES.csv")) DT::datatable(paces[,c(1:7)], fillContainer = FALSE, options = list(pageLength = 5)) ```
--- ## First post-randomization task? -- <img src="causal_id.jpg" width="1707" style="display: block; margin: auto;" /> --- # Balance checks ### Examine by covariates: `$$\bar{X}_{D=1} \approxeq \bar{X}_{D=0}$$` ```r random <- arsenal::tableby(won_lottry ~ male + base_age, paces) summary(random) ``` | | 0 (N=579) | 1 (N=592) | Total (N=1171) | p value| |:---------------------------|:--------------:|:--------------:|:--------------:|-------:| |**male** | | | | 0.980| | Mean (SD) | 0.504 (0.500) | 0.505 (0.500) | 0.505 (0.500) | | | Range | 0.000 - 1.000 | 0.000 - 1.000 | 0.000 - 1.000 | | |**base_age** | | | | 0.422| | Mean (SD) | 12.036 (1.352) | 11.973 (1.343) | 12.004 (1.347) | | | Range | 7.000 - 16.000 | 9.000 - 17.000 | 7.000 - 17.000 | | --- # Balance checks ### Omnibus `\(F\)`-test approach: ```r summary(lm(won_lottry ~ male + base_age, data=paces)) ``` ``` ... #> Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) 0.609897 0.131294 4.645 3.78e-06 *** #> male 0.002568 0.029338 0.088 0.930 #> base_age -0.008800 0.010894 -0.808 0.419 #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> Residual standard error: 0.5005 on 1168 degrees of freedom #> Multiple R-squared: 0.000559, Adjusted R-squared: -0.001152 *#> F-statistic: 0.3266 on 2 and 1168 DF, p-value: 0.7214 ... ``` --- ## A naïve estimate of financial aid ```r ols1 <- lm(finish8th ~ use_fin_aid, data=paces) ols2 <- lm(finish8th ~ use_fin_aid + base_age + male, data=paces) ``` .small[ <table style="text-align:center"><tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td></tr> <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">use_fin_aid</td><td>0.133<sup>***</sup></td><td>0.121<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td>(0.027)</td><td>(0.027)</td></tr> <tr><td style="text-align:left"></td><td></td><td></td></tr> <tr><td style="text-align:left">base_age</td><td></td><td>-0.063<sup>***</sup></td></tr> <tr><td style="text-align:left"></td><td></td><td>(0.010)</td></tr> <tr><td style="text-align:left"></td><td></td><td></td></tr> <tr><td style="text-align:left">male</td><td></td><td>-0.086<sup>**</sup></td></tr> <tr><td style="text-align:left"></td><td></td><td>(0.026)</td></tr> <tr><td style="text-align:left"></td><td></td><td></td></tr> <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>1,171</td><td>1,171</td></tr> <tr><td style="text-align:left">R<sup>2</sup></td><td>0.020</td><td>0.064</td></tr> <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td colspan="2" style="text-align:left"><sup>*</sup>p<0.05; <sup>**</sup>p<0.01; <sup>***</sup>p<0.001</td></tr> </table> ] --- ## What's wrong with naïve approach? -- > Only about 90 percent of lottery winners used the private school voucher to pay for private school and 24 percent of lotter losers found other sources of scholarships for which to pay for private school. There are endogenous differences in the expected outcomes of children from families who chose to both use the voucher and those who secured scholarship funding from sources outside the voucher lottery. The policy relevant question is how a public subsidy of private school might affect educational attainment for children from low-income families in Bogota, Colombia. The naïve approach does not identify these effects but rather the combination of voucher subsidy and endogenous unobservables across families and individuals. --- # Some differences ```r not_random <- arsenal::tableby(use_fin_aid ~ male + base_age, paces) summary(not_random) ``` | | 0 (N=490) | 1 (N=681) | Total (N=1171) | p value| |:---------------------------|:--------------:|:--------------:|:--------------:|-------:| |**male** | | | | 0.428| | Mean (SD) | 0.518 (0.500) | 0.495 (0.500) | 0.505 (0.500) | | | Range | 0.000 - 1.000 | 0.000 - 1.000 | 0.000 - 1.000 | | |**base_age** | | | | 0.043| | Mean (SD) | 12.098 (1.389) | 11.937 (1.313) | 12.004 (1.347) | | | Range | 7.000 - 17.000 | 9.000 - 16.000 | 7.000 - 17.000 | | --- # How could IV address? <img src="iv2.jpg" width="1707" style="display: block; margin: auto;" /> -- .pull-left[ **IV estimate**: ratio of area of *overlap of `\(Y\)` and `\(Z\)`* to area of *overlap of `\(D\)` and `\(Z\)`*. Depends entirely on variation in `\(Z\)` that predicts variation in `\(Y\)` and `\(D\)`: ] .pull-right[ `$$\hat{\beta}_{1}^{IVE} = \frac{S_{YD}}{S_{DZ}}$$` a .blue[**Local Average Treatment Effect**] ] --- # Recall 2SLS set-up ### 1<sup>st</sup> stage: Regress the endogenous treatment `\((D_{i})\)` on instrumental variable `\((Z_{i})\)`: `$$D_{i} = \alpha_{0} + \alpha_{1}Z_{i} + \nu_{i}$$` Obtain the *predicted values* of the treatment `\((\hat{D_{i}})\)` from this fit. -- ### 2<sup>nd</sup> stage: Regress the outcome `\((Y_{i})\)` on the predicted values of the treatment `\((\hat{D_{i}})\)`: `$$Y_{i} = \beta_{0} + \beta_{1}\hat{D_{i}} + \varepsilon_{i}$$` -- .blue[Think about this in the Colombia PACES experiment context. What is the **main outcome**? What is the **endogenous regressor**? What is the **instrument**? Can you write the two-stage equation without consulting the next slide or book?] --- # The PACES Scholarship ### 1<sup>st</sup> stage: $$ `\begin{align} USEFINAID_{i}=\alpha_{0} + \alpha_{1} WONLOTTERY_{i} + \nu_{i} \end{align}` $$ -- ### 2<sup>nd</sup> stage: $$ `\begin{align} FINISH8TH_{i}=\beta_{0} + \beta_1 \hat{USEFINAID}_{i} + \varepsilon_{i} \end{align}` $$ -- .blue[What is the main outcome? What is the endogenous regressor? What is the instrument? What are the assumptions?] --- # Outcome by lottery status <img src="EDLD_650_6_IV_2_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" /> -- This represents an important substantive finding... .blue[**can you interpret what it is?**] --- # A simple `\(t\)`-test ```r ttest <- t.test(finish8th ~ won_lottry, data=paces) ttest ``` ``` #> #> Welch Two Sample t-test #> #> data: finish8th by won_lottry *#> t = -4.1077, df = 1153.5, p-value = 4.279e-05 #> alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0 #> 95 percent confidence interval: #> -0.16441869 -0.05812251 #> sample estimates: *#> mean in group 0 mean in group 1 *#> 0.6252159 0.7364865 ``` -- .blue[**Can you interpret what this means?**] --- # Intent-to-Treat Estimates ```r itt1 <- lm(finish8th ~ won_lottry, data=paces) itt2 <- lm(finish8th ~ won_lottry + base_age + male, data=paces) itt3 <- lm(finish8th ~ won_lottry + base_age + male + as.factor(school), data=paces) ``` .small[ <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <caption>Table 1. Intent-to-Treat Estimates of Winning the PACES lottery on 8th Grade Completion</caption> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> (1) </th> <th style="text-align:center;"> (2) </th> <th style="text-align:center;"> (3) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Won Lottery </td> <td style="text-align:center;"> 0.111*** </td> <td style="text-align:center;"> 0.107*** </td> <td style="text-align:center;"> 0.108*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.027) </td> <td style="text-align:center;"> (0.026) </td> <td style="text-align:center;"> (0.027) </td> </tr> <tr> <td style="text-align:left;"> Starting Age </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> -0.065*** </td> <td style="text-align:center;"> -0.064*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> (0.010) </td> <td style="text-align:center;"> (0.010) </td> </tr> <tr> <td style="text-align:left;"> Male </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> -0.088*** </td> <td style="text-align:center;"> -0.089*** </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> (0.027) </td> <td style="text-align:center;box-shadow: 0px 1.5px"> (0.027) </td> </tr> <tr> <td style="text-align:left;"> School Fixed Effects </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> Yes </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 1171 </td> <td style="text-align:center;"> 1171 </td> <td style="text-align:center;"> 1171 </td> </tr> <tr> <td style="text-align:left;"> RMSE </td> <td style="text-align:center;"> 0.46 </td> <td style="text-align:center;"> 0.45 </td> <td style="text-align:center;"> 0.45 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> Notes: The table displays coefficients from Equation X and standard errors in parentheses.</td></tr></tfoot> </table> ] --- # Intent-to-Treat Estimates ### .blue[What is our parameter of interest? Do these estimates represent that?] .small[ <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <caption>Table 1. Intent-to-Treat Estimates of Winning the PACES lottery on 8th Grade Completion</caption> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> (1) </th> <th style="text-align:center;"> (2) </th> <th style="text-align:center;"> (3) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Won Lottery </td> <td style="text-align:center;"> 0.111*** </td> <td style="text-align:center;"> 0.107*** </td> <td style="text-align:center;"> 0.108*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.027) </td> <td style="text-align:center;"> (0.026) </td> <td style="text-align:center;"> (0.027) </td> </tr> <tr> <td style="text-align:left;"> Starting Age </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> -0.065*** </td> <td style="text-align:center;"> -0.064*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> (0.010) </td> <td style="text-align:center;"> (0.010) </td> </tr> <tr> <td style="text-align:left;"> Male </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> -0.088*** </td> <td style="text-align:center;"> -0.089*** </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> (0.027) </td> <td style="text-align:center;box-shadow: 0px 1.5px"> (0.027) </td> </tr> <tr> <td style="text-align:left;"> School Fixed Effects </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> Yes </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 1171 </td> <td style="text-align:center;"> 1171 </td> <td style="text-align:center;"> 1171 </td> </tr> <tr> <td style="text-align:left;"> RMSE </td> <td style="text-align:center;"> 0.46 </td> <td style="text-align:center;"> 0.45 </td> <td style="text-align:center;"> 0.45 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> Notes: The table displays coefficients from Equation X and standard errors in parentheses.</td></tr></tfoot> </table> ] --- ## Implementing IV in regression ### Reminder of key assumptions: 1. Instrument correlated with endogenous predictor (.red-pink[no **"weak" instruments**]) 2. Instrument not correlated with 1<sup>st</sup> stage residuals `\((\sigma_{Z\nu} = 0)\)` 3. Instrument not correlated with 2<sup>nd</sup> stage residuals `\((\sigma_{Z\varepsilon} = 0)\)` and correlated with outcome only via predictor<sup>[1]</sup> + .red-pink[Exclusion restriction means **NO THIRD PATH!**] -- ### Practical considerations: Can implement this various ways. Pedagogically, we'll implement 2SLS using the `fixest` package because it allows straightforward presentation of 1</sup>st</sup> stage results. This can also be done via `ivreg` and `iv_robust` in R. .footnote[[1] Don't forget, .red-pink[**no defiers**] too.] --- # IV Estimation ```r # Instrument with no covariates # With only instrumented predictor and no covariates, # need to include a "1" in 2nd stage tot1 <- feols(finish8th ~ 1 | use_fin_aid ~ won_lottry, data=paces) # Instrument with covariates # Note that these are automatically included in 1st stage # Can include multiple instruments and multiple # endogenous predictors tot2 <- feols(finish8th ~ base_age + male | use_fin_aid ~ won_lottry, data=paces) ``` --- # IV results - First Stage ```r summary(tot2, stage = 1) ``` ``` #> TSLS estimation, Dep. Var.: use_fin_aid, Endo.: use_fin_aid, Instr.: won_lottry #> First stage: Dep. Var.: use_fin_aid #> Observations: 1,171 #> Standard-errors: IID #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) 0.432760 0.095159 4.547738 5.9880e-06 *** #> won_lottry 0.674527 0.021014 32.098773 < 2.2e-16 *** #> base_age -0.015160 0.007826 -1.937178 5.2965e-02 . #> male -0.020257 0.021070 -0.961417 3.3654e-01 #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> RMSE: 0.358813 Adj. R2: 0.469577 *#> F-test (1st stage): stat = 1,030.3, p < 2.2e-16, on 1 and 1,167 DoF. ``` -- .small[ You will see some common rules of thumb about what makes for a strong instrument (e.g., `\(t_{F}>10\)`), but recent work has found that with `\(t\)`-ratios lower than 100 one should adjust critical value ([Lee et al., 2021](https://www.nber.org/papers/w29124)). ] --- # IV results - Second Stage ```r summary(tot2) ``` ``` #> TSLS estimation, Dep. Var.: finish8th, Endo.: use_fin_aid, Instr.: won_lottry #> Second stage: Dep. Var.: finish8th #> Observations: 1,171 #> Standard-errors: IID #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) 1.378128 0.123090 11.19614 < 2.2e-16 *** #> fit_use_fin_aid 0.159000 0.039173 4.05890 5.2589e-05 *** #> base_age -0.062157 0.009872 -6.29603 4.3146e-10 *** #> male -0.085145 0.026504 -3.21258 1.3515e-03 ** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> RMSE: 0.451177 Adj. R2: 0.059822 #> F-test (1st stage), use_fin_aid: stat = 1,030.3 , p < 2.2e-16 , on 1 and 1,167 DoF. #> Wu-Hausman: stat = 1.78464, p = 0.181841, on 1 and 1,166 DoF. ``` .blue[**Can you interpret what this means?**] --- # A taxonomy of IV estimates ```r # Include school fixed effects tot3 <- feols(finish8th ~ base_age + male | as.factor(school) | use_fin_aid ~ won_lottry, vcov = "iid", data=paces) # Cluster-robust standard errors tot4 <- feols(finish8th ~ base_age + male | as.factor(school) | use_fin_aid ~ won_lottry, vcov = ~ school, data=paces) ``` --- # Estimate voucher use effects .small[ <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <caption>Table 2. Instrumental variable estimates of using financial aid to attend private school due to winning the PACES lottery on 8th grade completion</caption> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> (1) </th> <th style="text-align:center;"> (2) </th> <th style="text-align:center;"> (3) </th> <th style="text-align:center;"> (4) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Use Fin. Aid </td> <td style="text-align:center;"> 0.165*** </td> <td style="text-align:center;"> 0.159*** </td> <td style="text-align:center;"> 0.161*** </td> <td style="text-align:center;"> 0.161* </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.040) </td> <td style="text-align:center;"> (0.039) </td> <td style="text-align:center;"> (0.039) </td> <td style="text-align:center;"> (0.052) </td> </tr> <tr> <td style="text-align:left;"> Starting Age </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> -0.062*** </td> <td style="text-align:center;"> -0.062*** </td> <td style="text-align:center;"> -0.062** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> (0.010) </td> <td style="text-align:center;"> (0.010) </td> <td style="text-align:center;"> (0.009) </td> </tr> <tr> <td style="text-align:left;"> Male </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> -0.085** </td> <td style="text-align:center;"> -0.086** </td> <td style="text-align:center;"> -0.086 </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> (0.027) </td> <td style="text-align:center;box-shadow: 0px 1.5px"> (0.027) </td> <td style="text-align:center;box-shadow: 0px 1.5px"> (0.037) </td> </tr> <tr> <td style="text-align:left;"> School FE </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> Yes </td> <td style="text-align:center;"> Yes </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 1171 </td> <td style="text-align:center;"> 1171 </td> <td style="text-align:center;"> 1171 </td> <td style="text-align:center;"> 1171 </td> </tr> <tr> <td style="text-align:left;"> RMSE </td> <td style="text-align:center;"> 0.46 </td> <td style="text-align:center;"> 0.45 </td> <td style="text-align:center;"> 0.45 </td> <td style="text-align:center;"> 0.45 </td> </tr> </tbody> <tfoot> <tr><td style="padding: 0; " colspan="100%"> <sup></sup> * p < 0.05, ** p < 0.01, *** p < 0.001</td></tr> <tr><td style="padding: 0; " colspan="100%"> <sup></sup> The table displays coefficients from Equation X and standard errors in parentheses. Model 4 uses cluster-robust standard errors at school level.</td></tr> </tfoot> </table> ] --- # OLS, ITT and TOT estimates .small[ <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <caption>Table 3. Comparison of OLS, ITT and IV estimates of using financial aid to attend private school due to winning the PACES lottery</caption> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> (1) </th> <th style="text-align:center;"> (2) </th> <th style="text-align:center;"> (3) </th> <th style="text-align:center;"> (4) </th> <th style="text-align:center;"> (5) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> OLS </td> <td style="text-align:center;"> ITT </td> <td style="text-align:center;"> TOT </td> <td style="text-align:center;"> TOT </td> <td style="text-align:center;"> TOT </td> </tr> <tr> <td style="text-align:left;"> Use Fin. Aid </td> <td style="text-align:center;"> 0.121*** </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> 0.165*** </td> <td style="text-align:center;"> 0.159*** </td> <td style="text-align:center;"> 0.161* </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.027) </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> (0.040) </td> <td style="text-align:center;"> (0.039) </td> <td style="text-align:center;"> (0.052) </td> </tr> <tr> <td style="text-align:left;"> Win Lottery </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> 0.107*** </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> (0.026) </td> <td style="text-align:center;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> </td> <td style="text-align:center;box-shadow: 0px 1.5px"> </td> </tr> <tr> <td style="text-align:left;"> School FE </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> Yes </td> </tr> <tr> <td style="text-align:left;"> Student Chars. </td> <td style="text-align:center;"> Yes </td> <td style="text-align:center;"> Yes </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> Yes </td> <td style="text-align:center;"> Yes </td> </tr> <tr> <td style="text-align:left;"> Clust. SEs </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> No </td> <td style="text-align:center;"> Yes </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 1171 </td> <td style="text-align:center;"> 1171 </td> <td style="text-align:center;"> 1171 </td> <td style="text-align:center;"> 1171 </td> <td style="text-align:center;"> 1171 </td> </tr> <tr> <td style="text-align:left;"> RMSE </td> <td style="text-align:center;"> 0.45 </td> <td style="text-align:center;"> 0.45 </td> <td style="text-align:center;"> 0.46 </td> <td style="text-align:center;"> 0.45 </td> <td style="text-align:center;"> 0.45 </td> </tr> </tbody> <tfoot> <tr><td style="padding: 0; " colspan="100%"> <sup></sup> * p < 0.05, ** p < 0.01, *** p < 0.001</td></tr> <tr><td style="padding: 0; " colspan="100%"> <sup></sup> The table displays coefficients from Equation X and standard errors in parentheses.</td></tr> </tfoot> </table> ] --- # Interpretation of results .small[The naïve OLS estimates .large[.red-pink[understate]] the effects of a public voucher subsidy for private school attendance for over 125,000 children from low-income families in Bogota, Colombia. Our preferred estimates of the effect of voucher use on eighth-grade completion imply an increase in the on-time completion rate of 16 percentage points.] -- .small[The estimates of the endogenous relationship between the use of financial aid to attend private school and school attainment (Model 1) imply that students who use any form of external scholarship are 12 percentage points more likely to complete eighth grade. In Model 2, we present results of winning an unbiased lottery to receive vouchers covering slightly more than half the cost of average private school attendance. We find that the offer of the voucher increased eighth-grade completion rates by just less than 11 percentage points. Finally, Models 3-5 present a taxonomy of Treatment-on-the-Treated estimates in which we use the randomized lottery as an instrument for the use of financial aid to attend private school. We find consistent effects 50 percent larger than the Intent-to-Treat estimates. These models are robust to the inclusion of baseline student characteristics, cohort fixed effects, and the clustering of standard errors at the level of randomization (within school).] --- class: middle, inverse # Synthesis and wrap-up --- # Goals ### 1. Describe conceptual approach to instrumental variables (IV) analysis ### 2. Assess validity of IV assumptions in applied context ### 3. Conduct IV analysis in simplified data and interpret results --- # Can you explain this figure? <img src="causal_id.jpg" width="1707" style="display: block; margin: auto;" /> --- # To-Dos ### Week 7: Instrumental Variables ### Readings: - Kim, Capotosto, Hartry & Fitzgerald (2011) ### Assignments Due **DARE 3** - Due 11:59pm, Feb. 18 --- # Feedback ## Plus/Deltas Front side of index card ## Clear/Murky On back