The purpose of this assignment is to practice the concepts and vocabulary we have been modeling in class and implement some of the techniques we have learned. You may work on your own or collaborate with one partner. Please make sure that you engage in a a full, fair and mutually-agreeable collaboration if you do choose to collaborate. If you do collaborate, you should plan, execute and write-up your analyses together, not simply divide the work. Please make sure to indicate clearly when your work is joint and any other individual or resource (outside of class material) you consulted in your responses.
Please upload below two files on Canvas:
Please submit a complete memo that includes all figures/tables integrated into the memo, uses full sentences and does not include any code chunks interspersed (you will be graded on this). As a challenge, try to format your tables and figure following APA 7 guidelines (you will not be graded on this).
Fit, visualize and interpret bivariate and multiple regression models including categorical predictors
Use ANOVA as a mechanism to conduct an omnibus test of between-group variation
Use categorical predictors as covariates and use them to plot prototypical values of continuous relationships
As a country, the United States spends approximately $650 billion per year on its K-12 education system. While district-level expenditure data has been available for decades, considerable funding variation exists within districts, as a result of the different experience levels (and therefore salaries) of teachers employed in a school, the specialized services particular schools offer, and targeted efforts to make schools more equitable through differentiated funding. Starting in 2015, the Every Student Succeeds Act began requiring that states report school-level expenditures. Even so, differences in how states report finance data, costs of living and more make cross-state comparisons challenging. The Edunomics Lab at Georgetown University recently launched the National Education Resource Database on Schools (NERD$), which allows for the first time a fully comparable, cross-state window into school-level spending. It contains rich variables including measures of academic achievement and achievement gaps for school districts and counties, as well as district-level measures of racial and socioeconomic composition, racial and socioeconomic segregation patterns, and other features of the schooling system. More details on the NERD$ data and ways it has been used to inform policy and practice are available here.
Analytic Sample. Our data set includes school-level data expenditure data for SY 2018-19 for 1,193 Oregon public schools. This represents nearly the full universe of Oregon public schools, with some restrictions due to data availability. Observations with missing and/or unreliable values on any of the key variables were deleted for simplification reasons.
Key variables. The data set contains 14 variables, detailed below.
Data preparation: Open your RStudio, create a project and save it. Go to the root directory of the project and create folders named: “Code”, “Data”, “Figures” and “Tables.” Download the nerds.csv dataset and store it in the folder “Data”. Create an R script (or .Rmd) file in the Code folder. Read the data into your R environment. You do not need to include this part of the response in your memo; only in your code.
1.1 Summarize the two categorical variables in your data set (\(level\) and \(locale\)). (1 point)
1.2 Create one figure describing how per-pupil expenditures differs across different levels of schooling and another figure showing how per-pupil expenditures differ by schooling locale. Interpret these figures. (2 points)
2.1 State your null hypothesis regarding the above research question. (1 point)
2.2 Test your null hypothesis using the statistical test that most directly and efficiently answers the above research question. Interpret the results of this test. (2 points)
2.3 State the magnitude by which schools at each of the different grade-bands differ from each other in their per-pupil spending and assess whether this magnitude is statistically different from zero. Select (and justify) the most sensible group to serve as the comparison type of school. (2 points)
2.4 High schools are generally understood to have greater resource needs than other K-12 schools due to fundings items such as science laboratories, extra-curriculars and specialized classes. Test whether high schools spend more per-pupil than all other schools and interpret your results. (1 point)
3.1 Review the variables you have at your disposal and select a set of substantively sensible (and non multi-collinear) continuous and categorical covariates that might explain schools’ per-pupil expenditure and help clarify the relationship between \(ppe\) and \(frpl\). (1 point)
3.2 Present a table and plot characterizing the regression-adjusted relationship between \(frpl\) and \(ppe\). Substantively interpret these results as you would for an academic journal article. (3 points)