General Guidelines

The purpose of this assignment is to practice the concepts and vocabulary we have been modeling in class and implement some of the techniques we have learned. You may work on your own or collaborate with one partner. Please make sure that you engage in a a full, fair and mutually-agreeable collaboration if you do choose to collaborate. If you do collaborate, you should plan, execute and write-up your analyses together, not simply divide the work. Please make sure to indicate clearly when your work is joint and any other individual or resource (outside of class material) you consulted in your responses.

Submission Requirements

Please upload below two files on Canvas:

  1. An .html, .doc(x) or .pdf file that includes your typed responses (in your own words and not identical to anybody else’s except your partner), tables, and/or figures to the problems
  2. The .Rmd or .R file that you used to render the tables and figures in the above html/pdf/doc.

Please submit a complete memo that includes all figures/tables integrated into the memo, uses full sentences and does not include any code chunks interspersed (you will be graded on this). As a challenge, try to format your tables and figure following APA 7 guidelines (you will not be graded on this).

Objectives of this assignment

Data Background

National Education Resource Database on Schools (NERD$)

As a country, the United States spends approximately $650 billion per year on its K-12 education system. While district-level expenditure data has been available for decades, considerable funding variation exists within districts, as a result of the different experience levels (and therefore salaries) of teachers employed in a school, the specialized services particular schools offer, and targeted efforts to make schools more equitable through differentiated funding. Starting in 2015, the Every Student Succeeds Act began requiring that states report school-level expenditures. Even so, differences in how states report finance data, costs of living and more make cross-state comparisons challenging. The Edunomics Lab at Georgetown University recently launched the National Education Resource Database on Schools (NERD$), which allows for the first time a fully comparable, cross-state window into school-level spending. It contains rich variables including measures of academic achievement and achievement gaps for school districts and counties, as well as district-level measures of racial and socioeconomic composition, racial and socioeconomic segregation patterns, and other features of the schooling system. More details on the NERD$ data and ways it has been used to inform policy and practice are available here.

Analytic Sample. Our data set includes school-level data expenditure data for SY 2018-19 for 1,193 Oregon public schools. This represents nearly the full universe of Oregon public schools, with some restrictions due to data availability. Observations with missing and/or unreliable values on any of the key variables were deleted for simplification reasons.

Key variables. The data set contains 14 variables, detailed below.

  • schoolname, school name
  • ncesid, school identifier assigned by the National Center for Education Statistics (a department of IES at the U.S. Department of Education)
  • ncesdistid_geo, district identifier, assigned by NCES
  • distname, district name
  • level, level of schooling (0 = Other; 1 = Early Education; 2 = Elementary; 3 = Middle School; 4 = High School)
  • ppe, per-pupil expenditure
  • enroll, total school enrollment
  • frpl, proportion (0-1) of students in school receiving free- or reduced-price meals
  • sesavgall, Census-based SES composite index for all families residing within school district
  • baplusavgall, Census-based Bachelor’s degree-plus rate for all adults in families residing within district
  • unempavgall, Census-based unemployment rate for all families residing within district
  • snapavgall, Census-based SNAP receipt rate for all families residing within the district
  • locale, Census-based location of schools attended by majority/plurality of students (Urban, Suburb, Town, Rural)
  • inc50avgall, Census-based median income for all families residing within school district

Assignment Details

Data preparation: Open your RStudio, create a project and save it. Go to the root directory of the project and create folders named: “Code”, “Data”, “Figures” and “Tables.” Download the nerds.csv dataset and store it in the folder “Data”. Create an R script (or .Rmd) file in the Code folder. Read the data into your R environment. You do not need to include this part of the response in your memo; only in your code.

1. Descriptive statistics

1.1 Summarize the two categorical variables in your data set (\(level\) and \(locale\)). (1 point)

1.2 Create one figure describing how per-pupil expenditures differs across different levels of schooling and another figure showing how per-pupil expenditures differ by schooling locale. Interpret these figures. (2 points)

2. Do schools that serve different educational grade-band levels spend differing amounts on their students?

2.1 State your null hypothesis regarding the above research question. (1 point)

2.2 Test your null hypothesis using the statistical test that most directly and efficiently answers the above research question. Interpret the results of this test. (2 points)

2.3 State the magnitude by which schools at each of the different grade-bands differ from each other in their per-pupil spending and assess whether this magnitude is statistically different from zero. Select (and justify) the most sensible group to serve as the comparison type of school. (2 points)

2.4 High schools are generally understood to have greater resource needs than other K-12 schools due to fundings items such as science laboratories, extra-curriculars and specialized classes. Test whether high schools spend more per-pupil than all other schools and interpret your results. (1 point)

3. Do schools that educate a larger proportion of students with financial need spend more per student on their education, after accounting for other school-level variation?

3.1 Review the variables you have at your disposal and select a set of substantively sensible (and non multi-collinear) continuous and categorical covariates that might explain schools’ per-pupil expenditure and help clarify the relationship between \(ppe\) and \(frpl\). (1 point)

3.2 Present a table and plot characterizing the regression-adjusted relationship between \(frpl\) and \(ppe\). Substantively interpret these results as you would for an academic journal article. (3 points)