Unit 1: GLM and regression

Topic: Introduction to the General Linear Model and bivariate regression refresher

Objectives:

  • Characterize a bivariate relationship along five dimensions (direction, linearity, outliers, strength and magnitude)
  • Describe how statistical models differ from deterministic models
  • Mathematically represent the population model and interpret its deterministic and stochastic components
  • Formulate a linear regression model to hypothesize a population relationship
  • Estimated a fitted regression line using Ordinary-Least Squares regression
  • Describe residuals and how they can describe the degree of our OLS model fit
  • Explain \(R^{2}\), both in terms of what it tells us and what it does not
  • Conduct an inference test for a regression coefficient and our regression model
  • Calculate a correlation coefficient \((r)\) and describe its relationship to \(R^{2}\)
  • Distinguish between research designs that permit correlational associations and those that permit causal inferences

Readings:

Lectures:

Assignment 1: get started!!! (see below)

Unit 2: Assumptions & diagnostics

Topic: Regression assumptions and diagnostics

Objectives:

  • Articulate the assumptions of the General Linear Model broadly and least squares estimation and inference particularly
  • Describe sources of assumption violation in the regression model including: measurement error, non-linearity, heteroscedasticity, non-normally distributed residuals, correlated errors, and outliers.
  • Articulate properties of residuals and describe their centrality in understanding the regression model assumptions
  • Conduct diagnostic tests on regression model assumption violations
  • Implement a consistent screening protocol to identify regression model assumption violations
  • Implement solutions to regression model assumption violations, when appropriate

Readings:

Lecture:

Assignment 1:

Unit 3: Multiple regression

Topic: Multiple Regression

Objectives:

  • Articulate the concepts of multiple regression and “statistical adjustment”
  • Distinguish between the substantive implications of the terms “statistical control” and “statistical adjustment”
  • Estimate the parameters of a multiple regression model
  • Visually display the results of multiple regression models
  • State the main effects assumption and what the implication would be if it is violated
  • Conduct statistical inference tests of single predictors ( \(t\)-test) and full model ( \(F\)-test) in multiple regression
  • Decompose the total variance into its component parts (model and residual) and use the \(R^2\) statistic to describe this decomposition
  • Describe problems for regression associated with the phenomenon of multicollinearity
  • Use visual schema (e.g., Venn diagrams) to assess regression models for the potential of multicollinearity
  • Use statistical results (e.g., correlation matrices or heat maps) to assess regression models for the potential of multicollinearity
  • Describe and implement some solutions to multi-collinearity

Readings:

Lecture:

Assignment 2:

Unit 4: Categorical predictors

Topic: Categorical predictors and ANOVA

Objectives

  • Describe the relationship between dichotomous and polychotomous variables and convert variables between these forms, as necessary
  • Conduct a two-sample \(t\)-test
  • Describe the relationship between a two-sample \(t\)-test and regressing a continuous outcome on a dichotomous predictor
  • Estimate a regression with one dummy variable as a predictor and interpret the results (including when the reference category changes)
  • Estimate a multiple regression model with several continuous and dummy variables and interpret the results
  • Estimate an ANOVA model and interpret the within- and between-group variance
  • Do the same for an ANCOVA model, adjusting for additional continuous predictors
  • Describe the similarities and differences of Ordinary-Least Squares regression analysis and ANOVA/ANCOVA, and when one would prefer one approach to another
  • Describe potential Type I error problems that arise from multiple group comparisons and potential solutions to these problems, including theory, pre-registration, ANOVA and post-hoc corrections
  • Describe the relationship between different modeling approaches with the General Linear Model family

Readings:

Lecture:

Assignment 3:

  • Due: Feb. 24, 11:59pm

Unit 5: Interactions and non-linearity

Topic: Interactions and non-linearity

Objectives

  • Describe in writing and verbally the assumptions we violate when we fit a non-linear relationship in a linear model
  • Transform non-linear relationships into linear ones by using logarithmic scales
  • Estimate regression models using logarithmic scales and interpret the results
  • Describe in writing and verbally the concept of statistical interaction
  • Estimate and interpret regression models with interactions between categorical and continuous predictors
  • Visualize interaction effects graphically
  • Describe statistical power and Type II error challenges resulting from interactions
  • Estimate models with quadratic and higher-order polynomial terms

Readings:

Lecture:

Assignment 4:

Unit 6: Model building

Topic: Model building

Objectives

  • Translate research questions into question predictors, covariates, outcomes and rival hypothesis predictors
  • Develop work processes to address real life data which contain large number of predictors
  • Build a logical and sequential taxonomy of fitted regression models
  • Distinguish between model building and reporting, including best practices for research transparency, replicability and integrity
  • Present results in publication-ready tables and figures
  • Write compelling and scientifically accurate interpretation of results
  • Describe power and limits of quantitative research

Readings:

Lecture:

Class cancelled on March 13 (David in Baltimore for AEFP Conference)

Final:

  • Due: Mar. 20, 12:01pm