MathValidityPy: IRT Analysis of a Math Diagnostic Assessment

This repository contains a Jupyter Notebook, Math Assessment.ipynb, that provides a comprehensive psychometric analysis of a mathematics diagnostic assessment for incoming undergraduate students. The project uses Item Response Theory (IRT) models to evaluate item characteristics and student abilities, leveraging the py-irt library and the Pyro probabilistic programming framework.

This IRT analysis also serves as a foundational step for investigating Differential Item Functioning (DIF). The ability estimates ($\theta$) generated by these models can be used in subsequent analyses to determine if items function differently across various demographic subgroups.

Key Features

Comprehensive IRT Modeling: Defines, trains, and evaluates a suite of IRT models, progressing from the baseline Rasch (1PL) model to the more complex 2PL model with multiple covariates.

Covariate Analysis: Incorporates item domain (e.g., geometry, statistics) and pre-determined difficulty levels ("EASY," "MEDIUM," "HARD") as covariates to build more robust and interpretable models.

Bayesian Inference: Models are trained using Stochastic Variational Inference (SVI), a powerful method for approximating posterior distributions in complex probabilistic models.

Model Evaluation: Model fit is assessed visually using Posterior Predictive Checks (PPC), which compare the distribution of observed test scores to scores generated from the fitted models.

Detailed Parameter Extraction: After training, the script extracts and analyzes key item parameters, including discrimination ($\alpha$), base difficulty ($\beta_{base}$), domain shifts ($\delta$), and difficulty shifts ($\gamma$).

Rich Visualization: The notebook generates both theoretical Item Characteristic Curves (ICCs) and empirical ICCs that overlay the model's predictions against actual student response data.

Data

The analysis uses two primary data files: math.items_AnSamp1.csv: Contains the wide-format student response data. mapping_unique_math.items_umgc1ua2.csv: Contains item metadata, including the domain and pre-assigned difficulty level for each item.

The final processed dataset used for modeling consists of responses from 4,460 individuals to 174 distinct items across 6 mathematical domains.

Methodology & Models

The analysis begins by reshaping the raw data into a long format suitable for IRT modeling. A series of progressively complex IRT models are then specified and trained:

Baseline Models

Rasch Model: A 1PL model estimating a single difficulty parameter (b) for each item.

2PL Model: An extension that adds a discrimination parameter (a) for each item.

Models with Covariates

Rasch + Covariates: Incorporates the effects of item domain and pre-assigned difficulty level on item difficulty.

2PL + Covariates: The most comprehensive model, accounting for item discrimination, base difficulty, and the effects of both domain and pre-assigned difficulty.

Key Findings

A notable finding from the model comparison was that adding the pre-determined item difficulty ("EASY," "MEDIUM," "HARD") as a covariate worsened the model fit. This suggests that the empirically derived difficulty parameters from the IRT models were more effective at explaining student response patterns than the pre-assigned labels.

Outputs & Visualizations

The primary outputs of this project are:

Item Parameter Estimates: A final CSV file, 2pl_domain_difficulty_item_parameters.csv, containing the estimated parameters for all 174 items.

ICC_Plots_math Folder: Contains the theoretical Item Characteristic Curve (ICC) for each item. These curves show the relationship between a student's ability ($\theta$) and their probability of answering correctly based on the model's estimated parameters (a and b).

Empirical_Plots_math Folder: Contains empirical plots that show the actual student responses (0 for incorrect, 1 for correct) against their estimated abilities, with the model-predicted ICC overlaid for visual fit assessment.

Installation & Usage: To run the analysis, first install the required dependencies:

pip install --quiet py-irt torch pyro-ppl pandas jsonlines Then, ensure the input CSV files are accessible in your environment (e.g., in your Google Drive) and run the cells sequentially in the Math Assessment.ipynb notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Empirical_Plots_math		Empirical_Plots_math
ICC_Plots_math		ICC_Plots_math
.Rhistory		.Rhistory
ICC_Plots		ICC_Plots
Math Assessment.ipynb		Math Assessment.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MathValidityPy: IRT Analysis of a Math Diagnostic Assessment

Key Features

Data

Methodology & Models

Baseline Models

Models with Covariates

Key Findings

Outputs & Visualizations

About

Uh oh!

Releases

Packages

Languages

ORosca/MathValidityBayesianIRT_Py

Folders and files

Latest commit

History

Repository files navigation

MathValidityPy: IRT Analysis of a Math Diagnostic Assessment

Key Features

Data

Methodology & Models

Baseline Models

Models with Covariates

Key Findings

Outputs & Visualizations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages