NFL Win Probability Neural Network

This project implements a neural network from scratch using NumPy to predict the probability that the home team wins an NFL game, given a snapshot of the game state.

Avoided the use of PyTorch, TensorFlow & autograd.

What I learned

Implementing a neural network from scratch
Understanding forward and backward propagation
Implementing gradient descent
Preventing data leakage in datasets
Building a complete machine learning pipeline
How predictions connect to loss
How gradients are computed via the chain rule
How gradient descent actually updates parameters
How data leakage happens and how to prevent it
How to validate backprop with numerical gradient checking

Problem definition

Task

Predict the probability that the home team wins the game.

Input

A snapshot of the game at a single moment in time.

Output

A probability in the range [0, 1].

Label

home_win = 1 if the home team eventually wins the game, otherwise 0.

Dataset

The dataset is derived from NFL play by play data sourced from nflfastR via nflreadr.

Each row in the final dataset represents a game state snapshot from which we predict.

Final feature set

Feature	Description
score_diff	home_score minus away_score
seconds_remaining	Seconds remaining in the game
quarter	Game quarter (1 to 4)
down	Down (1 to 4)
yards_to_go	Yards needed for a first down
yardline_100	Distance to opponent end zone
possession_is_home	1 if home team has possession, else 0

Label

Label	Description
home_win	1 if home team wins the game

Neural network architecture

This project uses a single hidden layer neural network.

Forward pass

X (N, D) -> z1 = X @ W1 + b1 -> h = tanh(z1) -> z2 = h @ W2 + b2 -> yhat = sigmoid(z2) -> loss = binary_cross_entropy(yhat, y)

Where:

N = number of samples
D = number of input features
W1, b1 = weights and biases for hidden layer
W2, b2 = weights and biases for output layer
tanh = hyperbolic tangent activation function
sigmoid = logistic sigmoid activation function
binary_cross_entropy = loss function for binary classification

Why tanh and sigmoid

tanh introduces nonlinearity and has a simple derivative
sigmoid maps logits to probabilities
binary cross entropy pairs naturally with sigmoid

Loss function

Binary cross entropy:

L = -mean(y * log(yhat) + (1 - y) * log(1 - yhat))

Backpropagation

Backpropagation is implemented manually using the chain rule.

Key identity used: dL/dz2 = (yhat - y) / N

From there:

gradients flow backward to W2 and b2
then through tanh using (1 - h²)
then to W1 and b1

Every gradient is computed explicitly.

Results

Here is the learning curve showing training and validation loss over epochs:

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data/processed		data/processed
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
NFL_Learning_curve.png		NFL_Learning_curve.png
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NFL Win Probability Neural Network

What I learned

Problem definition

Task

Input

Output

Label

Dataset

Final feature set

Label

Neural network architecture

Forward pass

Why tanh and sigmoid

Loss function

Backpropagation

Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NFL Win Probability Neural Network

What I learned

Problem definition

Task

Input

Output

Label

Dataset

Final feature set

Label

Neural network architecture

Forward pass

Why tanh and sigmoid

Loss function

Backpropagation

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages