Skip to content

vijayn7/NFL_Neural-Network

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NFL Win Probability Neural Network

This project implements a neural network from scratch using NumPy to predict the probability that the home team wins an NFL game, given a snapshot of the game state.

Avoided the use of PyTorch, TensorFlow & autograd.

What I learned

  • Implementing a neural network from scratch
  • Understanding forward and backward propagation
  • Implementing gradient descent
  • Preventing data leakage in datasets
  • Building a complete machine learning pipeline
  • How predictions connect to loss
  • How gradients are computed via the chain rule
  • How gradient descent actually updates parameters
  • How data leakage happens and how to prevent it
  • How to validate backprop with numerical gradient checking

Problem definition

Task

Predict the probability that the home team wins the game.

Input

A snapshot of the game at a single moment in time.

Output

A probability in the range [0, 1].

Label

home_win = 1 if the home team eventually wins the game, otherwise 0.


Dataset

The dataset is derived from NFL play by play data sourced from nflfastR via nflreadr.

Each row in the final dataset represents a game state snapshot from which we predict.

Final feature set

Feature Description
score_diff home_score minus away_score
seconds_remaining Seconds remaining in the game
quarter Game quarter (1 to 4)
down Down (1 to 4)
yards_to_go Yards needed for a first down
yardline_100 Distance to opponent end zone
possession_is_home 1 if home team has possession, else 0

Label

Label Description
home_win 1 if home team wins the game

Neural network architecture

This project uses a single hidden layer neural network.

Forward pass

X (N, D) -> z1 = X @ W1 + b1 -> h = tanh(z1) -> z2 = h @ W2 + b2 -> yhat = sigmoid(z2) -> loss = binary_cross_entropy(yhat, y)

Where:

  • N = number of samples
  • D = number of input features
  • W1, b1 = weights and biases for hidden layer
  • W2, b2 = weights and biases for output layer
  • tanh = hyperbolic tangent activation function
  • sigmoid = logistic sigmoid activation function
  • binary_cross_entropy = loss function for binary classification

Why tanh and sigmoid

  • tanh introduces nonlinearity and has a simple derivative
  • sigmoid maps logits to probabilities
  • binary cross entropy pairs naturally with sigmoid

Loss function

Binary cross entropy:

L = -mean(y * log(yhat) + (1 - y) * log(1 - yhat))

Backpropagation

Backpropagation is implemented manually using the chain rule.

Key identity used: dL/dz2 = (yhat - y) / N

From there:

  • gradients flow backward to W2 and b2
  • then through tanh using (1 - h²)
  • then to W1 and b1

Every gradient is computed explicitly.


Results

Here is the learning curve showing training and validation loss over epochs: Learning Curve

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors