Skip to content

Training a neural net to predict betting odds for a baseball game at any stage in the game.

License

Notifications You must be signed in to change notification settings

TimHanewich/Baseball-Betting-NN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

89 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Predicting MLB Betting Lines with Neural Networks

This project leverages TensorFlow's Keras API to compile, train, and use a neural network to predict betting lines in a baseball game. Training data is gathered from ESPN (the state, or inputs) and DraftKings (the prediction, the outputs).

Simply put - for any point in a theoretical game of baseball, this model predicts what the standard betting lines for the game should be, and thus, can predict a winner at any point before or during a game.

Baseball Betting Line Prediction Engine

How to run the model

  1. Download and unzip one of the pre-trained models in the download section below.
  2. Replace the value of the nn_model_path variable in the ui.py with the path of the folder within the unzipped folder.
  3. Install required dependencies:
    1. tensorflow: python -m pip install tensorflow
  4. Run ui.py to run the model and GUI!

Model Inputs (the State)

The model considers the following 16 inputs when predicting betting lines, in this order:

  • Away team record, as a percentage (i.e. 0.8 if the team is 8-2, meaning the team won 8 out of their 10 games total)
  • Home team record
  • Number of runs the away team has
  • Number of runs the home team has
  • Number of hits the away team has
  • Number of hits the home team has
  • Number of errors the away team has
  • Number of errors the home team has
  • The current inning - i.e. 1.0 for first inning, 2.0 for second, etc. And "0.0" would mean the game is yet to be started and the betting lines are a pre-game line.
  • Top or bottom of inning? Top = 0.0, Bottom = 1.0
  • Number of outs
  • Number of balls in the batter's count
  • Number of strikes in the batter's count
  • Is there a runner on first base? No = 0.0, Yes = 1.0
  • Is there a runner on second base? No = 0.0, Yes = 1.0
  • Is there a runner on third base? No = 0.0, Yes = 1.0

Model Outputs (the Prediction)

The model predicts four distinct betting lines, in the following order. You can read more about what each of these lines mean here.

  • The Run Line - the "point spread" between the two teams.
  • The Total Line - the prediction for what the under/over would be for the combined number of runs in the game.
  • The Away Team's Money Line
  • The Home Team's Money Line Using the odds above, particularly the money lines, we can use these to calculate the implied win probability for either team.

Model Downloads

Name Parameters Description
model4 7,469 Trained on ~5,300 examples. Warning, trained on data that likely contained errors.
model5 378,274 Trained on 5,473 examples. Warning, trained on data that likely contained errors.
model8 378,274 Trained on 14,594 examples

Training Data Downloads

These are .jsonl files. Each line is self-contained JSON object with both the state (game scenario) and real-world observed betting line information.

Number of Examples Size Description
5,473 Warning, likely contains errors.
16,515 2 MB Warning, likely contains errors.
14,594 1.7 MB
41,037 4.8 MB

In this Repo

This repo contains the following programs:

Future Areas of Improvement

  • When a batter walks, ESPN will mark it with 4 balls in the count AND a man on second temporarily. If there are 4 balls and a man is on, count it as 0 balls.

About

Training a neural net to predict betting odds for a baseball game at any stage in the game.

Topics

Resources

License

Stars

Watchers

Forks