Skip to content

Data science analyses delving into National Hockey League (NHL) ice hockey statistics

License

Notifications You must be signed in to change notification settings

justinjjlee/NHL-Analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nhla-logo NHL Analytics: A Ice Hockey Sports Analytics Platform Based on National Hockey League (NHL) Data

If you find my work to be useful, please star this repository!

justinjjlee - NHL-Analytics stars - NHL-Analytics forks - NHL-Analytics Medium - NHL-Analytics Streamlit - NHL-Analytics

This is a collection of methods for collecting, compiling, cleaning, analyzing, modeling, and predicting team and player (skaters and goalies) performances and strategies. This repository does not claim ownership of the data and reflects the perspectives of the organizations or entities mentioned. All original code (including generic and model algorithms) may be used freely, provided proper citation and credit are given to this repository.

Analysis & Insights

All of my analyses and deep-dive insights are written and presented in my Medium blog.

Goals & Capabilities

I use publicly available data to build up the analytics capabilities and insights generated beyond the headline statics easily measurable.

  • Capture complex strategical, behavioral, and performance trends asked by fans of the sport
  • Integrate different data sources (e.g. college hockey roaster and building up performance trend beyond players' professional career)

I hope works saved in this repository allows for replications, explorations, and advancing new measurements and insights.

Applied Tools

Capabilities I use for data collection, processing, and analysis to derive insights, data visualizations, and predictive models.

Capability Tools used
General
Data Collection & Processing
ML Model Build
Interactive Data Visualization
Data Pull & Process Automation with Github Actions

The Github Actions is being used to update the data saved in this repository folder ./latest/. The data collection is run every day.

  • Team-level rank
  • Game-level stats
  • Game-level betting odds
  • Play-by-play records

Required package version used is saved in ./src/requirement through .sh command. Note that the python environment function pull is based on where the script is located, where as data file reference is based on Github repository head directory.

   ,
    -   \O                                     ,  .-.___
  -     /\                                   O/  /xx\XXX\
 -   __/\ `\                                 /\  |xx|XXX|
    `    \, \_ =                          _/` << |xx|XXX|
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""