Skip to content

Latest commit

 

History

History
246 lines (168 loc) · 8.83 KB

README.md

File metadata and controls

246 lines (168 loc) · 8.83 KB

<a id="readme-top"></a>

Contributors Forks Stargazers Issues MIT License LinkedIn


Logo

OpenDota Data Warehouse

An all in one data warehouse utilising the API provided by OpenDota.com.
Explore the docs »

· Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

This project aims to provide an all-in-one ETL process to deliver a data warehouse for use in analytics and reporting on recent Dota matches involving professional players.


An example of the data structure within the Matches fact table.

Getting Started

When executed, the script will run a series of GET API calls against the OpenDota API. It will stage the json data locally, then transform the data into a Kimball dimensional model.

Currently there are 3 dimensions and 1 fact table in the model, this are Dim_Player, Dim_Item, Dim_Hero, and Fact_Matches. Dim_Item is not currently connected to the star schema, but this is a planned addition in the future.

The code contains a config file which can be edited to adjust the output file structure, as well as the preferred output file types. The parameters are as follows:

Staging Folder

The name of the folder that will be created to store the raw .json data. The default file path is Staging/. It can be altered by replacing Staging/ with your desired staging file path within config.py:-

# the folder in which the staging json files will land
staging_folder = base_file_path + 'Staging/'

Tables Folder

The name of the folder that will be created to store the dimensions and fact tables. The default file path is Tables/. It can be altered by replacing Tables/ with your desired staging file path within config.py:-

# the folder in while the star schema model will land
tables_folder = base_file_path + 'Tables/'

Output file type

The output file type of the dimensions and fact tables, currently the options are xlxs, parquet (using gzip compression) and csv. To change the output file type, replace xlsx with one of the other options in the following code within config.py:

# the desired output file type of the star schema
output_file_type = 'xlsx'

Output into single excel file

If the output file type is set to xlsx this option determines whether the tables will be output to sheets within a single workbook, or seperate workbooks by table. To change the output type, replace True with False in the following code within config.py:

# if xlsx if the desired output type, this will determine whether the tables are loaded
# to individual files or whether they will be loaded as sheets into the same file
output_file_excel_single_file = True

Prerequisites

A basic knowledge of running python code, optionally including changing parameters before execution. This is best done in an IDE such as VSCode.

Ensure that you have pip installed and are upgraded to the latest version.

Windows:
  py -m ensurepip --upgrade
Linux:
  python -m ensurepip --upgrade
MacOS
  python -m ensurepip --upgrade

Installation

  1. Clone the main branch from the repo into your desired IDE.
   git clone https://github.com/chasimm3/dota_project.git
  1. Open config.py and make any desired changes to the optional parameters.
  2. Save and close config.py
  3. Execute the main.py file.
  4. Once complete, the files will be available in the structure specified in config.py.

(back to top)

Usage

The primary use of this project is for data analysis of trends of professional Dota players. The data can be connected to any data analysis tool (e.g. PowerBi, Jupyter etc) to enable in-depth analysis of the hero choice upon win probability.

(back to top)

API Usage Limits

As the code utilises the free API provided by OpenDota.com, it is as such limited to:

  • 60 requests per min
  • 2,000 requests per day

There are currently no plans to include a premium option, if you would like this functionality added feel free to raise a feature-request at: https://github.com/chasimm3/dota_project/issues/new?labels=enhancement&template=feature-request---.md

Roadmap

  • Enable choice of output file type, currently the only available structure is .csv.
    • Parquet
    • Json
    • xlsx
  • Import additional data from the API, including in-depth match stats.
  • Build up a suite of PowerBI reports to get the ball rolling for the users.
  • Update Fact_Matches to pull from all staging files rather than the latest only.

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Contact

Charlie Simmons - charlie.simmons92@gmail.com.com

Project Link: https://github.com/chasimm3/dota_project

(back to top)

Acknowledgments

(back to top)