- Dota 2 Matches from Kaggle
- Public match data fetched using the Opendota API
It is a known fact that predicting a Dota 2 match's outcome at any given point of time during the match is very hard even with all related data such as net worth advantage, team kills, creep scores, hero levels, heroes and tower scores. Valve's Dota Plus win prediction during a pro game is an excellent example of this. The heavy fluctuation around the halfway probability mark throughout the game creates uncertainty in the calculated outcome even with a state-of-the-art-model.
For a lower ranked public game, there are more trends that models can pick up on. This is reflected in the hero winrates during a specific major patch on a source like Dotabuff.
However, lower rank means lower skill and the countering ability of a hero is sometimes lost due to lower mechanical skill and subpar/ineffective item builds.
This would mean that the played heroes alone would not suffice to give an accurate prediction of the result. We would need to add items purchased by the players into our feature set for more accurate predictions on a test set.
To show the difference in performance between features of heroes and features of heroes and items, the same models were trained on both datasets to benchmark them.
I decided to use two separate sources to verify that the inference holds irrespective of source. The kaggle dataset contains older matches from 2019 in a wide skill bracket while the opendota dataset only contains matches skilled above average and are more recent.
The features were one hot encoded with each hero having a radiant and dire slot to account for side bias and team separation.
The kaggle-heroes dataset contains 50000 rows and 222 features while the heroes&items data contains 50000 rows and 612 features with a test split of 10% for both.
The opendota dataset contains 50000 rows and 242 features (more heroes due to it being more recent) with a test split of 5%.
All machine learning algorithms were trained after using scikit-learn's GridSearchCV for hyperparameter tuning
- Decision Tree Classifier
- Logistic Regression
- Stochastic Gradient Descent SVM
- Linear Support Vector Machine
- Gaussian Naive Bayes
- XGBClassifier
- Random Forest Classifier
- Multi-Layer Perceptron
- Soft-Voting Ensemble (LR, GNB, XGB, RFC)
Algorithm | Accuracy | Precision | Recall |
---|---|---|---|
Decision Tree | 55% | 57% | 67% |
Logistic Regression | 59% | 61% | 65% |
Stochastic Gradient Descent SVM | 59% | 62% | 62% |
Linear SVM | 59% | 61% | 64% |
Gaussian Naive Bytes | 60% | 60% | 70% |
XGB Classifier | 59% | 61% | 64% |
Random Forest | 59% | 60% | 69% |
Multi-layer Perceptron | 59-60% | - | - |
Soft-Voting Ensemble | 62% | - | - |
Algorithm | Accuracy | Precision | Recall |
---|---|---|---|
Decision Tree | 83% | 85% | 84% |
Logistic Regression | 97% | 97% | 97% |
Stochastic Gradient Descent SVM | 97% | 98% | 96% |
Linear SVM | 97% | 97% | 97% |
Gaussian Naive Bytes | 86% | 86% | 88% |
XGB Classifier | 95% | 96% | 94% |
Random Forest | 95% | 96% | 94% |
Multi-layer Perceptron | 97-99% | - | - |
From the results, it is pretty obvious that heroes alone is not enough to reliably predict the outcome of a match as the factors that come into play during the match have an extremely large role in deciding the outcome. One of these factors are the items purchased which provided us with a much better model across the board. The best classification model from the heroes dataset was the voting classifier ensemble of Logistic Regression, Gaussian Naive Bytes, XGBClassifier and The Random Forest which gave an accuracy of 62% on the test set which is quite good considering the lack of strongly deciding factors. Meanwhile, the models trained on heroes and items resulted in a worst performance of 83% and a best of ~99% accuracy on the test set using a deep neural network!