This project analyzes the transfer market for women and men football, using machine learning models to predict player market values. As the popularity of women football grows, the transfer market has become more complex, with increasing transfer fees and data analytics playing a crucial role in decision-making.
- Significant rise in both transfer numbers and fees.
- Example: Transfer fees reached $3 million in summer 2022, marking a 140.8% increase.
Data-driven decision-making can enhance transfer success compared to intuition-based approaches.
Alternative data, like video games (e.g., FIFA), are utilized to predict player performance due to limited access to comprehensive player data.
To evaluate machine learning models for predicting the value of women footballers, providing stakeholders with actionable insights for strategic decisions in the transfer market.
- Source: Data collected from sofifa.com.
- Process: Python script used to scrape data, including player ratings, positions, skills, etc.
- Output: Data stored in womendataset.csv.
- Data Split:
- Training: 67.4%
- Testing: 15%
- Validation: 17.6%
- Preprocessing: Filling missing values and applying Min-Max normalization.
- Models: Gradient Boosting, Random Forest, K-Nearest Neighbors (KNN), Decision Tree, AdaBoost, Linear Regression(It is not regression model, we wanted to show that).
- Optimization: Bayesian Optimization and Grid Search for hyperparameter tuning.
The following Python libraries were utilized in this project:
- sklearn: For machine learning algorithms and model evaluation. It includes a range of supervised and unsupervised learning algorithms, as well as tools for model evaluation and selection.
- seaborn: For creating attractive and informative statistical graphics. It simplifies data visualization and exploration.
- matplotlib.pyplot: For plotting graphs and visualizing data. It is a versatile library used for creating a wide range of static, animated, and interactive plots.
- pandas: For data manipulation and analysis. It provides data structures and functions needed to work on structured data seamlessly.
- numpy: For numerical computations and array operations. It supports large, multi-dimensional arrays and matrices, along with a large collection of mathematical functions to operate on these arrays.
- Best Model: Gradient Boosting achieved the highest accuracy.
- Fastest Model: KNN was the fastest in training.
Machine learning models, especially Gradient Boosting, effectively predict the market values of female football players. This provides clubs and agents with valuable insights for making informed transfer decisions.
The study highlights the effectiveness of machine learning models in predicting football player market values. Further research is necessary to explore the broader applications of these findings in the football transfer market.
In the first part of our project, we wanted to present the results from our ML model to users more clearly through the UI, displaying predicted values of women football players. We tried to design an interface as user-friendly as possible.
The functionality of the search button in our project interface is demonstrated.
In our project, we can sort features such as Position, Age, Club, Nationality, and Value. Here, sorting the Age feature is demonstrated.
To view player statistics in our project, we can click the "View Stats" button. Player statistics in our project are displayed.
To view men football players' statistics and predicted values, we can click the "Men" button.
Searching and filtering is valid for this tab too.
You can send an e-mail at any time to get more detailed information about Data Mining, Machine Learning and User Interface in the project and to ask your questions: =)