8th place solution for Data Fusion 2022 Contest.
Rank | Public | Private |
---|---|---|
Matching | 6 | 8 |
-
Before analyzing transactional data, we need to create useful features based on the all available data. This will allow you to get more information in the context of various measurements in the future (such as time of day, days of the week, etc.), as well as use the obtained features to train machine learning models.
-
Training:
- CatBoostRanker with YetiRank loss with 9000 iterations,
- Ensembling of 2 catboost models with different parameters.
- General data for all tasks in a tabular
.csv
format:transactions.zip, clicstream.zip
and the target variabletrain_matching.csv
- Common accompanying data for all tasks in tabular
.csv
format:mcc_codes.csv
,click_categories.csv
andcurrency_rk.csv
- Baselines and examples of solutions for a container Matching problem: random solution
sample_submission.zip
andbaseline_catboost.zip
with an example of a solution based on the catboost library using GPU