This is the dataset used in "Boosting ModSecurity with Machine Learning".
If you use this dataset, please cite us:
Since GitHub does not allow files larger than 25MB, we divided them into chunks.
To rebuild the whole dataset, you can use the merge.py
scripts in legitimate and malicious folders.
:~$ cd legitimates
:~$ python3 merge.py
:~$ cd malicious
:~$ python3 merge.py