Open-source code for "On the Feasibility of Parser-based Log Compression in Large-Scale Cloud Systems" (USENIX FAST 2021)
python >= 3.8.5
pandas >= 1.1.1
six >= 1.15
numpy >= 1.19
gcc >= 7.4.0
7z >= 16.02
Samples of large scale cloud logs can be found at:
https://github.com/THUBear-wjy/openSample
make
Assume the path of target log file is /path/xx.log
Step 1: Training(Generate template at ./template/)
python3 training.py -I /path/xx.log -T ./template/
Step 2: Compression(Using template at ./template/ and generate result at ./out/)
python3 LogReducer.py -I /path/xx.log -T ./template/ -O ./out/
Assume the path of compressed log file is ./out/ and the original file is /path/xx.log. The template to compress file is ./template/
python3 LogRestore.py -I ./out/ -T ./template/ -O ./xx.log