note: 1. Some code is inherited from others. 2. The project is still under development.
This is a classification model with five classes (normal, DOS, R2L, U2R,PROBING). Ignore the content features of TCP connection ( columns 10-22 of KDD Cup 99 dataset) when training the model to adapt this project that a kdd99 feature extractor.
(Conv2d => ReLU )*2 => ( MaxPool2d )=> (Linear => [dropout] => ReLU) * 2 => ( Linear )
- PyTorch 1.4+
-
optimize data initialization (standardization,deal error data, etc) -
predict.py -
more evaluation methods (Confusion Matrix, Recall, etc)
First, unzip the data (
Train_data: kddcup.data.gz,
Train_data_10%: kddcup.data_10_percent.gz,
Test_data: corrected.gz )
in the /data_pre_processing folder. or download the data from the official Website
Second, convert strings in the datasets to discrete numbers
#using 10% training data
python data_pre_processing_10%.py
# or, using all training data
python data_pre_processing_all.py
Third, copy train_data* and test.csv to the ./dataset, and change the data path in train.py file
> python train.py -h
usage: train.py [-h] [-e E] [-b [B]] [-l [LR]] [-f LOAD]
Train the DNN on KDD Cup 1999. Note: the default parameters are not the
best!!!
optional arguments:
-h, --help show this help message and exit
-e E, --epochs E Number of epochs (default: 5)
-b [B], --batch-size [B]
Batch size (default: 512)
-l [LR], --learning-rate [LR]
Learning rate (default: 0.0001)
-f LOAD, --load LOAD Load model from a .pth file (default: False)
e.g. (It's working)
python train.py -e 20 -b 512 -l 0.0001
Visualize the train and test losses, accuracy, the weights and gradients in real time.
tensorboard --logdir=runs
note: the accuracy is the best one when training the model, but the model ( .pth ) that I provided is at the end of one epoch.
training_dataset | accuracy |
---|---|
10% | 0.9395 |
All | 0.9393 |
To be done.
To be done.
[1]data pre-processing: https://blog.csdn.net/qq_35733521/article/details/87889480
[2] some blogs that I read: https://blog.csdn.net/jbfsdzpp/article/details/44099849 and https://blog.csdn.net/asialee_bird/article/details/80491256?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.nonecase