Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High accuracy without pre-training #92

Open
mob2125 opened this issue Nov 5, 2024 · 2 comments
Open

High accuracy without pre-training #92

mob2125 opened this issue Nov 5, 2024 · 2 comments
Labels
question Further information is requested

Comments

@mob2125
Copy link

mob2125 commented Nov 5, 2024

Hello, We did not pass the pre-trained_model_path parameter during fine-tuning, So the code (fine-tuning/run_classifier.py) initializes the model parameters randomly. We finetuned this model on the .tsv datasets(datasets/CSTNET-TLS 1.3) given and achieved a high accuracy of 96%. Is this expected??. If not what are we doing wrong.

@mob2125 mob2125 changed the title High accuracy without retraining High accuracy without pre-training Nov 5, 2024
@linwhitehat linwhitehat added the question Further information is requested label Nov 6, 2024
@linwhitehat
Copy link
Owner

Hello, We did not pass the pre-trained_model_path parameter during fine-tuning, So the code (fine-tuning/run_classifier.py) initializes the model parameters randomly. We finetuned this model on the .tsv datasets(datasets/CSTNET-TLS 1.3) given and achieved a high accuracy of 96%. Is this expected??. If not what are we doing wrong.

Hi mob2125,

Thanks for using our codes!

If you get the desired results without using to a pre-trained model, we think it may depend on the difficulty of the traffic task and the training settings. And our ablation test in the ISCX-VPN-App dataset found a significant decrease in loss of pre-training.

We hope the answer can help you.

@mob2125
Copy link
Author

mob2125 commented Nov 11, 2024

Hello,
Thanks for the reply. We tried the same thing using the ISCX-VPN-App dataset given in the datasets folder. We first converted them into .tsv files using data_process/main.py and then fine-tuned a model without pretraining. It still achieved an accuracy of around 98-99%. Do you know where we are doing wrong??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants