Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

有关从pcap生成tsv文件遇到的问题 #75

Open
plfnico opened this issue Feb 7, 2024 · 3 comments
Open

有关从pcap生成tsv文件遇到的问题 #75

plfnico opened this issue Feb 7, 2024 · 3 comments
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@plfnico
Copy link

plfnico commented Feb 7, 2024

您好,我遵循您项目中./data_process/main.py,使用该文件将pcap转换成tsv文件时,得到的dataset.json文件中,有一些类别的sample并不能在您公开的tsv文件中找到,而另一些类别的sample则可以在公开的tsv文件中找到。由于我机器的内存限制,我在data_generation.generation时,修改了get_feature_packet中的for packet in packets,使其只访问前十个packet就返回,请问这可能会导致这一问题吗?
此外,我尝试用做了以上修改得到的tsv文件进行训练,但得到的sample只有大约四万个,远赶不上您公开的tsv文件中的约四十万个,同时这样训练出的模型准确率也极低,请问您对这一奇怪的问题有什么可能的解决思路吗,谢谢

@plfnico
Copy link
Author

plfnico commented Feb 7, 2024

比如说我这里的label 26squarespace.com,在您的tsv中就能找到,label为92,但是25号的ampproject.org和0号的yy.com等生成的数据就在您的tsv中找不到

@linwhitehat linwhitehat added help wanted Extra attention is needed question Further information is requested labels Aug 30, 2024
@linwhitehat
Copy link
Owner

你好,你指的是微调中的哪一个tsv文件,正常采样是会对所有类别进行采样的,建议可以减少类别的情况下调试是否仍存在类似问题。

@cbryant0
Copy link

I:/SplitCap.exe在哪里获取这个exe程序?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants