Skip to content

finallyupper/pre-augment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Introduction

Purpose of this code is to preprocess the given dataset to fit the given paper. The input data is already splitted into superior/inferior dataset that includes instances (no ack, empty files)(Total 200 sites with 50000 instances). It supports to preprocess dataset into two configurations. Details can be found in the paper [1]

Usage Examples

DATA_PATH = "/scratch/DA/dataset/tcp5"  
OUTPUT_PATH = "/scratch/DA/dataset/tcp5cfg1_fine_tuning_data.npz"

Data preprocessing

python split.py -r "${DATA_PATH}" -o "${OUTPUT_PATH}" --cfg 1 --onlydir False

python split.py -r "${DATA_PATH}" -o "${OUTPUT_PATH}" --cfg 2 --onlydir False

Directory Structure

src
    ├─Augment
    │      common.py
    │      split.py
    │
    └─others
        ├─tcp5
        │      pre_training_data.py
        │
        └─tcp5_filtered
                pt2.py

References

[1] https://people.cs.umass.edu/~amir/papers/CCS23-SSL-Web-Fingerprint.pdf
[2] https://github.com/notem/reWeFDE/tree/master

About

Codes used in DA project for data processing.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages