GitHub - finallyupper/pre-augment: Codes used in DA project for data processing.

Introduction

Purpose of this code is to preprocess the given dataset to fit the given paper. The input data is already splitted into superior/inferior dataset that includes instances (no ack, empty files)(Total 200 sites with 50000 instances). It supports to preprocess dataset into two configurations. Details can be found in the paper [1]

Usage Examples

DATA_PATH = "/scratch/DA/dataset/tcp5"  
OUTPUT_PATH = "/scratch/DA/dataset/tcp5cfg1_fine_tuning_data.npz"

Data preprocessing

python split.py -r "${DATA_PATH}" -o "${OUTPUT_PATH}" --cfg 1 --onlydir False

python split.py -r "${DATA_PATH}" -o "${OUTPUT_PATH}" --cfg 2 --onlydir False

Directory Structure

src
    ├─Augment
    │      common.py
    │      split.py
    │
    └─others
        ├─tcp5
        │      pre_training_data.py
        │
        └─tcp5_filtered
                pt2.py

References

[1] https://people.cs.umass.edu/~amir/papers/CCS23-SSL-Web-Fingerprint.pdf
[2] https://github.com/notem/reWeFDE/tree/master

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Usage Examples

Directory Structure

References

About

Releases

Packages

Languages

finallyupper/pre-augment

Folders and files

Latest commit

History

Repository files navigation

Introduction

Usage Examples

Directory Structure

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages