Recovering from Poisoning through Active Learning (RPAL) Framework

This repository is the official release of the code used for the 'The Impact of Active Learning on Availability Data Poisoning for Android Malware Classifiers' Paper published in the Workshop on Recent Advances in Resilient and Trustworthy Machine Learning (ARTMAN) 2024, co-located with ACSAC.

If you plan to use this repository in your projects, please cite the following paper:

@inproceedings{mcfadden2024recovery,
  title = {The Impact of Active Learning on Availability Data Poisoning for Android Malware Classifiers},
  author = {McFadden, Shae and Kan, Zeliang and Cavallaro, Lorenzo and Pierazzi, Fabio},
  booktitle = {Proc. of the Annual Computer Security Applications Conference Workshops (ACSAC Workshops)},
  year = {2024},
}

Disclaimer

Please note that the code in this repository is only a research prototype. This code is released under a "Modified (Non-Commercial) BSD License": see the terms here.

Installation

Please note that this project requires tesseract-ml, which can be found here and installed as follows.

pushd ${PATH_TO}/tesseract-ml
python setup.py install (install tesseract)
popd

Once tesseract-ml has been installed, RPAL can be setup as follows.

pip install NumpyEncoder
cd RPAL;
pip install -r requirements.txt
pip install .

Repository Contents

RPAL

RPAL/classification.py: This code handles training & testing of the classifier and returns the results.
RPAL/constraints.py: This code enables easy checking of spatial and temporal bias in the data.
RPAL/data.py: This code handles the various data manipulations required.
RPAL/grapher.py: This code generates the experiment and results plots.
RPAL/loader.py: This code handles loading the dataset.
RPAL/poison.py: This code performs all the data poisoning.
RPAL/recovery.py: This code generates all the recovery data.

Results

Results/Data/: Contains the data presented in the paper.
Results/Scripts/: Contains the scripts used to generate the plots and table data in the paper.

Experiments:

Drebin-Label-Flip-Deep-Tesseract.py: Runs all DNN experiments shown in the paper.
Drebin-Label-Flip-RF-Tesseract.py: Runs all RF experiments shown in the paper.
Drebin-Label-Flip-SVM-Tesseract.py: Runs all SVM experiments shown in the paper.

Other:

deepdrebin.py: Implements a SKLean compatible implementation of the architecture used in 'Adversarial Examples for Malware Detection' by Grosse et al.
Clean_Label_Poisoning_Mapping.py: Generates the feature-flip mappings used to mimic the label-flip attack.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recovering from Poisoning through Active Learning (RPAL) Framework

Disclaimer

Installation

Repository Contents

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
RPAL		RPAL
Results		Results
.gitignore		.gitignore
Clean_Label_Poisoning_Mapping.py		Clean_Label_Poisoning_Mapping.py
Drebin-Label-Flip-Deep-Tesseract.py		Drebin-Label-Flip-Deep-Tesseract.py
Drebin-Label-Flip-RF-Tesseract.py		Drebin-Label-Flip-RF-Tesseract.py
Drebin-Label-Flip-SVM-Tesseract.py		Drebin-Label-Flip-SVM-Tesseract.py
LICENSE		LICENSE
README.md		README.md
deepdrebin.py		deepdrebin.py

License

s2labres/RPAL

Folders and files

Latest commit

History

Repository files navigation

Recovering from Poisoning through Active Learning (RPAL) Framework

Disclaimer

Installation

Repository Contents

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages