Repository for the paper: Arcolezi, H.H., Makhlouf, K., Palamidessi, C. (2023). (Local) Differential Privacy has NO Disparate Impact on Fairness. In: Atluri, V., Ferrara, A.L. (eds) Data and Applications Security and Privacy XXXVII. DBSec 2023. Lecture Notes in Computer Science, vol 13942. Springer, Cham. https://doi.org/10.1007/978-3-031-37586-6_1.
If our codes and work are useful to you, we would appreciate a reference to:
@incollection{Arcolezi2023,
doi = {10.1007/978-3-031-37586-6_1},
url = {https://doi.org/10.1007/978-3-031-37586-6_1},
year = {2023},
publisher = {Springer Nature Switzerland},
pages = {3--21},
author = {H{\'{e}}ber H. Arcolezi and Karima Makhlouf and Catuscia Palamidessi},
title = {(Local) Differential Privacy has {NO} Disparate Impact on~Fairness},
booktitle = {Data and Applications Security and Privacy {XXXVII}}
}
Our codes were developed using Python 3 with mainly numpy, pandas, lightgbm, and multi-freq-ldpy libraries. The versions we used are listed below:
- Python 3.9.13
- Numpy 1.21.5
- Pandas 1.4.4
- Lightgbm 3.3.5
- Multi-freq-ldpy 0.2.4
This repository is organized/ordered in several Jupyter Notebook files as:
0_Preprocess_Datasets.ipynb:
Notebook for pre-processing (cleaning, encoding) three original datasets (i.e., Adult, ACSCoverage, LSAC). The pre-processed datasets are saved in thedatasets
folder.1_BO_NonDP.ipynb:
Notebook for conducting the Bayesian Optimization (BO) to find the local optimal LGBM hyperparameters using original data (i.e., no LDP). Results are saved in CSV format in theresults
folder.2_Exp_XXX:
Notebooks for carrying out all experiments (repeated overnb_seed=20
iterations) of the paper assuming the mechanismXXX
(e.g., NonDP as the baseline, GRR, OUE, etc). Results are saved in CSV and numpy formats in the correspondingresults
folder. The notebook2_Exp_All_LDP
is generic and runs all LDP protocols.3_Final_Results.ipynb:
Notebook with codes to plot the final results illustrated in the main paper.4_Appendix_Experiments.ipynb
: Notebook for carrying out the additional set of experiments (repeated overnb_seed=20
iterations) with all LDP protocols. Results are also saved in CSV format in theresults
folder.4_Appendix_Results.ipynb
: Notebook with codes to plot the final results illustrated in the appendix of the full paper (in arXiv).
Some functions used in the Jupyter Notebooks are imported from functions.py
.
We are slowly cleaning/generalizing the codes + documentation.
- We use the LDP protocols implemented in our multi-freq-ldpy library.
- We use the reconstructed Adult and ACSCoverage datasets from the folktables library.
- We use the LSAC dataset.
For any questions, please contact:
- Héber H. Arcolezi: heber.hwang-arcolezi [at] inria.fr
- Karima Makhlouf: karima.makhlouf [at] lix.polytechnique.fr