-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: preparations, training, attacks, evaluation and helpers
update README with instructions (tested on RWTH HPC) and links
- Loading branch information
1 parent
26818ef
commit d39b62d
Showing
18 changed files
with
31,233 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,38 @@ | ||
# Adversarial-Training-for-Jet-Tagging | ||
Code for "Improving robustness of jet tagging algorithms with adversarial training" (arXiv:2203.13890) | ||
Code for: | ||
> <b><a href="https://arxiv.org/abs/2203.13890" target="_blank">Improving robustness of jet tagging algorithms with adversarial training</a></b> | ||
> A. Stein, X. Coubez, S. Mondal, A. Novak, A. Schmidt | ||
> 2022. | ||
<i>Jet Flavor dataset</i> | ||
|
||
Obtained from http://mlphysics.ics.uci.edu/ and originally created for | ||
> <b><a href="https://arxiv.org/abs/1607.08633" target="_blank">Jet Flavor Classification in High-Energy Physics with Deep Neural Networks</a></b> | ||
> D. Guest, J. Collado, P. Baldi, S. Hsu, G. Urban, and D. Whiteson | ||
> Physical Review D, 2016. | ||
## Get and prepare dataset | ||
### Download | ||
Login to a copy18-node of the HPC with high bandwith (will download 2.2GB) | ||
``` | ||
wget http://mlphysics.ics.uci.edu/data/hb_jet_flavor_2016/dataset.json.gz | ||
mkdir -p /hpcwork/<your-account>/jet_flavor_MLPhysics/dataset | ||
mv dataset.json.gz /hpcwork/<your-account>/jet_flavor_MLPhysics/dataset | ||
``` | ||
### Extracting the data via awkward arrays | ||
Actually, it turns out that reading the file is not straightforward, at some point, the data has to be unzipped or extracted. The file might have the simple ending ".json", but it's rather various JSON-like entries distributed over several lines of the entire .json file. Consult the notebook `preparations/read_dataset.ipynb` for further details and potential alternatives to use the dataset. Finally, I ended up using awkward arrays with which the next steps become a bit easier. | ||
### A first look at the data | ||
Some initial investigations before proceeding to the actual framework will be conducted inside `preparations/explore_dataset.ipynb`. | ||
### Calculate defaults | ||
To use custom default values that fit well to the bulk distribution, preliminary studies are done inside `preparations/defaults.ipynb`. It's also the first notebook that makes use of `helpers/variables.py`. | ||
### Clean samples | ||
In order to not store too many versions of the same data, cleaning the samples will not be done as a separate step, but comes later when doing the preprocessing (scaling). There, also the final shape of the arrays will be flattened, the result should be a set of usable PyTorch tensors. During the cleaning, I would not cut on any variables, but would only modify certain unphysical values and place them at special default bins - i.e. the fractions of jets of a certain flavor, in certain pt and eta bins do not change by the next step of cleaning (and preprocessing) the data. | ||
### Calculate sample weights | ||
Sample weights are calculated in `preparations/reweighting.ipynb`. | ||
### Preprocessing | ||
Calculate scalers (from trainset only, and ignore defaults), apply scalers (do _not_ ignore defaults when applying the scaler, alternative: set to zero), train/val/test splitting & shuffling, build sample weights and bins. See `preparations/clean_preprocess.ipynb` for a first working example of the entire preprocessing chain. Also, `evaluate/tools.py` can be used later to facilitate communication between training or evaluation scripts with the preprocessing step. | ||
## Run framework (training, evaluation) | ||
### Training | ||
All relevant scripts are placed inside `training`, e.g. standalone training on current node is done with `training.py`, and for submission to the batch system, there is `training.sh` and `submit_training.py`. Can use nominal or adversarial training. | ||
### Evaluation | ||
ROC curves: `evaluate/eval_roc_new.py`. Training history (loss): `evaluate/plot_loss.py`. Tagger outputs and discriminator shapes: `evaluate/eval_discriminator_shapes.py`. Plotting of input variables `evaluate/eval_inputs.py`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,159 @@ | ||
import numpy as np | ||
import torch | ||
|
||
import sys | ||
|
||
sys.path.append("/home/um106329/aisafety/jet_flavor_MLPhysics/helpers/") | ||
from tools import defaults_path, preprocessed_path, get_all_scalers, get_all_defaults | ||
from variables import integer_indices, n_input_features, get_wanted_full_indices, all_factor_epsilons | ||
|
||
all_scalers = np.array(get_all_scalers()) | ||
all_defaults_scaled = np.array(get_all_defaults(scaled=True)) | ||
all_defaults = np.array(get_all_defaults(scaled=False)) | ||
|
||
def apply_noise(sample, magn=1e-2,offset=[0], dev="cpu", filtered_indices=[i for i in range(n_input_features)],restrict_impact=-1): | ||
seed = 0 | ||
np.random.seed(seed) | ||
|
||
if magn == 0: | ||
return sample | ||
n_Vars = len(filtered_indices) | ||
|
||
wanted_full_indices = get_wanted_full_indices(filtered_indices) | ||
|
||
scalers = all_scalers[wanted_full_indices] | ||
|
||
defaults_per_variable = all_defaults[wanted_full_indices] | ||
scaled_defaults_per_variable = all_defaults_scaled[wanted_full_indices] | ||
|
||
device = torch.device(dev) | ||
|
||
with torch.no_grad(): | ||
noise = torch.Tensor(np.random.normal(offset,magn,(len(sample),n_Vars))).to(device) | ||
xadv = sample + noise | ||
|
||
# use full indices and check if in int.vars. or defaults | ||
for i in range(n_Vars): | ||
if wanted_full_indices[i] in integer_indices: | ||
xadv[:,i] = sample[:,i] | ||
else: # non integer, but might have defaults that should be excluded from shift | ||
defaults = sample[:,i].cpu() == scaled_defaults_per_variable[i] | ||
if torch.sum(defaults) != 0: | ||
xadv[:,i][defaults] = sample[:,i][defaults] | ||
|
||
if restrict_impact > 0: | ||
original_back = scalers[i].inverse_transform(sample[:,i]) | ||
difference_back = scalers[i].inverse_transform(xadv[:,i]) - original_back | ||
allowed_perturbation = restrict_impact * np.abs(original_back) | ||
high_impact = np.abs(difference_back) > allowed_perturbation | ||
if np.sum(high_impact)!=0: | ||
scaled_back_max_perturbed = torch.from_numpy(original_back[high_impact]) + torch.from_numpy(allowed_perturbation[high_impact]) * torch.sign(noise[high_impact,i]) | ||
xadv[high_impact,i] = torch.Tensor(scalers[i].transform(scaled_back_max_perturbed.reshape(-1,1)).flatten()) | ||
|
||
return xadv | ||
|
||
def fgsm_attack(epsilon=1e-2,sample=None,targets=None,thismodel=None,thiscriterion=None,reduced=True, dev="cpu", filtered_indices=[i for i in range(n_input_features)],restrict_impact=-1): | ||
if epsilon == 0: | ||
return sample | ||
n_Vars = len(filtered_indices) | ||
|
||
wanted_full_indices = get_wanted_full_indices(filtered_indices) | ||
scalers = all_scalers[wanted_full_indices] | ||
defaults_per_variable = all_defaults[wanted_full_indices] | ||
scaled_defaults_per_variable = all_defaults_scaled[wanted_full_indices] | ||
|
||
device = torch.device(dev) | ||
|
||
xadv = sample.clone().detach() | ||
|
||
# inputs need to be included when calculating gradients | ||
xadv.requires_grad = True | ||
|
||
# from the undisturbed predictions, both the model and the criterion are already available and can be used here again; | ||
# it's just that they were each part of a function, so not automatically in the global scope | ||
if thismodel==None and thiscriterion==None: | ||
global model | ||
global criterion | ||
|
||
# forward | ||
preds = thismodel(xadv) | ||
|
||
loss = thiscriterion(preds, targets).mean() | ||
|
||
thismodel.zero_grad() | ||
loss.backward() | ||
|
||
with torch.no_grad(): | ||
# get sign of gradient | ||
dx = torch.sign(xadv.grad.detach()) | ||
|
||
# add to sample | ||
xadv += epsilon*dx | ||
|
||
# remove the impact on selected variables (exclude integers, default values) | ||
# and limit perturbation based on original value | ||
if reduced: | ||
for i in range(n_Vars): | ||
if wanted_full_indices[i] in integer_indices: | ||
xadv[:,i] = sample[:,i] | ||
#print('integer index:', wanted_full_indices[i]) | ||
else: # non integer, but might have defaults that should be excluded from shift | ||
defaults = sample[:,i].cpu() == scaled_defaults_per_variable[i] | ||
if torch.sum(defaults) != 0: | ||
xadv[:,i][defaults] = sample[:,i][defaults] | ||
|
||
if restrict_impact > 0: | ||
original_back = scalers[i].inverse_transform(sample[:,i]) | ||
difference_back = scalers[i].inverse_transform(xadv.detach()[:,i]) - original_back | ||
allowed_perturbation = restrict_impact * np.abs(original_back) | ||
high_impact = np.abs(difference_back) > allowed_perturbation | ||
if np.sum(high_impact)!=0: | ||
scaled_back_max_perturbed = torch.from_numpy(original_back) + torch.from_numpy(allowed_perturbation) * dx[:,i] | ||
xadv[high_impact,i] = torch.Tensor(scalers[i].transform(scaled_back_max_perturbed[high_impact].reshape(-1,1)).flatten()) | ||
|
||
return xadv.detach() | ||
|
||
|
||
def syst_var(epsilon=1e-2,sample=None,reduced=True, dev="cpu", filtered_indices=[i for i in range(n_input_features)],restrict_impact=-1, up=True): | ||
if epsilon == 0: | ||
return sample | ||
n_Vars = len(filtered_indices) | ||
|
||
wanted_full_indices = get_wanted_full_indices(filtered_indices) | ||
|
||
scalers = all_scalers[wanted_full_indices] | ||
|
||
defaults_per_variable = all_defaults[wanted_full_indices] | ||
scaled_defaults_per_variable = all_defaults_scaled[wanted_full_indices] | ||
|
||
device = torch.device(dev) | ||
|
||
with torch.no_grad(): | ||
# variation in common direction, default is upwards | ||
systvar = epsilon * torch.Tensor(np.ones((len(sample),n_Vars))).to(device) | ||
if up == False: | ||
systvar *= -1. | ||
# scale by a factor for individual feature | ||
for i in range(n_Vars): | ||
systvar[:,i] *= all_factor_epsilons[wanted_full_indices[i]] | ||
xadv = sample + systvar | ||
|
||
# use full indices and check if in int.vars. or defaults | ||
for i in range(n_Vars): | ||
if wanted_full_indices[i] in integer_indices: | ||
xadv[:,i] = sample[:,i] | ||
else: # non integer, but might have defaults that should be excluded from shift | ||
defaults = sample[:,i].cpu() == scaled_defaults_per_variable[i] | ||
if torch.sum(defaults) != 0: | ||
xadv[:,i][defaults] = sample[:,i][defaults] | ||
|
||
if restrict_impact > 0: | ||
original_back = scalers[i].inverse_transform(sample[:,i]) | ||
difference_back = scalers[i].inverse_transform(xadv[:,i]) - original_back | ||
allowed_perturbation = restrict_impact * np.abs(original_back) | ||
high_impact = np.abs(difference_back) > allowed_perturbation | ||
if np.sum(high_impact)!=0: | ||
scaled_back_max_perturbed = torch.from_numpy(original_back[high_impact]) + torch.from_numpy(allowed_perturbation[high_impact]) * torch.sign(systvar[high_impact,i]) | ||
xadv[high_impact,i] = torch.Tensor(scalers[i].transform(scaled_back_max_perturbed.reshape(-1,1)).flatten()) | ||
|
||
return xadv |
Oops, something went wrong.