Repository of the Causal Invariant Bayesian Neural Network (CIBNN).
Run the following command to install the required packages:
pip install -r requirements.txt
You can install the packages in a virtual environment by running the following commands:
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
The data used in the experiments are not included in the repository due to their large memory footprint. They are either automatically downloaded when the dataset is used or they must be downloaded separately, as indicated in the next table. The data can be downloaded from the following links:
Dataset | Split | Requires download | Link | File ID |
---|---|---|---|---|
MNIST | auto | torchvision documentation | ||
CIFAR10 | auto | torchvision documentation | ||
OFFICEHOME | auto | project homepage | ||
ACRE | IID | yes | project homepage | 1P0WBnnjWolGsrATUQtx4ictiYlOGc-OT |
Comp | yes | 1-LZMt08a1v-KSuaQTS1lqD6BCEw47LEY | ||
Comp | yes | 1Sn_tKbe6mMv7Tc_y6hJZnm7lSenjwIys | ||
CONCEPTARC | auto | github page | ||
RAVEN | IID | yes | project homepage | 111swnEzAY2NfZgeyAhVwQujMjRUfeyuY |
OOD | no (requires IID) |
Download the data using the following command:
gdown <file_id>
file_id
is the ID of the zip file in google drive, indicated in the last column. The zip must be placed in the corresponding data/<dataset_name>
folder and unzipped:
cd data/<dataset_name>
unzip <file_name>.zip
To run the tests, use the following command:
pytest
Make sure the data is properly downloaded before running the tests. Running the tests should take a couple minutes depending on your config.
To run the experiments from a config file, use the following command:
python run.py --load_config 'config/<config_file>.yaml' '<model_name>'
You can also pass arguments directly to the script:
python run.py --<parameter> '<value>' '<model_name>' --<model_parameter> '<value>'
Use the -h
flag to see the available options:
python run.py -h
To see the model options, use the following command:
python run.py '<model_name>' -h
Common arguments:
--load_config
: Load the configuration file.--save_config
: Save the configuration file.--data
: Dataset to use.--save
: Path to save the model.--train
,--test
,--train_and_test
: Train, test, or train and test the model. Default option istrain_and_test
.--max_epochs
: Maximum number of epochs. You might want to set it as the default value of Pytorch Lightning is 1000.
To run the experiments from the paper, use the provided config files in the config
folder.
When running an o.o.d config, replace the save
parameter in the config file with your own save path. Here are some additional dataset-specific parameters that you can tweak:
Dataset | Parameter | Description |
---|---|---|
CIFAR10 | --perturbation |
Level of image perturbation. Float between 0.0 and 1.0. |
OFFICEHOME | --split |
The split to use. IID Options: RealWorld , Product , Art , Clipart . In OOD settings, write the test split next to the train split, e.g. to test a model on Product after training on RealWorld , write realWorld_Product . |
ACRE | --split |
The split to use. Options: IID , Comp , Sys . |
RAVEN | --split |
The split to use. Options: IID , ĪID_SMALL , IID_TRANSFER , OOD , OOD_SMALL , OOD_TRANSFER . |
To run a hyperparameter search, use the following command:
python hyp_search.py
You can use the same arguments as in the run.py
script. Please, be aware that the Ray Tune trials are run in a different working directory: the path arguments must be absolute and the logging files are saved in a different directory (check Ray Tune
documentation for more information).