This repository contains the code and experiments for the paper "Training objective drives the consistency of representational similarity across datasets" (arXiv).
The Platonic Representation Hypothesis claims that recent foundation models are converging to a shared representation space as a function of their downstream task performance, irrespective of the objectives and data modalities used to train these models. Representational similarity is generally measured for individual datasets and is not necessarily consistent across datasets. Thus, one may wonder whether this convergence of model representations is confounded by the datasets commonly used in machine learning. Here, we propose a systematic way to measure how representational similarity between models varies with the set of stimuli used to construct the representations. We find that the objective function is the most crucial factor in determining the consistency of representational similarities across datasets. Specifically, self-supervised vision models learn representations whose relative pairwise similarities generalize better from one dataset to another compared to those of image classification or image-text models. Moreover, the correspondence between representational similarities and the models' task behavior is dataset-dependent, being most strongly pronounced for single-domain datasets. Our work provides a framework for systematically measuring similarities of model representations across datasets and linking those similarities to differences in task behavior.
- 🔧
sim_consistency/
- Core library code .
- 📜
scripts/
config/
: Configuration files for models and datasetsdownload_ds/
: Dataset download scripts- Scripts for feature extraction, model similarity computation, and linear probe evaluation. ( See How to Run)
- 📓
notebooks/
- Jupyter notebooks for analysis and visualization. Each notebook is named according to its corresponding paper section and can be used to reproduce our findings 🧪 .
The code relies on a specific directory structure for data organization.
All paths are configured and created in scripts/project_location.py
.
project_root/
├── datasets/ # Raw and subsetted datasets
├── features/ # Extracted model features
├── model_similarities/ # Representation similarity matrices for a given dataset, set of models, and similarity metric
├── models/ # Trained linear probe models
└── results/ # Evaluation results and experiments
- Nagivate to the repository root directory.
- Install the package:
pip install .
- Configure the project location as described in the Project structure section.
You can define the paths in the
scripts/project_location.py
file and run it. It will create the necessary directories.
- Create the webdatasets directory
[PROJECT_ROOT]/datasets/wds
- Download the datasets using the script
scripts/download_ds/download_webdatasets.sh
.
cd scripts/download_ds
bash download_webdatasets.sh [PROJECT_ROOT]/datasets/wds
NOTE: Scripts download datasets from huggingface.co
. A Hugging Face account may be required -
see download guide.
Running the script scripts/feature_extraction.py
will extract features from the models specified in the
models_config
file for the datasets specified in the datasets
file. The script launches a SLURM job for each model
separately. It saves the extracted features in [PROJECT_ROOT]/features
directory.
cd scripts
python feature_extraction.py \
--models_config ./configs/models_config_wo_alignment.json \
--datasets ./configs/webdatasets_w_in1k.txt
The computation of the model similarities is a crucial step in our analysis. It consists of two parts: Dataset subsampling and model similarity computation. The first part is necessary to ensure that the datasets have a maximum of 10k samples (see paper for justification), while the second part computes the representational similarities between the models.
- ImageNet-1k:
- Run:
It will generate the indices for the ImageNet-1k subset datasets and store them in
cd scripts python generate_imagenet_subset_indices.py
[PROJECT_ROOT]/datasets/subset
. - Create the new ImageNet-1k subsets by slicing the precomputed features:
It will create the ImageNet-1k subsets and store them in
cd scripts python in_subset_extraction.py
[PROJECT_ROOT]/datasets/imagenet-subset-{X}k
(X
indicates the total nr. of samples).
- Run:
- Remaining Datasets: Run the jupyter notebook
notebooks/check_wds_sizes_n_get_subsets.ipynb
to check the dataset sizes and create subsets of the datasets if needed. The indices for the subsets are stored in[PROJECT_ROOT]/datasets/subset
.
Running the script scripts/distance_matrix_computation.py
will compute the representational similarities between the
models for each dataset and similarity metric specified in scripts/configs/similarity_metric_config_local_global.json
.
It saves the computed similarity matrices in [PROJECT_ROOT]/model_similarities
directory.
cd scripts
python distance_matrix_computation.py \
--models_config ./configs/models_config_wo_alignment.json \
--datasets ./configs/webdatasets_w_insub10k.txt
Running the script scripts/single_model_evaluation.py
will train a linear probe on the extracted features for each
model and dataset specified in the models_config
and datasets
files, respectively.
It saves the trained models in [PROJECT_ROOT]/models
and the evaluation results in the
[PROJECT_ROOT]
directory.
cd scripts
python single_model_evaluation.py \
--models_config ./configs/models_config_wo_alignment.json \
--datasets ./configs/webdatasets_w_in1k.txt
Note: The script automatically launches separate SLURM jobs for each model to enable parallel processing.
After having extracted the features, computed the model similarities, and trained the linear probes, you can reproduce our results by following steps:
- Run aggregation notebooks:
- All
notebooks/aggregate_*
notebooks: store the results in[PROJECT_ROOT]/ results/aggregated/
- 🔥 For consistency computation of specific model set pairs 🔥:
notebooks/aggregate_consistencies_for_specific_model_set_pairs.ipynb
- All
- Run section-specific notebooks to generate figures
- Results saved in
results/plots/
- Results saved in
- This repository is built using components from thingsvision and the CLIP benchmark
If you find this work interesting or useful in your research, please cite our paper:
@misc{ciernik2024trainingobjectivedrivesconsistency,
title={Training objective drives the consistency of representational similarity across datasets},
author={Laure Ciernik and Lorenz Linhardt and Marco Morik and Jonas Dippel and Simon Kornblith and Lukas Muttenthaler},
year={2024},
eprint={2411.05561},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2411.05561},
}
If you have any feedback, questions, or ideas, please feel free to raise an issue in this repository. Alternatively, you can reach out to us directly via email for more in-depth discussions or suggestions.
📧 Contact us: ciernik[at]tu-berlin.de
Thank you for your interest and support!