Skip to content

FedPylot: Navigating Federated Learning for Real-Time Object Detection in Internet of Vehicles

License

Notifications You must be signed in to change notification settings

CyprienQuemeneur/fedpylot

Repository files navigation

This README is currently under construction 🚧.

Table of Contents

🔥 What's New

This is the initial release! 🤗

📖 Introduction

Official repository of:

For questions or inquiries about this program, please contact cyprien.quemeneur@protonmail.com.

🐍 Installation

We encourage to install FedPylot both locally and on your computer cluster, as a local env will be more suited for preparing the data and can help for prototyping.

git clone https://github.com/CyprienQuemeneur/fedpylot.git

To install the necessary packages in your local virtual environment run:

pip install -r requirements.txt

Installing all the packages on your cluster can come with some subtleties, and we would advise to refer to the documentation of your cluster for package installation and loading.

⚙️ Data Preparation

We used two publicly available autonomous driving datasets in our paper: the 2D object detection subset of the KITTI Vision Benchmark Suite and nuImages, an extension of nuScenes dedicated to 2D object detection.

Preparing the data involve both converting the annotations to the YOLO format and splitting the samples among the federated participants. In our experiments, we assume that the server holds a separate validation set and is responsible for evaluating the global model.

Data preparation should ideally be run locally. Splitting the original dataset will create a folder for each federated participants (server and clients) which will contain the samples and labels. Archiving the folders before sending them to the cluster is recommended and can be performed automatically by the preparation scripts (intermediate folders will not be deleted, ensure you have enough disk space). A good way to securely and reliably transfer a large volume of data to the cloud is to use a tool such as Globus.

KITTI

First go to https://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=2d and create an account to download the 2D object detection subset of KITTI. You will need to download the data samples, left color images of object data set (12 GB), and data labels, training labels of object data set (5 MB), and unzip the files in the datasets subfolder of this program.

By default, 25% of the training data is sent to the central server, as KITTI does not feature a predefined validation set. For the remaining data, we perform a balanced and IID split among 5 clients. The DontCare attribute is ignored. The random seed is fixed so that splitting is reproducible. To perform both the split and the annotation conversion, run the following:

python datasets/prepare_kitti.py --tar

If you wish to modify our splitting strategy, simply edit prepare_kitti.py.

nuImages

Go to https://nuscenes.org/nuimages and create an account, then download the samples and metadata (sweeps are not needed), and unzip the files in the datasets subfolder of this program. Unlike KITTI, nuImages is organized as a relational database, and we will use the nuscenes-devkit to manipulate the files. For the devkit to work properly, you need to create a nuimages folder and move the folders corresponding to the samples and labels to it. The folder structure should then be the following:

/datasets/nuimages
    samples	- Sensor data for keyframes (annotated images).
    v1.0-train  -  JSON tables that include all the metadata and annotations for the training set.
    v1.0-val    -  JSON tables that include all the metadata and annotations for the validation set.

nuImages predefined validation set is stored on the server, while the training data is split non-IID among 10 clients based on the locations and timeframes at which the data samples were captured.

Run the following to create the split which retains only 10 classes based on the nuScenes competition:

python datasets/prepare_nuimages.py --class-map 10 --tar

And the following to retain the full long-tail distribution with 23 classes:

python datasets/prepare_nuimages.py --class-map 23 --tar

If you wish to modify our splitting strategy, simply edit prepare_nuimages.py.

🚀 Running a Job

We provide template job scripts for the centralized and the federated settings, assuming the cluster supports the Slurm Workload Manager. We use official YOLOv7 weights pre-trained on MS COCO to initialize an experiment. Downloading the appropriate weights is normally performed by the script that launches the job, but you need to do it manually if Internet connexions are not available on the computing nodes of your cluster. FedPylot supports all YOLOv7 variants. For example, to download pre-trained weights for YOLOv7-tiny run the following:

bash weights/get_weights.sh yolov7-tiny

To launch a federated experiment, you will need to modify run_federated.sh to fit your cluster's requirements and choose the experimental settings, then run the command:

sbatch run_federated.sh

Similarly, to perform centralized learning, edit run_centralized.sh and then execute:

sbatch run_centralized.sh

In all cases, the data are copied to the local storage of the node(s) before training begins. For the federated setting, this is performed with a separate MPI script scatter_data.py, which ensures that the local datasets are dispatched to the appropriate federated participants.

🎓 Citation

If you find FedPylot is useful in your research or applications, please consider giving us a star 🌟 and citing our paper.

@article{fedpylot2024,
      title = {{FedPylot}: Navigating Federated Learning for Real-Time Object Detection in {Internet} of {Vehicles}}, 
      author = {Quéméneur, Cyprien and Cherkaoui, Soumaya},
      journal = {arXiv preprint arXiv:2406.03611},
      year = {2024}
}

🤝 Acknowledgements

We sincerely thank the authors of YOLOv7 for providing their code to the community!

📜 License

FedPylot is released under the GPL-3.0 Licence.