Created by Mariona Carós, Santi Seguí and Jordi Vitrià from University of Barcelona and Ariadna Just from Cartographic Institute of Catalonia
This work is based on our paper (arXiv here), proceedings of the International Conference on Machine Vision Applications (MVA 2023)
Airborne LiDAR systems have the capability to capture the Earth's surface by generating extensive point cloud data comprised of points mainly defined by 3D coordinates. However, labeling such points for supervised learning tasks is time-consuming. As a result, there is a need to investigate techniques that can learn from unlabeled data in order to significantly reduce the number of annotated samples. In this work, we propose to train a self-supervised encoder with Barlow Twins and use it as a pre-trained network in the task of semantic scene segmentation. The experimental results demonstrate that our unsupervised pre-training boosts performance once fine-tuned on the supervised task, especially for under-represented categories.
training samples: 45.915
training samples after semantic deduplication: 20.512
test samples: 6.710
Link here