Skip to content

Latest commit

 

History

History
39 lines (23 loc) · 2.22 KB

README.md

File metadata and controls

39 lines (23 loc) · 2.22 KB

Challenge Summary

Participants of this challenge are asked to develop a multi-modal model to estimate the construction year of any given building from 3 modality inputs: street-view imagery, VHR resolution top-view imagery, Medium Resolution S2 imagery. For half of the test set, street-view imagery will be missing, so the developed solution should address the issue of missing modality.

My solution includes modeling two classification models: type I where training and inference are conducted on all modalities, type II where training is done on all modalities, but inference is conducted only on top-view imageries. The second model is inspired by the Shared-Specific Feature Modelling approach in 3.

🎛 Development environment


mamba env create --file environment.yml
mamba activate ai4eo

💎 References


  1. Model Fusion for Building Type Classification from Aerial and Street View Images
  2. Multi-modal fusion of satellite and street-view images for urban village classification based on a dual-branch deep neural network
  3. Multi-Modal Learning With Missing Modality via Shared-Specific Feature Modelling