Augmented Box Replay: Overcoming Foreground Shift for Incremental Object Detection -- @ICCV 23, Paper
In incremental learning, replaying stored samples from previous tasks together with current task samples is one of the most efficient approaches to address catastrophic forgetting. However, unlike incremental classification, image replay has not been successfully applied to incremental object detection (IOD). In this paper, we identify the overlooked problem of foreground shift as the main reason for this. Foreground shift only occurs when replaying images of previous tasks and refers to the fact that their background might contain foreground objects of the current task. To overcome this problem, a novel and efficient Augmented Box Replay (ABR) method is developed that only stores and replays foreground objects and thereby circumvents the foreground shift problem. In addition, we propose an innovative Attentive RoI Distillation loss that uses spatial attention from region-of-interest (RoI) features to constrain current model to focus on the most important information from old model. ABR significantly reduces forgetting of previous classes while maintaining high plasticity in current classes. Moreover, it considerably reduces the storage requirements when compared to standard image replay. Comprehensive experiments on Pascal-VOC and COCO datasets support the state-of-the-art performance of our model.
Illustration of our proposed framework, which highlights the key novelties of Augmented Box Replay (ABR) and Attentive RoI Distillation. ABR fuses prototype object
Please, follow the instruction provided by Detectron 1 and found in install.md
You can find the Pascal-VOC dataset already in Detectron.
We provide scripts to run the experiments in the paper (JT, FT, ABR and ablations).
You can find three scripts in the scripts/
file: run_JT.sh
, run_MI.sh
, and run_SI.sh
. The file can be used to run, respectively: single-incremetal-step detection settings (19-1, 15-5, 10-10, 5-15), multi-incremental-step detection settings (10-5, 10-2, 15-1, 10-1, 5-5).
Without specifying any option, the defaults will load the Finetune method using the Faster-RCNN. You can play with the following parameters to obtain all the results in the paper:
--feat
with options [no
,std
,ard
]. No means not using feature distillation,std
is the feature distillation employed in Faster-ILOD, whileard
is the attentive RoI distillation (as in ABR)- (default: no).--inc
will enable the incremental setting - (default: not use);--dist_type
with options [l2
,id
,none
], wherel2
is the distillation used in ILOD,id
the Inclusive Distillation Loss used in our method ABR, andnone
means not use it (default: l2);--alpha_inclusive_distillation
is a float indicating the weight of the inclusive_distillation loss (default: 1.); In ABR we vary it in range [0.1, 0.2, 0.5, 1];--beta_attentive_roi_distillation
is a float indicating the weight of the attentive RoI distillation loss.--gamma
is a hyperparameter that controls the strength of the regularization of overall ARD loss. We used 1.0 for our ABR.--memory_buffer
is a int representing the box rehearsal memory size.--memory_type
with options [mean
,random
,herding
], wheremean
is the replay strategy in our ABR,random
is choosing the boxes images randomly, whileherding
selecting boxes by herding strategy in icarl - (default: None).
For the setting of the Pascal-VOC dataset, the first task usually contains categories 5, 10, 15 and 19.
So when we train the first task for different settings, we could only train four kinds of the first task.
First, choosing the correspoinding task setting (e.g., taks=10-10
) in the scripts/run_firststep.sh
file according to your needs, and then run:
To avoid repeated calculations with the same first task, we place the box rehearsal in the output root of the first task, and then run:
bash scripts/run_firststep.sh
Modify the scripts/run_SI.sh
or scripts/run_MI.sh
.
For example, if you want to train ABR under single incremental step task=15-5
, then please keep the following sentences uncommented:
task=15-5
name=ABR_LR001_BS4_ALPHA05_BETA1_GAMMA1_INIT
exp -t ${task} -n ${name} -s ${step} --feat ard -gamma 1.0 --uce --dist_type id -alpha 0.5 -beta 1.0 -mb 2000 -mt mean -cvd 0,1
, and then run:
bash scripts/run_SI.sh
Otherwise, if you want to train the finetune method, then please keep the following sentences uncommented:
task=15-5
name=Finetune
exp -t ${task} -n ${name} -s ${step} -cvd 0,1
, and then run:
bash scripts/run_SI.sh
@InProceedings{yuyang2023augmented,
title={Augmented Box Replay: Overcoming Foreground Shift for Incremental Object Detection},
author={Yuyang, Liu and Yang, Cong and Dipam, Goswami and Xialei, Liu and van de Weijer, Joost},
booktitle={In Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month={October},
year={2023}
}
Our repository is based on the amazing work of @fcdl94MMA and @CanPeng123 FasterILOD and on the maskrcnn-benchmark library. We thank the authors and the contibutors of these projects for releasing their code.