Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add MaskFormer #2789

Merged
merged 35 commits into from
Mar 1, 2023
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
4790c5d
scratch_1
shiyutang Nov 8, 2022
41fcd07
scratch_2
shiyutang Nov 10, 2022
35ba0d4
forward_align_finished
shiyutang Nov 11, 2022
d2bf25b
update
shiyutang Nov 15, 2022
933b367
backwardaligned_dot2
shiyutang Nov 16, 2022
360f1db
backward_blankin
shiyutang Nov 16, 2022
db3a3e2
train_divergy
shiyutang Nov 17, 2022
7a90e9b
fix_optimizer
shiyutang Nov 18, 2022
0abed97
fix_metric
shiyutang Nov 21, 2022
0e466d4
align_train_load+metric467
shiyutang Nov 21, 2022
080bac2
newest
shiyutang Nov 22, 2022
7de03d8
paddle2.2.2
shiyutang Nov 23, 2022
cc31da5
update_init_aug
shiyutang Nov 29, 2022
65d36ee
rm_redundant
shiyutang Nov 30, 2022
a75a0c3
update
shiyutang Nov 30, 2022
ea94836
update
shiyutang Nov 30, 2022
e05f390
update
shiyutang Dec 1, 2022
6e25d85
update_init
shiyutang Dec 1, 2022
24f8aed
train_with2.4and new aug
shiyutang Dec 1, 2022
4847dc7
validate_on2.4
shiyutang Dec 5, 2022
a3d151a
update_init
shiyutang Dec 13, 2022
d6fad22
add_multiple_backbone+initfree
shiyutang Dec 19, 2022
8738173
Merge branch 'develop' into paddle2.2.2
shiyutang Jan 12, 2023
14933b9
fix_format
shiyutang Jan 12, 2023
72c42c2
Merge branch 'develop' of https://github.com/PaddlePaddle/PaddleSeg i…
shiyutang Jan 12, 2023
3d54334
fix_by_comment_test_train_ok_no_acc
shiyutang Jan 13, 2023
b86a511
Merge branch 'paddle2.2.2' of https://github.com/shiyutang/PaddleSeg …
shiyutang Jan 13, 2023
dbc43c0
valid_47.6
shiyutang Jan 13, 2023
d7479be
validate_train_47.93
shiyutang Jan 19, 2023
c403cad
valid_train_small_50.4
shiyutang Jan 28, 2023
f658d73
fix_by_comment
shiyutang Jan 30, 2023
85c36a3
maskformer_tipc
shiyutang Jan 31, 2023
094d1a2
fix_large_yml
shiyutang Feb 1, 2023
f16a816
compact_train_loss
shiyutang Mar 1, 2023
fb9668a
Merge branch 'develop' into paddle2.2.2
shiyutang Mar 1, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion configs/_base_/ade20k.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ val_dataset:
- type: Normalize
mode: val


optimizer:
type: SGD
momentum: 0.9
Expand Down
17 changes: 17 additions & 0 deletions configs/maskformer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Per-Pixel Classification is Not All You Need for Semantic Segmentation

## Reference

> Cheng, Bowen, Alex Schwing, and Alexander Kirillov. "Per-pixel classification is not all you need for semantic segmentation." Advances in Neural Information Processing Systems 34 (2021): 17864-17875.

## Performance

### ADE20k

| Model | Backbone | Resolution | Training Iters | mIoU | mIoU (flip) | mIoU (ms+flip) | Links |
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
|Maskformer-tiny|SwinTransformer|512x512|160000|47.6|-|-|[model](https://bj.bcebos.com/paddleseg/dygraph/ade20k/maskformer_ade20k_swin_tiny/model.pdparams) \| [log](https://bj.bcebos.com/paddleseg/dygraph/ade20k/maskformer_ade20k_swin_tiny/train.log) \| [vdl](https://www.paddlepaddle.org.cn/paddle/visualdl/service/app/scalar?id=e59773eaad87f677837add5ff110441e)|

* Maskformer support different backbone including tiny, small, base and large. Due to long training time, the accuracy result is not provided.

* Maskformer-Base and Maskformer-Large need to be evaled with multi-scale and flip by default.
76 changes: 76 additions & 0 deletions configs/maskformer/maskformer_swin_base_ade20k_512x512_160k.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
batch_size: 4
iters: 160000

train_dataset:
type: MaskedADE20K
dataset_root: data/ADEChallengeData2016/
transforms:
- type: ResizeByShort
short_size: [320, 384, 448, 512, 576, 640, 704, 768, 832, 896, 960, 1024, 1088, 1152, 1216, 1280, 1344]
max_size: 2560
- type: RandomPaddingCrop
crop_size: [640, 640]
- type: RandomDistort
brightness_range: 0.125
brightness_prob: 1.0
contrast_range: 0.5
contrast_prob: 1.0
saturation_range: 0.5
saturation_prob: 1.0
hue_range: 18
hue_prob: 1.0
- type: RandomHorizontalFlip
- type: Padding
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

前面已经有RandomPaddingCrop,不需要padding了吧。

如果四个yml文件只是model不一样,一个yml文件作为base,其中三个yml文件可以使__base__包含。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已删除,除了model还有训练数据处理不一样,目前已经修改为最大化继承。

target_size: [640, 640]
im_padding_value: 128
- type: Normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]


val_dataset:
type: MaskedADE20K
dataset_root: data/ADEChallengeData2016/
transforms:
- type: ResizeByShort
short_size: 512
- type: Normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
mode: val

model:
type: MaskFormer
num_classes: 150
backbone:
type: SwinTransformer_base_patch4_window7_384_maskformer
pretrained: https://bj.bcebos.com/paddleseg/paddleseg/dygraph/ade20k/maskformer_ade20k_swin_base/pretrain/model.pdparams
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretrained还是针对整个maskformer模型,不是针对backbone啊

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不是,这个是只有swin的模型,训练日志里加载的参数数量也可以看出来


optimizer:
type: AdamW
weight_decay: 0.01
custom_cfg:
- name: backbone
lr_mult: 1.0
- name: norm
weight_decay_mult: 0.0
- name: relative_position_bias_table
weight_decay_mult: 0.0
grad_clip_cfg:
name: ClipGradByNorm
clip_norm: 0.01

lr_scheduler:
type: PolynomialDecay
warmup_iters: 1500
warmup_start_lr: 6.0e-11
learning_rate: 6.0e-05
end_lr: 0
power: 0.9

loss:
types:
- type: MaskFormerLoss
num_classes: 150
eos_coef: 0.1
coef: [1]
75 changes: 75 additions & 0 deletions configs/maskformer/maskformer_swin_large_ade20k_512x512_160k.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
batch_size: 4
iters: 160000

train_dataset:
type: MaskedADE20K
dataset_root: data/ADEChallengeData2016/
transforms:
- type: ResizeByShort
short_size: [320, 384, 448, 512, 576, 640, 704, 768, 832, 896, 960, 1024, 1088, 1152, 1216, 1280, 1344]
max_size: 2560
- type: RandomPaddingCrop
crop_size: [640, 640]
- type: RandomDistort
brightness_range: 0.125
brightness_prob: 1.0
contrast_range: 0.5
contrast_prob: 1.0
saturation_range: 0.5
saturation_prob: 1.0
hue_range: 18
hue_prob: 1.0
- type: RandomHorizontalFlip
- type: Padding
target_size: [640, 640]
im_padding_value: 128
- type: Normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]

val_dataset:
type: MaskedADE20K
dataset_root: data/ADEChallengeData2016/
transforms:
- type: ResizeByShort
short_size: 512
- type: Normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
mode: val

model:
type: MaskFormer
num_classes: 150
backbone:
type: SwinTransformer_large_patch4_window7_384_maskformer
pretrained: https://bj.bcebos.com/paddleseg/paddleseg/dygraph/ade20k/maskformer_ade20k_swin_large/pretrain/model.pdparams

optimizer:
type: AdamW
weight_decay: 0.01
custom_cfg:
- name: backbone
lr_mult: 1.0
- name: norm
weight_decay_mult: 0.0
- name: relative_position_bias_table
weight_decay_mult: 0.0
grad_clip_cfg:
name: ClipGradByNorm
clip_norm: 0.01

lr_scheduler:
type: PolynomialDecay
warmup_iters: 1500
warmup_start_lr: 6.0e-11
learning_rate: 6.0e-05
end_lr: 0
power: 0.9

loss:
types:
- type: MaskFormerLoss
num_classes: 150
eos_coef: 0.1
coef: [1]
75 changes: 75 additions & 0 deletions configs/maskformer/maskformer_swin_small_ade20k_512x512_160k.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
batch_size: 4
iters: 160000

train_dataset:
type: MaskedADE20K
dataset_root: data/ADEChallengeData2016/
transforms:
- type: ResizeByShort
short_size: [256, 307, 358, 409, 460, 512, 563, 614, 665, 716, 768, 819, 870, 921, 972, 1024]
max_size: 2048
- type: RandomPaddingCrop
crop_size: [512, 512]
- type: RandomDistort
brightness_range: 0.125
brightness_prob: 1.0
contrast_range: 0.5
contrast_prob: 1.0
saturation_range: 0.5
saturation_prob: 1.0
hue_range: 18
hue_prob: 1.0
- type: RandomHorizontalFlip
- type: Padding
target_size: [512, 512]
im_padding_value: 128
- type: Normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]

val_dataset:
type: MaskedADE20K
dataset_root: data/ADEChallengeData2016/
transforms:
- type: ResizeByShort
short_size: 512
- type: Normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
mode: val

model:
type: MaskFormer
num_classes: 150
backbone:
type: SwinTransformer_small_patch4_window7_224_maskformer
pretrained: https://bj.bcebos.com/paddleseg/paddleseg/dygraph/ade20k/maskformer_ade20k_swin_small/pretrain/model.pdparams

optimizer:
type: AdamW
weight_decay: 0.01
custom_cfg:
- name: backbone
lr_mult: 1.0
- name: norm
weight_decay_mult: 0.0
- name: relative_position_bias_table
weight_decay_mult: 0.0
grad_clip_cfg:
name: ClipGradByNorm
clip_norm: 0.01

lr_scheduler:
type: PolynomialDecay
warmup_iters: 1500
warmup_start_lr: 6.0e-11
learning_rate: 6.0e-05
end_lr: 0
power: 0.9

loss:
types:
- type: MaskFormerLoss
num_classes: 150
eos_coef: 0.1
coef: [1]
75 changes: 75 additions & 0 deletions configs/maskformer/maskformer_swin_tiny_ade20k_512x512_160k.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
batch_size: 2
iters: 160000

train_dataset:
type: MaskedADE20K
dataset_root: data/ADEChallengeData2016/
transforms:
- type: ResizeByShort
short_size: [256, 307, 358, 409, 460, 512, 563, 614, 665, 716, 768, 819, 870, 921, 972, 1024]
max_size: 2048
- type: RandomPaddingCrop
crop_size: [512, 512]
- type: RandomDistort
brightness_range: 0.125
brightness_prob: 1.0
contrast_range: 0.5
contrast_prob: 1.0
saturation_range: 0.5
saturation_prob: 1.0
hue_range: 18
hue_prob: 1.0
- type: RandomHorizontalFlip
- type: Padding
target_size: [512, 512]
im_padding_value: 128
- type: Normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]

val_dataset:
type: MaskedADE20K
dataset_root: data/ADEChallengeData2016/
transforms:
- type: ResizeByShort
short_size: 512
- type: Normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
mode: val

model:
type: MaskFormer
num_classes: 150
backbone:
type: SwinTransformer_tiny_patch4_window7_224_maskformer
pretrained: https://bj.bcebos.com/paddleseg/paddleseg/dygraph/ade20k/maskformer_ade20k_swin_tiny/pretrain/model.pdparams

optimizer:
type: AdamW
weight_decay: 0.01
custom_cfg:
- name: backbone
lr_mult: 1.0
- name: norm
weight_decay_mult: 0.0
- name: relative_position_bias_table
weight_decay_mult: 0.0
grad_clip_cfg:
name: ClipGradByNorm
clip_norm: 0.01

lr_scheduler:
type: PolynomialDecay
warmup_iters: 1500
warmup_start_lr: 6.0e-11
learning_rate: 6.0e-05
end_lr: 0
power: 0.9

loss:
types:
- type: MaskFormerLoss
num_classes: 150
eos_coef: 0.1
coef: [1]
Loading