Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon 5th No.69】 分类大模型--人体视觉任务SOLIDER #2995

Merged
merged 11 commits into from
Oct 18, 2023
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions docs/zh_CN/models/LUPerson/solider.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Solider

-----
## 目录

- [1. 模型介绍](#1)
- [2. 对齐日志、模型](#2)

<a name='1'></a>

## 1. 模型介绍

Solider是一个语义可控的自监督学习框架,可以从大量未标记的人体图像中学习一般的人类表征,从而最大限度地有利于下游以人类为中心的任务。与已有的自监督学习方法不同,该方法利用人体图像中的先验知识建立伪语义标签,并将更多的语义信息引入到学习的表示中。同时,不同的下游任务往往需要不同比例的语义信息和外观信息,单一的学习表示不能满足所有需求。为了解决这一问题,Solider引入了一种带有语义控制器的条件网络,可以满足下游任务的不同需求。[论文地址](https://arxiv.org/abs/2303.17602)。

<a name='2'></a>

## 2. 对齐日志、模型

| model | weight | log |
| ----------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注意到这个链接名字不太对,我已将SOILDER改为SOLIDER,麻烦更改一下吧~

| swin_tiny_patch4_window7_224 | https://paddleclas.bj.bcebos.com/models/SOILDER/SwinTransformer_tiny_patch4_window7_224_pretrained.pdparams | 链接:https://pan.baidu.com/s/1W5zUFboMMhXETy4HEWbM3Q?pwd=45nx <br/>提取码:45nx |
| swin_small_patch4_window7_224 | https://paddleclas.bj.bcebos.com/models/SOILDER/SwinTransformer_small_patch4_window7_224_pretrained.pdparams | 链接:https://pan.baidu.com/s/1sqcUdfv6FyhW9_QgxBUPWA?pwd=letv <br/>提取码:letv |
| swin_base_patch4_window7_224 | https://paddleclas.bj.bcebos.com/models/SOILDER/SwinTransformer_base_patch4_window7_224_pretrained.pdparams | 链接:https://pan.baidu.com/s/1S2TgDxDRa72C_3FrP8duiA?pwd=u3d2 <br/>提取码:u3d2 |

[1]:基于 LUPerson 数据集预训练

<a name='3'></a>
3 changes: 3 additions & 0 deletions ppcls/arch/backbone/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,9 @@
from .variant_models.pp_lcnetv2_variant import PPLCNetV2_base_ShiTu
from .variant_models.efficientnet_variant import EfficientNetB3_watermark
from .variant_models.foundation_vit_variant import CLIP_large_patch14_224_aesthetic
from .variant_models.swin_transformer_variant import SwinTransformer_tiny_patch4_window7_224_SOLIDER
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个一次全部import吧

from .variant_models.swin_transformer_variant import SwinTransformer_small_patch4_window7_224_SOLIDER
from .variant_models.swin_transformer_variant import SwinTransformer_base_patch4_window7_224_SOLIDER
from .model_zoo.adaface_ir_net import AdaFace_IR_18, AdaFace_IR_34, AdaFace_IR_50, AdaFace_IR_101, AdaFace_IR_152, AdaFace_IR_SE_50, AdaFace_IR_SE_101, AdaFace_IR_SE_152, AdaFace_IR_SE_200
from .model_zoo.wideresnet import WideResNet
from .model_zoo.uniformer import UniFormer_small, UniFormer_small_plus, UniFormer_small_plus_dim64, UniFormer_base, UniFormer_base_ls
Expand Down
18 changes: 11 additions & 7 deletions ppcls/arch/backbone/legendary_models/swin_transformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -359,12 +359,8 @@ def __init__(self,
self.window_size = window_size
self.shift_size = shift_size
self.mlp_ratio = mlp_ratio
if min(self.input_resolution) <= self.window_size:
# if window size is larger than input resolution, we don't partition windows
self.shift_size = 0
self.window_size = min(self.input_resolution)
assert 0 <= self.shift_size < self.window_size, "shift_size must in 0-window_size"

self.check_condition()
self.norm1 = norm_layer(dim)
self.attn = WindowAttention(
dim,
Expand Down Expand Up @@ -412,6 +408,13 @@ def __init__(self,

self.register_buffer("attn_mask", attn_mask)

def check_condition(self):
if min(self.input_resolution) <= self.window_size:
# if window size is larger than input resolution, we don't partition windows
self.shift_size = 0
self.window_size = min(self.input_resolution)
assert 0 <= self.shift_size < self.window_size, "shift_size must in 0-window_size"

def forward(self, x):
H, W = self.input_resolution
B, L, C = x.shape
Expand Down Expand Up @@ -835,7 +838,8 @@ def _load_pretrained(pretrained,
model_url,
use_ssld=False,
use_imagenet22k_pretrained=False,
use_imagenet22kto1k_pretrained=False):
use_imagenet22kto1k_pretrained=False,
**kwargs):
if pretrained is False:
pass
elif pretrained is True:
Expand Down Expand Up @@ -988,4 +992,4 @@ def SwinTransformer_large_patch4_window12_384(
use_ssld=use_ssld,
use_imagenet22k_pretrained=use_imagenet22k_pretrained,
use_imagenet22kto1k_pretrained=use_imagenet22kto1k_pretrained)
return model
return model
3 changes: 3 additions & 0 deletions ppcls/arch/backbone/variant_models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,6 @@
from .vgg_variant import VGG19Sigmoid
from .pp_lcnet_variant import PPLCNet_x2_5_Tanh
from .pp_lcnetv2_variant import PPLCNetV2_base_ShiTu
from .swin_transformer_variant import SwinTransformer_base_patch4_window7_224_SOLIDER
Yang-Changhui marked this conversation as resolved.
Show resolved Hide resolved
from .swin_transformer_variant import SwinTransformer_small_patch4_window7_224_SOLIDER
from .swin_transformer_variant import SwinTransformer_tiny_patch4_window7_224_SOLIDER
Loading