Swin Transformer V2: Scaling Up Capacity and Resolution #516

tkuri · 2022-10-22T02:10:34Z

論文概要

Swin Transformerを30億パラメータまで拡張し1,536×1,536の解像度の画像を学習可能に。様々なベンチマークでSOTA。学習における不安定性を解決するためにモデルを改良(Layer Normの順番、Cosine Attentionの導入等)。更にGPUのメモリ消費量を大幅に削減する実装方法を提案。

https://openaccess.thecvf.com/content/CVPR2022/html/Liu_Swin_Transformer_V2_Scaling_Up_Capacity_and_Resolution_CVPR_2022_paper.html

Code

https://github.com/microsoft/Swin-Transformer

tkuri added Conference: CVPR Conference on Computer Vision and Pattern Recognition Year: 2022 Subject: Backbone labels Oct 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Swin Transformer V2: Scaling Up Capacity and Resolution #516

Swin Transformer V2: Scaling Up Capacity and Resolution #516

tkuri commented Oct 22, 2022

Swin Transformer V2: Scaling Up Capacity and Resolution #516

Swin Transformer V2: Scaling Up Capacity and Resolution #516

Comments

tkuri commented Oct 22, 2022

論文概要

Code