Implementation of the Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones. This repository will only house the attention layer and not much more.
$ pip install halonet-pytorch
import torch
from halonet_pytorch import HaloAttention
attn = HaloAttention(
dim = 512, # dimension of feature map
block_size = 8, # neighborhood block size (feature map must be divisible by this)
halo_size = 4, # halo size (block receptive field)
dim_head = 64, # dimension of each head
heads = 4 # number of attention heads
).cuda()
fmap = torch.randn(1, 512, 32, 32).cuda()
attn(fmap) # (1, 512, 32, 32)
@misc{vaswani2021scaling,
title = {Scaling Local Self-Attention For Parameter Efficient Visual Backbones},
author = {Ashish Vaswani and Prajit Ramachandran and Aravind Srinivas and Niki Parmar and Blake Hechtman and Jonathon Shlens},
year = {2021},
eprint = {2103.12731},
archivePrefix = {arXiv},
primaryClass = {cs.CV}
}