This repository has been archived by the owner on Aug 26, 2022. It is now read-only.
v1.1.2
Updates
[#7] Selective Kernel Fusion
[#9] Fix argument bug
New Feature: Selective Kernel Fusion
Since version 1.1.2, you can fuse only partial kernels, not all kernels. Currently, only Attention class and MLP class are supported.
from oslo import GPT2MLP, GPT2Attention
# MLP only fusion
model.fuse([GPT2MLP])
# Attention only fusion
model.fuse([GPT2Attention])
# MLP + Attention fusion
model.fuse([GPT2MLP, GPT2Attention])