rnn/lstm/gru dynamic quantization #5435

nihui · 2024-04-18T06:43:02Z

This reverts commit 951ab60.

nihui · 2024-05-07T06:15:36Z

import torch
import torch.nn as nn
import torch.nn.functional as F
import pnnx

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()

        self.rnn = nn.RNN(input_size=256, hidden_size=256, num_layers=30)
        self.lstm = nn.LSTM(input_size=256, hidden_size=256, num_layers=30)
        self.gru = nn.GRU(input_size=256, hidden_size=256, num_layers=30)

    def forward(self, x):
        out0, _ = self.rnn(x)
        out1, _ = self.lstm(x)
        out2, _ = self.gru(x)
        return out0, out1, out2

net = Model().half().float()
net.eval()

torch.manual_seed(0)
x = torch.rand(300, 1, 256)

pnnx.export(net, "rnn.pt", x)

ncnn2int8 rnn.ncnn.param rnn.ncnn.bin rnn-int8.ncnn.param rnn-int8.ncnn.bin /dev/null

rnn/rnn-int8.bin	fp16	int8
模型体积	60.1M	30.6M

qcom855plus MAE	fp16	int8
30层rnn	2.29E-08	7.31E-08
30层lstm	4.39E-09	5.54E-09
30层gru	6.75E-09	1.96E-08

qcom855plus 单线程耗时	fp32	fp16	int8
30层rnn	45.16	24.81	19.87
30层lstm	256.51	121.99	60.7
30层gru	167.52	94.68	46.29

i5-12400 单线程跑30层lstm-int8模型耗时
naive(sse2)	95.24
sse2	87.02
avx	64.85
avx2	42.22
avxvnni	23.24
avx512	27.95
avx512vnni	15.8

nihui · 2024-05-08T07:25:17Z

imx6d 单线程耗时	fp32	int8
30层rnn	1392.22	504.83
30层lstm	6063.91	1833.46
30层gru	4357.59	1300.93

nihui and others added 3 commits April 15, 2024 17:14

wip

7a6c0a7

fuse gru int8 kernel

017fc31

apply code-format changes

cd0c719

github-actions bot added riscv tool test layer arm labels Apr 18, 2024

nihui added 3 commits April 18, 2024 14:49

wip

e22a695

rnn int8 kernel

9275f17

wip

45f9161

github-actions bot added the x86 label Apr 22, 2024

nihui added 12 commits April 22, 2024 17:04

wip

f361aef

lstm arm int8

1a2d242

code clean

29b4930

wip

21c4283

fix

0d66049

fix

7546165

fix build

5880edb

debug++

98d77cf

fix

5a45f84

improve neon sigmoid numeric stablility

b4e0104

code--

16ab586

test 79

6c94fd5

nihui changed the title ~~[WIP] rnn/lstm/gru weight only quantization~~ rnn/lstm/gru weight only quantization Apr 24, 2024

wip

3c425e9

nihui changed the title ~~rnn/lstm/gru weight only quantization~~ [WIP] rnn/lstm/gru weight only quantization Apr 24, 2024

nihui added 2 commits April 26, 2024 20:06

gru dynamic quantization

cacf762

wip

74d4f73

nihui changed the title ~~[WIP] rnn/lstm/gru weight only quantization~~ [WIP] rnn/lstm/gru dynamic quantization Apr 28, 2024

nihui force-pushed the rnn-weight-quantize branch from b85eb01 to e0575ea Compare April 30, 2024 11:45

wip

e0575ea

nihui force-pushed the rnn-weight-quantize branch from dba4755 to 1830257 Compare April 30, 2024 11:50

nihui and others added 7 commits April 30, 2024 19:54

wip

1830257

apply code-format changes

7ee4310

Merge branch 'Tencent:master' into rnn-weight-quantize

f91c947

opt

a5444f6

opt

31ba013

opt

50dc43e

define ssse3 for msvc

f47aad9

github-actions bot added core cmake labels May 6, 2024

nihui and others added 3 commits May 6, 2024 23:01

Update lstm_int8.h

951ab60

Revert "Update lstm_int8.h"

b8a51b0

This reverts commit 951ab60.

x86 32bit

8f0bccb

nihui and others added 5 commits May 7, 2024 19:36

opt++

378c091

apply code-format changes

8a07bde

Merge branch 'master' into rnn-weight-quantize

00b096f

fix test

ee36857

fix

03154e6

nihui changed the title ~~[WIP] rnn/lstm/gru dynamic quantization~~ rnn/lstm/gru dynamic quantization May 8, 2024

nihui added 3 commits May 8, 2024 15:42

coverage++

8a4e5c0

coverage++

b87e51f

coverage++

58c3ad6

github-actions bot added the doc label May 8, 2024

nihui added 2 commits May 8, 2024 19:03

update doc

779fba0

clean

617c594

nihui merged commit 08b7d99 into Tencent:master May 8, 2024
64 of 66 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rnn/lstm/gru dynamic quantization #5435

rnn/lstm/gru dynamic quantization #5435

nihui commented Apr 18, 2024 •

edited

Loading

nihui commented May 7, 2024

nihui commented May 8, 2024

rnn/lstm/gru dynamic quantization #5435

rnn/lstm/gru dynamic quantization #5435

Conversation

nihui commented Apr 18, 2024 • edited Loading

nihui commented May 7, 2024

nihui commented May 8, 2024

nihui commented Apr 18, 2024 •

edited

Loading