- 带插件的QAT
- 在qat onnx生成后替换相应的op
- 插件在网络尾部(插件的输出就是网络的输出) ,这时候插件对应的op可以不参与量化训练
更多 see https://github.com/lix19937/pytorch-quantization/tree/main/pytorch_quantization/nn
https://github.com/lix19937/auto_qat
Did you start with a pretrained model w/o QAT? If yes, does the FP32 model (unquantized) also shows instability?
How did you add the QDQ nodes and how did you determine the scales (what SW did you use? did you perform calibration? Was the calibration DS large enough?)?
Did you perform fine-tuning after adding fake quantization? Did you observe the loss vs accuracy curve? Did you check that you did not overfit?
Intuitively I think you should verify that your model is not overfitting because an overfitted model will be unstable when we introduce noise from quantization and limited-precision arithmetic (in float arithmetic different operations ordering can produce small differences in output).
-
yolox
-
yolov7
-
centernet(lidar seg)
-
lidar od
-
resnet
-
hrnet
-
hourglass