Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

相同的模型在不同的平台,表现差异巨大 #5263

Open
minushuang opened this issue Jan 4, 2024 · 11 comments
Open

相同的模型在不同的平台,表现差异巨大 #5263

minushuang opened this issue Jan 4, 2024 · 11 comments

Comments

@minushuang
Copy link

minushuang commented Jan 4, 2024

同样的ncnn模型,在pc上结果正常,移动端掉点太多。

如下表格,第四行的括号中的值表示,相对于ncnn-fp16模型在pc上的表现的掉点情况,以及相对于pt模型的掉点情况。

请问作者大大,这正常吗?什么原因导致android和pc有如此大的差距?

表格

@nihui
Copy link
Member

nihui commented Jan 8, 2024

@minushuang
Copy link
Author

minushuang commented Feb 26, 2024

我优化了一版精度更高的模型,但pt和ncnn-pc和ncnn-android的mAP还是有一些差距。

模型 class1-ap class2-ap mAP
pt 89.97% 72% 80.9%
ncnn-pc 88.2%% 64.6% 76.4%
ncnn-android 84.2% 61.3% 72.8%

check了一下wiki文档,暂时没发现问题,且pc上的c++和android上的c++预处理、后处理、推理代码都是一致的,很难理解为什么从pc上到android会存在gap?

这样的gap正常吗?

@nihui
Copy link
Member

nihui commented Feb 26, 2024

@minushuang
Copy link
Author

感谢大佬及时回复,我看了下,输入图片大部分是jpg/png,后续会用png/bmp测一遍看下效果。

再请问下,如果都是同样的jpg图片,同样的推理引擎,在pc和android的推理结果是否应该是一样的呢?

@minushuang
Copy link
Author

重新review了一下,现在ncnn-pc基本无精度损失(c++,python都测试过,精度损失在1%以内),但迁移到android还是损失七八个点左右。

我将pc和android端的mat分别保存下来,发现两者在经过下面这行代码之后,输出的mat开始有差异。

ncnn::Mat in_net = ncnn::Mat::from_pixels_resize(img.data, ncnn::Mat::PIXEL_BGR2RGB, img_w, img_h, w, h);

如下,截取了一部分,总的差异比例在0.23%,即千分之二左右。

all data is 537600
NO.525043 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.525140 row not equal. pc: 0 0 0, android: 0 0 2
NO.525141 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.525142 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.525819 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.525849 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.525855 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.525921 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.525922 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.525923 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.525924 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.525925 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.525949 row not equal. pc: 0 0 0, android: 0 0 199
NO.525951 row not equal. pc: 0 0 0, android: 0 0 -417197
NO.525952 row not equal. pc: 0 0 0, android: 0 0 -1606
NO.525954 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.526624 row not equal. pc: 0 0 0, android: 0 0 3380302
NO.526625 row not equal. pc: 0 0 0, android: 0 0 217396576
NO.526628 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.526630 row not equal. pc: 0 0 0, android: 0 0 -27239742
NO.526631 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.526656 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.526668 row not equal. pc: 0 0 0, android: 0 0 2367962
NO.526730 row not equal. pc: 0 0 0, android: 0 0 2
NO.526733 row not equal. pc: 0 0 0, android: 0 0 -25443
NO.526734 row not equal. pc: 0 0 0, android: 0 0 -1590
NO.526758 row not equal. pc: 0 0 0, android: 0 0 12864
NO.526760 row not equal. pc: 0 0 0, android: 0 0 -2147483648
NO.526761 row not equal. pc: 0 0 0, android: 0 0 -2147483648
NO.526762 row not equal. pc: 0 0 0, android: 0 0 -2147483648
NO.526763 row not equal. pc: 0 0 0, android: 0 0 -2147483648
NO.527273 row not equal. pc: 0 0 0, android: 0 0 2459335
NO.527274 row not equal. pc: 0 0 0, android: 0 0 2147483647
NO.527275 row not equal. pc: 0 0 0, android: 0 0 2147483647

@minushuang
Copy link
Author

from_pixels_resize时使用的快速指令集导致的,将if __ARM_NEON分支注释掉之后,from_pixels_resize输出的内容一致了。

另外,也尝试在编译ncnn时,指定-DANDROID_ARM_NEON=OFF关闭NEON,armv7可以生效,armv8不生效。

@nihui
Copy link
Member

nihui commented Mar 27, 2024

#5390
请测试下这个pr是否能解决你遇到的问题

@minushuang
Copy link
Author

#5390 请测试下这个pr是否能解决你遇到的问题

可以,from_pixel_resize的输出pc和移动端能够保持一致了。模型的输出精度仍然有丢失。

@nihui
Copy link
Member

nihui commented Mar 28, 2024

#5393

还有这个pr修复了softmax fp16的运算错误,可以测试下

@nihui nihui closed this as completed in 6595743 Mar 28, 2024
@nihui nihui reopened this Mar 28, 2024
@minushuang
Copy link
Author

#5393

还有这个pr修复了softmax fp16的运算错误,可以测试下

好的,我测下看看。
不过我在对比模型的output mat的差异时,其实是关闭了fp16的计算的,且这个output没有经过softmax😊

@minushuang
Copy link
Author

已测,最终的mAP变化不大,从75.67%--->75.66%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants