The current SE-Resnext 152 benchmark result on P40 #10706

chengduoZH · 2018-05-16T11:05:14Z

Env

8 cards，P40
docker container：zcd_paddle-latest-dev
paddle commit：5f6fd26fbd1eabc083e9ce506e15e4b1f5f9819f
benchmark code: https://github.com/chengduoZH/benchmark/tree/Update_SE_Resnext152/fluid/SE-ResNeXt-152
- commit: ef9f3d4af3fc2a107a0aeb408ae19083dfb5f26d

Net work config

se-resnext 152
Excluding data read operations
mem_opt:ON
flower data set
batch_size:320(40/card)

/	8 card	-	single card	-	Acceleration ratio
/	sec/batch	image/sec	sec/batch	image/sec	-
parallel _do	3.100722447	103.2017555	1.783022	22.43382303	4.600275014
parallel_exe	2.870211447	111.4900438	1.71482642	23.32597605	4.779651817
parallel_exe + balance_param_opt	2.864137779	111.7264687	-	-	4.78978751

typhoonzero · 2018-05-16T11:38:29Z

Can you paste the result of nvidia-smi topo -m, if not all GPUs are connected with PCIe switch, it may reduce the performance. In that case, you can test 4 GPU ratio for reference.

chengduoZH · 2018-05-16T12:15:20Z

@typhoonzero Thanks!
The result is:

GPU0	 X 	PIX	PIX	PIX	PXB	PXB	PXB	PXB	SOC	0-13
GPU1	PIX	 X 	PIX	PIX	PXB	PXB	PXB	PXB	SOC	0-13
GPU2	PIX	PIX	 X 	PIX	PXB	PXB	PXB	PXB	SOC	0-13
GPU3	PIX	PIX	PIX	 X 	PXB	PXB	PXB	PXB	SOC	0-13
GPU4	PXB	PXB	PXB	PXB	 X 	PIX	PIX	PIX	SOC	0-13
GPU5	PXB	PXB	PXB	PXB	PIX	 X 	PIX	PIX	SOC	0-13
GPU6	PXB	PXB	PXB	PXB	PIX	PIX	 X 	PIX	SOC	0-13
GPU7	PXB	PXB	PXB	PXB	PIX	PIX	PIX	 X 	SOC	0-13
mlx5_0	SOC	SOC	SOC	SOC	SOC	SOC	SOC	SOC	 X

Legend:

  X   = Self
  SOC  = Connection traversing PCIe as well as the SMP link between CPU sockets(e.g. QPI)
  PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
  PXB  = Connection traversing multiple PCIe switches (without traversing the PCIe Host Bridge)
  PIX  = Connection traversing a single PCIe switch
  NV#  = Connection traversing a bonded set of # NVLinks

chengduoZH · 2018-05-16T12:39:48Z

The following is the performant comparison of four cards:

/	4 card	-	single card	-	Acceleration ratio
/	s/batch	image/s	s/batch	image/s	-
parallel_do	2.30841	69.31177737	1.783022	22.43382303	3.089610598
parallel_exe	2.192076294	72.99016027	1.71482642	23.32597605	3.129136381
parallel_exe + balance_param_opt	2.179436251	73.41348019	-	-	3.147284385

typhoonzero · 2018-05-16T13:39:26Z

Seems adding 4 more cards only added about 1.5 equal GPU cards, not sure whether this is affected by the topo.

shanyi15 · 2018-08-15T10:25:56Z

您好，此issue在近一个月内暂无更新，我们将于今天内关闭。若在关闭后您仍需跟进提问，可重新开启此问题，我们将在24小时内回复您。因关闭带来的不便我们深表歉意，请您谅解~感谢您对PaddlePaddle的支持!
Hello, this issue has not been updated in the past month. We will close it today for the sake of other user‘s experience. If you still need to follow up on this question after closing, please feel free to reopen it. In that case, we will get back to you within 24 hours. We apologize for the inconvenience caused by the closure and thank you so much for your support of PaddlePaddle Group!

chengduoZH assigned reyoung, panyx0718 and JiayiFeng May 16, 2018

panyx0718 assigned kolinwei and guochaorong May 16, 2018

shanyi15 closed this as completed Aug 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The current SE-Resnext 152 benchmark result on P40 #10706

The current SE-Resnext 152 benchmark result on P40 #10706

chengduoZH commented May 16, 2018 •

edited

Loading

typhoonzero commented May 16, 2018

chengduoZH commented May 16, 2018

chengduoZH commented May 16, 2018

typhoonzero commented May 16, 2018

shanyi15 commented Aug 15, 2018

The current SE-Resnext 152 benchmark result on P40 #10706

The current SE-Resnext 152 benchmark result on P40 #10706

Comments

chengduoZH commented May 16, 2018 • edited Loading

Env

Net work config

typhoonzero commented May 16, 2018

chengduoZH commented May 16, 2018

chengduoZH commented May 16, 2018

typhoonzero commented May 16, 2018

shanyi15 commented Aug 15, 2018

chengduoZH commented May 16, 2018 •

edited

Loading