Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increase parallel tests #34570

Closed
wants to merge 1 commit into from
Closed

increase parallel tests #34570

wants to merge 1 commit into from

Conversation

lelelelelez
Copy link
Contributor

@lelelelelez lelelelelez commented Aug 3, 2021

PR types

Others

PR changes

Others

Describe

  1. 增大单测的并发度:
    a. 更新单测与内存关系
    b. 根据内存的不同,单卡单测并发度分别为48、14、2;多卡单测并发度分别为4、2;独占单测并发度分别为8、4、2
  2. 更新rerun逻辑:
    当前单测数量在1500+,设定首次失败的单测数目在80个以内就可以直接进行rerun(1600*5%,此数据在上线之后需要观察)。首次失败的单测需要降低并发度执行一次,看是否可以成功,如果成功无需进入QA同学之前的rerun逻辑,如果rerun失败,就进入QA同学之前的rerun逻辑(3次有50%的成功率)。

注:此次修改未涉及到windows,windows相关的修改放到下一个PR

@paddle-bot-old
Copy link

paddle-bot-old bot commented Aug 3, 2021

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@@ -1038,17 +1685,17 @@ def main():
test_cases = sys.argv[1]
test_cases = test_cases.split("\n")

for unittest in CPU_PARALLEL_JOB:
for unittest in CPU_PARALLEL_JOB_NEW:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里改了后Windows就也变了,Windows是和Linux放一起吗

@XieYunshen
Copy link
Contributor

XieYunshen commented Aug 11, 2021

这个PR的coverage执行时间,比coverage其他任务执行的时间更长,且第二次rerun只有两个单测
比如这个执行了全部case的coverage任务,
paddle-test步骤耗时18:05-19:14,
单测耗时 ipipe_log_param_TestCases_Total_Time: 3850s
任务链接:https://xly.bce.baidu.com/paddlepaddle/paddle/newipipe/detail/3350997/job/6491415

本PR的任务,
paddle-test步骤耗时16:37-18:00,
单测耗时ipipe_log_param_TestCases_Total_Time: 3916s
任务链接https://xly.bce.baidu.com/paddlepaddle/paddle/newipipe/detail/3350117/job/6488749

@paddle-bot-old
Copy link

Since you haven't replied for more than a year, we have closed this issue/pr.
If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up.
由于您超过一年未回复,我们将关闭这个issue/pr。
若问题未解决或有后续问题,请随时重新打开,我们会继续跟进。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants