Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit CI runs to 4 threads #309

Closed

Conversation

jasoncouture
Copy link
Contributor

@jasoncouture jasoncouture commented Jan 4, 2023

During CI/CD runs of #307 the number of parallel tests was having an impact on overall execution time.
In order to improve performance, this updates the github actions to limit test threads to 4.

This was previously the effective limit, as no test file had more than 4 tests in it.

Fixes #310

@jasoncouture jasoncouture mentioned this pull request Jan 4, 2023
@@ -60,7 +60,7 @@ jobs:
- name: Run api tests
run: cargo test -p bootloader_api
- name: Run integration tests
run: cargo test
run: cargo test -- --test-threads=4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it already be limited to the amount of cores by default?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that by default, rust gets the available "parallelism" from the OS: https://github.com/rust-lang/rust/blob/master/library/test/src/helpers/concurrency.rs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linux VM's get 2 cores, so thread::available_parallelism() should return 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem to be, unfortunately. Another option would be to get the list of VM tests and run them with a matrix instead. I can write the code to generate the matrix for GH actions.

But looking at the logs, it appeared it was running at least 4 tests in parallel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And on top of that, oversubscription here wouldn't be terrible as long as it's limited. The tests spend a good amount of time doing IO, the CPU intense part is relatively quick.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the issue was simply a slow action runner.

Maybe, but 20 minutes without any outputs should never happen, even on slow machines. So my guess is that some panic occured in the above case.

Regarding the general slowness of the Windows tests: Is QEMU really expected to be so slow on Windows? Maybe the issue is that we're running multiple threads at the same time. Could you try to update this PR to --test-threads=1 to see whether this improves things?

One thing that we should definitely do is to increase the timeout for job, e.g. to timeout-minutes: 60.

I think a better solution here is to generate a matrix from the list of tests, and run them all separately. This has the added benefit of quickly seeing which test failed, and makes the logs easier to read.

I'm not sure about this approach. It makes it easy to accidentally forget some tests (e.g. when a new test is added) and it spams the check list even more. Also, the number of free concurrent jobs is limited anyway, I think to 20 per organization. So running each test as a separate job will probably exceed this limit and lead to wait times, so we would not gain much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no, I mean automate, via building a matrix dynamically using the output of cargo test -- --list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the rest makes sense. I'll try a single thread, and increased timeout and see how we do. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(we can also easily limit concurrency with the matrix too)

Copy link
Contributor Author

@jasoncouture jasoncouture Jan 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the general slowness of the Windows tests: Is QEMU really expected to be so slow on Windows? Maybe the issue is that we're running multiple threads at the same time. Could you try to update this PR to --test-threads=1 to see whether this improves things?

Added. Building as I write this.

One thing that we should definitely do is to increase the timeout for job, e.g. to timeout-minutes: 60.

Also added.

@phil-opp
Copy link
Member

phil-opp commented Jan 5, 2023

Thanks for the update! Limiting the test runner to a single thread worked quite well on the first try: The integration tests were done in 8 minutes on Windows: https://github.com/rust-osdev/bootloader/actions/runs/3842801795/jobs/6544467654

I restarted the job for good measure, but unfortunately it hangs again on the second try: https://github.com/rust-osdev/bootloader/actions/runs/3842801795/jobs/6550043485 . So I it looks like there is really something going wrong sometimes which results in an endlessly running test.

I think the best path forward is to finish #314 first to see whether we run into some panic. After we hopefully found the issue, we can experiment with different thread counts to improve the CI's run time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Testing significantly slows down with more tests
4 participants