Limit CI runs to 4 threads #309

jasoncouture · 2023-01-04T00:59:16Z

During CI/CD runs of #307 the number of parallel tests was having an impact on overall execution time.
In order to improve performance, this updates the github actions to limit test threads to 4.

This was previously the effective limit, as no test file had more than 4 tests in it.

Fixes #310

bjorn3 · 2023-01-04T07:44:13Z

.github/workflows/ci.yml

@@ -60,7 +60,7 @@ jobs:
      - name: Run api tests
        run: cargo test -p bootloader_api
      - name: Run integration tests
-        run: cargo test
+        run: cargo test -- --test-threads=4


Shouldn't it already be limited to the amount of cores by default?

It seems that by default, rust gets the available "parallelism" from the OS: https://github.com/rust-lang/rust/blob/master/library/test/src/helpers/concurrency.rs

Linux VM's get 2 cores, so thread::available_parallelism() should return 2.

It doesn't seem to be, unfortunately. Another option would be to get the list of VM tests and run them with a matrix instead. I can write the code to generate the matrix for GH actions.

But looking at the logs, it appeared it was running at least 4 tests in parallel.

And on top of that, oversubscription here wouldn't be terrible as long as it's limited. The tests spend a good amount of time doing IO, the CPU intense part is relatively quick.

Perhaps the issue was simply a slow action runner.

Maybe, but 20 minutes without any outputs should never happen, even on slow machines. So my guess is that some panic occured in the above case.

Regarding the general slowness of the Windows tests: Is QEMU really expected to be so slow on Windows? Maybe the issue is that we're running multiple threads at the same time. Could you try to update this PR to --test-threads=1 to see whether this improves things?

One thing that we should definitely do is to increase the timeout for job, e.g. to timeout-minutes: 60.

I think a better solution here is to generate a matrix from the list of tests, and run them all separately. This has the added benefit of quickly seeing which test failed, and makes the logs easier to read.

I'm not sure about this approach. It makes it easy to accidentally forget some tests (e.g. when a new test is added) and it spams the check list even more. Also, the number of free concurrent jobs is limited anyway, I think to 20 per organization. So running each test as a separate job will probably exceed this limit and lead to wait times, so we would not gain much.

Ah, no, I mean automate, via building a matrix dynamically using the output of cargo test -- --list

But the rest makes sense. I'll try a single thread, and increased timeout and see how we do. :)

(we can also easily limit concurrency with the matrix too)

Regarding the general slowness of the Windows tests: Is QEMU really expected to be so slow on Windows? Maybe the issue is that we're running multiple threads at the same time. Could you try to update this PR to --test-threads=1 to see whether this improves things?

Added. Building as I write this.

One thing that we should definitely do is to increase the timeout for job, e.g. to timeout-minutes: 60.

Also added.

phil-opp · 2023-01-05T10:35:14Z

Thanks for the update! Limiting the test runner to a single thread worked quite well on the first try: The integration tests were done in 8 minutes on Windows: https://github.com/rust-osdev/bootloader/actions/runs/3842801795/jobs/6544467654

I restarted the job for good measure, but unfortunately it hangs again on the second try: https://github.com/rust-osdev/bootloader/actions/runs/3842801795/jobs/6550043485 . So I it looks like there is really something going wrong sometimes which results in an endlessly running test.

I think the best path forward is to finish #314 first to see whether we run into some panic. After we hopefully found the issue, we can experiment with different thread counts to improve the CI's run time.

Limit CI runs to 4 threads

6dbff65

jasoncouture mentioned this pull request Jan 4, 2023

Feature test macros #307

Closed

bjorn3 reviewed Jan 4, 2023

View reviewed changes

Limit test threads to 1, and increase timeout to 60 minutes

e2e8bef

phil-opp mentioned this pull request Jan 5, 2023

Testing significantly slows down with more tests #310

Open

jasoncouture closed this Jan 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit CI runs to 4 threads #309

Limit CI runs to 4 threads #309

jasoncouture commented Jan 4, 2023 •

edited

Loading

bjorn3 Jan 4, 2023

Vest Jan 4, 2023

bjorn3 Jan 4, 2023

jasoncouture Jan 4, 2023

jasoncouture Jan 4, 2023

phil-opp Jan 4, 2023

jasoncouture Jan 4, 2023

jasoncouture Jan 4, 2023

jasoncouture Jan 4, 2023

jasoncouture Jan 5, 2023 •

edited

Loading

phil-opp commented Jan 5, 2023

Limit CI runs to 4 threads #309

Limit CI runs to 4 threads #309

Conversation

jasoncouture commented Jan 4, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jasoncouture Jan 5, 2023 • edited Loading

Choose a reason for hiding this comment

phil-opp commented Jan 5, 2023

jasoncouture commented Jan 4, 2023 •

edited

Loading

jasoncouture Jan 5, 2023 •

edited

Loading