Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Improve SYCL-CTS job in Nightly workflow #12934

Merged
merged 3 commits into from
Mar 7, 2024

Conversation

KornevNikita
Copy link
Contributor

@KornevNikita KornevNikita commented Mar 6, 2024

  1. Use the normal runner instead of failing one.
  2. Keep executing the test loop if suite fails.

1. Extend the number of suites.
2. Add the ability to define ninja's extra args for build-cts.
3. Keep executing the loop if suite fails.
@KornevNikita
Copy link
Contributor Author

To check if all works fine: https://github.com/intel/llvm/actions/runs/8179408035

Comment on lines 13 to 19
vector_alias
vector_api
vector_constructors
vector_load_store
vector_operators
vector_swizzle_assignment
vector_swizzles
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: some of these tests (e.g. vector_*) are in this list because they take a lot of time to compile and execute.

@KornevNikita
Copy link
Contributor Author

To check if all works fine: https://github.com/intel/llvm/actions/runs/8179408035

For some reason workflow didn't take -j8 and the runner is offline again because of some memory issue. I think we should not use this runner. I'll update the workflow to use a normal machine. But a bit later.

@bader
Copy link
Contributor

bader commented Mar 6, 2024

To check if all works fine: https://github.com/intel/llvm/actions/runs/8179408035

For some reason workflow didn't take -j8 and the runner is offline again because of some memory issue. I think we should not use this runner. I'll update the workflow to use a normal machine. But a bit later.

Potential reason - GHA runner process is killed by OS. It's possible that -j8 launches multiple tools consuming a lot of RAM and OS kills processes allocating memory. GHA runner usually allocates memory dynamically to store/send logs produced by workflows.

@KornevNikita
Copy link
Contributor Author

To check if all works fine: https://github.com/intel/llvm/actions/runs/8179408035

For some reason workflow didn't take -j8 and the runner is offline again because of some memory issue. I think we should not use this runner. I'll update the workflow to use a normal machine. But a bit later.

Potential reason - GHA runner process is killed by OS. It's possible that -j8 launches multiple tools consuming a lot of RAM and OS kills processes allocating memory. GHA runner usually allocates memory dynamically to store/send logs produced by workflows.

It's not just runner was killed, for some reason the whole machine goes offline after this. We need to manually reboot it every time this happens. Not very convenient:) This machine was unused in CI, may be because of that.

About "-j8" - as I understand ninja takes all 12 threads buy default. So "-j8" helps us use less memory. But for some reason the workflow ignored this argument. Probably wrong format.

@KornevNikita
Copy link
Contributor Author

KornevNikita commented Mar 7, 2024

Okay, I'll extend the set a bit later. It takes more than 6 hours to run SYCL-CTS and it's too much. Need to adjust the set.
Alexey mentioned that vector_* tests may take a lot of time, but I'm not sure what took so much time here https://github.com/intel/llvm/actions/runs/8180415137/job/22368396848

@KornevNikita KornevNikita changed the title [CI] Improve SYCL-CTS job [CI] Improve SYCL-CTS job in Nightly workflow Mar 7, 2024
@KornevNikita
Copy link
Contributor Author

KornevNikita commented Mar 7, 2024

Nightly passes as expected: https://github.com/intel/llvm/actions/runs/8188808112

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a good logging addition. 👍

@KornevNikita
Copy link
Contributor Author

Failed pre-commit is unrelated: #12944

@steffenlarsen steffenlarsen merged commit 3d58edd into sycl Mar 7, 2024
28 of 30 checks passed
@KornevNikita KornevNikita deleted the enable-cts-on-debug-runner-2 branch March 11, 2024 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants