-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to Legate 24.09 nightlies #163
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
OK... This is failing to build in CI due to https://github.com/nv-legate/legate.core.internal/issues/1246 will mark as draft until then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for taking this on @seberg! Looks not too bad in terms of code changes. Will take another look when the CI is working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this!
I think your diagnosis of why this is failing (#163 (comment)) is exactly correct.
I left some other small packaging-related suggestions for your consideration.
226e78b
to
a0981da
Compare
Just to note: the way that |
69dcf7f
to
889ab86
Compare
Ping CI to run (close/reopen). |
/ok to test |
This is getting somewhere: The test failures seem pretty harmless (one new mypy failure, and the Two notes about the setup:
Not sure what this is about. Maybe it is picking up the config of the parent process? Not sure that starting legate from a legate process is a good idea :).
|
Hmmmmm, had to actually cancel the GPU job, it was still running after an hour, which I don't think is correct... hanging at: EDIT: Hmmm, no. There are more tests later, I just missed it between all the warnings.
|
This updates code for 24.09 compatibility based on Jacobs start. Additional/larger changes: * legate::cuda::StreamPool::get_stream_pool().get_stream(); is deprecated so use context.get_task_stream() and rewire things as needed. * We are always including the label/experimental channel. * The `install_info.py` file generation seem slightly different, so the test scripts now move into the test folder and then run the tests. (This could maybe be improved.) This has no effect locally, though. Signed-off-by: Sebastian Berg <sebastianb@nvidia.com>
The reason for the slowness of the GPU tests is that they are actually the CPU tests :) legate-boost/ci/test_python_gpu.sh Line 13 in c42e0fa
@jameslamb a typo? |
Let me fix that, although I am confused. I thought the new version of legate would guess at using the GPU if anything!? |
Signed-off-by: Sebastian Berg <sebastianb@nvidia.com>
Well, let's see. Maybe I got the tags wrong also, and we are actually compiling a CPU only version for both or so... |
ah!!! Yes absolutely a typo! Sorry about that, thank you for fixing it. If this PR is going to go on much longer, I recommend just putting up a separate one with that change to get it onto |
The CPU tests are taking 43 minutes right now. Maybe that is actually acceptable for the moment? |
Its acceptable for now, although lets not leave it for too long. From my perspective this PR is fine. @jameslamb will this interfere with your packaging work if we merged it? |
- cunumeric {{ legate_version }} | ||
- legate-core {{ legate_version }} | ||
- cunumeric {{ legate_version }} =*_gpu | ||
- legate {{ legate_version }} =*_gpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh hey great! I'm glad they now added _gpu
to the build strings for legate
/ cunumeric
. When I'd asked about that before, the folks maintaining legate
/ cunumeric
were against it: https://github.com/nv-legate/legate.core.internal/pull/1035#discussion_r1702268130
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my perspective this PR is fine. @jameslamb will this interfere with your packaging work if we merged it?
Thanks for the @
. I totally support merging this... focusing on just supporting legate
/ cunumeric
24.09 will help the packaging work towards publishing a legate-boost
24.09.
I guess I'll put it in then, thanks! 🤞 that CI will be stable enough (and that slowdown tracked down soon!). |
This updates to the nightlies based on Jacobs work (i.e. most of the cmake changes and the larger part of the non-stream related C changes).
A slightly bigger change is that there is now a warning for the stream pool API, so we have to pass through the stream more explicitly.
Locally there is a test failing because
legate --cpus 2
thinks that--cpus
are passed twice. Let's see, the question will be whether we can just remove it (using the default of 4 cpus probably).