-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RCCL support #93
Merged
Merged
RCCL support #93
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Collaborator
jrmadsen
commented
Jul 18, 2022
•
edited
Loading
edited
- adds support for RCCL similar to MPI
- perfetto args and ret
- timemory rccl_comm_data component
- includes cpack tweak
- handle rocprofiler-dev DEB dependency
- minor tweak to omnitrace exe to prevent printf functions from being instrumented (cause deadlock)
jrmadsen
added
enhancement
New feature or request
libomnitrace
Involves omnitrace library
sampling
Statistical sampling via interrupts
cmake
Modifies the CMake build system
cpack
Modifies the CPack packaging system
configuration
Changes/involves configuration options
rccl
ROCm Communication Collectives Library
labels
Jul 18, 2022
jrmadsen
force-pushed
the
rccl-support
branch
5 times, most recently
from
July 21, 2022 16:16
b02b4e9
to
3885920
Compare
jrmadsen
changed the title
[WIP] RCCL support
[WIP] RCCL support + Improved ROCm-SMI Error Handling
Jul 21, 2022
jrmadsen
force-pushed
the
rccl-support
branch
5 times, most recently
from
July 25, 2022 06:54
c542e65
to
e71c467
Compare
jrmadsen
changed the title
[WIP] RCCL support + Improved ROCm-SMI Error Handling
RCCL support
Jul 25, 2022
jrmadsen
force-pushed
the
rccl-support
branch
2 times, most recently
from
July 25, 2022 10:04
b4d8619
to
d148416
Compare
jrmadsen
commented
Jul 25, 2022
- also OMNITRACE_SAMPLING_KEEP_INTERNAL option - minor modifications to sampling to use keep internal option + discard funlockfile
- add tpls/rccl/rccl/rccl.h
- disable ompt - enable building testing
- ctest exclude
- remove source /.../setup-env.sh, replace with $GITHUB_ENV
- Recover from rocm-smi errors - Disabling rocm-smi after recovering from errors - Werror in developer mode - Remove State::DelayedInit - Add State::Disabled
- based on ROCm version we need with <rccl/rccl.h> or <rccl.h>
- updated tests to use configuration files - many tests generate a configuration file - tests how have GPU option - enable ncclCommCount, disable ncclGetVersion - add testing for RCCLP via rccl-tests - working directory of tests is PROJECT_BINARY_DIR - add nccl/rccl functions to get_whole_function_names - some clang compiler fixes
jrmadsen
added
the
omnitrace-instrument
Involves the omnitrace-instrument executable (binary instrumenter)
label
Jul 25, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
cmake
Modifies the CMake build system
configuration
Changes/involves configuration options
cpack
Modifies the CPack packaging system
enhancement
New feature or request
libomnitrace
Involves omnitrace library
omnitrace-instrument
Involves the omnitrace-instrument executable (binary instrumenter)
rccl
ROCm Communication Collectives Library
testing
Extends/improves/modifies testing
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.