Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TL/MLX5: a2a part 1 -- coll init #790

Merged
merged 4 commits into from
Jun 29, 2023

Conversation

samnordmann
Copy link
Collaborator

What

a2a part 1 -- coll init

@samnordmann samnordmann force-pushed the tl_mlx5/a2a_part1 branch 3 times, most recently from 92af19e to f1881da Compare June 5, 2023 10:59
@samnordmann samnordmann force-pushed the tl_mlx5/a2a_part1 branch 4 times, most recently from 82e35e2 to 6d5e1c2 Compare June 5, 2023 14:08
src/components/tl/mlx5/alltoall/alltoall.h Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall.h Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall.h Outdated Show resolved Hide resolved
src/components/tl/mlx5/tl_mlx5.c Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall_mkeys.h Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall.c Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall_mkeys.c Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall_mkeys.c Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall_mkeys.c Outdated Show resolved Hide resolved
src/components/tl/mlx5/tl_mlx5_team.c Outdated Show resolved Hide resolved
@samnordmann samnordmann requested a review from shimmybalsam June 8, 2023 16:42
@samnordmann samnordmann force-pushed the tl_mlx5/a2a_part1 branch 7 times, most recently from 45064ab to bb36257 Compare June 11, 2023 11:24
src/components/tl/mlx5/alltoall/alltoall.h Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall.h Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall.c Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall_mkeys.h Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall_mkeys.c Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall.c Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall.c Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall.c Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall.c Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall.h Outdated Show resolved Hide resolved
src/components/tl/mlx5/alltoall/alltoall.c Show resolved Hide resolved
@manjugv
Copy link
Contributor

manjugv commented Jun 14, 2023

Depends on #784

@samnordmann samnordmann force-pushed the tl_mlx5/a2a_part1 branch 2 times, most recently from 5445058 to 26ca009 Compare June 19, 2023 19:39
@Sergei-Lebedev
Copy link
Contributor

@samnordmann please rebase

@samnordmann
Copy link
Collaborator Author

@edgargabriel we are getting an error with Linter-ROCM
any idea ?
Example:

#warning "hip_version.h has moved to /opt/rocm-5.6.0/include/hip and package include paths have changed. Provide include path as /opt/rocm-5.6.0/include when using cmake packages."

@shimmybalsam
Copy link
Collaborator

@edgargabriel we are getting an error with Linter-ROCM any idea ? Example:

#warning "hip_version.h has moved to /opt/rocm-5.6.0/include/hip and package include paths have changed. Provide include path as /opt/rocm-5.6.0/include when using cmake packages."

Hi @edgargabriel I am getting this error in PR #801 as well.

@edgargabriel
Copy link
Contributor

@edgargabriel we are getting an error with Linter-ROCM any idea ? Example:

#warning "hip_version.h has moved to /opt/rocm-5.6.0/include/hip and package include paths have changed. Provide include path as /opt/rocm-5.6.0/include when using cmake packages."

Hi @edgargabriel I am getting this error in PR #801 as well.

I will have a look

@manjugv manjugv self-requested a review June 29, 2023 13:24
@edgargabriel
Copy link
Contributor

not entirely sure why that is happening, ucc compiles for me with rocm 5.6 without issues, probably a minor difference in the environment. I will try to remove an include path that is there for historic reasons but is not really required anymore, I think that is what is causing the issue.

@Sergei-Lebedev Sergei-Lebedev enabled auto-merge (squash) June 29, 2023 13:38
@edgargabriel
Copy link
Contributor

I think this is the cause of the issue, the version check fails for whatever reason and because of that it adds in the include path to the deprecated directories (which causes the issue). This test doesn't fail on my test systems

checking if ROCm version is 5.0 or above... no 

@Sergei-Lebedev Sergei-Lebedev merged commit 22dc12d into openucx:master Jun 29, 2023
@samnordmann samnordmann deleted the tl_mlx5/a2a_part1 branch June 29, 2023 14:24
jeniaka pushed a commit to jeniaka/ucc that referenced this pull request Aug 14, 2023
* TL/MLX5: a2a part 1 -- coll init

* BUILD: fix inclusion path
janjust pushed a commit to janjust/ucc that referenced this pull request Jan 31, 2024
* TL/MLX5: a2a part 1 -- coll init

* BUILD: fix inclusion path
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants