Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prov/efa: Fix the ep list scan in cq/cntr read #10543

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

shijin-aws
Copy link
Contributor

We cannot only iterate eps and post initial batch of internal rx pkt once, as there can be more eps joining later after the cq read call. This patch fixes by introducing a bit in cq/ctnr that indicates whether a ep list scan is needed. This bit is set as true when a new ep is bind to the cq, and will be set as false every time when a scan is done.

We cannot only iterate eps and post initial batch of internal rx pkt once,
as there can be more eps joining later after the cq read call.
This patch fixes by introducing a bit in cq/ctnr that indicates whether
a ep list scan is needed. This bit is set as true when a new ep is
bind to the cq, and will be set as false every time when a scan is done.

Signed-off-by: Shi Jin <sjina@amazon.com>
@aws-nslick
Copy link
Contributor

tested and confirmed that this fixed an issue with nccl-net-ofi

@shijin-aws
Copy link
Contributor Author

bot:aws:retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants