-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prov/tcp: Set FI_MULTI_RECV for last completed RX slice #9420
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behavior looks correct to me. I would like to move the new functionality around a little bit to limit the extra overhead of managing the new struct.
CI failed running mpi over tcp. Restarted it, but the failure may be related to the changes. I didn't notice anything off on the updated changes on a first review, so I may be overlooking an issue. The changes looked good to me. |
prov/tcp/src/xnet_cq.c
Outdated
flags = xfer_entry->cq_flags & ~FI_COMPLETION; | ||
if (flags & FI_RECV) { | ||
len = xnet_msg_len(&xfer_entry->hdr); | ||
if (xfer_entry->mrecv) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait -- this check doesn't work since moving mrecv into the union. The checks like this may need to be updated to something like : xfer_entry->ctrl_flags & FI_MULTI_RECV
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meant XNET_MULTI_RECV for the ctrl_flag... it's the same value as FI_MULTI_RECV, but I'd use the internal definition instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it was in the back of my mind. v1 guarantees mrecv but not v2. There are 2 possible cases where mrecv is null, one, obviously is the failed allocation (unlikely) and the second is for the first receive to consume the RX buffer beyond the minimum. I'm guessing it is the second case that's causing the failure. I will make a change to accommodate these.
b3d0724
to
c0d0b31
Compare
Looks better MPI_TCP passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks -- I missed catching the earlier problem when I asked you to move mrecv into a union, which broke the code you had. Code looks reasonable, but I did list a couple cleanups that I think are possible.
When FI_MULTI_RECV is enabled for an RX buffer, the last completed slice of the RX buffer should have FI_MULTI_RECV set in the corresponding CQE. Currently TCP provider assumes the last slice of the RX buffer will get completed last and sets FI_MULTI_RECV flag when the last slice is used. Completions can get out of order with multiple connections and varied payload sizes using the same RX buffer. Last slice will not always complete last. To make sure the last completed RX buffer slice has FI_MULTI_RECV set in the CQE, use a reference counter to track outstanding slices. Count is initialized to 1 when the RX buffer is first posted and incrememted on every slice allocation. Upon completion, count is decremented. When it reaches 0, FI_MULTI_RECV is set for that CQE. v2 - Minimize impact to non-multi-recv SRX code. v3 - Fix completion path, set FI_MULTI_RECV if mrecv is NULL and recv_entry is XNET_MULTI_RECV. v4 - Check ctrl_flag against XNET_MULTI_RECV first as mrecv may be garbage. Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good -- thanks! Just waiting for CI to catch up, but is passing our tests
When FI_MULTI_RECV is enabled for an RX buffer, the last completed slice of the RX buffer should have FI_MULTI_RECV set in the corresponding CQE.
Currently TCP provider assumes the last slice of the RX buffer will get completed last and sets FI_MULTI_RECV flag when the last slice is used. Completions can get out of order with multiple connections and varied payload sizes using the same RX buffer. Last slice will not always complete last.
To make sure the last completed RX buffer slice has FI_MULTI_RECV set in the CQE, use a reference counter to track outstanding slices. Count is initialized to 1 when the RX buffer is first posted and incrememted on every slice allocation. Upon completion, count is decremented. When it reaches 0, FI_MULTI_RECV is set for that CQE.