-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HLD for changing teamd expiry timer #1073
HLD for changing teamd expiry timer #1073
Conversation
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
|
||
## Requirements | ||
|
||
- Switch running a supported SONiC with patches in libteam for this feature on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this feature (patch) stay as patch in libteam specifically for SONiC or will be pushed to libteam community as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this is effectively breaking the LACP protocol, I'm not planning on submitting this patch upstream.
# Protocol | ||
|
||
To change the number of retries, an Ethernet packet of the fillowing structure | ||
will be sent. This Ethernet packet will have an ethertype of 0x6300, and will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for ethertype 0x6300 selection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just meant to be a custom ethertype that appears to be unused. I needed something that was unused and is unlikely to get treated like a "normal" data packet.
|
||
# CLI | ||
|
||
No new CLI options or config options will be added, as this is not meant to be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Config knob may be required to avoid sending unnecessary 0x6300 packets during warm-reboot when SONiC is connected to Non-SONiC device.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see a configurable option to enable this feature, default should be left disabled with no custom TLV to interfere with WB. The HLD diverged from standard LACP protocol definition, using custom TLV to overcome SONiC WB timing, and potentially a phase2 send and ack mechanism in the future that could potentially block WB from proceeding. For deployment where non standard protocol packets are forbidden, we need a configurable option to control this behavior. Preferably leaving it disable by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The HLD has been updated with a new design/approach. By default, this feature is disabled, so there won't be any custom packets unless configured.
|
||
Now, in addition to refreshing the PDUs timer, the above-specified Ethernet | ||
packet (with ethertype 0x6300) will be sent to the peer devices, with the new | ||
retry count set to 5. This notifies the peer device that for this device that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why cant we make this retry-count as user configurable with default value 5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The HLD has been updated with a new design/approach. The retry count is now user-configurable.
with ethertype 0x6300, and the data will contain the Actor Information, Partner | ||
Information, and Retry Count TLVs. The receiving device must validate the actor | ||
and partner information, and then update the retry count as specified. No | ||
acknowledgment packet is sent back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you ensure the packet reached the peer without ack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack will be added into the protocol.
|
||
| Value | Description | | ||
|-------|---------------------| | ||
| 0x01 | Actor Information | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about sending this special packet at LAG level, instead of sending on every lag member which can be heavy if the lag size is significantly high; for example 64 member lag
@saiarcot895 can you please help to add the code PRs into this HLD by referring to #806 ? Thanks. |
@saiarcot895 can you please add the code PRs by referring to #806 ? Thanks. |
No code has been committed yet for this. There are changes to the design of this feature, and this HLD needs to be updated. I'm moving this PR to draft status. |
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Recording for today's community review https://zoom.us/rec/share/9xfUbBRqllA9BpWfc3UN51f0Q6067KPXtpsLD9owQrUiRPtIpMjEaXpEDmDi8cTc.XSBxLjWF9FtvCjIM |
added to 202305 release |
move to post 202305 release. code PRs are not ready |
Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
This PR adds a HLD for changing the duration of teamd's expiry timer, by sending a message to the peer device with the number of retries it should do for this LAG.
Code PRs: