-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Take non-CMIS xcvrs out of lpmode in SFF Manager #565
Take non-CMIS xcvrs out of lpmode in SFF Manager #565
Conversation
|
@arlakshm @wenyiz2021 for awareness |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Yes, along with enabling the SFF manager in xcvrd per platform config (
sonic-net/sonic-buildimage#20886).
…On Thu, Nov 21, 2024 at 5:57 PM wenyiz2021 ***@***.***> wrote:
do we only need this PR for the LP mode causing links down issue?
—
Reply to this email directly, view it on GitHub
<#565 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AWIHIW6M2AJXRJ3DQENRAQ32B2FO3AVCNFSM6AAAAABSHXSQQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOJSG4YTGNRQGM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
@mihirpat1 can you please review |
sonic-xcvrd/xcvrd/sff_mgr.py
Outdated
@@ -435,6 +435,8 @@ def task_worker(self): | |||
# Skip if these essential routines are not available | |||
continue | |||
|
|||
sfp.set_lpmode(False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peterbailey-arista I would suggest to use sfp.set_lpmode() only for SFPs that follow SFF8472. All other transceivers like QSFP+, QSFP28 can support lpmode via EEPROM write. The above code expects each platform to implement set_lpmode()
even thought that is NOT required for QSFP based modules.
if (SFP type module) {
sfp.set_lpmode(False)
} else {
api = set_lpmode(False)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing with @byu343 I ended up wrapping api.set_lpmode in a try except instead. Please let me know if this update works for you. Thanks!
c59fabf
to
78d427e
Compare
sonic-xcvrd/xcvrd/sff_mgr.py
Outdated
@@ -435,6 +435,12 @@ def task_worker(self): | |||
# Skip if these essential routines are not available | |||
continue | |||
|
|||
try: | |||
api = sfp.get_xcvr_api() | |||
api.set_lpmode(False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peterbailey-arista Per the below implementation, an exception will not be returned for SFF-8472 modules. Can you please handle this accordingly.
https://github.com/sonic-net/sonic-platform-common/blob/0f2e22faccd093a1e5d18235fe119a860be7855e/sonic_platform_base/sonic_xcvr/api/public/sff8472.py#L308
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now updated it to use sfp.set_lpmode() for only SFPs implementing SFF8472 as was originally suggested
sonic-xcvrd/xcvrd/sff_mgr.py
Outdated
@@ -435,6 +435,12 @@ def task_worker(self): | |||
# Skip if these essential routines are not available | |||
continue | |||
|
|||
try: | |||
api = sfp.get_xcvr_api() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
api
has been already obtained at the code before:
api = sfp.get_xcvr_api() |
78d427e
to
3d14308
Compare
sonic-xcvrd/xcvrd/sff_mgr.py
Outdated
@@ -435,6 +436,11 @@ def task_worker(self): | |||
# Skip if these essential routines are not available | |||
continue | |||
|
|||
if isinstance(api, Sff8472Api): | |||
sfp.set_lpmode(False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peterbailey-arista Can we check for the return value in both the cases and log error if it returns False?
Also, can you please help in fixing the built failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the check with the new error log. But am not sure how to resolve the build failure, it does not seem to be related to my change. Do you have any suggestions?
/azpw run Azure.sonic-platform-daemons |
Above build failure is unrelated to the change in this PR. The failure is caused by:
|
3d14308
to
072a3e6
Compare
/azpw run Azure.sonic-platform-daemons |
/AzurePipelines run Azure.sonic-platform-daemons |
Azure Pipelines successfully started running 1 pipeline(s). |
sonic-xcvrd/xcvrd/sff_mgr.py
Outdated
@@ -435,6 +436,17 @@ def task_worker(self): | |||
# Skip if these essential routines are not available | |||
continue | |||
|
|||
set_lp_success = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume lpmode handling is only needed at module insertion event, right?
Just some minor suggestion:
Maybe adding below condition check can avoid unnecessary lpmode handling in other cases (e.g. the case of admin_status/host_tx_ready getting changed by config interface shutdown/startup
)
if xcvr_inserted:
<lpmode logic>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The described scenario unfortunately still requires lpmode set False. If you shutdown then startup the interface without bringing it out of lpmode the interface remains down even if it was up before shutdown
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought lpmode can get reset after module reset (sfputil reset
), but interface shut/start (i.e. NPU/PHY/laser tx ON/OFF) wouldn't impact the module on lpmode/etc unless user/platform/vendor explicitly triggers something additional for module as part of the interface shut/start. Is that not the case here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake, I believe you are correct. I'll add that change, thanks
/azpw run Azure.sonic-platform-daemons |
/AzurePipelines run Azure.sonic-platform-daemons |
Azure Pipelines successfully started running 1 pipeline(s). |
/Azp Azure.sonic-platform-daemons |
Command 'Azure.sonic-platform-daemons' is not supported by Azure Pipelines. Supported commands
See additional documentation. |
Fix non-CMIS transceivers in down state by bringing them out of lpmode in the SFF Manager Task.
072a3e6
to
94b96b6
Compare
/azpw run Azure.sonic-platform-daemons |
/AzurePipelines run Azure.sonic-platform-daemons |
Azure Pipelines successfully started running 1 pipeline(s). |
@longhuan-cisco @mihirpat1, Can you please approve this change if all the comments are addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peterbailey-arista The changes look good to me. Can you please help in resolving the build failure?
/Azp run Azure.sonic-platform-daemons |
Azure Pipelines successfully started running 1 pipeline(s). |
Description
Fix non-CMIS transceivers in down state by bringing them out of low power mode in the SFF Manager Task.
This is intended to work together with the change in sonic-net/sonic-buildimage#20886.
Motivation and Context
Non-CMIS transceivers were not functioning correctly when put into Low Power mode. So XCVRD now brings them out of lpmode.
How Has This Been Tested?
Loaded an image containing this change alongside the change from sonic-net/sonic-buildimage#20886 on an Arista chassis containing a Clearwater2 linecard.
Verified that without this image some interfaces were in a down state but with the image all interfaces came up as expected.
Additional Information (Optional)