-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reset skip error counter on a fabric link if it was down. #3247
Reset skip error counter on a fabric link if it was down. #3247
Conversation
Related to sonic-net/sonic-buildimage#19288 |
@saksarav-nokia to review too |
Tested the fix and seems to be working. However i see lot of these log messages and not sure if we need have them at NOTICE level 2024 Aug 5 20:50:16.520538 ixre-cpm-chassis12 NOTICE swss0#orchagent: :- updateStateDbTable: PORT0 updates POLL_WITH_ERRORS to 0 0 |
I can change the message level to INFO |
Changed the log level to INFO, so it will not show up by default now |
can anyone help review again and see if we can merge up this ? thank you |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as comments
test_HashSwitchGlobalConfiguration[inner-frame-ecmp-hash] failed (1 runs remaining out of 2). |
/azpw run Azure.sonic-swss |
/AzurePipelines run Azure.sonic-swss |
Azure Pipelines successfully started running 1 pipeline(s). |
I don't quite get the coverage results, I added test to cover the change and get coverage tests passed before retarting the pipeline. How it can fail the coverage test after the rerun ... start again /azpw run Azure.sonic-swss |
/azpw run Azure.sonic-swss |
/AzurePipelines run Azure.sonic-swss |
Azure Pipelines successfully started running 1 pipeline(s). |
/azpw run coverage.Azure.sonic-swss.vstest |
@prsunny, can you please help merge this PR |
…3247) What I did reset the skip error counters for fabric link monitoring if a link was down and up again. Why I did it The fabric link monitoring feature does not take care the peer card restart cases. Peer card reload could lead to links go in to isolation state, and that's a case where we can ignore the error due to the init churn. So in this change , the skip error on init counters get reset if a link was down and up again.
Cherry-pick PR to 202405: #3281 |
What I did reset the skip error counters for fabric link monitoring if a link was down and up again. Why I did it The fabric link monitoring feature does not take care the peer card restart cases. Peer card reload could lead to links go in to isolation state, and that's a case where we can ignore the error due to the init churn. So in this change , the skip error on init counters get reset if a link was down and up again.
What I did
reset the skip error counters for fabric link monitoring if a link was down and up again.
Why I did it
The fabric link monitoring feature does not take care the peer card restart cases. Peer card reload could lead to links go in to isolation state, and that's a case where we can ignore the error due to the init churn. So in this change , the skip error on init counters get reset if a link was down and up again.
How I verified it
Details if related