Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

orchagent/portsorch: Missing scheduler group after SWSS restart #2174

Merged
merged 3 commits into from
Nov 11, 2022

Conversation

arvbb
Copy link
Contributor

@arvbb arvbb commented Mar 11, 2022

What I did
Added function to query scheduler group objects during SWSS restart.

Why I did it
Scheduler group objects were removed as they were missing in temp view during SWSS restart. The triggers to reproduce this issue:

sudo config warm_restart enable swss
sudo service swss restart

How I verified it
Verified scheduler group objects are not removed after the fix and are part of ASIC_DB.

Details if related
Mar 1 22:46:39.015724 sonic NOTICE swss#orchagent: :- setWarmStartState: orchagent warm start state changed to restored
Mar 1 22:46:39.015724 sonic NOTICE swss#orchagent: :- warmRestoreAndSyncUp: Orchagent state restore done
Mar 1 22:46:39.015724 sonic NOTICE swss#orchagent: :- syncd_apply_view: Notify syncd APPLY_VIEW
Mar 1 22:46:39.015724 sonic NOTICE swss#orchagent: :- notifySyncd: sending syncd: APPLY_VIEW
Mar 1 22:46:39.016078 sonic WARNING syncd[24]: :- processNotifySyncd: syncd received APPLY VIEW, will translate
Mar 1 22:46:39.155799 sonic NOTICE syncd[24]: :- dump: getting took 0.139421 sec
Mar 1 22:46:39.172973 sonic NOTICE syncd[24]: :- getAsicView: ASIC_STATE switch count: 1:
Mar 1 22:46:39.172973 sonic NOTICE syncd[24]: :- getAsicView: oid:0x21000000000000: objects count: 7529
Mar 1 22:46:39.175515 sonic NOTICE syncd[24]: :- getAsicView: get asic view from ASIC_STATE took 0.159394 sec
Mar 1 22:46:39.304183 sonic NOTICE syncd[24]: :- dump: getting took 0.128388 sec
Mar 1 22:46:39.318381 sonic NOTICE syncd[24]: :- getAsicView: TEMP_ASIC_STATE switch count: 1:
Mar 1 22:46:39.318381 sonic NOTICE syncd[24]: :- getAsicView: oid:0x21000000000000: objects count: 7174
Mar 1 22:46:39.320676 sonic NOTICE syncd[24]: :- getAsicView: get asic view from TEMP_ASIC_STATE took 0.145098 sec
Mar 1 22:46:39.412827 sonic NOTICE syncd[24]: :- ComparisonLogic: srand seed for switch oid:0x21000000000000: 1646174799
Mar 1 22:46:39.413412 sonic NOTICE syncd[24]: :- matchOids: matched oids
Mar 1 22:46:39.413412 sonic NOTICE syncd[24]: :- populateExistingObjects: populate existing objects
Mar 1 22:46:39.414330 sonic NOTICE syncd[24]: :- checkInternalObjects: check internal objects
Mar 1 22:46:39.414519 sonic WARNING syncd[24]: :- checkInternalObjects: different number of objects SAI_OBJECT_TYPE_SCHEDULER_GROUP, curr: 321, tmp 33 (not expected if warm boot)
Mar 1 22:46:39.414519 sonic ERR syncd[24]: :- checkInternalObjects: object status is not MATCHED on curr: SAI_OBJECT_TYPE_SCHEDULER_GROUP:oid:0x1700000000005e
Mar 1 22:46:39.414519 sonic ERR syncd[24]: :- checkInternalObjects: object status is not MATCHED on curr: SAI_OBJECT_TYPE_SCHEDULER_GROUP:oid:0x1700000000005f
Mar 1 22:46:39.414519 sonic ERR syncd[24]: :- checkInternalObjects: object status is not MATCHED on curr: SAI_OBJECT_TYPE_SCHEDULER_GROUP:oid:0x17000000000060
..
..
Mar 1 22:46:39.421248 sonic ERR syncd[24]: :- checkInternalObjects: object status is not MATCHED on curr: SAI_OBJECT_TYPE_SCHEDULER_GROUP:oid:0x1700000000034e
Mar 1 22:46:39.439937 sonic NOTICE syncd[24]: :- createPreMatchMap: preMatch map size: 102, tmp oid obj: 186
Mar 1 22:46:39.439937 sonic NOTICE syncd[24]: :- createPreMatchMap: create preMatch map took 0.023570 sec
Mar 1 22:46:39.441150 sonic WARNING syncd[24]: :- logViewObjectCount: object count for SAI_OBJECT_TYPE_SCHEDULER_GROUP on current view 321 is different than on temporary view: 33
Mar 1 22:46:39.443028 sonic WARNING syncd[24]: :- logViewObjectCount: object count is different on both view, there will be ASIC OPERATIONS!
Mar 1 22:46:39.443028 sonic NOTICE syncd[24]: :- checkMatchedPorts: all ports are matched
Mar 1 22:46:39.443082 sonic WARNING syncd[24]: :- performObjectSetTransition: current attr is CREATE_ONLY and object is MATCHED: oid:0x1000000000050 transferring SAI_PORT_ATTR_HW_LANE_LIST:4:0,1,2,3 to temp object
Mar 1 22:46:39.443150 sonic WARNING syncd[24]: :- performObjectSetTransition: current attr is CREATE_ONLY and object is MATCHED: oid:0x1000000000068 transferring SAI_PORT_ATTR_HW_LANE_LIST:4:4,5,6,7 to temp object
..
..
Mar 1 22:46:39.444210 sonic WARNING syncd[24]: :- performObjectSetTransition: current attr is CREATE_ONLY and object is MATCHED: oid:0x1000000000338 transferring SAI_PORT_ATTR_HW_LANE_LIST:4:124,125,126,127 to temp object
Mar 1 22:46:39.516195 sonic NOTICE syncd[24]: :- applyViewTransition: loop removed 288 objects
Mar 1 22:46:39.516680 sonic NOTICE syncd[24]: :- applyViewTransition: comparison logic took 0.073610 sec
Mar 1 22:46:39.516680 sonic NOTICE syncd[24]: :- transferNotProcessed: calling transferNotProcessed
Mar 1 22:46:39.517046 sonic NOTICE syncd[24]: :- compareViews: ASIC operations to execute: 288
Mar 1 22:46:39.517992 sonic NOTICE syncd[24]: :- compareViews: all temporary view objects were processed to FINAL state
Mar 1 22:46:39.518395 sonic NOTICE syncd[24]: :- compareViews: all current view objects were processed to FINAL state
Mar 1 22:46:39.518395 sonic NOTICE syncd[24]: :- executeOperationsOnAsic: operations to execute on ASIC: 288
Mar 1 22:46:39.518416 sonic NOTICE syncd[24]: :- executeOperationsOnAsic: NOT optimized operations
Mar 1 22:46:39.518451 sonic NOTICE syncd[24]: :- executeOperationsOnAsic: remove: SAI_OBJECT_TYPE_SCHEDULER_GROUP:oid:0x1700000000005e
Mar 1 22:46:39.518451 sonic NOTICE syncd[24]: :- executeOperationsOnAsic: remove: SAI_OBJECT_TYPE_SCHEDULER_GROUP:oid:0x1700000000005f
..
..
Mar 1 22:46:39.527085 sonic NOTICE syncd[24]: :- executeOperationsOnAsic: operations on SAI_OBJECT_TYPE_SCHEDULER_GROUP: 288
Mar 1 22:46:39.527085 sonic NOTICE syncd[24]: :- asicGetWithOptimizedRemoveOperations: moved 288 REMOVE operations upper in stack from total 288 operations
Mar 1 22:46:39.527118 sonic NOTICE syncd[24]: :- asicGetWithOptimizedRemoveOperations: optimizing asic remove operations took 0.000160 sec
Mar 1 22:46:39.530969 sonic NOTICE syncd[24]: :- executeOperationsOnAsic: asic apply took 0.012517 sec
Mar 1 22:46:39.530969 sonic NOTICE syncd[24]: :- executeOperationsOnAsic: performed all operations on asic successfully
Mar 1 22:46:39.589620 sonic NOTICE syncd[24]: :- threadFunction: time span 573 ms for 'notify:APPLY_VIEW'

kcudnik
kcudnik previously approved these changes Mar 14, 2022
@lguohan
Copy link
Contributor

lguohan commented Mar 14, 2022

why is this?
"Scheduler group objects were removed as they were missing in temp view during SWSS restart."

@kcudnik
Copy link
Contributor

kcudnik commented Mar 14, 2022

why is this? "Scheduler group objects were removed as they were missing in temp view during SWSS restart."

As we spoke on sync up, this is related to 2nd restart of OA while syncd is still running, this is more like a syncd/comparison logic design issue.

@gord1306
Copy link
Contributor

This issue seems the same case of sonic-net/sonic-sairedis#994

@kcudnik
Copy link
Contributor

kcudnik commented Mar 16, 2022

This issue seems the same case of Azure/sonic-sairedis#994

yea, seems like this will also solve the problem

@kcudnik
Copy link
Contributor

kcudnik commented Apr 5, 2022

any progress here? could this be moved out of draft and merged?

@arvbb arvbb marked this pull request as ready for review April 5, 2022 22:55
@arvbb arvbb requested a review from prsunny as a code owner April 5, 2022 22:55
@sunesh
Copy link

sunesh commented Apr 13, 2022

@lguohan can this PR be merged? Are there any additional items that @arvbb needs to take care of

Copy link
Collaborator

@stephenxs stephenxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As comment

stephenxs
stephenxs previously approved these changes Apr 19, 2022
Copy link
Collaborator

@stephenxs stephenxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, the comment will be addressed in another PR

@arvbb
Copy link
Contributor Author

arvbb commented Apr 21, 2022

Hi @neethajohn, can this PR be merged? Thanks

@prsunny
Copy link
Collaborator

prsunny commented Aug 15, 2022

@neethajohn , could you please review/signoff?

@arvbb arvbb dismissed stale reviews from stephenxs and kcudnik via 94b7d0f November 2, 2022 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants