-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[db_migrator.py] Fix issue while upgrading from 202205 to 202211 via fast reboot #2948
[db_migrator.py] Fix issue while upgrading from 202205 to 202211 via fast reboot #2948
Conversation
scripts/db_migrator.py
Outdated
|
||
if self.stateDB.keys(self.stateDB.STATE_DB, "FAST_REBOOT|system"): | ||
enable_state = self.stateDB.get(self.stateDB.STATE_DB, 'FAST_RESTART_ENABLE_TABLE|system', 'enable') | ||
if self.stateDB.keys(self.stateDB.STATE_DB, "FAST_REBOOT|system") or enable_state == 'true': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this change needed? The check here was added (#2839) for old systems (like 201911) where new flag was not present. The newer branches (where FAST_RESTART_ENABLE_TABLE
flag is present use below changes to delay flex counters:
sonic-net/sonic-swss#1877
#1768
@shlomibitton can you please review if the above analysis is correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @vaibhavhd , this issue is only reproduced from 202205 to 202211. In 202205, there is already a db migrator function version_3_0_5 which changes old flag to new flag. However, in 202211, the same logic is in version_4_0_0. While upgrading from 202205 to 202211, 202205 sets new flag and we have a dump.rdb; in 202211 db_migrator just sets flag to false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @stepanblyschak and @arfeigin , could you please provide your comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Junchao-Mellanox correct, good observation.
The suggested change for version_4_0_0 is in place and hence also this change in version_4_0_2 is needed to tie it together so it will all work as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vaibhavhd As wrote since in version_4_0_0 in the db_migrator according to this PR we are adding new state-db entry for fast-reboot instead of the old version it will be more coherent to check for the newer version when migrating to even newer version (4_0_2) and ignore the former state-db entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand why these two lines need to be changed. I understand the change in 4_0_0, but w/o the change in 4_0_2, what is the issue?
I am not saying that this is doing any damage, just that I don't understand why we are fixing this if this is not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it might as well be doing some damage? For newer systems where we use FAST_RESTART_ENABLE_TABLE|system
flag we do not want to call migrate_config_db_flex_counter_delay_status
, or do we? I think we do not, as flex counters in these newer systems are handled by:
sonic-net/sonic-swss#1877
#1768
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @vaibhavhd, you are right, this change is not necessary. I have updated code, could you please review again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, good catch and solution.
@vaibhavhd can you please review feedback and if possible aprrove? |
ADO: 25048636 |
…fast reboot (#2948) - What I did Fix issue while upgrading from 202205 to 202211 via fast reboot. This issue is caused by a mismatch version handling for fast reboot in db_migrator. This mismatch causes an incorrect sequence like this: 1. User issue fast-reboot under 202205 2. fast-reboot script set fast reboot flag by command "sonic-db-cli STATE_DB HSET "FAST_RESTART_ENABLE_TABLE|system" "enable" "true" &>/dev/null" 3. system boot to 202211 4. db_migrator found the database version is 3_0_6, it will run 4_0_0, however, it found FAST_REBOOT|system does not exist, and it set FAST_RESTART_ENABLE_TABLE|system enable to false 5. system incorrectly performs cold reboot - How I did it in db migrator if we see there is FAST_RESTART_ENABLE_TABLE already, we should skip fast reboot flag migration - How to verify it unit test manual test
PR for 202211: #2962 |
…2211 via fast reboot (#2962) Backport #2948 to 202211 What I did Fix issue while upgrading from 202205 to 202211 via fast reboot. This issue is caused by a mismatch version handling for fast reboot in db_migrator. In 202205, the db migrator for fast reboot flag handling is 3_0_5: sonic-utilities/scripts/db_migrator.py Line 920 in 56a1ae2 def version_3_0_5(self): However, in master(202211), the db migrator for fast reboot flag handling is 4_0_0: sonic-utilities/scripts/db_migrator.py Line 993 in 7435b1c def version_4_0_0(self): This mismatch causes an incorrect sequence like this: 1. User issue fast-reboot under 202205 2. fast-reboot script set fast reboot flag by command "sonic-db-cli STATE_DB HSET "FAST_RESTART_ENABLE_TABLE|system" "enable" "true" &>/dev/null" 3. system boot to 202211 4. db_migrator found the database version is 3_0_6, it will run 4_0_0, however, it found FAST_REBOOT|system does not exist, and it set FAST_RESTART_ENABLE_TABLE|system enable to false 5. system incorrectly performs cold reboot How I did it in db migrator if we see there is FAST_RESTART_ENABLE_TABLE already, we should skip fast reboot flag migration How to verify it unit test manual test
What I did
Fix issue while upgrading from 202205 to 202211 via fast reboot. This issue is caused by a mismatch version handling for fast reboot in db_migrator.
In 202205, the db migrator for fast reboot flag handling is 3_0_5:
sonic-utilities/scripts/db_migrator.py
Line 920 in 56a1ae2
However, in master(202211), the db migrator for fast reboot flag handling is 4_0_0:
sonic-utilities/scripts/db_migrator.py
Line 993 in 7435b1c
This mismatch causes an incorrect sequence like this:
How I did it
in db migrator if we see there is FAST_RESTART_ENABLE_TABLE already, we should skip fast reboot flag migration
How to verify it
unit test
manual test
Previous command output (if the output of a command-line utility has changed)
New command output (if the output of a command-line utility has changed)