Skip to content

Commit

Permalink
[docker_image_ctl] flush ASIC_DB on fast boot (sonic-net#19106)
Browse files Browse the repository at this point in the history
- Why I did it

I observed an issue with fast-reboot that in a rare circumstances a queued FDB event might be written to ASIC_DB by a thread inside syncd after a call to FLUSHDB ASIC_DB was made.
That left ASIC_DB only with one record about that FDB entry and caused syncd to crash at start:

Mar 15 13:28:42.765108 sonic NOTICE syncd#SAI: :- Syncd: syncd started
Mar 15 13:28:42.765268 sonic NOTICE syncd#SAI: :- onSyncdStart: performing hard reinit since COLD start was performed
Mar 15 13:28:42.765451 sonic NOTICE syncd#SAI: :- readAsicState: loaded 1 switches
Mar 15 13:28:42.765465 sonic NOTICE syncd#SAI: :- readAsicState: switch VID: oid:0x21000000000000
Mar 15 13:28:42.765465 sonic NOTICE syncd#SAI: :- readAsicState: read asic state took 0.000205 sec
Mar 15 13:28:42.766364 sonic NOTICE syncd#SAI: :- onSyncdStart: on syncd start took 0.001097 sec
Mar 15 13:28:42.766376 sonic ERR syncd#SAI: :- run: Runtime error during syncd init: map::at
Mar 15 13:28:42.766376 sonic NOTICE syncd#SAI: :- sendShutdownRequest: sending switch_shutdown_request notification to OA for switch: oid:0x0
Mar 15 13:28:42.766518 sonic NOTICE syncd#SAI: :- sendShutdownRequestAfterException: notification send successfully
The fix is done in utilities in fast-reboot script, however in order to allow upgrade from a version without the fix, flush ASIC_DB at boot in fast-reboot as well.

Related to sonic-net/sonic-utilities#3342

- How I did it
Flush ASIC_DB on fast boot.

- How to verify it
Run fast-reboot.

Signed-off-by: Stepan Blyschak <stepanb@nvidia.com>
  • Loading branch information
stepanblyschak authored Jun 24, 2024
1 parent 63c78c2 commit 8c45508
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions files/build_templates/docker_image_ctl.j2
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,9 @@ function postStartAction()
fi

if [[ "$BOOT_TYPE" == "fast" ]]; then
# Flush ASIC DB. On fast-boot there should be nothing in there.
# In the older versions there has been an issue where a queued FDB event might get into ASIC_DB causing syncd crash at boot.
$SONIC_DB_CLI ASIC_DB FLUSHDB
# this is the case when base OS version does not support fast-reboot with reconciliation logic (dump.rdb is absent)
# In this case, we need to set the flag to indicate fast-reboot is in progress. Set the key to expire in 3 minutes
$SONIC_DB_CLI STATE_DB SET "FAST_REBOOT|system" "1" "EX" "180"
Expand Down

0 comments on commit 8c45508

Please sign in to comment.