Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VOQ][saidump] Modify generate_dump: replace save_saidump with save_saidump_by_route_size #2972

Merged
merged 12 commits into from
Nov 15, 2023

Conversation

JunhongMao
Copy link
Contributor

@JunhongMao JunhongMao commented Sep 6, 2023

Why did I do it?

To fix the issue: sonic-net/sonic-buildimage#13561
The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access.

This solution uses the Redis SAVE command to save the snapshot of DB each time and recover later, instead of looping through each entry in the table.

Related PRs:
#2972
sonic-net/sonic-buildimage#16466
sonic-net/sonic-sairedis#1288
sonic-net/sonic-sairedis#1298

Work item tracking

Microsoft ADO (25892277):

How did I do it?

To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

1. Updated dockers/docker-base-bullseye/Dockerfile.j2, install Python library rdbtools into the all the docker-base-bullseye containers.

2. Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.

3. To add a new script file: syncd/scripts/saidump.sh into the sairedis repo. This shell script does the following steps:

  For each ASIC, such as ASIC0,

  3.1. Config Redis consistency directory. 
  redis-cli -h $hostname -p $port CONFIG SET dir $redis_dir > /dev/null

  3.2. Save the Redis data.
  redis-cli -h $hostname -p $port SAVE > /dev/null

  3.3. Run rdb command to convert the dump files into JSON files
    rdb --command json $redis_dir/dump.rdb | tee $redis_dir/dump.json > /dev/null

  3.4.  Run saidump -r to update the JSON files' format as same as the saidump before. 
       Then we can get the saidump's result in standard output."
       saidump -r $redis_dir/dump.json -m 100

  3.5. Clear the temporary files.
   rm -f $redis_dir/dump.rdb
   rm -f $redis_dir/dump.json

4. Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump. To check the asic db size and if it is larger than ROUTE_TAB_LIMIT_DIRECT_ITERATION (with default value 24000) entries, then do with REDIS SAVE, otherwise, to do with old method: looping through each entry of Redis DB.

How to verify it

  1. On T2 setup with more than 96K routes, execute CLI command -- generate_dump
  2. No error should be shown
  3. Download the generate_dump result and verify the saidump file after unpacking it.
  4. The test results
    Execute show techsupport ccomand with more than 24k routes we see the system cpu higher than 3.8%. The following result
Oct 25 20:14:01.188269 ixre-egl-board5 NOTICE syncd0#root: saidump.sh: [4] Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump's result in standard output.
Oct 25 20:14:02.686092 ixre-egl-board5 NOTICE syncd0#root: saidump.sh: [5] Clear the temporary files.
Oct 25 20:14:03.278677 ixre-egl-board5 NOTICE syncd1#root: saidump.sh: hostname:240.127.1.2, port:6379, redis_dir:/var/run/redis1
Oct 25 20:14:03.279898 ixre-egl-board5 NOTICE syncd1#root: saidump.sh: [1] Config Redis consistency directory.
Oct 25 20:14:03.286410 ixre-egl-board5 NOTICE syncd1#root: saidump.sh: [2] SAVE.
Oct 25 20:14:03.459810 ixre-egl-board5 NOTICE syncd1#root: saidump.sh: [3] Run rdb command to convert the dump files into JSON files.
Oct 25 20:14:07.546016 ixre-egl-board5 NOTICE syncd1#root: saidump.sh: [4] Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump's result in standard output.
Oct 25 20:14:09.028892 ixre-egl-board5 NOTICE syncd1#root: saidump.sh: [5] Clear the temporary files.


==================================================================
top - 20:14:01 up  3:05,  3 users,  load average: 2.79, 2.35, 2.13
Tasks: 488 total,   3 running, 481 sleeping,   0 stopped,   4 zombie
%Cpu(s): 18.8 us,  1.8 sy,  0.0 ni, 79.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  31978.1 total,  21475.0 free,   7383.4 used,   3119.7 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  24088.1 avail Mem 

--
 536840 root      20   0    2480    576    508 S   0.0   0.0   0:00.00 sh
 536842 root      20   0    2480    568    504 S   0.0   0.0   0:00.00 sh
 536843 root      20   0   11144   5444   4880 S   0.0   0.0   0:00.00 bcmcmd
Wed 25 Oct 2023 08:14:01 PM UTC
==================================================================
top - 20:14:04 up  3:06,  3 users,  load average: 2.80, 2.36, 2.14
Tasks: 484 total,   2 running, 478 sleeping,   0 stopped,   4 zombie
%Cpu(s): 12.4 us,  0.4 sy,  0.0 ni, 87.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  31978.1 total,  21549.6 free,   7319.1 used,   3109.4 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  24169.8 avail Mem 

--
 537057 root      20   0   62328  43688  14900 S   0.0   0.1   0:00.32 show
 537110 root      20   0    2480    512    444 S   0.0   0.0   0:00.00 sh
 537111 root      20   0   11048   4076   3128 R   0.0   0.0   0:00.01 top
Wed 25 Oct 2023 08:14:04 PM UTC
==================================================================
top - 20:14:06 up  3:06,  3 users,  load average: 2.80, 2.36, 2.14
Tasks: 484 total,   2 running, 478 sleeping,   0 stopped,   4 zombie
%Cpu(s):  8.1 us,  0.7 sy,  0.0 ni, 91.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  31978.1 total,  21535.2 free,   7318.7 used,   3124.2 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  24155.4 avail Mem 

--
 537033 root      20   0    3896   2992   2684 S   0.0   0.0   0:00.00 saidump.sh
 537130 root      20   0   62332  43828  15040 S   0.0   0.1   0:00.30 show
 537178 root      20   0    2480    572    504 S   0.0   0.0   0:00.00 sh
Wed 25 Oct 2023 08:14:06 PM UTC
==================================================================
top - 20:14:09 up  3:06,  3 users,  load average: 2.74, 2.35, 2.14
Tasks: 479 total,   2 running, 473 sleeping,   0 stopped,   4 zombie
%Cpu(s):  3.4 us,  3.0 sy,  3.0 ni, 90.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  31978.1 total,  21555.6 free,   7323.9 used,   3098.7 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  24182.9 avail Mem 

--
 537211 root      25   5   12652   3756   1596 S   0.0   0.0   0:00.00 generate_dump
 537213 root      25   5   16740   9284   5392 S   0.0   0.0   0:00.03 python
 537260 root      20   0    2480    568    504 S   0.0   0.0   0:00.00 sh
Wed 25 Oct 2023 08:14:09 PM UTC
==================================================================
top - 20:14:12 up  3:06,  3 users,  load average: 2.74, 2.35, 2.14
Tasks: 482 total,   3 running, 475 sleeping,   0 stopped,   4 zombie
%Cpu(s):  8.2 us,  3.4 sy,  4.1 ni, 84.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  31978.1 total,  21569.8 free,   7309.5 used,   3098.9 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  24197.5 avail Mem 

--
 537776 root      25   5   11144   5432   4868 S   0.0   0.0   0:00.00 bcmcmd
 537778 root      20   0    2480    568    504 S   0.0   0.0   0:00.00 sh
 537779 root      20   0   11144   5408   4844 S   0.0   0.0   0:00.00 bcmcmd
Wed 25 Oct 2023 08:14:12 PM UTC
==================================================================


dmin@ixre-egl-board5:~$ show ip route summ
asic0:
Route Source         Routes               FIB  (vrf default)
kernel               42                   42                   
connected            17                   17                   
ebgp                 109                  109                  
ibgp                 22785                22785                
------
Totals               22953                22953                


asic1:
Route Source         Routes               FIB  (vrf default)
kernel               42                   42                   
connected            17                   17                   
ebgp                 10075                10075                
ibgp                 12819                12819                
------
Totals               22953                22953                


admin@ixre-egl-board5:~$ show ipv6 route summ
asic0:
Route Source         Routes               FIB  (vrf default)
kernel               43                   43                   
connected            46                   46                   
static               1                    1                    
ebgp                 109                  109                  
ibgp                 14786                14786                
------
Totals               14985                14985                


asic1:
Route Source         Routes               FIB  (vrf default)
kernel               43                   43                   
connected            46                   46                   
static               1                    1                    
ebgp                 2075                 2075                 
ibgp                 12820                12820                
------
Totals               14985                14985                
  1. Also for syncd crash we do not see any swss timeouts or high cpu, also the syncd is able to recover without affecting other syncd.

Previous command output (if the output of a command-line utility has changed)

New command output (if the output of a command-line utility has changed)

output data of the new saidump is identical to the previous.

•	Saidump for DNX-SAI sonic-net/sonic-buildimage#13561

Solution and modification:
To use the redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

(1) Updated sonic-buildimage/build_debian.sh, to install Python library rdbtools into the host.
(2) Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.
(3) Add a new script file: files/scripts/saidump.sh, to do the below steps
  For each ASIC0, such as ASIC0,

  sonic-net#1. Save the Redis data.
  sudo sonic-db-cli -n asic$1 SAVE > /dev/null

  sonic-net#2. Move dump files to /var/run/redisX/
  docker exec database$1 sh -c "mv /var/lib/redis/dump.rdb /var/run/redis$1/"

  sonic-net#3. Run rdb command to convert the dump files into JSON files
  sudo python /usr/local/bin/rdb --command json  /var/run/redis$1/dump.rdb | sudo tee /var/run/redis$1/dump.json > /dev/null

  sonic-net#4. Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump result in standard output.
  docker exec syncd$1 sh -c "saidump -r /var/run/redis$1/dump.json"

  sonic-net#5. clear
  sudo rm -f /var/run/redis$1/dump.rdb
  sudo rm -f /var/run/redis$1/dump.json

(4) Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump, replace saidump with saidump.sh
•	Saidump for DNX-SAI sonic-net/sonic-buildimage#13561

Solution and modification:
To use the redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

(1) Updated sonic-buildimage/build_debian.sh, to install Python library rdbtools into the host.
(2) Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.
(3) Add a new script file: files/scripts/saidump.sh, to do the below steps
  For each ASIC0, such as ASIC0,

  1. Save the Redis data.
  sudo sonic-db-cli -n asic$1 SAVE > /dev/null

  2. Move dump files to /var/run/redisX/
  docker exec database$1 sh -c "mv /var/lib/redis/dump.rdb /var/run/redis$1/"

  3. Run rdb command to convert the dump files into JSON files
  sudo python /usr/local/bin/rdb --command json  /var/run/redis$1/dump.rdb | sudo tee /var/run/redis$1/dump.json > /dev/null

  4. Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump result in standard output.
  docker exec syncd$1 sh -c "saidump -r /var/run/redis$1/dump.json"

  5. clear
  sudo rm -f /var/run/redis$1/dump.rdb
  sudo rm -f /var/run/redis$1/dump.json

(4) Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump, replace saidump with saidump.sh
    •       Saidump for DNX-SAI sonic-net/sonic-buildimage#13561

    Solution and modification:
    To use the redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

    (1) Updated sonic-buildimage/build_debian.sh, to install Python library rdbtools into the host.
    (2) Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.
    (3) Add a new script file: files/scripts/saidump.sh, to do the below steps
      For each ASIC0, such as ASIC0,

      1. Save the Redis data.
      sudo sonic-db-cli -n asic$1 SAVE > /dev/null

      2. Move dump files to /var/run/redisX/
      docker exec database$1 sh -c "mv /var/lib/redis/dump.rdb /var/run/redis$1/"

      3. Run rdb command to convert the dump files into JSON files
      sudo python /usr/local/bin/rdb --command json  /var/run/redis$1/dump.rdb | sudo tee /var/run/redis$1/dump.json > /dev/null

      4. Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump result in standard output.
      docker exec syncd$1 sh -c "saidump -r /var/run/redis$1/dump.json"

      5. clear
      sudo rm -f /var/run/redis$1/dump.rdb
      sudo rm -f /var/run/redis$1/dump.json

    (4) Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump, replace saidump with saidump.sh
        •       Saidump for DNX-SAI sonic-net/sonic-buildimage#13561

        Solution and modification:
        To use the redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

        (1) Updated sonic-buildimage/build_debian.sh, to install Python library rdbtools into the host.
        (2) Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.
        (3) Add a new script file: files/scripts/saidump.sh, to do the below steps
          For each ASIC0, such as ASIC0,

          1. Save the Redis data.
          sudo sonic-db-cli -n asic$1 SAVE > /dev/null

          2. Move dump files to /var/run/redisX/
          docker exec database$1 sh -c "mv /var/lib/redis/dump.rdb /var/run/redis$1/"

          3. Run rdb command to convert the dump files into JSON files
          sudo python /usr/local/bin/rdb --command json  /var/run/redis$1/dump.rdb | sudo tee /var/run/redis$1/dump.json > /dev/null

          4. Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump result in standard output.
          docker exec syncd$1 sh -c "saidump -r /var/run/redis$1/dump.json"

          5. clear
          sudo rm -f /var/run/redis$1/dump.rdb
          sudo rm -f /var/run/redis$1/dump.json

        (4) Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump, replace saidump with saidump.sh
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Sep 6, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@@ -1791,6 +1812,8 @@ main() {

if [[ "$device_type" != "SpineRouter" ]]; then
save_saidump
else
save_saidump_by_redis_save_cmd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do not feel we should check device_type, and doing different saidump based on the device type. Can you check the asic db size and if it is larger than xxx entries, then we do the new way?

Copy link
Contributor Author

@JunhongMao JunhongMao Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lguohan, please help to review my updates. The variable ROUTE_TAB_LIMIT_DIRECT_ITERATION with default value 24000.

JunhongMao added a commit to JunhongMao/sonic-utilities that referenced this pull request Sep 19, 2023
sonic-net#2972
SAI DUMP based on the route table size
JunhongMao added a commit to JunhongMao/sonic-utilities that referenced this pull request Sep 19, 2023
sonic-net#2972
SAI DUMP based on the route table size
JunhongMao added a commit to JunhongMao/sonic-utilities that referenced this pull request Sep 19, 2023
sonic-net#2972
SAI DUMP based on the route table size
JunhongMao added a commit to JunhongMao/sonic-utilities that referenced this pull request Sep 19, 2023
sonic-net#2972
SAI DUMP based on the route table size

* [saidump]
• Saidump for DNX-SAI sonic-net/sonic-buildimage#13561

Solution and modification:
To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

(1) Updated platform/broadcom/docker-syncd-brcm-dnx/Dockerfile.j2, install Python library rdbtools into the syncd containter.
(2) Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.
(3) Updated sonic-buildimage/build_debian.sh, to add a new script file: files/scripts/saidump.sh into the host. This shell file does the below steps:
  For each ASIC0, such as ASIC0,

  1. Save the Redis data.
  sudo sonic-db-cli -n asic$1 SAVE > /dev/null

  2. Move dump files to /var/run/redisX/
  docker exec database$1 sh -c "mv /var/lib/redis/dump.rdb /var/run/redis$1/"

  3. Run rdb command to convert the dump files into JSON files
  docker exec syncd$1 sh -c "rdb --command json /var/run/redis$1/dump.rdb | tee /var/run/redis$1/dump.json > /dev/null"

  4. Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump result in standard output.
  docker exec syncd$1 sh -c "saidump -r /var/run/redis$1/dump.json -m 100"

  5. clear
  sudo rm -f /var/run/redis$1/dump.rdb
  sudo rm -f /var/run/redis$1/dump.json

(4) Update sonic-buildimage/src/sonic-utilities/scrip
sonic-net#2972
SAI DUMP based on the route table size

* [saidump]
• Saidump for DNX-SAI sonic-net/sonic-buildimage#13561

Solution and modification:
To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

(1) Updated platform/broadcom/docker-syncd-brcm-dnx/Dockerfile.j2, install Python library rdbtools into the syncd containter.
(2) Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.
(3) Updated sonic-buildimage/build_debian.sh, to add a new script file: files/scripts/saidump.sh into the host. This shell file does the below steps:
  For each ASIC0, such as ASIC0,

  1. Save the Redis data.
  sudo sonic-db-cli -n asic$1 SAVE > /dev/null

  2. Move dump files to /var/run/redisX/
  docker exec database$1 sh -c "mv /var/lib/redis/dump.rdb /var/run/redis$1/"

  3. Run rdb command to convert the dump files into JSON files
  docker exec syncd$1 sh -c "rdb --command json /var/run/redis$1/dump.rdb | tee /var/run/redis$1/dump.json > /dev/null"

  4. Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump result in standard output.
  docker exec syncd$1 sh -c "saidump -r /var/run/redis$1/dump.json -m 100"

  5. clear
  sudo rm -f /var/run/redis$1/dump.rdb
  sudo rm -f /var/run/redis$1/dump.json

(4) Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump, to check the asic db size and if it is larger than xxx entries, then do with REDIS SAVE, otherwise, to do with old method: looping through each entry of Redis DB.
kcudnik pushed a commit to sonic-net/sonic-sairedis that referenced this pull request Sep 25, 2023
…file and displays/format the right output (#1288)

Why I did it
Fix issue: sonic-net/sonic-buildimage#13561
The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access.

This solution uses the redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table.
Related PRs:
sonic-net/sonic-utilities#2972
sonic-net/sonic-buildimage#16466
@JunhongMao JunhongMao changed the title [VOQ][saidump] Modify generate_dump to save_saidump_by_redis_save_cmd [VOQ][saidump] Modify generate_dump: replace save_saidump with save_saidump_by_route_size Oct 3, 2023
To add the rdbtools into base docker.
Move saidump.sh from host to syncd docker container.
scripts/generate_dump Outdated Show resolved Hide resolved
@JunhongMao
Copy link
Contributor Author

Lgtm. Can you add some test results to PR summary that with routes eg: 23k per asic on a multi-asic platform we don't see CPU high and swss timeouts in scenarios viz.

  1. show tech support
  2. one syncd crashed.

We have tested these cases and provided the result above. Please see the "How To Verify It" section of this PR:
#2972 (comment)

lguohan pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Nov 8, 2023
…rs. (#16466)

Fix #13561

The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access.

This solution uses the Redis SAVE command to save the snapshot of DB each time and recover later, instead of looping through each entry in the table.

Related PRs:
sonic-net/sonic-utilities#2972
sonic-net/sonic-sairedis#1288
sonic-net/sonic-sairedis#1298

How did I do it?
To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

1. Updated dockers/docker-base-bullseye/Dockerfile.j2, install Python library rdbtools into the all the docker-base-bullseye containers.

2. Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.

3. To add a new script file: syncd/scripts/saidump.sh into the sairedis repo. This shell script does the following steps:

  For each ASIC, such as ASIC0,

  3.1. Config Redis consistency directory. 
  redis-cli -h $hostname -p $port CONFIG SET dir $redis_dir > /dev/null

  3.2. Save the Redis data.
  redis-cli -h $hostname -p $port SAVE > /dev/null

  3.3. Run rdb command to convert the dump files into JSON files
    rdb --command json $redis_dir/dump.rdb | tee $redis_dir/dump.json > /dev/null

  3.4.  Run saidump -r to update the JSON files' format as same as the saidump before. 
       Then we can get the saidump's result in standard output."
       saidump -r $redis_dir/dump.json -m 100

  3.5. Clear the temporary files.
   rm -f $redis_dir/dump.rdb
   rm -f $redis_dir/dump.json

4. Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump. To check the asic db size and if it is larger than ROUTE_TAB_LIMIT_DIRECT_ITERATION (with default value 24000) entries, then do with REDIS SAVE, otherwise, to do with old method: looping through each entry of Redis DB.

How to verify it
On T2 setup with more than 96K routes, execute CLI command -- generate_dump
No error should be shown
Download the generate_dump result and verify the saidump file after unpacking it.
@JunhongMao
Copy link
Contributor Author

@lguohan , please help to review this PR. Thanks.

@judyjoseph
Copy link
Contributor

@JunhongMao please add a testcase for covering this case where "$route_size > $ROUTE_TAB_LIMIT_DIRECT_ITERATION" and the new saidump flow will be taken. You could add as part of this PR .. or a separate one

@JunhongMao
Copy link
Contributor Author

@JunhongMao please add a testcase for covering this case where "$route_size > $ROUTE_TAB_LIMIT_DIRECT_ITERATION" and the new saidump flow will be taken. You could add as part of this PR .. or a separate one

@judyjoseph I would suggest to merge this PR first since it has been opened for a long time. I will create a test PR for that separately.

@judyjoseph judyjoseph merged commit cd85569 into sonic-net:master Nov 15, 2023
5 checks passed
@mlok-nokia
Copy link
Contributor

@judyjoseph Please merge this to 202205 branch also.

mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Nov 19, 2023
…rs. (sonic-net#16466)

Fix sonic-net#13561

The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access.

This solution uses the Redis SAVE command to save the snapshot of DB each time and recover later, instead of looping through each entry in the table.

Related PRs:
sonic-net/sonic-utilities#2972
sonic-net/sonic-sairedis#1288
sonic-net/sonic-sairedis#1298

How did I do it?
To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

1. Updated dockers/docker-base-bullseye/Dockerfile.j2, install Python library rdbtools into the all the docker-base-bullseye containers.

2. Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.

3. To add a new script file: syncd/scripts/saidump.sh into the sairedis repo. This shell script does the following steps:

  For each ASIC, such as ASIC0,

  3.1. Config Redis consistency directory. 
  redis-cli -h $hostname -p $port CONFIG SET dir $redis_dir > /dev/null

  3.2. Save the Redis data.
  redis-cli -h $hostname -p $port SAVE > /dev/null

  3.3. Run rdb command to convert the dump files into JSON files
    rdb --command json $redis_dir/dump.rdb | tee $redis_dir/dump.json > /dev/null

  3.4.  Run saidump -r to update the JSON files' format as same as the saidump before. 
       Then we can get the saidump's result in standard output."
       saidump -r $redis_dir/dump.json -m 100

  3.5. Clear the temporary files.
   rm -f $redis_dir/dump.rdb
   rm -f $redis_dir/dump.json

4. Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump. To check the asic db size and if it is larger than ROUTE_TAB_LIMIT_DIRECT_ITERATION (with default value 24000) entries, then do with REDIS SAVE, otherwise, to do with old method: looping through each entry of Redis DB.

How to verify it
On T2 setup with more than 96K routes, execute CLI command -- generate_dump
No error should be shown
Download the generate_dump result and verify the saidump file after unpacking it.
StormLiangMS pushed a commit that referenced this pull request Nov 19, 2023
…aidump_by_route_size (#2972)

* * [saidump]
•	Saidump for DNX-SAI sonic-net/sonic-buildimage#13561

Solution and modification:
To use the redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

(1) Updated sonic-buildimage/build_debian.sh, to install Python library rdbtools into the host.
(2) Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.
(3) Add a new script file: files/scripts/saidump.sh, to do the below steps
  For each ASIC0, such as ASIC0,

  #1. Save the Redis data.
  sudo sonic-db-cli -n asic$1 SAVE > /dev/null

  #2. Move dump files to /var/run/redisX/
  docker exec database$1 sh -c "mv /var/lib/redis/dump.rdb /var/run/redis$1/"

  #3. Run rdb command to convert the dump files into JSON files
  sudo python /usr/local/bin/rdb --command json  /var/run/redis$1/dump.rdb | sudo tee /var/run/redis$1/dump.json > /dev/null

  #4. Run saidump -r to update the JSON files' format as same as the saidump before. Then we can get the saidump result in standard output.
  docker exec syncd$1 sh -c "saidump -r /var/run/redis$1/dump.json"

  #5. clear
  sudo rm -f /var/run/redis$1/dump.rdb
  sudo rm -f /var/run/redis$1/dump.json

(4) Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump, replace saidump with saidump.sh
* * [saidump]
•	Saidump for DNX-SAI sonic-net/sonic-buildimage#13561
StormLiangMS pushed a commit to sonic-net/sonic-sairedis that referenced this pull request Nov 19, 2023
…file and displays/format the right output (#1288)

Why I did it
Fix issue: sonic-net/sonic-buildimage#13561
The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access.

This solution uses the redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table.
Related PRs:
sonic-net/sonic-utilities#2972
sonic-net/sonic-buildimage#16466
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Nov 21, 2023
…rs. (sonic-net#16466)

Fix sonic-net#13561

The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access.

This solution uses the Redis SAVE command to save the snapshot of DB each time and recover later, instead of looping through each entry in the table.

Related PRs:
sonic-net/sonic-utilities#2972
sonic-net/sonic-sairedis#1288
sonic-net/sonic-sairedis#1298

How did I do it?
To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

1. Updated dockers/docker-base-bullseye/Dockerfile.j2, install Python library rdbtools into the all the docker-base-bullseye containers.

2. Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.

3. To add a new script file: syncd/scripts/saidump.sh into the sairedis repo. This shell script does the following steps:

  For each ASIC, such as ASIC0,

  3.1. Config Redis consistency directory. 
  redis-cli -h $hostname -p $port CONFIG SET dir $redis_dir > /dev/null

  3.2. Save the Redis data.
  redis-cli -h $hostname -p $port SAVE > /dev/null

  3.3. Run rdb command to convert the dump files into JSON files
    rdb --command json $redis_dir/dump.rdb | tee $redis_dir/dump.json > /dev/null

  3.4.  Run saidump -r to update the JSON files' format as same as the saidump before. 
       Then we can get the saidump's result in standard output."
       saidump -r $redis_dir/dump.json -m 100

  3.5. Clear the temporary files.
   rm -f $redis_dir/dump.rdb
   rm -f $redis_dir/dump.json

4. Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump. To check the asic db size and if it is larger than ROUTE_TAB_LIMIT_DIRECT_ITERATION (with default value 24000) entries, then do with REDIS SAVE, otherwise, to do with old method: looping through each entry of Redis DB.

How to verify it
On T2 setup with more than 96K routes, execute CLI command -- generate_dump
No error should be shown
Download the generate_dump result and verify the saidump file after unpacking it.
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Nov 21, 2023
…rs. (sonic-net#16466)

Fix sonic-net#13561

The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access.

This solution uses the Redis SAVE command to save the snapshot of DB each time and recover later, instead of looping through each entry in the table.

Related PRs:
sonic-net/sonic-utilities#2972
sonic-net/sonic-sairedis#1288
sonic-net/sonic-sairedis#1298

How did I do it?
To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

1. Updated dockers/docker-base-bullseye/Dockerfile.j2, install Python library rdbtools into the all the docker-base-bullseye containers.

2. Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.

3. To add a new script file: syncd/scripts/saidump.sh into the sairedis repo. This shell script does the following steps:

  For each ASIC, such as ASIC0,

  3.1. Config Redis consistency directory. 
  redis-cli -h $hostname -p $port CONFIG SET dir $redis_dir > /dev/null

  3.2. Save the Redis data.
  redis-cli -h $hostname -p $port SAVE > /dev/null

  3.3. Run rdb command to convert the dump files into JSON files
    rdb --command json $redis_dir/dump.rdb | tee $redis_dir/dump.json > /dev/null

  3.4.  Run saidump -r to update the JSON files' format as same as the saidump before. 
       Then we can get the saidump's result in standard output."
       saidump -r $redis_dir/dump.json -m 100

  3.5. Clear the temporary files.
   rm -f $redis_dir/dump.rdb
   rm -f $redis_dir/dump.json

4. Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump. To check the asic db size and if it is larger than ROUTE_TAB_LIMIT_DIRECT_ITERATION (with default value 24000) entries, then do with REDIS SAVE, otherwise, to do with old method: looping through each entry of Redis DB.

How to verify it
On T2 setup with more than 96K routes, execute CLI command -- generate_dump
No error should be shown
Download the generate_dump result and verify the saidump file after unpacking it.
mssonicbld pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Nov 21, 2023
…rs. (#16466)

Fix #13561

The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access.

This solution uses the Redis SAVE command to save the snapshot of DB each time and recover later, instead of looping through each entry in the table.

Related PRs:
sonic-net/sonic-utilities#2972
sonic-net/sonic-sairedis#1288
sonic-net/sonic-sairedis#1298

How did I do it?
To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

1. Updated dockers/docker-base-bullseye/Dockerfile.j2, install Python library rdbtools into the all the docker-base-bullseye containers.

2. Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.

3. To add a new script file: syncd/scripts/saidump.sh into the sairedis repo. This shell script does the following steps:

  For each ASIC, such as ASIC0,

  3.1. Config Redis consistency directory. 
  redis-cli -h $hostname -p $port CONFIG SET dir $redis_dir > /dev/null

  3.2. Save the Redis data.
  redis-cli -h $hostname -p $port SAVE > /dev/null

  3.3. Run rdb command to convert the dump files into JSON files
    rdb --command json $redis_dir/dump.rdb | tee $redis_dir/dump.json > /dev/null

  3.4.  Run saidump -r to update the JSON files' format as same as the saidump before. 
       Then we can get the saidump's result in standard output."
       saidump -r $redis_dir/dump.json -m 100

  3.5. Clear the temporary files.
   rm -f $redis_dir/dump.rdb
   rm -f $redis_dir/dump.json

4. Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump. To check the asic db size and if it is larger than ROUTE_TAB_LIMIT_DIRECT_ITERATION (with default value 24000) entries, then do with REDIS SAVE, otherwise, to do with old method: looping through each entry of Redis DB.

How to verify it
On T2 setup with more than 96K routes, execute CLI command -- generate_dump
No error should be shown
Download the generate_dump result and verify the saidump file after unpacking it.
mssonicbld pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Nov 21, 2023
…rs. (#16466)

Fix #13561

The existing saidump use https://github.com/sonic-net/sonic-swss-common/blob/master/common/table_dump.lua script which loops the ASIC_DB more than 5 seconds and blocks other processes access.

This solution uses the Redis SAVE command to save the snapshot of DB each time and recover later, instead of looping through each entry in the table.

Related PRs:
sonic-net/sonic-utilities#2972
sonic-net/sonic-sairedis#1288
sonic-net/sonic-sairedis#1298

How did I do it?
To use the Redis-db SAVE option to save the snapshot of DB each time and recover later, instead of looping through each entry in the table and saving it.

1. Updated dockers/docker-base-bullseye/Dockerfile.j2, install Python library rdbtools into the all the docker-base-bullseye containers.

2. Updated sonic-buildimage/src/sonic-sairedis/saidump/saidump.cpp, add a new option -r, which updates the rdbtools's output-JSON files' format.

3. To add a new script file: syncd/scripts/saidump.sh into the sairedis repo. This shell script does the following steps:

  For each ASIC, such as ASIC0,

  3.1. Config Redis consistency directory. 
  redis-cli -h $hostname -p $port CONFIG SET dir $redis_dir > /dev/null

  3.2. Save the Redis data.
  redis-cli -h $hostname -p $port SAVE > /dev/null

  3.3. Run rdb command to convert the dump files into JSON files
    rdb --command json $redis_dir/dump.rdb | tee $redis_dir/dump.json > /dev/null

  3.4.  Run saidump -r to update the JSON files' format as same as the saidump before. 
       Then we can get the saidump's result in standard output."
       saidump -r $redis_dir/dump.json -m 100

  3.5. Clear the temporary files.
   rm -f $redis_dir/dump.rdb
   rm -f $redis_dir/dump.json

4. Update sonic-buildimage/src/sonic-utilities/scripts/generate_dump. To check the asic db size and if it is larger than ROUTE_TAB_LIMIT_DIRECT_ITERATION (with default value 24000) entries, then do with REDIS SAVE, otherwise, to do with old method: looping through each entry of Redis DB.

How to verify it
On T2 setup with more than 96K routes, execute CLI command -- generate_dump
No error should be shown
Download the generate_dump result and verify the saidump file after unpacking it.
JunhongMao added a commit to JunhongMao/sonic-utilities that referenced this pull request Dec 13, 2023
sonic-net#2972 added two below functions into scripts/generate_dump.
get_route_table_size_by_asic_id_and_ipver
save_saidump_by_route_size
The unittest scripts need to be added.

Related PRs:
sonic-net#2972
sonic-net/sonic-buildimage#16466
sonic-net/sonic-sairedis#1288
sonic-net/sonic-sairedis#1298

Microsoft ADO (25892277):

Add two scripts:
tests/saidump_test.py
tests/saidump_test.sh

To use below 6 test cases to verify the functionality of get_route_table_size_by_asic_id_and_ipver and save_saidump_by_route_size behave correctly.

```
saidump test list format: [ACIS number, ipv4 and ipv6 route table size, expected function save_cmd arguments]
saidump_test_list = [
    [1, 10000, "docker exec syncd saidump saidump"],
    [1, 12000, "docker exec syncd saidump saidump"],
    [1, 12001, "docker exec syncd saidump.sh saidump"],
    [1, 20000, "docker exec syncd saidump.sh saidump"],
    [2, 10000, "docker exec syncd0 saidump saidump0\ndocker exec syncd1 saidump saidump1"],
    [2, 12000, "docker exec syncd0 saidump saidump0\ndocker exec syncd1 saidump saidump1"],
    [2, 12001, "docker exec syncd0 saidump.sh saidump0\ndocker exec syncd1 saidump.sh saidump1"],
    [2, 20000, "docker exec syncd0 saidump.sh saidump0\ndocker exec syncd1 saidump.sh saidump1"]
]
```
During the compiling stage, run the below command to check if it's PASSED.
jumao@1b1ffba5949a:/sonic/src/sonic-utilities$ time python3 setup.py test
tests/saidump_test.py::test_saidump PASSED
JunhongMao added a commit to JunhongMao/sonic-utilities that referenced this pull request Dec 13, 2023
sonic-net#2972 added two below functions into scripts/generate_dump.
get_route_table_size_by_asic_id_and_ipver
save_saidump_by_route_size
The unittest scripts need to be added.

Related PRs:
sonic-net#2972
sonic-net/sonic-buildimage#16466
sonic-net/sonic-sairedis#1288
sonic-net/sonic-sairedis#1298

Microsoft ADO (25892277):

Add two scripts:
tests/saidump_test.py
tests/saidump_test.sh

To use below 6 test cases to verify the functionality of get_route_table_size_by_asic_id_and_ipver and save_saidump_by_route_size behave correctly.

```
saidump test list format: [ACIS number, ipv4 and ipv6 route table size, expected function save_cmd arguments]
saidump_test_list = [
    [1, 10000, "docker exec syncd saidump saidump"],
    [1, 12000, "docker exec syncd saidump saidump"],
    [1, 12001, "docker exec syncd saidump.sh saidump"],
    [1, 20000, "docker exec syncd saidump.sh saidump"],
    [2, 10000, "docker exec syncd0 saidump saidump0\ndocker exec syncd1 saidump saidump1"],
    [2, 12000, "docker exec syncd0 saidump saidump0\ndocker exec syncd1 saidump saidump1"],
    [2, 12001, "docker exec syncd0 saidump.sh saidump0\ndocker exec syncd1 saidump.sh saidump1"],
    [2, 20000, "docker exec syncd0 saidump.sh saidump0\ndocker exec syncd1 saidump.sh saidump1"]
]
```
During the compiling stage, run the below command to check if it's PASSED.
jumao@1b1ffba5949a:/sonic/src/sonic-utilities$ time python3 setup.py test
tests/saidump_test.py::test_saidump PASSED
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants