-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sonic-cfggen isn't able to render template sporadically when the image is installed from ONIE for the first time #13791
Comments
Do you have direct proof that "sonic-cfggen isn't able to render template"? |
No, I don't. |
After a minor change in sonic-cfggen, I render the template with empty data, and here is result: admin@vlab-01:~$ sonic-cfggen -d -t /home/admin/asic_table.j2 So after check the code, sonic-cfggen will never render a empty asic_table.json. |
After check the syslog and asic_table.json create time in show tech dump, I found the sonic-cfggen render asic_table.json correctly:
Feb 10 14:55:05.772684 r-panther-16 INFO swss.sh[5270]: Creating new swss container with HWSKU Mellanox-SN2700 According to fillowing code, this log output exactly before render json file:
Feb 10 14:55:26.943901 r-panther-16 NOTICE swss#buffermgrd: :- main: --- Starting buffermgrd ---
Feb 10 14:59:01.768622 r-panther-16 INFO swss#supervisord 2023-02-10 12:59:01,767 INFO waiting for buffermgrd to stop Also, the system reboot at 14:54: Feb 10 14:54:35.114710 r-panther-16 NOTICE kernel: [ 0.000000] Linux version 5.10.0-18-2-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP Debian 5.10.140-1 (2022-09-02) |
@stephenxs ,base on the analyze result of tech dump files, this issue seems not a sonic-cfggen issue. |
Hi @liuh-80 |
Hi @stephenxs |
Here is a update, We have offline discussion, when sonic-cfggen have exception, a empty json file will be generated, I create following script to catch and show error: admin@vlab-01:~$ cat test.sh echo "start" And here is reproduce: So stephen will help reproduce and get the error messagen, then we can continue. |
Thanks Hua.
|
…13888) - Why I did it We suspect the issue #13791 is caused by redis server being temporarily unavailable during system initialization so we do not use -d in sonic-cfggen, for now, to avoid accessing redis server - How I did it Provide a string containing required json data when calling sonic-cfggen - How to verify it Manually test it Signed-off-by: Stephen Sun <stephens@nvidia.com>
…onic-net#13888) - Why I did it We suspect the issue sonic-net#13791 is caused by redis server being temporarily unavailable during system initialization so we do not use -d in sonic-cfggen, for now, to avoid accessing redis server - How I did it Provide a string containing required json data when calling sonic-cfggen - How to verify it Manually test it Signed-off-by: Stephen Sun <stephens@nvidia.com>
Description
Steps to reproduce the issue:
Describe the results you received:
Error message
buffermgrd: ERROR (spawn error)
observed which is causedasic_table.json
not being able to rendered from the template.But from the dump we see the
asic_table.json
is empty, which is not expected.The
asic_table.json
should be rendered from the templateasic_table.j2
usingsonic-cfggen
when theswss
docker is created.The template
asic_table.j2
is built into the image, which means it should be available if the image is good. The image is able to start at most times, which means the image should be good. So,asic_table.j2
should be available.The only possible cause is that
sonic-cfggen
wasn't able to render the template and generateasic_table.json
.The command to generate the json is
sonic-cfggen -d -t /usr/share/sonic/templates/asic_table.j2 > /etc/sonic/asic_table.json
. OnlyDEVICE_METADATA|localhost[platform]
is required in the template.The table
DEVICE_METADATA
should be good because from the log we see both platform and hwsku, where are both inDEVICE_METADATA
, are available and correct.In
sonic-cfggen
, only two if branches are executed:branch 2 is simple and branch 1 is complicated. I suspect it somehow exited the process so nothing was output and
asic_table.json
was empty.Describe the results you expected:
asic_table.json
should be available.Output of
show version
:The issue is observed on master and 202211 from last December. 4 times occurred in the past 2 monthes.
Output of
show techsupport
:Additional information you deem important (e.g. issue happens only occasionally):
The text was updated successfully, but these errors were encountered: