-
Notifications
You must be signed in to change notification settings - Fork 740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes problems with database - swss - syncd synchronization. #110
Fixes problems with database - swss - syncd synchronization. #110
Conversation
…race condition with database flush/set
@@ -5,6 +5,8 @@ After=database.service | |||
|
|||
[Service] | |||
User={{ sonicadmin_user }} | |||
# Wait for redis server start before database clean by checking the server listening port 6379 | |||
ExecStartPre=/bin/bash -c "while true; do if [ -n \"$(netstat -l | grep 6379)\" ]; then break; fi; sleep 1; done" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using "nc -z -w 5 127.0.0.1 6379" to check if the port is open? there could a port 36379 that match your criteria.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lguohan netsat output looks as follows on the box:
admin@switch2:~$ netstat -l
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 localhost:6379 : LISTEN
So I guess it's better to check for ":6379". The it will cover all the cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redis CLI has ping command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, let's check with PING
on my box there are lots of more ports. acsadmin@CCPSCH01030BBLF:~$ sudo netstat -l |
On my too. I've just posted a part of the log. |
Requires=database.service | ||
After=database.service | ||
Requires=database.service swss.service | ||
After=database.service swss.service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syncd is not depending on swss
and syncd should start before swss
@vitaliy-senchyshyn @lguohan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
swss depends on syncd and swss starts after syncd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
swss.service is clearing the database
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I will give a test on this
picked this change from sonic-mgmt repo. sonic-net/sonic-mgmt#110
picked this change from sonic-mgmt repo. sonic-net/sonic-mgmt#110
This PR fixes problem with database - swss - syncd synchronization.
There are two problems:
Feb 2 12:56:23 switch2 INFO docker[798]: Could not connect to Redis at 127.0.0.1:6379: Connection refused
Feb 2 12:56:23 switch2 INFO docker[798]: Could not connect to Redis at 127.0.0.1:6379: Connection refused
Feb 2 12:56:23 switch2 NOTICE systemd[1]: swss.service: control process exited, code=exited status=1
Feb 2 12:56:23 switch2 ERR systemd[1]: Failed to start switch state service container.
Feb 2 12:56:23 switch2 NOTICE systemd[1]: Unit swss.service entered failed state.
In order to solve this to swss.service is added a bash loop which checks that redis server is up using redis-cli ping command. If it's not the loop sleeps for a second before the next try.
As a solution syncd.service is made dependant on swss.service and should be executed after the last one is started.