-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some of Netdevice created by teamd are not cleaned up when teamd service is disabled. #1450
Conversation
not cleaned up. Issue was seen in Multi-asic platform and seems to be timing issue where SIGTERM send via kill systemcall of teammgrd to teamd was not cleaning all teamd process. Sp fix is Instead of sending explicit SIGTERM to teamd we are calling teamd -k. Using this teamd itself generate SIGTERM and handle the processing. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LGTM. @judyjoseph Running test to make sure all BGP session are up after config reload on Multi-asic platforms. This way it will verify end -to end as portchannel gets cleaned up and ip interface are recreated correct on those port-channel and BGP is up. On Multi-asic platforms we have seen once netdev create failed then assigning ip address to those interface also fail and then finally bgp also does not come up. @judyjoseph Updated config reload result with BGP summary and looks good. |
…ere (#1450) not cleaned up. Issue was seen in Multi-asic platform and seems to be timing issue where SIGTERM send via kill systemcall of teammgrd to teamd was not cleaning all teamd process. Sp fix is Instead of sending explicit SIGTERM to teamd we are calling teamd -k. Using this teamd itself generate SIGTERM and handle the processing. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
…ere (sonic-net#1450) not cleaned up. Issue was seen in Multi-asic platform and seems to be timing issue where SIGTERM send via kill systemcall of teammgrd to teamd was not cleaning all teamd process. Sp fix is Instead of sending explicit SIGTERM to teamd we are calling teamd -k. Using this teamd itself generate SIGTERM and handle the processing. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
… teamd were (sonic-net#1450)" This reverts commit 4e10374.
…1450) Migrate from using the `imp` module to using the `importlib` module. As of Python 3, the `imp` module has been deprecated in favor of the `importlib` module. Place logic in a new function, `load_module_from_source()` in a new file, `utilities_common/general.py` Also fix some formatting
Why I did:
The Netdevice created by teamd process not getting clean when Teamd feature/docker/service is disabled/stop
as observed and fixed by 2PR's -
#1159
#1407
is still being observed on Multi-asic platforms consistently and also on KVM VS testbed ( it is not consistent but happens every few runs (sonic-net/sonic-buildimage#5432)
Issue seems to be timing issue where
SIGTERM (generated by docker stop) and send via kill() of teammgrd to teamd was not cleaning
all teamd process netdevice resources.
How I did:
So fix is Instead of sending explicit SIGTERM via kill() system call to teamd we are calling
teamd -k. Using this teamd itself generate SIGTERM (https://github.com/jpirko/libteam/blob/master/teamd/teamd.c#L1861) and handle the processing.
With this change reverted (Lag alias <-> to pid mapping) done in PR #1159 as these is not needed as we are not using using kill() call.
How I verified:
a) Executed the below script 50 times on Multu-asic platforms and things were fine.
b) on KVM VS
a) Before fix:
admin@vlab-01:~$ bash teamd_cleanup.sh
iteration 0
iteration 1
iteration 2
iteration 3
Failed iteration 3 for portchannel cleanup
Completed 3 iteration sucessfully
After fix looks fine.
c) Run script to make sure BGP is fine after config reload on multi-asic platforms