Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[code sync] Merge code from sonic-net/sonic-buildimage:202305 to 202305 #372

Merged
merged 2 commits into from
May 25, 2024

Conversation

mssonicbld
Copy link
Collaborator

* 561bb5420 - (head/202305) [eventd]: Close rsyslog plugin when rsyslog SIGTERM and EOF is sent to stream (#18835) (#19035) (2024-05-24) [Zain Budhwani]<br>```

zbud-msft and others added 2 commits May 24, 2024 08:01
…o stream (#18835) (#19035)

Fix #18771

Microsoft ADO (number only):27882794

How I did it

Add signalOnClose for omprog as well as close rsyslog plugin when receives an EOF.

How to verify it

Verify rsyslog_plugin is running inside bgp or swss container

Run docker exec -it bgp supervisorctl restart rsyslogd

Before change:

This will not kill current rsyslog_plugin process but instead rsyslogd will now break off its end of writing to cin and send EOF to rsyslog_plugin, however will not send a signal SIGTERM or SIGKILL to rsyslog_plugin. Therefore, rsyslog plugin will run in an infinite loop forever, constantly calling getline raising CPU to 100% inside docker.

After change of adding signalOnClose="on" to conf file inside omprog, rsyslogd will now send SIGTERM to rsyslog_plugin process running inside container, and rsyslog_plugin will die.

? ( ): rsyslog_plugin/578637 ... [continued]: read()) = -1 (unknown) (INTERNAL ERROR: strerror_r(512, [buf], 128)=22)

UT (will add sonic-mgmt testcase for storming events with logs)

RCA:

1. When rsyslogd is terminated, no signal is sent to child process of rsyslog_plugin meaning that rsyslog_plugin will be constantly trying to read from cin with no writer on the other end of the pipe. This leads to rsyslog_plugin process will constantly be reading via getline infinitely.

2. Because rsyslog is terminated and the spawned rsyslog_plugin is still alive, when rsyslog starts backup again, and log is triggered, a new rsyslog_plugin will be spawned for that rsyslog process, which can lead to many "ghost" rsyslog_plugin processes that will be at high CPU usage.
@mssonicbld mssonicbld requested a review from lguohan as a code owner May 25, 2024 03:02
@mssonicbld mssonicbld merged commit b449b15 into sonic-net:202305 May 25, 2024
3 checks passed
abdosi pushed a commit to abdosi/sonic-buildimage-msft that referenced this pull request Jun 28, 2024
…D automatically (#15347)

src/sonic-platform-daemons

* a90bff5 - (HEAD -> 202205, origin/202205) [ycabled] correct the wrong function call for 'config hwmode state' (sonic-net#372) (5 minutes ago) [vdahiya12]
abdosi pushed a commit to abdosi/sonic-buildimage-msft that referenced this pull request Jun 28, 2024
…D automatically (#15366)

src/sonic-platform-daemons

* 18815c7 - (HEAD -> 202205, origin/202205) Revert "[ycabled] refactor code for onboarding async client changes;refactor (sonic-net#355)" (3 minutes ago) [Ying Xie]
* 5324554 - Revert "add async notification support in active-active topo; refactor code for ycable tasks for change events  (sonic-net#327)" (3 minutes ago) [Ying Xie]
* cbbe2b5 - Revert "[ycabled] fix bug for `show mux status` delayed response (sonic-net#364)" (3 minutes ago) [Ying Xie]
* 9746709 - Revert "[dualtor] Fix command `show mux status` (sonic-net#371)" (3 minutes ago) [Ying Xie]
* 551ab3c - Revert "[ycabled] correct the wrong function call for 'config hwmode state' (sonic-net#372)" (3 minutes ago) [Ying Xie]
abdosi pushed a commit to abdosi/sonic-buildimage-msft that referenced this pull request Jun 28, 2024
…D automatically (#15749)

src/sonic-platform-daemons

* 112656c - (HEAD -> 202205, origin/202205) [ycabled][active-active] no initialize Async Client, when no active-active cable type; fix names for all ycabled threads (sonic-net#373) (4 minutes ago) [vdahiya12]
* e325d5a - Revert "Revert "[ycabled] correct the wrong function call for 'config hwmode state' (sonic-net#372)"" (4 minutes ago) [Ying Xie]
* ddabca1 - Revert "Revert "[dualtor] Fix command `show mux status` (sonic-net#371)"" (4 minutes ago) [Ying Xie]
* 28918da - Revert "Revert "[ycabled] fix bug for `show mux status` delayed response (sonic-net#364)"" (4 minutes ago) [Ying Xie]
* a849de9 - Revert "Revert "add async notification support in active-active topo; refactor code for ycable tasks for change events  (sonic-net#327)"" (4 minutes ago) [Ying Xie]
* cf1e73a - Revert "Revert "[ycabled] refactor code for onboarding async client changes;refactor (sonic-net#355)"" (4 minutes ago) [Ying Xie]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants