Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[201803][monit] Restart rsyslog service if rsyslogd consumes > 800 MB memory #2963

Merged
merged 1 commit into from
Jun 4, 2019
Merged

[201803][monit] Restart rsyslog service if rsyslogd consumes > 800 MB memory #2963

merged 1 commit into from
Jun 4, 2019

Conversation

jleveque
Copy link
Contributor

@jleveque jleveque commented Jun 3, 2019

Configure monit to monitor the resident memory consumption of rsyslogd. If memory usage is > 800 MB for 5 out of 10 checks (2-minute cycle interval, so 10 out of 20 minutes), restart the rsyslog service, because rsyslogd is most likely leaking memory.

@jleveque jleveque self-assigned this Jun 3, 2019
@@ -268,6 +268,10 @@ check system $HOST
if memory usage > 50% for 5 times within 10 cycles then alert
if cpu usage (user) > 90% for 5 times within 10 cycles then alert
if cpu usage (system) > 90% for 5 times within 10 cycles then alert
check process rsyslog with pidfile /var/run/rsyslogd.pid
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/var/run/rsyslogd.pid [](start = 35, length = 21)

How about the rsyslog processes inside docker? Do they matter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have not seen the rsyslogd memory leak occur on a rsyslogd process inside any Docker container. The assumption is that those rsyslogd processes have a very light load, whereas the rsyslogd process in the host image also acts as the rsyslog server for all of those processes, so it handles a much higher load of messages.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rsyslog within container is better managed within the container, for example use superlance with supervisord or use container option to limit the whole memory consumption for the container.

check process rsyslog with pidfile /var/run/rsyslogd.pid
start program = "/bin/systemctl start rsyslog.service"
stop program = "/bin/systemctl stop rsyslog.service"
if totalmem > 800 MB for 5 times within 10 cycles then restart
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restart [](start = 57, length = 7)

Do we need to keep a restart counter somewhere?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question, what is the log message for such restart. we can search the syslog for such cases.

Copy link
Contributor Author

@jleveque jleveque Jun 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each cycle that monit detects the memory has exceeded the threshold it will log the following:

ERR monit[607]: 'rsyslog' total mem amount of 1.6 GB matches resource limit [total mem amount>800.0 MB]

And if it meets the criteria (5 of these within 10 cycles), it will log the following when it attempts to restart the service:

INFO monit[607]: 'rsyslog' trying to restart

@lguohan lguohan merged commit 647b257 into sonic-net:201803 Jun 4, 2019
@jleveque jleveque deleted the rsyslog_mem_limit_201803 branch June 4, 2019 21:25
@yxieca
Copy link
Contributor

yxieca commented Jun 13, 2019

Get this change in 201811 branch until we have a better memory resource monitor/mitigation in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants