Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[memory_monitoring] Monitoring High Memory Usage of Containers #1016

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

yozhao101
Copy link
Contributor

@yozhao101 yozhao101 commented Jun 12, 2022

This PR aims to provide high level design regarding to monitoring memory usage of containers in SONiC.

Ubuntu and others added 8 commits June 12, 2022 15:46
Signed-off-by: Ubuntu <yozhao@yozhao-dev.q2ryrfy45r0utbtgx4kh2e1bob.phxx.internal.cloudapp.net>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
alerting ability.

Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
@yozhao101 yozhao101 marked this pull request as ready for review July 11, 2022 21:09
Signed-off-by: Yong Zhao <yozhao@microsoft.com>
5. Monit will write an alerting message into syslog if it receives non-zero
value from `memory_checker` for specified number of times during a monitoring
interval.
6. After the monitoring interval, Monit will write alerting messages into syslog
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the diff 5 vs 6?

either alerting is disabled or specified container is not running.
4. If runtime memory usage is larger than memory threshold, then `memory_checker`
exits with non-zero value; Otherwise, `memory_checker` exits with zero value.
5. Monit will write an alerting message into syslog if it receives non-zero
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syslog if it receives non-zero

Need to write both zero and non-zero value to syslog. Zero is INFO, non-zero is ERROR or CRITICAL. This way we will get confidence that the monitor process is running fine.


```bash
check program container_memory_lldp with path "/usr/bin/memory_checker lldp"
if status == 3 for 10 times within 20 minutes then alert repeat every 1 cycles
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3

You did not define 3 as a non-zero value. Could you add all the possible values with its meaning?


#### 2.2.2.1 Show High Memory Alerting of Containers
```
admin@sonic:~$ show feature high_memory_alerting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high_memory_alerting

Better name: memory_alert

#### 2.2.2.2 Show Memory Threshold of Containers
```
admin@sonic:~$ show feature memory_threhsold
Container Name MemThreshold
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MemThreshold

What is the unit? KB or B?

```
{
"FEATURE": {
"database": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

database

Is it possible that FEATURE entries and containers are not 1:1 mapping?

@linux-foundation-easycla
Copy link

CLA Missing ID CLA Not Signed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants