Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nbrmgrd/buffermgrd were killed due to use too much memory in latest master image #2840

Closed
keboliu opened this issue Apr 30, 2019 · 1 comment

Comments

@keboliu
Copy link
Collaborator

keboliu commented Apr 30, 2019

server task like buffermgrd/nbrngrd were killed due to use too much memory:

Apr 30 07:12:11.435319 mtbc-sonic-01-2410 WARNING kernel: [  429.852782] buffermgrd: page allocation stalls for 10908ms, order:0, mode:0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD)
Apr 30 07:12:11.435330 mtbc-sonic-01-2410 WARNING kernel: [  429.852794] CPU: 1 PID: 10760 Comm: buffermgrd Tainted: G           O    4.9.0-8-2-amd64 #1 Debian 4.9.110-3+deb9u6
Apr 30 07:12:11.435333 mtbc-sonic-01-2410 WARNING kernel: [  429.852796] Hardware name: Mellanox Technologies Ltd. MSN2410/VMOD0001, BIOS 4.6.5 05/31/2018
Apr 30 07:12:11.435336 mtbc-sonic-01-2410 WARNING kernel: [  429.852798]  0000000000000000 ffffffffa5b312c4 ffffffffa62020a8 ffffbcc181513b70
Apr 30 07:12:11.435338 mtbc-sonic-01-2410 WARNING kernel: [  429.852803]  ffffffffa5989cca 024201caa66e7d00 ffffffffa62020a8 ffffbcc181513b10
Apr 30 07:12:11.435340 mtbc-sonic-01-2410 WARNING kernel: [  429.852807]  0000000000000010 ffffbcc181513b80 ffffbcc181513b30 ff283237d4d60199
Apr 30 07:12:11.435342 mtbc-sonic-01-2410 WARNING kernel: [  429.852811] Call Trace:
Apr 30 07:12:11.435344 mtbc-sonic-01-2410 WARNING kernel: [  429.852819]  [<ffffffffa5b312c4>] ? dump_stack+0x5c/0x78
Apr 30 07:12:11.435346 mtbc-sonic-01-2410 WARNING kernel: [  429.852824]  [<ffffffffa5989cca>] ? warn_alloc+0x13a/0x160
Apr 30 07:12:11.435348 mtbc-sonic-01-2410 WARNING kernel: [  429.852828]  [<ffffffffa598a6f5>] ? __alloc_pages_slowpath+0x995/0xbf0
Apr 30 07:12:11.435350 mtbc-sonic-01-2410 WARNING kernel: [  429.852832]  [<ffffffffa598ab51>] ? __alloc_pages_nodemask+0x201/0x260
Apr 30 07:12:11.435352 mtbc-sonic-01-2410 WARNING kernel: [  429.852835]  [<ffffffffa59dbe11>] ? alloc_pages_current+0x91/0x140
Apr 30 07:12:11.435353 mtbc-sonic-01-2410 WARNING kernel: [  429.852838]  [<ffffffffa5983686>] ? filemap_fault+0x326/0x5d0
Apr 30 07:12:11.435355 mtbc-sonic-01-2410 WARNING kernel: [  429.852864]  [<ffffffffc03e3a01>] ? ext4_filemap_fault+0x31/0x50 [ext4]
Apr 30 07:12:11.435357 mtbc-sonic-01-2410 WARNING kernel: [  429.852867]  [<ffffffffa59b4267>] ? __do_fault+0x87/0x170
Apr 30 07:12:11.435358 mtbc-sonic-01-2410 WARNING kernel: [  429.852870]  [<ffffffffa59b8b58>] ? handle_mm_fault+0xe78/0x12b0
Apr 30 07:12:11.435360 mtbc-sonic-01-2410 WARNING kernel: [  429.852874]  [<ffffffffa5861245>] ? __do_page_fault+0x255/0x4f0
Apr 30 07:12:11.435361 mtbc-sonic-01-2410 WARNING kernel: [  429.852879]  [<ffffffffa5e11772>] ? schedule+0x32/0x80
Apr 30 07:12:11.435363 mtbc-sonic-01-2410 WARNING kernel: [  429.852882]  [<ffffffffa5e173d8>] ? page_fault+0x28/0x30
Apr 30 07:12:11.435365 mtbc-sonic-01-2410 WARNING kernel: [  429.852884] Mem-Info:
Apr 30 07:12:11.435367 mtbc-sonic-01-2410 WARNING kernel: [  429.852890] active_anon:1808490 inactive_anon:5483 isolated_anon:0
Apr 30 07:12:11.435368 mtbc-sonic-01-2410 WARNING kernel: [  429.852890]  active_file:39 inactive_file:63 isolated_file:7
Apr 30 07:12:11.435370 mtbc-sonic-01-2410 WARNING kernel: [  429.852890]  unevictable:0 dirty:0 writeback:0 unstable:0
Apr 30 07:12:11.435372 mtbc-sonic-01-2410 WARNING kernel: [  429.852890]  slab_reclaimable:4197 slab_unreclaimable:166808
Apr 30 07:12:11.435373 mtbc-sonic-01-2410 WARNING kernel: [  429.852890]  mapped:3279 shmem:5805 pagetables:6361 bounce:0
Apr 30 07:12:11.435375 mtbc-sonic-01-2410 WARNING kernel: [  429.852890]  free:25271 free_pcp:0 free_cma:0
Apr 30 07:12:11.435412 mtbc-sonic-01-2410 WARNING kernel: [  431.208880] nbrmgrd: page allocation stalls for 11988ms, order:0, mode:0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD)
Apr 30 07:12:11.435414 mtbc-sonic-01-2410 WARNING kernel: [  431.208892] CPU: 0 PID: 12524 Comm: nbrmgrd Tainted: G           O    4.9.0-8-2-amd64 #1 Debian 4.9.110-3+deb9u6
Apr 30 07:12:11.435416 mtbc-sonic-01-2410 WARNING kernel: [  431.208894] Hardware name: Mellanox Technologies Ltd. MSN2410/VMOD0001, BIOS 4.6.5 05/31/2018
Apr 30 07:12:11.435418 mtbc-sonic-01-2410 WARNING kernel: [  431.208897]  0000000000000000 ffffffffa5b312c4 ffffffffa62020a8 ffffbcc18203bb70
Apr 30 07:12:11.435420 mtbc-sonic-01-2410 WARNING kernel: [  431.208901]  ffffffffa5989cca 024201ca00000006 ffffffffa62020a8 ffffbcc18203bb10
Apr 30 07:12:11.435422 mtbc-sonic-01-2410 WARNING kernel: [  431.208905]  0000000000000010 ffffbcc18203bb80 ffffbcc18203bb30 c34bb573a79c6442
Apr 30 07:12:11.435424 mtbc-sonic-01-2410 WARNING kernel: [  431.208909] Call Trace:
Apr 30 07:12:11.435425 mtbc-sonic-01-2410 WARNING kernel: [  431.208918]  [<ffffffffa5b312c4>] ? dump_stack+0x5c/0x78
Apr 30 07:12:11.435427 mtbc-sonic-01-2410 WARNING kernel: [  431.208923]  [<ffffffffa5989cca>] ? warn_alloc+0x13a/0x160
Apr 30 07:12:11.435429 mtbc-sonic-01-2410 WARNING kernel: [  431.208927]  [<ffffffffa598a6f5>] ? __alloc_pages_slowpath+0x995/0xbf0
Apr 30 07:12:11.435430 mtbc-sonic-01-2410 WARNING kernel: [  431.208930]  [<ffffffffa59dbe11>] ? alloc_pages_current+0x91/0x140
Apr 30 07:12:11.435432 mtbc-sonic-01-2410 WARNING kernel: [  431.208934]  [<ffffffffa598ab51>] ? __alloc_pages_nodemask+0x201/0x260
Apr 30 07:12:11.435434 mtbc-sonic-01-2410 WARNING kernel: [  431.208937]  [<ffffffffa59dbe11>] ? alloc_pages_current+0x91/0x140
Apr 30 07:12:11.435435 mtbc-sonic-01-2410 WARNING kernel: [  431.208939]  [<ffffffffa5983686>] ? filemap_fault+0x326/0x5d0
Apr 30 07:12:11.435437 mtbc-sonic-01-2410 WARNING kernel: [  431.208966]  [<ffffffffc03e3a01>] ? ext4_filemap_fault+0x31/0x50 [ext4]
Apr 30 07:12:11.435438 mtbc-sonic-01-2410 WARNING kernel: [  431.208969]  [<ffffffffa59b4267>] ? __do_fault+0x87/0x170
Apr 30 07:12:11.435440 mtbc-sonic-01-2410 WARNING kernel: [  431.208972]  [<ffffffffa59b8b58>] ? handle_mm_fault+0xe78/0x12b0
Apr 30 07:12:11.435441 mtbc-sonic-01-2410 WARNING kernel: [  431.208975]  [<ffffffffa5a5155f>] ? ep_poll+0x32f/0x350
Apr 30 07:12:11.435443 mtbc-sonic-01-2410 WARNING kernel: [  431.208979]  [<ffffffffa5861245>] ? __do_page_fault+0x255/0x4f0
Apr 30 07:12:11.435445 mtbc-sonic-01-2410 WARNING kernel: [  431.208983]  [<ffffffffa5e173d8>] ? page_fault+0x28/0x30
Apr 30 07:12:11.435447 mtbc-sonic-01-2410 WARNING kernel: [  431.208985] Mem-Info:
Apr 30 07:12:11.435455 mtbc-sonic-01-2410 WARNING kernel: [  431.208992] active_anon:1808490 inactive_anon:5483 isolated_anon:0
Apr 30 07:12:11.435457 mtbc-sonic-01-2410 WARNING kernel: [  431.208992]  active_file:29 inactive_file:46 isolated_file:3
Apr 30 07:12:11.435459 mtbc-sonic-01-2410 WARNING kernel: [  431.208992]  unevictable:0 dirty:0 writeback:0 unstable:0
Apr 30 07:12:11.435460 mtbc-sonic-01-2410 WARNING kernel: [  431.208992]  slab_reclaimable:4194 slab_unreclaimable:166812
Apr 30 07:12:11.435462 mtbc-sonic-01-2410 WARNING kernel: [  431.208992]  mapped:3277 shmem:5805 pagetables:6361 bounce:0
Apr 30 07:12:11.435463 mtbc-sonic-01-2410 WARNING kernel: [  431.208992]  free:25266 free_pcp:0 free_cma:0

Steps to reproduce the issue:
can be easily observed with the latest image after switch start up

Describe the results you received:

Describe the results you expected:

Additional information you deem important (e.g. issue happens only occasionally):

**Output of `show version`:**
SONiC Software Version: SONiC.HEAD.956-ad2c1b2
Distribution: Debian 9.9
Kernel: 4.9.0-8-2-amd64
Build commit: ad2c1b2
Build date: Sun Apr 28 08:38:17 UTC 2019
Built by: johnar@jenkins-worker-3

Platform: x86_64-mlnx_msn2410-r0
HwSKU: ACS-MSN2410
ASIC: mellanox
Serial Number: MT1848K10623
Uptime: 10:34:23 up  3:29,  1 user,  load average: 2.73, 2.55, 2.47

Docker images:
REPOSITORY                 TAG                 IMAGE ID            SIZE
docker-dhcp-relay          HEAD.956-ad2c1b2    4e8db672e338        256MB
docker-dhcp-relay          latest              4e8db672e338        256MB
docker-fpm-quagga          HEAD.956-ad2c1b2    b5d608c9c504        281MB
docker-fpm-quagga          latest              b5d608c9c504        281MB
docker-syncd-mlnx-rpc      HEAD.956-ad2c1b2    e59e740d124e        617MB
docker-syncd-mlnx-rpc      latest              e59e740d124e        617MB
docker-teamd               HEAD.956-ad2c1b2    184617224403        300MB
docker-teamd               latest              184617224403        300MB
docker-sonic-telemetry     HEAD.956-ad2c1b2    0b0b9f93bfd5        300MB
docker-sonic-telemetry     latest              0b0b9f93bfd5        300MB
docker-snmp-sv2            HEAD.956-ad2c1b2    dd7f19154668        317MB
docker-snmp-sv2            latest              dd7f19154668        317MB
docker-router-advertiser   HEAD.956-ad2c1b2    1e0782909d55        279MB
docker-router-advertiser   latest              1e0782909d55        279MB
docker-platform-monitor    HEAD.956-ad2c1b2    98dffb8c7efa        324MB
docker-platform-monitor    latest              98dffb8c7efa        324MB
docker-orchagent           HEAD.956-ad2c1b2    6f752e1afab4        319MB
docker-orchagent           latest              6f752e1afab4        319MB
docker-lldp-sv2            HEAD.956-ad2c1b2    199b4184f107        298MB
docker-lldp-sv2            latest              199b4184f107        298MB
docker-database            HEAD.956-ad2c1b2    6ff3d687146f        280MB
docker-database            latest              6ff3d687146f        280MB
**Attach debug file `sudo generate_dump`:**

syslog.zip
(paste your output here)

@nazariig
Copy link
Collaborator

nazariig commented May 2, 2019

@keboliu the issue was fixed in this PR: sonic-net/sonic-swss#864

@lguohan lguohan closed this as completed May 4, 2019
yxieca pushed a commit that referenced this issue Jul 21, 2023
…atically (#15930)

src/sonic-utilities

* 99864640 - (HEAD -> 202205, origin/202205) [dualtor] Add script to verify consistency between kernel and ASIC  (#2840) (87 minutes ago) [Longxiang Lyu]
* a32ddc1b - [show][muxcable] update `show mux config` to print out `soc_ipv6` as well  (#2909) (88 minutes ago) [Jing Zhang]
* 0c6d0c51 - [202205] Flush RESTAPI db in fast-reboot shutdown path (#2921) (4 hours ago) [bingwang-ms]
mssonicbld added a commit that referenced this issue Dec 1, 2023
…atically (#17370)

#### Why I did it
src/sonic-utilities
```
* 29511a5a - (HEAD -> 202211, origin/202211) [dualtor] Add script to verify consistency between kernel and ASIC  (#2840) (5 hours ago) [Longxiang Lyu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
mlok-nokia pushed a commit to mlok-nokia/sonic-buildimage that referenced this issue Jun 5, 2024
…t to verify consistency between kernel and ASIC (sonic-net#2840)

   * za32ddc1 [show][muxcable] update  to print out  as well  (sonic-net#2909)
   * 0c6d0c5 [202205] Flush RESTAPI db in fast-reboot shutdown path (sonic-net#2921)
   * bc7c792 Add FEC correctable and uncorrectable port stats (sonic-net#2027)
   * 58db48a [show][muxcable] update  to check soc_ipv6 as well
   * 24fc1db [dualtor][route_check] filter out   (sonic-net#2899)
   * d89d483 [route_check][dualtor] Ignore vlan neighbor route miss (sonic-net#2888)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants