Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core dump is not created on OOM exceptions in Docker even though COMPlus_DbgEnableMiniDump is set to 1 #52521

Open
itai opened this issue May 9, 2021 · 11 comments
Assignees
Milestone

Comments

@itai
Copy link

itai commented May 9, 2021

Steps to reproduce

Run the mcr.microsoft.com/dotnet/sdk:5.0.202-alpine3.13-amd64 docker image with a memory limit and privileged capabilites:

docker run --memory 10g --memory-swap 10g --privileged --cap-add=ALL --security-opt seccomp:unconfined -it mcr.microsoft.com/dotnet/sdk:5.0.202-alpine3.13-amd64

Inside the container, create a new app:

dotnet new console -o App -n App

Change App/Program.cs to the following:

using System;
using System.Collections.Generic;

namespace App
{
    class Program
    {
        static void Main(string[] args)
        {
            var l = new List<byte[]>();
            while (true)
            {
                l.Add(new byte[1024 * 1024 * 1024]);
            }
        }
    }
}

Publish the application:

dotnet publish -c Release App/App.csproj

Run the following:

export COMPlus_DbgEnableMiniDump=1
export COMPlus_DbgMiniDumpType=1
export COMPlus_DbgMiniDumpName=/core.dmp
export COMPlus_CreateDumpDiagnostics=1

Run the executable:

dotnet ./App/bin/Release/net5.0/App.dll

As expected, we get an "out of memory" exception. However, a core dump is not created. It appears that createdump starts running but doesn't finish. The following file contains the on-screen output: app_output.txt

Configuration

> docker --version
Docker version 20.10.5, build 55c4c88

Inside the container:

# dotnet --info
.NET SDK (reflecting any global.json):
 Version:   5.0.202
 Commit:    db7cc87d51

Runtime Environment:
 OS Name:     alpine
 OS Version:  3.13
 OS Platform: Linux
 RID:         linux-musl-x64
 Base Path:   /usr/share/dotnet/sdk/5.0.202/

Host (useful for support):
  Version: 5.0.5
  Commit:  2f740adc14

.NET SDKs installed:
  5.0.202 [/usr/share/dotnet/sdk]

.NET runtimes installed:
  Microsoft.AspNetCore.App 5.0.5 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 5.0.5 [/usr/share/dotnet/shared/Microsoft.NETCore.App]
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label May 9, 2021
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost
Copy link

ghost commented May 9, 2021

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

Steps to reproduce

Run the mcr.microsoft.com/dotnet/sdk:5.0.202-alpine3.13-amd64 docker image with a memory limit and privileged capabilites:

docker run --memory 10g --memory-swap 10g --privileged --cap-add=ALL --security-opt seccomp:unconfined -it mcr.microsoft.com/dotnet/sdk:5.0.202-alpine3.13-amd64

Inside the container, create a new app:

dotnet new console -o App -n App

Change App/Program.cs to the following:

using System;
using System.Collections.Generic;

namespace App
{
    class Program
    {
        static void Main(string[] args)
        {
            var l = new List<byte[]>();
            while (true)
            {
                l.Add(new byte[1024 * 1024 * 1024]);
            }
        }
    }
}

Publish the application:

dotnet publish -c Release App/App.csproj

Run the following:

export COMPlus_DbgEnableMiniDump=1
export COMPlus_DbgMiniDumpType=1
export COMPlus_DbgMiniDumpName=/core.dmp
export COMPlus_CreateDumpDiagnostics=1

Run the executable:

dotnet ./App/bin/Release/net5.0/App.dll

As expected, we get an "out of memory" exception. However, a core dump is not created. It appears that createdump starts running but doesn't finish. The following file contains the on-screen output: app_output.txt

Configuration

> docker --version
Docker version 20.10.5, build 55c4c88

Inside the container:

# dotnet --info
.NET SDK (reflecting any global.json):
 Version:   5.0.202
 Commit:    db7cc87d51

Runtime Environment:
 OS Name:     alpine
 OS Version:  3.13
 OS Platform: Linux
 RID:         linux-musl-x64
 Base Path:   /usr/share/dotnet/sdk/5.0.202/

Host (useful for support):
  Version: 5.0.5
  Commit:  2f740adc14

.NET SDKs installed:
  5.0.202 [/usr/share/dotnet/sdk]

.NET runtimes installed:
  Microsoft.AspNetCore.App 5.0.5 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 5.0.5 [/usr/share/dotnet/shared/Microsoft.NETCore.App]
Author: itai
Assignees: -
Labels:

area-Diagnostics-coreclr, untriaged

Milestone: -

@davidfowl
Copy link
Member

cc @mikem8361

@mikem8361 mikem8361 self-assigned this May 10, 2021
@mikem8361 mikem8361 added this to the 6.0.0 milestone May 10, 2021
@mikem8361 mikem8361 removed the untriaged New issue has not been triaged by the area owner label May 10, 2021
@mikem8361
Copy link
Member

From the app_output.txt it looks like createdump is getting launched but it gets killed in the middle of the dump generation because of Linux's OOM-killer. Could you try to disable it and run this again?

@mikem8361
Copy link
Member

To disable the OOM-killer use the same command with the value 0:
# cat /proc/sys/vm/panic_on_oom. When you set the value to 0 that means the kernel will not panic when out of memory error occurred.
$ echo 0 > /proc/sys/vm/panic_on_oom. ...
echo 1 > /proc/sys/vm/panic_on_oom.

@ghost
Copy link

ghost commented Jul 13, 2021

This issue has been automatically marked no recent activity because it has been marked as needs author feedback but has not had any activity for 14 days. It will be closed if no further activity occurs within 7 more days. Any new comment (by anyone, not necessarily the author) will remove no recent activity

@itai
Copy link
Author

itai commented Jul 13, 2021

The default value of /proc/sys/vm/panic_on_oom appears to be 0. I tried setting it to both 0 and 1 and running dotnet ./App/bin/Release/net5.0/App.dll and I still did not get a core dump.

run with panic_on_oom set to 0.txt
run with panic_on_oom set to 1.txt

@ghost ghost added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed needs author feedback no-recent-activity labels Jul 13, 2021
@mikem8361 mikem8361 removed the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Jul 13, 2021
@mikem8361
Copy link
Member

Yes, it does look like in both cases, createdump is aborted. It seems that createdump is still being aborted by the OOM killer. It also could be aborting itself when a memory allocation fails but that would be a crash/exception/signal. Not sure what to do about it in either case.

@mikem8361
Copy link
Member

/cc: @hoyosjs

@mikem8361
Copy link
Member

I don't know the details but I was told that it is hard to turn off the OOM killer. The above directions I found on the internet. I don't have any experience with it.

@mikem8361
Copy link
Member

Moving this to 7.0.0 because there is nothing we can do in createdump to prevent it from being killed in low memory scenarios

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants