Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ping class responding Timedout not reading response when ICMP Time-to-live exceeded. #73232

Closed
IainStevenson opened this issue Aug 2, 2022 · 36 comments · Fixed by #99875
Closed
Labels
area-System.Net in-pr There is an active PR which will close this issue when it is merged os-linux Linux OS (any supported distro)
Milestone

Comments

@IainStevenson
Copy link

IainStevenson commented Aug 2, 2022

Description

With reference to #61465 it appears that the fix already produced is not working, At least on linux containers.

I read the above issue and it matched my scenario so was expecting update my framework/sdk's and move on.

I noted that the fix was not included in 6.0.7 so I pushed on to preview .net 7.

Reproduction Steps

Note: its a few hops from the container to my outer WAN address. 192.168.1.1

IPV6

I dont have facility to test IPV6.

Code used.

The code suggested for a test from Issue #61465 was used with the ttl changed to suit my environment.

using System.Net.NetworkInformation;

var ping = new Ping();
var reply = ping.Send("8.8.8.8", 5000, new byte[32], new PingOptions(3, false)); // Remove PingOptions to make is succeed
Console.WriteLine($"Status: {reply.Status} Address: {reply.Address}");

Dockerfile file.

Used to generate the container

Visual Studio generated and nothing untoward here.

#See https://aka.ms/containerfastmode to understand how Visual Studio uses this Dockerfile to build your images for faster debugging.

FROM mcr.microsoft.com/dotnet/runtime:7.0 AS base
WORKDIR /app

FROM mcr.microsoft.com/dotnet/sdk:7.0 AS build
WORKDIR /src
COPY ["PingFixTest.csproj", "."]
RUN dotnet restore "./PingFixTest.csproj"
COPY . .
WORKDIR "/src/."
RUN dotnet build "PingFixTest.csproj" -c Release -o /app/build

FROM build AS publish
RUN dotnet publish "PingFixTest.csproj" -c Release -o /app/publish

FROM base AS final
WORKDIR /app
COPY --from=publish /app/publish .
ENTRYPOINT ["dotnet", "PingFixTest.dll"]

Expected behavior

Works on windows. It does.
Status: TtlExpired Address: 192.168.1.1

Works in linux container:
It fails with Status: TimedOut Address: 0.0.0.0

Actual behavior

Docker log file

Status: TimedOut Address: 0.0.0.0

Please note: I tried running container with docker network default and again with --network host. It made no difference.

Regression?

No response

Known Workarounds

No response

Configuration

Dev environment

Windows 10 Pro 21H2 19044.1826
Visual Studio 2022 Version 17.2.6
SDK .NET 7 preview 6 installed and allowed.
Docker Desktop 4.10.1 (82475) is currently the newest version available.
System.Net.Ping file version 7.0.22.32404.

Other information

Linux command line on runing container

I installed the necessary utilities.

apt update
apt-get install -y traceroute
apt-get install -y iputils-ping

# traceroute 8.8.8.8
traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets
 1  172.17.0.1 (172.17.0.1)  0.880 ms  0.749 ms  0.625 ms
 2  * * *
 3  * * *
 4  * * *
 5  * * *
 6  * * *
 7  * * *
 8  * * *
 9  * * *
10  * * *
11  * * *
12  * * *
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *
#

'172.17.0.1' being the docker bridge network gateway.
By now I expected to see the above result. Note it expires after the max hop (default 30) and does not even get the Success from the final hop to the address (15).

I find it intresting the docker network gateway response works but not outside.

However interesting that may be the test code fails in the Ping class at the docker gateway address hop 1. So that may be a red herring.

# traceroute -I 8.8.8.8
traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets
 1  172.17.0.1 (172.17.0.1)  0.422 ms  0.242 ms  0.219 ms
 2  192.168.0.1 (192.168.0.1)  8.790 ms  8.703 ms  9.670 ms
 3  192.168.1.1 (192.168.1.1)  9.675 ms  9.676 ms  10.607 ms
 4  * * *
 5  10.248.27.157 (10.248.27.157)  55.479 ms  55.487 ms  55.488 ms
 6  10.247.87.167 (10.247.87.167)  54.488 ms  39.425 ms  39.327 ms
 7  * * *
 8  10.247.87.113 (10.247.87.113)  38.963 ms  45.442 ms  45.532 ms
 9  10.247.87.142 (10.247.87.142)  45.534 ms  46.442 ms  51.346 ms
10  87.237.20.146 (87.237.20.146)  46.036 ms  46.340 ms  46.736 ms
11  87.237.20.67 (87.237.20.67)  51.395 ms  41.240 ms  34.692 ms
12  72.14.242.70 (72.14.242.70)  34.497 ms  41.573 ms  41.030 ms
13  108.170.246.129 (108.170.246.129)  41.314 ms  40.970 ms  41.984 ms
14  142.251.54.25 (142.251.54.25)  35.644 ms  43.839 ms  59.243 ms
15  dns.google (8.8.8.8)  48.850 ms  48.872 ms  52.435 ms
#

I also expected to see the above result as it forces traceroute to use ICMP messages (Which Ping.cs is doing) and it all works wonderfully via traceroute.exe :)

Note: First three hops are docker and my equipment (I have dual WAN) , hop 4 is my entry point to the internet, hops 5-9 are EE mobile network internal class A addresses. Then are hops from UK to the USA.

This is very normal for here.

# ping -t 3 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
From 192.168.1.1 icmp_seq=1 Time to live exceeded
From 192.168.1.1 icmp_seq=2 Time to live exceeded
From 192.168.1.1 icmp_seq=3 Time to live exceeded
From 192.168.1.1 icmp_seq=4 Time to live exceeded
From 192.168.1.1 icmp_seq=5 Time to live exceeded

So, the container is connected to the internet and its networking is performaing as expected.

My conclusion is that the fix for issue #61465 is not enough, or Missing In Action in this release for some reason.

Further information

Wireshark data out and in from container code run.

Request

Frame 51: 74 bytes on wire (592 bits), 74 bytes captured (592 bits) on interface \Device\NPF_{AF47E508-7C19-4E4B-8832-9499515FEC65}, id 0
    Interface id: 0 (\Device\NPF_{AF47E508-7C19-4E4B-8832-9499515FEC65})
        Interface name: \Device\NPF_{AF47E508-7C19-4E4B-8832-9499515FEC65}
        Interface description: Ethernet
    Encapsulation type: Ethernet (1)
    Arrival Time: Aug  2, 2022 14:01:41.082932000 GMT Summer Time
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1659445301.082932000 seconds
    [Time delta from previous captured frame: 0.136255000 seconds]
    [Time delta from previous displayed frame: 0.000000000 seconds]
    [Time since reference or first frame: 4.793458000 seconds]
    Frame Number: 51
    Frame Length: 74 bytes (592 bits)
    Capture Length: 74 bytes (592 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:ethertype:ip:icmp:data]
    [Coloring Rule Name: ICMP]
    [Coloring Rule String: icmp || icmpv6]
Ethernet II, Src: Microsof_00:16:0e (00:15:5d:00:16:0e), Dst: TP-Link_e4:b4:6e (14:eb:b6:e4:b4:6e)
    Destination: TP-Link_e4:b4:6e (14:eb:b6:e4:b4:6e)
        Address: TP-Link_e4:b4:6e (14:eb:b6:e4:b4:6e)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Source: Microsof_00:16:0e (00:15:5d:00:16:0e)
        Address: Microsof_00:16:0e (00:15:5d:00:16:0e)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 192.168.0.122, Dst: 8.8.8.8
    0100 .... = Version: 4
    .... 0101 = Header Length: 20 bytes (5)
    Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
        0000 00.. = Differentiated Services Codepoint: Default (0)
        .... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
    Total Length: 60
    Identification: 0xccb6 (52406)
    Flags: 0x00
        0... .... = Reserved bit: Not set
        .0.. .... = Don't fragment: Not set
        ..0. .... = More fragments: Not set
    ...0 0000 0000 0000 = Fragment Offset: 0
    Time to Live: 2
        [Expert Info (Note/Sequence): "Time To Live" only 2]
            ["Time To Live" only 2]
            [Severity level: Note]
            [Group: Sequence]
    Protocol: ICMP (1)
    Header Checksum: 0x0000 [validation disabled]
    [Header checksum status: Unverified]
    Source Address: 192.168.0.122
    Destination Address: 8.8.8.8
Internet Control Message Protocol
    Type: 8 (Echo (ping) request)
    Code: 0
    Checksum: 0xf7f1 [correct]
    [Checksum Status: Good]
    Identifier (BE): 14 (0x000e)
    Identifier (LE): 3584 (0x0e00)
    Sequence Number (BE): 0 (0x0000)
    Sequence Number (LE): 0 (0x0000)
    [No response seen]
        [Expert Info (Warning/Sequence): No response seen to ICMP request]
            [No response seen to ICMP request]
            [Severity level: Warning]
            [Group: Sequence]
    Data (32 bytes)
        Data: 0000000000000000000000000000000000000000000000000000000000000000
        [Length: 32]

Response

Frame 52: 102 bytes on wire (816 bits), 102 bytes captured (816 bits) on interface \Device\NPF_{AF47E508-7C19-4E4B-8832-9499515FEC65}, id 0
    Interface id: 0 (\Device\NPF_{AF47E508-7C19-4E4B-8832-9499515FEC65})
        Interface name: \Device\NPF_{AF47E508-7C19-4E4B-8832-9499515FEC65}
        Interface description: Ethernet
    Encapsulation type: Ethernet (1)
    Arrival Time: Aug  2, 2022 14:01:41.084832000 GMT Summer Time
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1659445301.084832000 seconds
    [Time delta from previous captured frame: 0.001900000 seconds]
    [Time delta from previous displayed frame: 0.001900000 seconds]
    [Time since reference or first frame: 4.795358000 seconds]
    Frame Number: 52
    Frame Length: 102 bytes (816 bits)
    Capture Length: 102 bytes (816 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:ethertype:ip:icmp:ip:icmp:data]
    [Coloring Rule Name: ICMP errors]
    [Coloring Rule String: icmp.type eq 3 || icmp.type eq 4 || icmp.type eq 5 || icmp.type eq 11 || icmpv6.type eq 1 || icmpv6.type eq 2 || icmpv6.type eq 3 || icmpv6.type eq 4]
Ethernet II, Src: TP-Link_e4:b4:6e (14:eb:b6:e4:b4:6e), Dst: Microsof_00:16:0e (00:15:5d:00:16:0e)
    Destination: Microsof_00:16:0e (00:15:5d:00:16:0e)
        Address: Microsof_00:16:0e (00:15:5d:00:16:0e)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Source: TP-Link_e4:b4:6e (14:eb:b6:e4:b4:6e)
        Address: TP-Link_e4:b4:6e (14:eb:b6:e4:b4:6e)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 192.168.1.1, Dst: 192.168.0.122
    0100 .... = Version: 4
    .... 0101 = Header Length: 20 bytes (5)
    Differentiated Services Field: 0xc0 (DSCP: CS6, ECN: Not-ECT)
        1100 00.. = Differentiated Services Codepoint: Class Selector 6 (48)
        .... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
    Total Length: 88
    Identification: 0x8494 (33940)
    Flags: 0x00
        0... .... = Reserved bit: Not set
        .0.. .... = Don't fragment: Not set
        ..0. .... = More fragments: Not set
    ...0 0000 0000 0000 = Fragment Offset: 0
    Time to Live: 63
    Protocol: ICMP (1)
    Header Checksum: 0x7385 [validation disabled]
    [Header checksum status: Unverified]
    Source Address: 192.168.1.1
    Destination Address: 192.168.0.122
Internet Control Message Protocol
    Type: 11 (Time-to-live exceeded)
    Code: 0 (Time to live exceeded in transit)
    Checksum: 0xf4ff [correct]
    [Checksum Status: Good]
    Unused: 00000000
    Internet Protocol Version 4, Src: 192.168.0.122, Dst: 8.8.8.8
        0100 .... = Version: 4
        .... 0101 = Header Length: 20 bytes (5)
        Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
            0000 00.. = Differentiated Services Codepoint: Default (0)
            .... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
        Total Length: 60
        Identification: 0xccb6 (52406)
        Flags: 0x00
            0... .... = Reserved bit: Not set
            .0.. .... = Don't fragment: Not set
            ..0. .... = More fragments: Not set
        ...0 0000 0000 0000 = Fragment Offset: 0
        Time to Live: 1
            [Expert Info (Note/Sequence): "Time To Live" only 1]
                ["Time To Live" only 1]
                [Severity level: Note]
                [Group: Sequence]
        Protocol: ICMP (1)
        Header Checksum: 0x1bd9 [validation disabled]
        [Header checksum status: Unverified]
        Source Address: 192.168.0.122
        Destination Address: 8.8.8.8
    Internet Control Message Protocol
        Type: 8 (Echo (ping) request)
        Code: 0
        Checksum: 0xf7f1 [unverified] [in ICMP error packet]
        [Checksum Status: Unverified]
        Identifier (BE): 14 (0x000e)
        Identifier (LE): 3584 (0x0e00)
        Sequence Number (BE): 0 (0x0000)
        Sequence Number (LE): 0 (0x0000)
        Data (32 bytes)
            Data: 0000000000000000000000000000000000000000000000000000000000000000
            [Length: 32]

Please let me know if there is anything I can help with in resolving this.

@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Aug 2, 2022
@IainStevenson
Copy link
Author

I would label as 'area-System.Net' and 'os-linux' but dont have write permissions.

@filipnavara filipnavara added area-System.Net os-linux Linux OS (any supported distro) labels Aug 2, 2022
@ghost
Copy link

ghost commented Aug 2, 2022

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

With reference to #61465 it appears that the fix already produced is not working, At least on linux containers.

I read the above issue and it matched my scenario so was expecting update my framework/sdk's and move on.

I noted that the fix was not included in 6.0.7 so I pushed on to preview .net 7.

Reproduction Steps

Note: its a few hops from the container to my outer WAN address. 192.168.1.1

IPV6

I dont have facility to test IPV6.

Code used.

The code suggested for a test from Issue #61465 was used with the ttl changed to suit my environment.

using System.Net.NetworkInformation;

var ping = new Ping();
var reply = ping.Send("8.8.8.8", 5000, new byte[32], new PingOptions(3, false)); // Remove PingOptions to make is succeed
Console.WriteLine($"Status: {reply.Status} Address: {reply.Address}");

Dockerfile file.

Used to generate the container

Visual Studio generated and nothing untoward here.

#See https://aka.ms/containerfastmode to understand how Visual Studio uses this Dockerfile to build your images for faster debugging.

FROM mcr.microsoft.com/dotnet/runtime:7.0 AS base
WORKDIR /app

FROM mcr.microsoft.com/dotnet/sdk:7.0 AS build
WORKDIR /src
COPY ["PingFixTest.csproj", "."]
RUN dotnet restore "./PingFixTest.csproj"
COPY . .
WORKDIR "/src/."
RUN dotnet build "PingFixTest.csproj" -c Release -o /app/build

FROM build AS publish
RUN dotnet publish "PingFixTest.csproj" -c Release -o /app/publish

FROM base AS final
WORKDIR /app
COPY --from=publish /app/publish .
ENTRYPOINT ["dotnet", "PingFixTest.dll"]

Expected behavior

Works on windows. It does.
Status: TtlExpired Address: 192.168.1.1

Works in linux container:
It fails with Status: TimedOut Address: 0.0.0.0

Actual behavior

Docker log file

Status: TimedOut Address: 0.0.0.0

Please note: I tried running container with docker network default and again with --network host. It made no difference.

Regression?

No response

Known Workarounds

No response

Configuration

Dev environment

Windows 10 Pro 21H2 19044.1826
Visual Studio 2022 Version 17.2.6
SDK .NET 7 preview 6 installed and allowed.
Docker Desktop 4.10.1 (82475) is currently the newest version available.
System.Net.Ping file version 7.0.22.32404.

Other information

Linux command line on runing container

I installed the necessary utilities.

apt update
apt-get install -y traceroute
apt-get install -y iputils-ping

# traceroute 8.8.8.8
traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets
 1  172.17.0.1 (172.17.0.1)  0.880 ms  0.749 ms  0.625 ms
 2  * * *
 3  * * *
 4  * * *
 5  * * *
 6  * * *
 7  * * *
 8  * * *
 9  * * *
10  * * *
11  * * *
12  * * *
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *
#

'172.17.0.1' being the docker bridge network gateway.
By now I expected to see the above result. Note it expires after the max hop (default 30) and does not even get the Success from the final hop to the address (15).

I find it intresting the docker network gateway response works but not outside.

However interesting that may be the test code fails in the Ping class at the docker gateway address hop 1. So that may be a red herring.

# traceroute -I 8.8.8.8
traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets
 1  172.17.0.1 (172.17.0.1)  0.422 ms  0.242 ms  0.219 ms
 2  192.168.0.1 (192.168.0.1)  8.790 ms  8.703 ms  9.670 ms
 3  192.168.1.1 (192.168.1.1)  9.675 ms  9.676 ms  10.607 ms
 4  * * *
 5  10.248.27.157 (10.248.27.157)  55.479 ms  55.487 ms  55.488 ms
 6  10.247.87.167 (10.247.87.167)  54.488 ms  39.425 ms  39.327 ms
 7  * * *
 8  10.247.87.113 (10.247.87.113)  38.963 ms  45.442 ms  45.532 ms
 9  10.247.87.142 (10.247.87.142)  45.534 ms  46.442 ms  51.346 ms
10  87.237.20.146 (87.237.20.146)  46.036 ms  46.340 ms  46.736 ms
11  87.237.20.67 (87.237.20.67)  51.395 ms  41.240 ms  34.692 ms
12  72.14.242.70 (72.14.242.70)  34.497 ms  41.573 ms  41.030 ms
13  108.170.246.129 (108.170.246.129)  41.314 ms  40.970 ms  41.984 ms
14  142.251.54.25 (142.251.54.25)  35.644 ms  43.839 ms  59.243 ms
15  dns.google (8.8.8.8)  48.850 ms  48.872 ms  52.435 ms
#

I also expected to see the above result as it forces traceroute to use ICMP messages (Which Ping.cs is doing) and it all works wonderfully via traceroute.exe :)

Note: First three hops are docker and my equipment (I have dual WAN) , hop 4 is my entry point to the internet, hops 5-9 are EE mobile network internal class A addresses. Then are hops from UK to the USA.

This is very normal for here.

# ping -t 3 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
From 192.168.1.1 icmp_seq=1 Time to live exceeded
From 192.168.1.1 icmp_seq=2 Time to live exceeded
From 192.168.1.1 icmp_seq=3 Time to live exceeded
From 192.168.1.1 icmp_seq=4 Time to live exceeded
From 192.168.1.1 icmp_seq=5 Time to live exceeded

So, the container is connected to the internet and its networking is performaing as expected.

My conclusion is that the fix for issue #61465 is not enough, or Missing In Action in this release for some reason.

Further information

Wireshark data out and in from container code run.

Request

Frame 51: 74 bytes on wire (592 bits), 74 bytes captured (592 bits) on interface \Device\NPF_{AF47E508-7C19-4E4B-8832-9499515FEC65}, id 0
    Interface id: 0 (\Device\NPF_{AF47E508-7C19-4E4B-8832-9499515FEC65})
        Interface name: \Device\NPF_{AF47E508-7C19-4E4B-8832-9499515FEC65}
        Interface description: Ethernet
    Encapsulation type: Ethernet (1)
    Arrival Time: Aug  2, 2022 14:01:41.082932000 GMT Summer Time
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1659445301.082932000 seconds
    [Time delta from previous captured frame: 0.136255000 seconds]
    [Time delta from previous displayed frame: 0.000000000 seconds]
    [Time since reference or first frame: 4.793458000 seconds]
    Frame Number: 51
    Frame Length: 74 bytes (592 bits)
    Capture Length: 74 bytes (592 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:ethertype:ip:icmp:data]
    [Coloring Rule Name: ICMP]
    [Coloring Rule String: icmp || icmpv6]
Ethernet II, Src: Microsof_00:16:0e (00:15:5d:00:16:0e), Dst: TP-Link_e4:b4:6e (14:eb:b6:e4:b4:6e)
    Destination: TP-Link_e4:b4:6e (14:eb:b6:e4:b4:6e)
        Address: TP-Link_e4:b4:6e (14:eb:b6:e4:b4:6e)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Source: Microsof_00:16:0e (00:15:5d:00:16:0e)
        Address: Microsof_00:16:0e (00:15:5d:00:16:0e)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 192.168.0.122, Dst: 8.8.8.8
    0100 .... = Version: 4
    .... 0101 = Header Length: 20 bytes (5)
    Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
        0000 00.. = Differentiated Services Codepoint: Default (0)
        .... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
    Total Length: 60
    Identification: 0xccb6 (52406)
    Flags: 0x00
        0... .... = Reserved bit: Not set
        .0.. .... = Don't fragment: Not set
        ..0. .... = More fragments: Not set
    ...0 0000 0000 0000 = Fragment Offset: 0
    Time to Live: 2
        [Expert Info (Note/Sequence): "Time To Live" only 2]
            ["Time To Live" only 2]
            [Severity level: Note]
            [Group: Sequence]
    Protocol: ICMP (1)
    Header Checksum: 0x0000 [validation disabled]
    [Header checksum status: Unverified]
    Source Address: 192.168.0.122
    Destination Address: 8.8.8.8
Internet Control Message Protocol
    Type: 8 (Echo (ping) request)
    Code: 0
    Checksum: 0xf7f1 [correct]
    [Checksum Status: Good]
    Identifier (BE): 14 (0x000e)
    Identifier (LE): 3584 (0x0e00)
    Sequence Number (BE): 0 (0x0000)
    Sequence Number (LE): 0 (0x0000)
    [No response seen]
        [Expert Info (Warning/Sequence): No response seen to ICMP request]
            [No response seen to ICMP request]
            [Severity level: Warning]
            [Group: Sequence]
    Data (32 bytes)
        Data: 0000000000000000000000000000000000000000000000000000000000000000
        [Length: 32]

Response

Frame 52: 102 bytes on wire (816 bits), 102 bytes captured (816 bits) on interface \Device\NPF_{AF47E508-7C19-4E4B-8832-9499515FEC65}, id 0
    Interface id: 0 (\Device\NPF_{AF47E508-7C19-4E4B-8832-9499515FEC65})
        Interface name: \Device\NPF_{AF47E508-7C19-4E4B-8832-9499515FEC65}
        Interface description: Ethernet
    Encapsulation type: Ethernet (1)
    Arrival Time: Aug  2, 2022 14:01:41.084832000 GMT Summer Time
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1659445301.084832000 seconds
    [Time delta from previous captured frame: 0.001900000 seconds]
    [Time delta from previous displayed frame: 0.001900000 seconds]
    [Time since reference or first frame: 4.795358000 seconds]
    Frame Number: 52
    Frame Length: 102 bytes (816 bits)
    Capture Length: 102 bytes (816 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:ethertype:ip:icmp:ip:icmp:data]
    [Coloring Rule Name: ICMP errors]
    [Coloring Rule String: icmp.type eq 3 || icmp.type eq 4 || icmp.type eq 5 || icmp.type eq 11 || icmpv6.type eq 1 || icmpv6.type eq 2 || icmpv6.type eq 3 || icmpv6.type eq 4]
Ethernet II, Src: TP-Link_e4:b4:6e (14:eb:b6:e4:b4:6e), Dst: Microsof_00:16:0e (00:15:5d:00:16:0e)
    Destination: Microsof_00:16:0e (00:15:5d:00:16:0e)
        Address: Microsof_00:16:0e (00:15:5d:00:16:0e)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Source: TP-Link_e4:b4:6e (14:eb:b6:e4:b4:6e)
        Address: TP-Link_e4:b4:6e (14:eb:b6:e4:b4:6e)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 192.168.1.1, Dst: 192.168.0.122
    0100 .... = Version: 4
    .... 0101 = Header Length: 20 bytes (5)
    Differentiated Services Field: 0xc0 (DSCP: CS6, ECN: Not-ECT)
        1100 00.. = Differentiated Services Codepoint: Class Selector 6 (48)
        .... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
    Total Length: 88
    Identification: 0x8494 (33940)
    Flags: 0x00
        0... .... = Reserved bit: Not set
        .0.. .... = Don't fragment: Not set
        ..0. .... = More fragments: Not set
    ...0 0000 0000 0000 = Fragment Offset: 0
    Time to Live: 63
    Protocol: ICMP (1)
    Header Checksum: 0x7385 [validation disabled]
    [Header checksum status: Unverified]
    Source Address: 192.168.1.1
    Destination Address: 192.168.0.122
Internet Control Message Protocol
    Type: 11 (Time-to-live exceeded)
    Code: 0 (Time to live exceeded in transit)
    Checksum: 0xf4ff [correct]
    [Checksum Status: Good]
    Unused: 00000000
    Internet Protocol Version 4, Src: 192.168.0.122, Dst: 8.8.8.8
        0100 .... = Version: 4
        .... 0101 = Header Length: 20 bytes (5)
        Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
            0000 00.. = Differentiated Services Codepoint: Default (0)
            .... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
        Total Length: 60
        Identification: 0xccb6 (52406)
        Flags: 0x00
            0... .... = Reserved bit: Not set
            .0.. .... = Don't fragment: Not set
            ..0. .... = More fragments: Not set
        ...0 0000 0000 0000 = Fragment Offset: 0
        Time to Live: 1
            [Expert Info (Note/Sequence): "Time To Live" only 1]
                ["Time To Live" only 1]
                [Severity level: Note]
                [Group: Sequence]
        Protocol: ICMP (1)
        Header Checksum: 0x1bd9 [validation disabled]
        [Header checksum status: Unverified]
        Source Address: 192.168.0.122
        Destination Address: 8.8.8.8
    Internet Control Message Protocol
        Type: 8 (Echo (ping) request)
        Code: 0
        Checksum: 0xf7f1 [unverified] [in ICMP error packet]
        [Checksum Status: Unverified]
        Identifier (BE): 14 (0x000e)
        Identifier (LE): 3584 (0x0e00)
        Sequence Number (BE): 0 (0x0000)
        Sequence Number (LE): 0 (0x0000)
        Data (32 bytes)
            Data: 0000000000000000000000000000000000000000000000000000000000000000
            [Length: 32]

Please let me know if there is anything I can help with in resolving this.

Author: IainStevenson
Assignees: -
Labels:

area-System.Net, os-linux, untriaged

Milestone: -

@filipnavara
Copy link
Member

filipnavara commented Aug 2, 2022

There are two implementations of Ping on Linux which are switched depending on the permissions. #61592 addressed the issue for the one that uses raw sockets. That one is always used on macOS but on Linux it can be used only if you have permission to use raw sockets. You can try to run the program under sudo to see how it behaves (or grant cap_net_raw capability using setcap).

The other implementation spawns the ping utility as a separate process and parses the output. That limits what errors it can report back. I believe #65312 should have addressed the TTL case but apparently it doesn't seem to work in your environment. (cc @rzikm)

@IainStevenson
Copy link
Author

IainStevenson commented Aug 2, 2022

@filipnavara thanks for the note: I have confirmed by cloning the internal RawSocketPermissions class to use in my code, to determine that it has permissions to use sockets in my environment.

My container log reveals at startup

netmon.cli.AppHost[0] Can use Sockets On this host... True

I had already read the source code for System.Net.NetworkInformation and determined that was an option deeper down the call stack, which then directly led me to #61465 again, which is why I had hoped it was fixed.

I do hope I am doing something wrong, coz its easier to fix :) but I dont think so at this point. i ahve been looking at this for the best part of two days.

I tried your suggestions anyway but got nowhere, too much of a linux newb there. I think It does not matter as that affects the branch of code not being used I.e. the call out to the ping utility. My code is definately entitled to call the Raw sockets branch of the code.

@wfurt
Copy link
Member

wfurt commented Aug 2, 2022

May be worth of setting up complete unable repro with container @IainStevenson. I'm not sure if we would be able to reproduce otherwise.

@IainStevenson
Copy link
Author

IainStevenson commented Aug 2, 2022

@wfurt thanks, I was jsut working on that very thing to help out. The project I am working right now is here:

https://github.com/IainStevenson/network.monitor

and if you clone it and checkout the devlop branch load it into Visual Studio 2022 and run the docker as startup it will spin up a linux container in docker for windows and start it up.

It should take about 10-20 seconds tracing the route to 8.8.8.8 and then set off pinging it every 10 seconds for quite some time.

The logic it runs is as follows.

Performs a traceroute excersize to 8.8.8.8 by walking through max 30 hops (ttl) to find each host node on the route to that address.
Then proceeeds to independently ping all of those addresses every 10 seconds. Storing the results in files on the container.

On windows (by starting netmon.cli) it does find each address that responds and then, pings all of them every 10 seconds. In my case thats about a doxens addresses.

On linux it via the docker startup only finds 8.8.8.8 and pings that every 10 seconds.

The data is output to the \usr\share\netmon folder on the container where there will be a time-stamped json file for every ping per host and a summary text file per address pinged.

Please give me shout if you have any probs with that repo.

BTW: I ahve a mac mini, I may get a chance to see if its working on there, but I know less about that than linux LOL - its too new. to me.

@IainStevenson
Copy link
Author

IainStevenson commented Aug 2, 2022

OK that was quite easy to set up everything on my mac, its behaving ther same there too. Docker desktop for mac with linux containers, so its a linux specific problem. I will try and run code natively on mac next and see.

@IainStevenson
Copy link
Author

IainStevenson commented Aug 2, 2022

Sorry folks, emergency stop.

In that repo, the netmon.cli project from the network.monitor.sln which is executed via the docker compose startup is using net6 which as I understand it missed the boat for the last change anyway.

In that repo is a small PingFixTest project which when you do a build and run of its docker file will get you the simple code I posted originally. I havent upgraded the whole repo yet to .net7 preview due to this bug/issue?. Let me know if you want that done and I will put in the work for teh whole repo.

@IainStevenson
Copy link
Author

OK I built a docker image of pingfixtest and ran it as a linux image in docker for mac under net7 and it fails as described.
I then ran the code via dotnet build pingfixtest.csproj followed by hop into bin folder and dotnet pingfixtest.dll where it worked as expected.
So same code different OS and it changes behaviour.

So in all my environments: in linux it is a problem, on the mac its fine, on windows its fine.

So what I did in reproduction standards is this;

git clone repo

CD network.monitor/src/pingfixtest
dotnet build pingfixtest.csproj
docker build -t pingfixtest .
docker run -t pingfixtest

See that it fails

cd bin\debug\net7.0
dotnet pingfixtest.dll

see that it works.

I just repeated thse instructions on Windows and Mac and both wth linux containers in docker for OS.

@IainStevenson
Copy link
Author

I am going to go out on a limb here and suggest this may be an endian problem. Windows Works, Mac works, Linux fails.

The code paths in PingRawSockets.cs will result in a Timedout response in its calling code if the returned message Identifier is not the same as was configured for the socket.

This fits the problem profile of the observed (in wireshark) message returning and being 'ignored' and resulting in a default timeout scenario.

I am wondering if there is a way to test it with your working debug versions to see what is happening there. I cant get my environment to build in debug or I would have tested my theory already. I am also wondering how to do this without a debug environment ON a linux development host, which I dont have.

I cant find enough of the source code to see if MemoryMarshal.Read is taking care of endien-ness or not. This is the only OS difference I can think of to this point, so I thought I'd share.

@IainStevenson
Copy link
Author

Well I found my linux contaier is little-endian (same as windows) so maybe not.

@filipnavara
Copy link
Member

I'm trying to reproduce it under WSL. I do get similar symptoms but I still need to confirm that it's ultimately the same cause.

@IainStevenson
Copy link
Author

@filipnavara I was wondering that myself, the common problem denominators so far are docker & linux.

I will also work on a wsl environment here to see if I can eliminate docker from the environment. I am certaily learnig lots with this one :)

@filipnavara
Copy link
Member

What happens on my WSL is that the ReceiveFrom call doesn't even get the ICMP timeout replies. It just times out.

@IainStevenson
Copy link
Author

While I was downloading a WSL distro I ran the pingfixtest code through a windows container on dokcer and it worked as expected.

I modified my code to loop from hop 1 to 30 and exit on success.

linux container

Hop: 1 Status: TimedOut Address: 0.0.0.0
Hop: 2 Status: TimedOut Address: 0.0.0.0
Hop: 3 Status: TimedOut Address: 0.0.0.0
Hop: 4 Status: TimedOut Address: 0.0.0.0
Hop: 5 Status: TimedOut Address: 0.0.0.0
Hop: 6 Status: TimedOut Address: 0.0.0.0
Hop: 7 Status: TimedOut Address: 0.0.0.0
Hop: 8 Status: TimedOut Address: 0.0.0.0
Hop: 9 Status: TimedOut Address: 0.0.0.0
Hop: 10 Status: TimedOut Address: 0.0.0.0
Hop: 11 Status: TimedOut Address: 0.0.0.0
Hop: 12 Status: TimedOut Address: 0.0.0.0
Hop: 13 Status: TimedOut Address: 0.0.0.0
Hop: 14 Status: TimedOut Address: 0.0.0.0
Hop: 15 Status: TimedOut Address: 0.0.0.0
Hop: 16 Status: Success Address: 8.8.8.8

windows container

Hop: 1 Status: TtlExpired Address: 172.22.128.1 
Hop: 2 Status: TtlExpired Address: 192.168.0.1 
Hop: 3 Status: TtlExpired Address: 192.168.1.1 
Hop: 4 Status: TimedOut Address: 8.8.8.8 
Hop: 5 Status: TtlExpired Address: 10.124.71.1 
Hop: 6 Status: TtlExpired Address: 10.124.222.84 
Hop: 7 Status: TtlExpired Address: 10.124.72.165 
Hop: 8 Status: TimedOut Address: 8.8.8.8 
Hop: 9 Status: TtlExpired Address: 10.247.89.1 
Hop: 10 Status: TtlExpired Address: 10.124.72.154 
Hop: 11 Status: TtlExpired Address: 87.237.20.118 
Hop: 12 Status: TtlExpired Address: 87.237.20.67 
Hop: 13 Status: TtlExpired Address: 72.14.242.70 
Hop: 14 Status: TtlExpired Address: 74.125.242.97 
Hop: 15 Status: TtlExpired Address: 142.251.52.145 
Hop: 16 Status: Success Address: 8.8.8.8

So that agreees with your results. Something about the linux environment which dives down into Ping.RawSockets.cs has a problem with the socket calls.

@IainStevenson
Copy link
Author

IainStevenson commented Aug 3, 2022

@filipnavara I guess the next question is: Are you getting a reply showing up in wireshark that is getting lost, or is there genuinely nothing coming back? I am seeing replies coming back in. I can post my saved session for you?

@filipnavara
Copy link
Member

I do see the requests/replies on the outer host machine:

image

That doesn't necessarily mean the guest machine sees the same.

@IainStevenson
Copy link
Author

IainStevenson commented Aug 3, 2022

@filipnavara I think it may be an linux default firewall issue.

I read here that this may help.

iptables -A INPUT -p icmp -j ACCEPT

I am going to switch back to linux containers and try ad see if it the problem and if that helps.

@IainStevenson
Copy link
Author

IainStevenson commented Aug 3, 2022

@filipnavara Ah how the thick plottens!

I checked the firewall rules eventually by adding run arguemtns for docker to allow me to.

{
  "profiles": {
    "PingFixTest": {
      "commandName": "Project"
    },
    "Docker": {
      "commandName": "Docker",
      **"DockerfileRunArguments": "--cap-add=NET_ADMIN"**
    }
  }
}

And installed iptables and sudo

root@34d384e06918:/app# sudo iptables -L

and only to find its already accepting everything.

root@34d384e06918:/app# sudo iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

So I dug deeper and installed tcpdump and this was the output on the container console whilst the code ran in another console:

root@34d384e06918:/app# tcpdump -i eth0 proto ICMP
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:20:40.409381 IP 34d384e06918 > dns.google: ICMP echo request, id 22547, seq 0, length 40
12:20:40.409411 IP 172.17.0.1 > 34d384e06918: ICMP time exceeded in-transit, length 68
12:20:41.508426 IP 34d384e06918 > dns.google: ICMP echo request, id 3706, seq 0, length 40
12:20:41.514649 IP 192.168.0.1 > 34d384e06918: ICMP time exceeded in-transit, length 68
12:20:42.528587 IP 34d384e06918 > dns.google: ICMP echo request, id 9030, seq 0, length 40
12:20:42.538929 IP 192.168.1.1 > 34d384e06918: ICMP time exceeded in-transit, length 68
12:20:43.568730 IP 34d384e06918 > dns.google: ICMP echo request, id 20892, seq 0, length 40
12:20:44.608620 IP 34d384e06918 > dns.google: ICMP echo request, id 56844, seq 0, length 40
12:20:44.652495 IP 10.124.71.1 > 34d384e06918: ICMP time exceeded in-transit, length 36
12:20:45.648616 IP 34d384e06918 > dns.google: ICMP echo request, id 64090, seq 0, length 40
12:20:45.693189 IP 10.124.222.84 > 34d384e06918: ICMP time exceeded in-transit, length 36
12:20:46.688372 IP 34d384e06918 > dns.google: ICMP echo request, id 44493, seq 0, length 40
12:20:46.736771 IP 10.124.72.165 > 34d384e06918: ICMP time exceeded in-transit, length 36
12:20:47.728408 IP 34d384e06918 > dns.google: ICMP echo request, id 36163, seq 0, length 40
12:20:48.768785 IP 34d384e06918 > dns.google: ICMP echo request, id 10149, seq 0, length 40
12:20:48.814220 IP 10.247.89.1 > 34d384e06918: ICMP time exceeded in-transit, length 68
12:20:49.808341 IP 34d384e06918 > dns.google: ICMP echo request, id 6407, seq 0, length 40
12:20:49.855133 IP 10.124.72.154 > 34d384e06918: ICMP time exceeded in-transit, length 36
12:20:50.848499 IP 34d384e06918 > dns.google: ICMP echo request, id 51050, seq 0, length 40
12:20:50.893665 IP 87.237.20.118 > 34d384e06918: ICMP time exceeded in-transit, length 36
12:20:51.888365 IP 34d384e06918 > dns.google: ICMP echo request, id 25129, seq 0, length 40
12:20:51.935730 IP 87.237.20.67 > 34d384e06918: ICMP time exceeded in-transit, length 36
12:20:52.928251 IP 34d384e06918 > dns.google: ICMP echo request, id 28026, seq 0, length 40
12:20:52.975824 IP 72.14.242.70 > 34d384e06918: ICMP time exceeded in-transit, length 36
12:20:53.968375 IP 34d384e06918 > dns.google: ICMP echo request, id 52028, seq 0, length 40
12:20:54.015957 IP 74.125.242.97 > 34d384e06918: ICMP time exceeded in-transit, length 76
12:20:55.008497 IP 34d384e06918 > dns.google: ICMP echo request, id 53384, seq 0, length 40
12:20:55.054848 IP 142.251.52.145 > 34d384e06918: ICMP time exceeded in-transit, length 68
12:20:56.048567 IP 34d384e06918 > dns.google: ICMP echo request, id 24754, seq 0, length 40
12:20:56.095639 IP dns.google > 34d384e06918: ICMP echo reply, id 24754, seq 0, length 40

^C
30 packets captured
30 packets received by filter
0 packets dropped by kernel

So the container is seeing the packets arriving. I can only assume that 'ICMP time exceeded in-transit' equals TTLExpired

@IainStevenson
Copy link
Author

IainStevenson commented Aug 3, 2022

@filipnavara OOO errr. this does not look good. :(

I stepped up the detail and can see a laod of bad cksum on incoming data

root@34d384e06918:/app# tcpdump -i eth0 proto ICMP -ttnnvvS
tcpdump: listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
1659529953.110875 IP (tos 0x0, ttl 1, id 57119, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 1038, seq 0, length 40
1659529953.110913 IP (tos 0xc0, ttl 64, id 43588, offset 0, flags [none], proto ICMP (1), length 88)
    172.17.0.1 > 172.17.0.2: ICMP time exceeded in-transit, length 68
        IP (tos 0x0, ttl 1, id 57119, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 1038, seq 0, length 40
1659529954.144442 IP (tos 0x0, ttl 2, id 20672, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 35660, seq 0, length 40
1659529954.150212 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 88)
    192.168.0.1 > 172.17.0.2: ICMP time exceeded in-transit, length 68
        IP (tos 0x0, ttl 1, id 6951, offset 0, flags [none], proto ICMP (1), length 60, bad cksum 2300 (->e277)!)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 35660, seq 0, length 40
1659529955.166324 IP (tos 0x0, ttl 3, id 28857, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 63133, seq 0, length 40
1659529955.174256 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 88)
    192.168.1.1 > 172.17.0.2: ICMP time exceeded in-transit, length 68
        IP (tos 0x0, ttl 1, id 6952, offset 0, flags [none], proto ICMP (1), length 60, bad cksum 22ff (->e276)!)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 63133, seq 0, length 40
1659529956.206211 IP (tos 0x0, ttl 4, id 2223, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 45670, seq 0, length 40
1659529957.246170 IP (tos 0x0, ttl 5, id 35906, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 26926, seq 0, length 40
1659529957.294076 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 56)
    10.124.71.1 > 172.17.0.2: ICMP time exceeded in-transit, length 36
        IP (tos 0x0, ttl 1, id 6954, offset 0, flags [none], proto ICMP (1), length 60, bad cksum 22fd (->e274)!)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 26926, seq 0, length 40
1659529958.286037 IP (tos 0x0, ttl 6, id 24851, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 18721, seq 0, length 40
1659529958.331288 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 56)
    10.124.222.84 > 172.17.0.2: ICMP time exceeded in-transit, length 36
        IP (tos 0x0, ttl 1, id 6955, offset 0, flags [none], proto ICMP (1), length 60, bad cksum 22fc (->e273)!)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 18721, seq 0, length 40
1659529959.325876 IP (tos 0x0, ttl 7, id 28431, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 54848, seq 0, length 40
1659529959.371854 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 56)
    10.124.72.165 > 172.17.0.2: ICMP time exceeded in-transit, length 36
        IP (tos 0x0, ttl 1, id 6956, offset 0, flags [none], proto ICMP (1), length 60, bad cksum 22fb (->e272)!)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 54848, seq 0, length 40
1659529960.366062 IP (tos 0x0, ttl 8, id 36106, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 36884, seq 0, length 40
1659529961.406334 IP (tos 0x0, ttl 9, id 13449, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 43936, seq 0, length 40
1659529961.452158 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 88)
    10.247.89.1 > 172.17.0.2: ICMP time exceeded in-transit, length 68
        IP (tos 0x0, ttl 1, id 6958, offset 0, flags [none], proto ICMP (1), length 60, bad cksum 22f9 (->e270)!)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 43936, seq 0, length 40
1659529962.446125 IP (tos 0x0, ttl 10, id 32178, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 7944, seq 0, length 40
1659529962.500890 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 56)
    10.124.72.154 > 172.17.0.2: ICMP time exceeded in-transit, length 36
        IP (tos 0x0, ttl 1, id 6959, offset 0, flags [none], proto ICMP (1), length 60, bad cksum 22f8 (->e26f)!)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 7944, seq 0, length 40
1659529963.486268 IP (tos 0x0, ttl 11, id 9691, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 12497, seq 0, length 40
1659529963.534018 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 56)
    87.237.20.118 > 172.17.0.2: ICMP time exceeded in-transit, length 36
        IP (tos 0x0, ttl 1, id 6960, offset 0, flags [none], proto ICMP (1), length 60, bad cksum 22f7 (->e26e)!)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 12497, seq 0, length 40
1659529964.526075 IP (tos 0x0, ttl 12, id 11676, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 53484, seq 0, length 40
1659529964.576012 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 56)
    87.237.20.67 > 172.17.0.2: ICMP time exceeded in-transit, length 36
        IP (tos 0x0, ttl 1, id 6961, offset 0, flags [none], proto ICMP (1), length 60, bad cksum 22f6 (->e26d)!)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 53484, seq 0, length 40
1659529965.566055 IP (tos 0x0, ttl 13, id 45405, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 29655, seq 0, length 40
1659529965.613593 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 56)
    72.14.242.70 > 172.17.0.2: ICMP time exceeded in-transit, length 36
        IP (tos 0x0, ttl 1, id 6962, offset 0, flags [none], proto ICMP (1), length 60, bad cksum 22f5 (->e26c)!)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 29655, seq 0, length 40
1659529966.606655 IP (tos 0x0, ttl 14, id 2962, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 48420, seq 0, length 40
1659529966.653240 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 96)
    74.125.242.97 > 172.17.0.2: ICMP time exceeded in-transit, length 76
        IP (tos 0x80, ttl 1, id 6963, offset 0, flags [none], proto ICMP (1), length 60, bad cksum 2274 (->e1eb)!)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 48420, seq 0, length 40
1659529967.646273 IP (tos 0x0, ttl 15, id 21511, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 27097, seq 0, length 40
1659529967.691051 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 88)
    142.251.52.145 > 172.17.0.2: ICMP time exceeded in-transit, length 68
        IP (tos 0x80, ttl 1, id 6964, offset 0, flags [none], proto ICMP (1), length 60, bad cksum 2273 (->e1ea)!)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 27097, seq 0, length 40
1659529968.685978 IP (tos 0x0, ttl 16, id 51207, offset 0, flags [none], proto ICMP (1), length 60)
    172.17.0.2 > 8.8.8.8: ICMP echo request, id 22417, seq 0, length 40
1659529968.732351 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 60)
    8.8.8.8 > 172.17.0.2: ICMP echo reply, id 22417, seq 0, length 40
^C
30 packets captured
30 packets received by filter
0 packets dropped by kernel

@IainStevenson
Copy link
Author

IainStevenson commented Aug 3, 2022

@filipnavara Well the linux Ping utility also gets those so that is not a factor.

root@34d384e06918:/app# ping -t 3 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
From 192.168.1.1 icmp_seq=1 Time to live exceeded
From 192.168.1.1 icmp_seq=2 Time to live exceeded
From 192.168.1.1 icmp_seq=3 Time to live exceeded
^C
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2003ms
13:12:24.499936 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 112)
    192.168.1.1 > 34d384e06918: ICMP time exceeded in-transit, length 92
        IP (tos 0x0, ttl 1, id 7667, offset 0, flags [none], proto ICMP (1), length 84, bad cksum 201c (->df93)!)
    34d384e06918 > dns.google: ICMP echo request, id 3, seq 3, length 64
        0x0000:  4500 0070 0000 0000 2501 27d1 c0a8 0101  E..p....%.'.....
        0x0010:  ac11 0002 0b00 b477 0000 0000 4500 0054  .......w....E..T
        0x0020:  1df3 0000 0101 201c ac11 0002 0808 0808  ................
        0x0030:  0800 d4cd 0003 0003 3874 ea62 0000 0000  ........8t.b....
        0x0040:  3a82 0700 0000 0000 1011 1213 1415 1617  :...............
        0x0050:  1819 1a1b 1c1d 1e1f 2021 2223 2425 2627  .........!"#$%&'
        0x0060:  2829 2a2b 2c2d 2e2f 3031 3233 3435 3637  ()*+,-./01234567

@karelz
Copy link
Member

karelz commented Aug 9, 2022

Triage: @IainStevenson it seems you are the only person hitting it. DO you have an isolated repro (e.g. in containers) that you could share. I don't see how else we can move forward here ... unless we missed something in the thread.

@karelz karelz added the needs-author-action An issue or pull request that requires more info or actions from the author. label Aug 9, 2022
@ghost
Copy link

ghost commented Aug 9, 2022

This issue has been marked needs-author-action and may be missing some important information.

@filipnavara
Copy link
Member

@karelz I can hit it on WSL so it's reproducible.

I'm just not familiar enough with how raw sockets work on Linux though. I've only got as far as finding out that recvfrom doesn't get the TTL exceeded ICMP packets.

@wfurt
Copy link
Member

wfurt commented Aug 9, 2022

We we get nothing? (and I'm not sure I would trust WSL)

@filipnavara
Copy link
Member

It's WSL2, regular ping command works. recvfrom gets the successful replies, just not the error ones (which come from different IP if that matters).

@IainStevenson
Copy link
Author

@wfurt @karelz The repo posted here has a solution in it that is isolated and produces the problem reliably.

I apologise of the rather messy commentary above and I will repeat instructions here.

You can ignore most of my solution code and hone in on the isolated problem repo within it called PINGFIXTEST

Which was written to proove the earlier bug fix does not work in the linux container context in the latest preview even though the change is included.

The test repo was written from the suggestion by the bug fixer on how to test it.

This following solution is setup to use .net7 preview 6

On a windows/mac terminal.

git clone https://github.com/IainStevenson/network.monitor.git
CD network.monitor/src/pingfixtest
dotnet build pingfixtest.csproj
docker build -t pingfixtest .
docker run -t pingfixtest

if you run that on a linux container in docker on your development host you see this.

Hop: 1 Status: TimedOut Address: 0.0.0.0
Hop: 2 Status: TimedOut Address: 0.0.0.0
Hop: 3 Status: TimedOut Address: 0.0.0.0
Hop: 4 Status: TimedOut Address: 0.0.0.0
Hop: 5 Status: TimedOut Address: 0.0.0.0
Hop: 6 Status: TimedOut Address: 0.0.0.0
Hop: 7 Status: TimedOut Address: 0.0.0.0
Hop: 8 Status: TimedOut Address: 0.0.0.0
Hop: 9 Status: TimedOut Address: 0.0.0.0
Hop: 10 Status: TimedOut Address: 0.0.0.0
Hop: 11 Status: TimedOut Address: 0.0.0.0
Hop: 12 Status: TimedOut Address: 0.0.0.0
Hop: 13 Status: TimedOut Address: 0.0.0.0
Hop: 14 Status: TimedOut Address: 0.0.0.0
Hop: 15 Status: TimedOut Address: 0.0.0.0
Hop: 16 Status: Success Address: 8.8.8.8

If you run the same project code on a windows container you see this;

Hop: 1 Status: TtlExpired Address: 172.22.128.1 
Hop: 2 Status: TtlExpired Address: 192.168.0.1 
Hop: 3 Status: TtlExpired Address: 192.168.1.1 
Hop: 4 Status: TimedOut Address: 8.8.8.8 
Hop: 5 Status: TtlExpired Address: 10.124.71.1 
Hop: 6 Status: TtlExpired Address: 10.124.222.84 
Hop: 7 Status: TtlExpired Address: 10.124.72.165 
Hop: 8 Status: TimedOut Address: 8.8.8.8 
Hop: 9 Status: TtlExpired Address: 10.247.89.1 
Hop: 10 Status: TtlExpired Address: 10.124.72.154 
Hop: 11 Status: TtlExpired Address: 87.237.20.118 
Hop: 12 Status: TtlExpired Address: 87.237.20.67 
Hop: 13 Status: TtlExpired Address: 72.14.242.70 
Hop: 14 Status: TtlExpired Address: 74.125.242.97 
Hop: 15 Status: TtlExpired Address: 142.251.52.145 
Hop: 16 Status: Success Address: 8.8.8.8

I have replicated this on a windows and a MAC host.

There are comments above that show when you run the same code natively on your (windows/mac) host it works as expected.

I understand that on the different base OS's things work diffeerently. I am jsut proving that Ping works when the underlying layer works.

I have confirmed through experimentation that on the linux host it is traversing the code path that uses sockets rather than calling out externally to the linux OS.

Therefore there is some problem with seeing the returning bytes from the pinged hsot in teh socket on linux hosts using the sockets layer.

From what we all ahve observed the packetrs arearriving into the linux container host operating system AS THEY SHOUlD and do for ping working correctly on that container OS.

IMHO Sockets is the problem, and or a problem in recognising the return data in the Ping socket handlers.

My apologies if I am off the mark here. Basically I beleive my repo proves reliably that it it is broken.

Please understand I will be delighted if its a code problem of mine.

Personally I dont think this is a Ping problem but a deeper problem with the sockets layer on Linux containers when specifying a low value TTL. Just my 2 cents worth.

@ghost ghost added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed needs-author-action An issue or pull request that requires more info or actions from the author. labels Aug 9, 2022
@IainStevenson
Copy link
Author

To make life simpler I created a clean isolated version here

@karelz
Copy link
Member

karelz commented Aug 10, 2022

Triage: Given that this is not a regression in 7.0, moving to 8.0.

@karelz karelz added this to the 8.0.0 milestone Aug 10, 2022
@karelz karelz removed untriaged New issue has not been triaged by the area owner needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration labels Aug 10, 2022
@IainStevenson
Copy link
Author

@karelz

Some more info that may be of use.

I enabled '.NET framework source stepping' in my Visual Studio -Tools/ Options menu in the Debugging / General settings.
Then enabled 'break when thrown' for all in the Visual Studio - Debug / Windows/ Exception Settings menu Common language runtime exceptions

When I re-ran the test in the container it threw this Exception.

System.Net.Sockets.SocketException
  HResult=0x80004005
  Message=Connection timed out
  Source=System.Net.Sockets
  StackTrace:
   at System.Net.Sockets.Socket.ReceiveFrom(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, EndPoint& remoteEP)

Which corresponds with the comment above from @filipnavara ReceiveFrom recieving no response

Clearly this Sockets exception is trapped down below somewhere and surfaces at the Ping class as a Timedout response.

Which suggests the problem lies within the Sockets library.

I had a look there 'System.Net.Sockets.Socket.ReceiveFrom' and saw that it CAN emit 'NetEventSource' information but I dont know how to get that setup in my environment to find out what Sockets experiencing.

@IainStevenson
Copy link
Author

Just so everyone is sure that something actually came back from the ICMP request, this is the 'tcpdump' of such a test;

root@8d0848e95f8f:/app# tcpdump -i eth0 proto ICMP -v
tcpdump: listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
13:50:29.005820 IP (tos 0x0, ttl 7, id 14134, offset 0, flags [none], proto ICMP (1), length 60)
    8d0848e95f8f > dns.google: ICMP echo request, id 59203, seq 0, length 40
13:50:36.102408 IP (tos 0x0, ttl 8, id 61273, offset 0, flags [none], proto ICMP (1), length 60)
    8d0848e95f8f > dns.google: ICMP echo request, id 17603, seq 0, length 40
13:50:36.138599 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 88)
    10.247.89.65 > 8d0848e95f8f: ICMP time exceeded in-transit, length 68
        IP (tos 0x38, ttl 1, id 36525, offset 0, flags [none], proto ICMP (1), length 60, bad cksum af26 (->6eb9)!)
    8d0848e95f8f > dns.google: ICMP echo request, id 17603, seq 0, length 40
13:50:46.175180 IP (tos 0x0, ttl 9, id 55459, offset 0, flags [none], proto ICMP (1), length 60)
    8d0848e95f8f > dns.google: ICMP echo request, id 6385, seq 0, length 40
13:50:46.211203 IP (tos 0x0, ttl 37, id 0, offset 0, flags [none], proto ICMP (1), length 56)
    10.247.87.193 > 8d0848e95f8f: ICMP time exceeded in-transit, length 36
        IP (tos 0x38, ttl 1, id 36528, offset 0, flags [none], proto ICMP (1), length 60, bad cksum af23 (->6eb6)!)
    8d0848e95f8f > dns.google: ICMP echo request, id 6385, seq 0, length 40
^C
5 packets captured
5 packets received by filter
0 packets dropped by kernel

BTW: The 'ICMP time exceeded in-transit' had me looking twice but that is what you get from a 'traceroute -I 8.8.8.8'

@filipnavara
Copy link
Member

I appreciate the debugging you do! I was already convinced that the TTL packet reach the Linux VM (confirmed by ping utility working and Wireshark seeing the packets on the outside).

Sorry for the lack of feedback, I am hampered both by a lack of understanding of the raw socket details and health issues.

@IainStevenson
Copy link
Author

@filipnavara YW. Sorry to hear about your health. I am a dog with a bone on this one. It's not a big earth shaking problem, but I dont like issues like this and worse I don't like not being able to solve it myself :) I am readind the Sockets code now to try and figure out a theory on why its happening but that code is scary ! Plus I have learned loads of new stuff. After 38 years in IT its been a rare treat in that respect.

@wfurt
Copy link
Member

wfurt commented Aug 11, 2022

This may be because we call Connect on the socket. I should be able to take a look once my 7.0 queue is empty.
Adding @tmds just in case

@IainStevenson
Copy link
Author

There is a load of event source logging going on down inside Sockets.
I was wondering if the NetEventSource tools would show any light on it but I am struggling to get collection working on my container. Following the instructions here, here and here I will report anything I find out.

@karelz karelz modified the milestones: 8.0.0, Future Jun 9, 2023
@karelz karelz modified the milestones: Future, 9.0.0 Jul 18, 2023
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Mar 18, 2024
@github-actions github-actions bot locked and limited conversation to collaborators May 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net in-pr There is an active PR which will close this issue when it is merged os-linux Linux OS (any supported distro)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants