Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The vL3 DNS doesn't support PTR requests #1425

Closed
fr-Pursuit opened this issue Feb 21, 2023 · 8 comments
Closed

The vL3 DNS doesn't support PTR requests #1425

fr-Pursuit opened this issue Feb 21, 2023 · 8 comments
Assignees
Labels
enhancement New feature or request

Comments

@fr-Pursuit
Copy link

Expected Behavior

The vL3 DNS service should support PTR (ie: reverse DNS) requests, potentially by returning an empty response.

Current Behavior

An error occurs when the vL3 NSE receives PTR requests. When using the ping command with a vL3 domain name, this translates in a small freeze just before the first ping is sent (while the program waits for the timeout associated with the reverse DNS request).

Failure Information (for bugs)

When the NSE receives a PTR request, this error appears in the logs:

Feb  7 12:06:14.957 [TRAC] [id:26676] [type:dnsServer] (1) ⎆ sdk/pkg/tools/dnsutils/dnsconfigs/dnsConfigsHandler.ServeDNS()
Feb  7 12:06:14.957 [TRAC] [id:26676] [type:dnsServer] (1.1)   message-request={"Id":26676,"Response":false,"Opcode":0,"Authoritative":false,"Truncated":false,"RecursionDesired":false,"RecursionAvailable":false,"Zero":false,"AuthenticatedData":false,"CheckingDisabled":false,"Rcode":0,"Question":[{"Name":"3.0.16.172.in-addr.arpa.","Qtype":12,"Qclass":1}],"Answer":null,"Ns":null,"Extra":null}
Feb  7 12:06:14.958 [DEBU] [id:26676] [type:dnsServer] (1.2)   passed clientURLs: [{udp   172.16.0.1   false   }]
Feb  7 12:06:14.958 [DEBU] [id:26676] [type:dnsServer] (1.3)   passed SearchDomains: []
Feb  7 12:06:14.958 [TRAC] [id:26676] [type:dnsServer] (2)  ⎆ sdk/pkg/tools/dnsutils/noloop/noloopDNSHandler.ServeDNS()
Feb  7 12:06:14.958 [ERRO] [id:26676] [noloopDNSHandler:ServeDNS] [type:dnsServer] (2.1)    loop is not allowed: query: ;; opcode: QUERY, status: NOERROR, id: 26676;   ;; flags:; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0;    ;      ;; QUESTION SECTION:;    ;3.0.16.172.in-addr.arpa.       IN       PTR;   ;
Feb  7 12:06:14.958 [TRAC] [id:26676] [type:dnsServer] (2.2)    message-response={"Id":26676,"Response":true,"Opcode":0,"Authoritative":false,"Truncated":false,"RecursionDesired":false,"RecursionAvailable":false,"Zero":false,"AuthenticatedData":false,"CheckingDisabled":false,"Rcode":2,"Question":[{"Name":"3.0.16.172.in-addr.arpa.","Qtype":12,"Qclass":1}],"Answer":[],"Ns":[],"Extra":[]}
Feb  7 12:06:15.011 [TRAC] [id:26676] [type:dnsServer] (5.1)       message-response={"Id":26676,"Response":true,"Opcode":0,"Authoritative":false,"Truncated":false,"RecursionDesired":false,"RecursionAvailable":false,"Zero":false,"AuthenticatedData":false,"CheckingDisabled":false,"Rcode":2,"Question":[{"Name":"3.0.16.172.in-addr.arpa.","Qtype":12,"Qclass":1}],"Answer":[],"Ns":[],"Extra":[]}

Steps to Reproduce

  1. Deploy the interdomain vL3 example (while renaming one alpine pod to alpine2 to avoid confusion and setting the NSM_DNS_TEMPLATES environment variable to {{ index .Labels \"podName\" }}.my-interdomain-vl3-network. for the DNS service to work)
  2. Open a shell in the alpine pod
  3. Run ping alpine2.my-interdomain-vl3-network

Context

ping is a great way to see how PTR requests are handled since when using ping with a domain name, the program performs a reverse DNS query on the source address of the response:


I didn't see how nslookup could directly tell how long each request took, but using the time command, I got this:

root@yelb-appserver-55688766-pm8rf:/# time nslookup -debug yelb-ui.my-interdomain-vl3-network.
Server:         127.0.0.1
Address:        127.0.0.1#53

------------
    QUESTIONS:
        yelb-ui.my-interdomain-vl3-network, type = A, class = IN
    ANSWERS:
    ->  yelb-ui.my-interdomain-vl3-network
        internet address = 172.16.0.3
        ttl = 3511
    AUTHORITY RECORDS:
    ADDITIONAL RECORDS:
------------
Name:   yelb-ui.my-interdomain-vl3-network
Address: 172.16.0.3
------------
    QUESTIONS:
        yelb-ui.my-interdomain-vl3-network, type = AAAA, class = IN
    ANSWERS:
    AUTHORITY RECORDS:
    ADDITIONAL RECORDS:
------------
** server can't find yelb-ui.my-interdomain-vl3-network: SERVFAIL


real    0m0.905s
user    0m0.014s
sys     0m0.000s
Each request took more or less the same time, so I'd say around 400/500ms per request.

As for the PTR requests, they come from the ping problem itself. From what I can see, ping performs a reverse DNS query when you use it with a domain name instead of an IP:

% ping -4 google.com -c 4
PING  (216.58.213.78) 56(84) bytes of data.
64 bytes from lhr25s01-in-f14.1e100.net (216.58.213.78): icmp_seq=1 ttl=110 time=4.92 ms
64 bytes from lhr25s01-in-f14.1e100.net (216.58.213.78): icmp_seq=2 ttl=110 time=4.91 ms
64 bytes from par21s18-in-f14.1e100.net (216.58.213.78): icmp_seq=3 ttl=110 time=4.98 ms
64 bytes from lhr25s01-in-f78.1e100.net (216.58.213.78): icmp_seq=4 ttl=110 time=5.00 ms

---  ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 4.909/4.950/5.000/0.038 ms

% ping -4 216.58.213.78 -c 4
PING 216.58.213.78 (216.58.213.78) 56(84) bytes of data.
64 bytes from 216.58.213.78: icmp_seq=1 ttl=110 time=4.86 ms
64 bytes from 216.58.213.78: icmp_seq=2 ttl=110 time=4.87 ms
64 bytes from 216.58.213.78: icmp_seq=3 ttl=110 time=4.96 ms
64 bytes from 216.58.213.78: icmp_seq=4 ttl=110 time=4.92 ms

--- 216.58.213.78 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 4.857/4.902/4.963/0.041 ms

This problem was diagnosed as part of #1414

The example was deployed on three kind clusters using kind 0.17.0 and the kindest v1.25.3 image

@denis-tingaikin
Copy link
Member

@glazychev-art Could you please add a test into deployments-k8s?

@glazychev-art
Copy link
Contributor

@fr-Pursuit
Could you check PTR requests on the latest main branch if you get a chance?

@fr-Pursuit
Copy link
Author

fr-Pursuit commented Mar 26, 2023

@glazychev-art Hi! PTR requests seem to be handled correctly now. ping doesn't freeze at all :)

However, I now need to explicitly use IPv4 in order for the ping to work...

/ # ping -c 4 alpine2.my-interdomain-vl3-network
ping: bad address 'alpine2.my-interdomain-vl3-network'

/ # ping -c 4 alpine2.my-interdomain-vl3-network -4
PING alpine2.my-interdomain-vl3-network (172.16.0.2): 56 data bytes
64 bytes from 172.16.0.2: seq=0 ttl=59 time=65.076 ms
64 bytes from 172.16.0.2: seq=1 ttl=59 time=2.696 ms
64 bytes from 172.16.0.2: seq=2 ttl=59 time=12.237 ms
64 bytes from 172.16.0.2: seq=3 ttl=59 time=60.345 ms

When the ping fails, this error appears in the NSE's logs:

Mar 26 21:04:34.885 [TRAC] [id:22041] [type:dnsServer] (1) ⎆ sdk/pkg/tools/dnsutils/dnsconfigs/dnsConfigsHandler.ServeDNS()
Mar 26 21:04:34.885 [TRAC] [id:22041] [type:dnsServer] (1.1)   message-request={"Id":22041,"Response":false,"Opcode":0,"Authoritative":false,"Truncated":false,"RecursionDesired":false,"RecursionAvailable":false,"Zero":false,"AuthenticatedData":false,"CheckingDisabled":false,"Rcode":0,"Question":[{"Name":"alpine2.my-interdomain-vl3-network.","Qtype":28,"Qclass":1}],"Answer":null,"Ns":null,"Extra":null}
Mar 26 21:04:34.885 [DEBU] [id:22041] [type:dnsServer] (1.2)   passed clientURLs: [{udp   172.16.0.1   false   } {udp   172.16.0.1   false   }]
Mar 26 21:04:34.885 [DEBU] [id:22041] [type:dnsServer] (1.3)   passed SearchDomains: []
Mar 26 21:04:34.885 [TRAC] [id:22041] [type:dnsServer] (2)  ⎆ sdk/pkg/tools/dnsutils/noloop/noloopDNSHandler.ServeDNS()
Mar 26 21:04:34.886 [TRAC] [id:22041] [type:dnsServer] (3)   ⎆ sdk/pkg/tools/dnsutils/norecursion/norecursionDNSHandler.ServeDNS()
Mar 26 21:04:34.886 [TRAC] [id:22041] [type:dnsServer] (1) ⎆ sdk/pkg/tools/dnsutils/dnsconfigs/dnsConfigsHandler.ServeDNS()
Mar 26 21:04:34.887 [TRAC] [id:22041] [type:dnsServer] (4)    ⎆ sdk/pkg/tools/dnsutils/memory/memoryHandler.ServeDNS()
Mar 26 21:04:34.887 [TRAC] [id:22041] [type:dnsServer] (1.1)   message-request={"Id":22041,"Response":false,"Opcode":0,"Authoritative":false,"Truncated":false,"RecursionDesired":false,"RecursionAvailable":false,"Zero":false,"AuthenticatedData":false,"CheckingDisabled":false,"Rcode":0,"Question":[{"Name":"alpine2.my-interdomain-vl3-network.","Qtype":28,"Qclass":1}],"Answer":null,"Ns":null,"Extra":null}
Mar 26 21:04:34.887 [DEBU] [id:22041] [type:dnsServer] (1.2)   passed clientURLs: [{udp   172.16.0.1   false   } {udp   172.16.0.1   false   }]
Mar 26 21:04:34.887 [DEBU] [id:22041] [type:dnsServer] (1.3)   passed SearchDomains: []
Mar 26 21:04:34.887 [TRAC] [id:22041] [type:dnsServer] (2)  ⎆ sdk/pkg/tools/dnsutils/noloop/noloopDNSHandler.ServeDNS()
Mar 26 21:04:34.887 [ERRO] [id:22041] [noloopDNSHandler:ServeDNS] [type:dnsServer] (2.1)    loop is not allowed: query: ;; opcode: QUERY, status: NOERROR, id: 22041;   ;; flags:; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0;    ;      ;; QUESTION SECTION:;    ;alpine2.my-interdomain-vl3-network.    IN       AAAA;  ;
Mar 26 21:04:34.887 [TRAC] [id:22041] [type:dnsServer] (2.2)    message-response={"Id":22041,"Response":true,"Opcode":0,"Authoritative":false,"Truncated":false,"RecursionDesired":false,"RecursionAvailable":false,"Zero":false,"AuthenticatedData":false,"CheckingDisabled":false,"Rcode":2,"Question":[{"Name":"alpine2.my-interdomain-vl3-network.","Qtype":28,"Qclass":1}],"Answer":[],"Ns":[],"Extra":[]}
Mar 26 21:04:34.888 [TRAC] [id:22041] [type:dnsServer] (5)     ⎆ sdk/pkg/tools/dnsutils/fanout/fanoutHandler.ServeDNS()
Mar 26 21:04:34.915 [TRAC] [id:22041] [type:dnsServer] (5.1)       message-response={"Id":22041,"Response":true,"Opcode":0,"Authoritative":false,"Truncated":false,"RecursionDesired":false,"RecursionAvailable":false,"Zero":false,"AuthenticatedData":false,"CheckingDisabled":false,"Rcode":2,"Question":[{"Name":"alpine2.my-interdomain-vl3-network.","Qtype":28,"Qclass":1}],"Answer":[],"Ns":[],"Extra":[]}

It seems the AAAA request fails in a way that makes ping fail before it even attempts to make a A request. I don't remember seeing that before. Is it be caused by a change in NSM or somewhere else?

@glazychev-art
Copy link
Contributor

@fr-Pursuit
No, there were no changes in NSM in this direction. And this behavior was before, you can check versions v1.8.0 or v1.7.1 :)
ping utility is the reason for this. Depending on the system settings, it prefers IPv6 over IPv4.

As far as I can see, in some linux versions this is solved by changing /etc/gai.conf. But not for alpine. Perhaps this will help you.

@denis-tingaikin
Copy link
Member

@glazychev-art I feel we might need to verify it.

@NikitaSkrynnik Could you please compare behaviour with NSM v1.5.0 (with coredns) and current main version?

@NikitaSkrynnik
Copy link
Contributor

@fr-Pursuit Hello, I've fixed the problem with ping and explicit IPv4. Could you please check it?

@fr-Pursuit
Copy link
Author

@NikitaSkrynnik Hi! Sorry for the late reply.

It's all fixed, thanks!

/ # ping -c 4 alpine2.my-interdomain-vl3-network
PING alpine2.my-interdomain-vl3-network (172.16.1.2): 56 data bytes
64 bytes from 172.16.1.2: seq=0 ttl=56 time=1.426 ms
64 bytes from 172.16.1.2: seq=1 ttl=56 time=1.503 ms
64 bytes from 172.16.1.2: seq=2 ttl=56 time=1.652 ms
64 bytes from 172.16.1.2: seq=3 ttl=56 time=4.073 ms

--- alpine2.my-interdomain-vl3-network ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 1.426/2.163/4.073 ms

@glazychev-art
Copy link
Contributor

@fr-Pursuit
Thanks for checking!
Can you close the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

No branches or pull requests

4 participants