-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vL3 DNS doesn't work in an interdomain context #1414
Comments
Hi @fr-Pursuit , Could you please recheck your problem with |
Hi, With the added dot at the end of the template, the DNS resolution partially works.
The same error also appears in the logs of the vL3 NSE that is in the same cluster as the
In practice, this error seems to considerably increase the time it takes for a program to perform a DNS resolution in the PS: I changed the template to use a custom |
Actually, the slow In the meantime, the following appears in the remote NSE's logs:
The same error appears several times in the logs, as the server seems to handle one PTR request per entry in the pod's DNS search path. I've also noticed major slowdowns in the yelb app, but these don't seem to be caused by NSM: in order for the app to work regardless of how it is deployed, I added the |
@fr-Pursuit So, we can check: For ping, force ipv4: |
@glazychev-art Indeed! The
I already tried to force IPv4 on ping though, which didn't change anything. I don't find this surprising though, since from what I saw the freeze came from reverse DNS queries. |
We can also check how long each request takes with Where do PTR requests come from? Any ideas? |
I didn't see how
Each request took more or less the same time, so I'd say around 400/500ms per request. As for the
|
Could you check please Got it, thanks. Apparently we also need to process PTR requests. |
Here's what I get when I only query
|
Actually, pods connected to a vL3 network seem to use a local resolver running in the |
@fr-Pursuit
This log is from alpine (cmd-nsc). Could you also check the latest main branch of deployments-k8s? We slightly increased the query speed, at least |
Weirdly, I get similar logs in the I ran I'll retry with the latest main branch (I'm currently using v1.7.0), and I'll get back to you. |
Great! With the latest branch, the DNS resolutions are quite faster:
However, I have another issue... I deployed
The following error appears in the NSE's logs:
All the other names can be resolved fine: only Any idea about what could have gone wrong? |
Is Could you upload the full NSE log to which it is connected? |
I'm currently redeploying my clusters. It's probably better to restart with a clean setup, just to be sure my previous tinkering didn't affect anything. |
It appears something was wrong with my setup. After a clean reinstall, everything works fine. However, the resolution speedup only applies when I use the absolute DNS name:
If I use the relative name (
|
@fr-Pursuit I think it's not quite right to solve all dns problems in one issue. Could you create a separate ones for other problems? For example, about PTR requests. |
Seems like root cause of the problem is fixed. Next problem will be considered separatly #1425 |
Expected Behavior
After deploying the interdomain vL3 example, the vL3 DNS feature should work as described in its associated example.
Specifically, after modifying the
NSM_DNS_TEMPLATES
to avoid using the{{ .NetworkService }}
property (which contains the illegal character@
in the case of an interdomain vL3 service), the vL3 NSE should respond to DNS queries with the vL3 IP of the pod about which the query is.For instance, in the context of the interdomain vL3 example example and after modifying the
NSM_DNS_TEMPLATES
to{{ index .Labels \"podName\" }}.my-interdomain-vl3-network
for both vL3 NSEs, the NSEs should respond to DNS queries about thealpine.my-interdomain-vl3-network
with the IP address of thealpine
pod connected to themy-interdomain-vl3-network
vL3 network. Since twoalpine
pods are deployed in this example (one in each cluster), whether the NSE should return the vL3 IP of the local alpine pod, the remote one, or both, remains to be defined.Current Behavior
In the context described above, neither the alpine pods nor NSE pods are able to resolve the
alpine.my-interdomain-vl3-network
domain name.Failure Information (for bugs)
Running
nslookup alpine.my-interdomain-vl3-network.
in either one alpine pod or one vL3 NSE pod results in theSERVFAIL
error.Steps to Reproduce
{{ index .Labels \"podName\" }}.my-interdomain-vl3-network
value for theNSM_DNS_TEMPLATES
environment variableapt update && apt install dnsutils && nslookup alpine.my-interdomain-vl3-network. 127.0.0.1
in the NSE podContext
The interdomain vL3 example example was deployed on 3 VMs, each running a local
kind
cluster. The nodes of each kind cluster have InternalIPs in different ranges, and multicluster communication between nodes is possible using their InternalIPs. The nodes however do not have ExternalIPs.MetalLB is also deployed in each cluster to provide ExternalIPs to k8s services. These IPs are routable between the three VMs.
I used Kind 0.17.0.
Failure Logs
When a DNS query about the
alpine
pod is sent to the vL3 NSE, the following error appears in the pod's logs:The complete logs of the vL3 NSE pod are available here.
The text was updated successfully, but these errors were encountered: