Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS resolver can run for longer than the given timeout #101

Open
nmenardg-keeper opened this issue Mar 17, 2023 · 4 comments
Open

DNS resolver can run for longer than the given timeout #101

nmenardg-keeper opened this issue Mar 17, 2023 · 4 comments

Comments

@nmenardg-keeper
Copy link

It's very hard to reproduce, we found the problem because we have a 60s timeout for our web server that we kept busting, even with email-validator's default 15s timeout.

Disabling check_availability solved our problem.

The problem

The problem lies within:

response = dns_resolver.resolve(domain, "MX")

( https://github.com/JoshData/python-email-validator/blob/main/email_validator/deliverability.py#L40 )

While debugging, what I can see is that the resolver goes through a loop, and only checks the timeout between calls. It also does some time.sleep()
That means that if a call or a sleep is longer than the timeout, it doesn't get interrupted and can thus run for longer

The solution

Using signal, we could interrupt the process. See https://stackoverflow.com/a/494273

This package does it: https://github.com/pnpnpn/timeout-decorator

I'll try implementing the stackoverflow suggestion on my end and write back with news

@JoshData
Copy link
Owner

This sounds like it would be a suggestion better made for the dnspython library rather than here but I would be curious to hear what you find out.

@nmenardg-keeper
Copy link
Author

Yeah agreed.

Here's the issue in dnspython: rthalley/dnspython#913

I'm testing the signal solution today. Some caveats: it only works on single-thread programs, and could interfere with other code also using signal.SIGALRM

@nmenardg-keeper
Copy link
Author

I closed the issue in dnspython. The problem seems to be that in deliverability.validate_email_deliverability, Resolver.resolve is called potentially 4 times with the same timeout (due to except dns.resolver.NoAnswer).
The solutions I see on your side could be:

  1. Divide timeout in 4 (so each of the 4 potential resolve call gets 1/4 of the total timeout)
  2. Measure time left after every resolve call, and allocate the remaining time (if any) to the following call
    a. e.g. first call takes 5s, next call gets a timeout of 15-5 = 10s

For my solution using signal to work, I'll have to raise a BaseException because dnspython is catching Exception
Or a hacky fix would be to give email_validator a timeout of 1/4 of what I actually want

@JoshData
Copy link
Owner

Ah sorry for suggesting you head toward dnspython.

Since the except-block for dns.exception.Timeout is around all of the DNS queries, the first query to time out should end the deliverability checks. So to get a total time that's four times the timeout, each query would have to complete in just under the timeout without timing out.

Your option 2 would be OK for me, but it might be a little tricky to implement because the resolver might be provided by the caller and reducing the timeout in future calls would mangle the original value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants