Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DatagramHandler doing DNS lookup on every log message #91305

Closed
bmerry mannequin opened this issue Mar 29, 2022 · 10 comments
Closed

DatagramHandler doing DNS lookup on every log message #91305

bmerry mannequin opened this issue Mar 29, 2022 · 10 comments
Labels
3.10 only security fixes 3.11 only security fixes 3.12 bugs and security fixes docs Documentation in the Doc dir performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@bmerry
Copy link
Mannequin

bmerry mannequin commented Mar 29, 2022

BPO 47149
Nosy @ericvsmith, @bmerry

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2022-03-29.08:47:19.565>
labels = ['library', '3.10', 'performance']
title = 'DatagramHandler doing DNS lookup on every log message'
updated_at = <Date 2022-03-30.07:10:36.998>
user = 'https://github.com/bmerry'

bugs.python.org fields:

activity = <Date 2022-03-30.07:10:36.998>
actor = 'bmerry'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2022-03-29.08:47:19.565>
creator = 'bmerry'
dependencies = []
files = []
hgrepos = []
issue_num = 47149
keywords = []
message_count = 9.0
messages = ['416247', '416270', '416279', '416286', '416287', '416288', '416289', '416290', '416327']
nosy_count = 2.0
nosy_names = ['eric.smith', 'bmerry']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue47149'
versions = ['Python 3.10']

@bmerry
Copy link
Mannequin Author

bmerry mannequin commented Mar 29, 2022

logging.DatagramHandler uses socket.sendto to send the messages. If the given address is a hostname rather than an IP address, it will do a DNS lookup every time.

I suspect that fixing bpo-14855 will also fix this, since fixing that issue requires resolving the hostname to determine whether it is an IPv4 or IPv6 address to create a suitable socket.

I've run into this on 3.8, but tagging 3.10 since the code still looks the same.

@bmerry bmerry mannequin added 3.10 only security fixes stdlib Python modules in the Lib dir performance Performance or resource usage labels Mar 29, 2022
@ericvsmith
Copy link
Member

If you don’t look it up every time, how do you deal with DNS timeouts?

@bmerry
Copy link
Mannequin Author

bmerry mannequin commented Mar 29, 2022

If you don’t look it up every time, how do you deal with DNS timeouts?

Do you mean expiring the IP address when the TTL is reached? I suppose that could be an issue for a long-running service, and I don't have a good answer to that. Possibly these days with services meshes and load-balancers it is less of a concern since a logging server can move without changing its IP address.

But it's important for a logging system not to block the service doing the logging (which is one reason for using UDP in the first place). I only discovered this issue because of some flaky DNS servers that would occasionally take several seconds to answer a query, and block the whole asyncio event loop while it waited.

At a minimum it would be useful to document it, so that you know it's something to be concerned about when using DatagramHandler.

@ericvsmith
Copy link
Member

Do you mean expiring the IP address when the TTL is reached?

Yes, that's what I mean. Isn't the resolver library smart enough to cache lookups and handle the TTL timeout by itself?

@bmerry
Copy link
Mannequin Author

bmerry mannequin commented Mar 29, 2022

Yes, that's what I mean. Isn't the resolver library smart enough to cache lookups and handle the TTL timeout by itself?

Apparently not in this case - with tcpdump I can see the DNS requests being fired off several times a second. I'll need to check what the TTL actually is though.

@ericvsmith
Copy link
Member

Hmm. I'm not sure we should try to work around a bad resolver issue. What's your platform, and how did you install Python?

@bmerry
Copy link
Mannequin Author

bmerry mannequin commented Mar 29, 2022

Hmm. I'm not sure we should try to work around a bad resolver issue. What's your platform, and how did you install Python?

Fair point. It's Ubuntu 20.04, running inside Docker, with the default Python (3.8). I've also reproduced it outside Docker (again Ubuntu 20.04 with system Python). The TTL is 30s, so I'm not sure why systemd-resolved isn't caching it for messages logged several times a second.

Even if the system has a local cache though, it's not ideal to have logging block when the TTL expires, particularly in an event-driven (asyncio) service. Updating the address in a background thread while continuing to log to the old address might be better. But my use case is particularly real-time (even 10ms of latency is problematic), and maybe that shouldn't drive the default behaviour.

I blame the lack of standard POSIX functions for doing DNS lookups asynchronously and in a way that provides TTL information to the client.

@ericvsmith
Copy link
Member

I blame the lack of standard POSIX functions for doing DNS lookups asynchronously and in a way that provides TTL information to the client.

I totally agree with that!

And I agree it would be nice to have some way of doing non-blocking lookups when the TTL expires (or whenever it decides it needs to do a lookup). But it's going to be non-trivial, I fear.

@bmerry
Copy link
Mannequin Author

bmerry mannequin commented Mar 30, 2022

But it's going to be non-trivial, I fear.

Yeah. Maybe some documentation is achievable in the short term though, so that users who care more about latency than changing DNS are aware that they should do the lookup themselves?

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@iritkatriel iritkatriel added docs Documentation in the Doc dir 3.11 only security fixes 3.12 bugs and security fixes labels Aug 26, 2022
@vsajip
Copy link
Member

vsajip commented Aug 27, 2022

Maybe some documentation is achievable in the short term though, so that users who care more about latency than changing DNS are aware that they should do the lookup themselves?

This isn't specific to DatagramHandler, though, right? Isn't it something that users can figure out if they are having latency problems? It seems like every Python API that accepts a hostname can potentially hit this problem depending on the exact application using it. Edit: I guess streaming connections don't have this problem.

vsajip added a commit to vsajip/cpython that referenced this issue Aug 29, 2022
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Aug 29, 2022
…thonGH-96380)

(cherry picked from commit 6324b13)

Co-authored-by: Vinay Sajip <vinay_sajip@yahoo.co.uk>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Aug 29, 2022
…thonGH-96380)

(cherry picked from commit 6324b13)

Co-authored-by: Vinay Sajip <vinay_sajip@yahoo.co.uk>
@vsajip vsajip closed this as completed Aug 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.10 only security fixes 3.11 only security fixes 3.12 bugs and security fixes docs Documentation in the Doc dir performance Performance or resource usage stdlib Python modules in the Lib dir
Projects
Development

No branches or pull requests

3 participants