-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
network sysctls #556
base: main
Are you sure you want to change the base?
network sysctls #556
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -111,10 +111,56 @@ | |
''; | ||
}; | ||
|
||
# use TCP BBR has significantly increased throughput and reduced latency for connections | ||
# https://www.kernel.org/doc/html/latest/networking/ip-sysctl.html | ||
# In some cases, TCP BBR can significantly increase throughput and reduce latency, | ||
# however this is not true in all cases, and should be used with caution | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What cases in particular? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are we talking about the long fat pipes issue? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At edgecast, with ~20k machines covering something like >75% of routes in the world, we setup socket performance sampling using a a tool like xtcp2 (https://github.com/randomizedcoder/xtcp2). This is essentially streaming back "ss --tcp --info", back to a big clickhouse cluster. This gave us global visibility of socket performance. Then we ran a series for "canaries"/experiments to enable BBR on some machines in different PoPs all over the world. We then carefully analyzed the results to observed the impact to the socket performance. In many cases, socket performance, like throughput, dropped. This was particularly true for small HTTP transactions, and connections with low RTTs (like 20-40ms). This might be because BBR takes time to find the target rate. We did see benefits with BRR particularly to higher RTTs, and in particular cellular networks. In the end, we kept cubic the default, and wrote automation to detect the sockets were BBR would benefit them, and then the Edgecast webserver would switch to BBR only for those destination routes. This gave ~4% performance improvement globally. Of course this was all for Edgecast, which has traffic patterns that could be very different to your use cases. ... I find that if I enable BBR on my laptop at home, it sucks, particularly for talking to local machines on my LAN, so I don't use BBR on my laptop anymore. My internet connection is pretty good, so I'm ~12ms to most CDNs, so that's also why BBR isn't really required. Anyway, my point is that just blindly turning on BBR could be doing more harm than good. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok. So my assumption here, when I enabled But your data suggest that it would be actually worse for these cases and this should be used mainly to improve throughput when serving mobile clients? Interestingly your 4% performance improvement seem to match the result that Youtube reported as well. Which version of BBR did you use for your testing? It looks like since 2023 we now also have BBRv3, with some enhancements: https://datatracker.ietf.org/meeting/117/materials/slides-117-ccwg-bbrv3-algorithm-bug-fixes-and-public-internet-deployment-00 |
||
boot.kernel.sysctl = { | ||
"net.core.default_qdisc" = "fq"; | ||
"net.ipv4.tcp_congestion_control" = "bbr"; | ||
Comment on lines
118
to
119
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We dropped both settings in #576 |
||
# "net.ipv4.tcp_congestion_control" = "cubic"; | ||
|
||
# Increase TCP buffer sizes for increased throughput | ||
"net.ipv4.tcp_rmem" = "4096 1000000 16000000"; | ||
"net.ipv4.tcp_wmem" = "4096 1000000 16000000"; | ||
# Default kernel | ||
#net.ipv4.tcp_rmem = 4096 131072 6291456 | ||
#net.ipv4.tcp_wmem = 4096 16384 4194304 | ||
|
||
# https://github.com/torvalds/linux/blob/master/Documentation/networking/ip-sysctl.rst?plain=1#L1042 | ||
# https://lwn.net/Articles/560082/ | ||
"net.ipv4.tcp_notsent_lowat" = "131072"; | ||
#net.ipv4.tcp_notsent_lowat = 4294967295 | ||
|
||
# Enable reuse of TIME-WAIT sockets globally | ||
"net.ipv4.tcp_tw_reuse" = 1; | ||
#net.ipv4.tcp_tw_reuse=2 | ||
"net.ipv4.tcp_timestamps" = 1; | ||
"net.ipv4.tcp_ecn" = 1; | ||
|
||
# For machines with a lot of UDP traffic increase the buffers | ||
"net.core.rmem_default" = 26214400; | ||
"net.core.rmem_max" = 26214400; | ||
"net.core.wmem_default" = 26214400; | ||
"net.core.wmem_max" = 26214400; | ||
#net.core.optmem_max = 20480 | ||
#net.core.rmem_default = 212992 | ||
#net.core.rmem_max = 212992 | ||
#net.core.wmem_default = 212992 | ||
#net.core.wmem_max = 212992 | ||
|
||
# Increase ephemeral ports | ||
"net.ipv4.ip_local_port_range" = "1025 65535"; | ||
#net.ipv4.ip_local_port_range ="32768 60999" | ||
|
||
# detect dead connections more quickly | ||
"net.ipv4.tcp_keepalive_intvl" = 30; | ||
#net.ipv4.tcp_keepalive_intvl = 75 | ||
"net.ipv4.tcp_keepalive_probes" = 4; | ||
#net.ipv4.tcp_keepalive_probes = 9 | ||
"net.ipv4.tcp_keepalive_time" = 120; | ||
#net.ipv4.tcp_keepalive_time = 7200 | ||
# 30 * 4 = 120 seconds. / 60 = 2 minutes | ||
# default: 75 seconds * 9 = 675 seconds. /60 = 11.25 minutes | ||
}; | ||
|
||
# Make sure the serial console is visible in qemu when testing the server configuration | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@randomizedcoder The optimization you applied, maybe we should agree first what usecase we are optimizing for. Could you maybe describe what type of server/client setup you have based your considerations on? I would like to add this as a comment for future readers.
Something like that (feel free to change it what you want to)