Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quinn_udp: use async_io instead of tokio #1183

Closed
wants to merge 11 commits into from
Closed

Conversation

kpp
Copy link
Contributor

@kpp kpp commented Aug 23, 2021

This is a continuation of #1180. I tried to keep both the changes by @dvc94ch and the original code, however I had to make a monkey fix in cmgs.rs in order to fix tests.

@dvc94ch all good?

dvc94ch
dvc94ch previously approved these changes Aug 23, 2021
Copy link
Contributor

@dvc94ch dvc94ch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

use std::io::IoSliceMut;
use std::net::Ipv4Addr;
use std::time::Instant;
use proto::{EcnCodepoint, Transmit};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not very happy about this. I guess it would be better to duplicate the definitions and convert it in quinn.

@Ralith
Copy link
Collaborator

Ralith commented Aug 23, 2021

We should not spawn a hidden, redundant, global executor in the background, which is unfortunately the behavior of the async-io crate (see docs). A better solution to the problem of making quinn-udp tokio-independent would be to lift the executor-related work out of the crate entirely.

@dvc94ch
Copy link
Contributor

dvc94ch commented Aug 23, 2021

technically it's only a reactor not an executor. opened an issue a few days ago expecting this to come up smol-rs/async-io#70

@@ -39,10 +37,6 @@ impl UdpSocket {
Self::from_std(socket)
}

pub fn socket_type(&self) -> SocketType {
self.ty
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure we don't use this in libp2p-quic? you need to be able to figure out if you can dial ipv4 addresses when bound to an ipv6 address

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I will revert this change if you are right

@Ralith
Copy link
Collaborator

Ralith commented Aug 24, 2021

opened an issue a few days ago expecting this to come up smol-rs/async-io#70

The solution I outlined also saves the extra dep and avoids compatibility issues with other runtimes, and prevents us from being blocked on upstream. I think that's a strong case for avoiding async-io entirely.

@kpp
Copy link
Contributor Author

kpp commented Aug 24, 2021

So far there is #1180 without touching tokio at all.

The solution I outlined

Would you be so kind to elaborate? I know this PR is questionable but I am happy to get some help from you.

@Ralith
Copy link
Collaborator

Ralith commented Aug 25, 2021

So far there is paritytech/substrate#1180 without touching tokio at all.

Yeah, I haven't reviewed that yet but I don't have any fundamental objection. By "prevents us from being blocked" I meant the users who aren't using tokio, for whom paritytech/substrate#1180 alone isn't sufficient.

The solution I outlined was:

A better solution to the problem of making quinn-udp tokio-independent would be to lift the executor-related work out of the crate entirely.

To be more specific, this would take the form of removing the mio/tokio dependencies from quinn-udp, and instead exposing whatever (presumably little) additional access is necessary for downstream code to implement binding to a particular reactor.

@Matthias247
Copy link
Contributor

The performance/efficiency of quinn is already not great. By doing that you will make it even much much worse.

Please meausure whether your changes actually fulfill whatever goals you have.

I'm also not sure if exposing more is actually the way to go. There had been some larger scale refactorings of the quinn crate discussed which would run connection processing inside endpoint tasks to fix some issues and recover performance.

If now a lot more internals are used directly by external consumers, they would break in rather strong ways.

@dvc94ch
Copy link
Contributor

dvc94ch commented Aug 28, 2021

By doing that you will make it even much much worse.

Can you elaborate on what you're talking about? I don't understand.

@Matthias247
Copy link
Contributor

Can you elaborate on what you're talking about? I don't understand.

Quinn at this point in time has somewhere between 30-50% of CPU efficiency of TCP+TLS (I hope you measured this before settling on QUIC 😉). Some amount of this degradation comes due to connections and endpoints running on different tasks. It gets way worse if you run a multithreaded runtime and tasks migrate threads. What will happen if you use another runtime? Exactly the same - you will also have more context switches and an inefficient system.

I would recommend the parity folks to just try to use the library as is, and understand the performance and efficiency impliciations of using QUIC before trying to change it. As far as looking into paritytech/polkadot-sdk#536 tells me there are some assumptions, but no numbers for anything. I can tell you upfront that you likely won't see any difference in congestion controller behavior - they follow the same patterns (apart from the Quinn version having more bugs and less production experience than a TCP stack). The multiplexing also isn't that different from what Yamux does. So will it help you - maybe?
Go ahead, integrate it "as is" (which should be fine if you use TLS and start a tokio runtime for the endpoint as recommended by @Ralith ), discover whether it actually fits your usecase, measure the real benefits and drawbacks.

At that point in time it seems more useful to discuss what could be changed - ideally in a way that benefits all users and doesn't add more code that needs to be maintained on all sides.

@dvc94ch
Copy link
Contributor

dvc94ch commented Aug 28, 2021

which should be fine if you use TLS and start a tokio runtime for the endpoint as recommended by @Ralith

Well that's propaganda, there is no performance difference between async-io and Tokio. It's fine if you prefer Tokio.

Addendum: And by there is no performance difference I mean tokio caught up. When async-std launched it featured task spawning with a single allocation and shortly after api stability. tokio quickly caught up on performance but took another I think 2y to gain stability. and even after it was stabilized 1.0.0 it's api is less ergonomic to use, the code base more complicated and the components reactor/executor interdependent.

Quinn at this point in time has somewhere between 30-50% of CPU efficiency of TCP+TLS

that's true

@Ralith
Copy link
Collaborator

Ralith commented Aug 28, 2021

I don't think @Matthias247 was trying to make any sort of point about the relative quality, let alone historical quality, of the implementations, just that doing everything on one thread is more efficient than context-switching, especially in wakeup-heavy workloads like a saturated network transport.

@dvc94ch
Copy link
Contributor

dvc94ch commented Aug 28, 2021

the problem remains, we have two async runtimes for better or worse and I don't see that changing, that ship has sailed. Maybe once GAT's are stabilized it will be easier to write runtime agnostic code.

@Ralith
Copy link
Collaborator

Ralith commented Aug 28, 2021

Sure. That doesn't have anything to do with this PR, which can and should be refactored to introduce runtime-independence by way of leaving async runtime operations to the next layer up.

@kpp kpp marked this pull request as draft August 30, 2021 10:03
@kpp
Copy link
Contributor Author

kpp commented Aug 30, 2021

Go ahead, integrate it "as is"

We can't until #1180 is merged and released =)

At that point in time it seems more useful to discuss what could be changed - ideally in a way that benefits all users and doesn't add more code that needs to be maintained on all sides.

Thank you all for your comments on this PR. I am not going to participate in a tokio vs async-std war. I feel sorry about all the flame and I am moving this PR to draft.

Let's come to a conclusion. What can be really changed in a reactor(or runtime)-agnostic way is three functions to 1) to initialize IP_RECVTOS, IP_MTU_DISCOVER... 2) send datagrams 3) recv datagrams. Moreover the code must play well with both tokio and async-std. I am not even sure if the juice is worth the squeeze =)

@djc
Copy link
Member

djc commented Aug 30, 2021

If we cannot reach consensus on a way forward for this issue, is paritytech/substrate#1180 still useful?

@kpp
Copy link
Contributor Author

kpp commented Aug 30, 2021

I am not quite sure. I can use it to bench against async-io implementation (with some additional work to satisfy our needs (socket_type)). But that’s it for now.

@sehz
Copy link

sehz commented Aug 30, 2021

As another user of async-io, it would be great to have quinn-udp independent of any executor.

@Ralith
Copy link
Collaborator

Ralith commented Aug 30, 2021

What can be really changed in a reactor(or runtime)-agnostic way is three functions to 1) to initialize IP_RECVTOS, IP_MTU_DISCOVER... 2) send datagrams 3) recv datagrams. Moreover the code must play well with both tokio and async-std. I am not even sure if the juice is worth the squeeze =)

I stand by the approach I've outlined previously. Doing the right thing for the actual I/O operations with various non-standard syscalls on various platforms is the core value proposition of quinn-udp, and doesn't involve any particular runtime. Leave the registration and wakeup logic to the caller, and the overwhelming bulk of the UDP logic remains useful, operating on raw FDs or something spiritually equivalent as necessary.

@sehz
Copy link

sehz commented Aug 30, 2021

One way to go forward is create abstraction that doesn't depend on any particular runtime. For example, instead of using std::net::UdpSocket, create dyn for UdpSocket. This should work on future coming io-uring based runtime as well.

@Ralith
Copy link
Collaborator

Ralith commented Aug 30, 2021

There is no need to abstract over different UDP socket types because there are not, in fact, multiple possible OS-facing UDP socket types. We can work in terms of (newtyped?) OS resources and avoid any complexity. Every runtime should be able to extract a raw FD or equivalent from their own socket wrapper.

@Matthias247
Copy link
Contributor

Thank you all for your comments on this PR. I am not going to participate in a tokio vs async-std war. I feel sorry about all the flame and I am moving this PR to draft.

I'm sorry that you felt that way. However as @Ralith mentioned - this was never intended to talk about the differences of async runtimes. Someone else started that comparison. The discussion was 100% about that any addition of threads will severely reduce performance and reduce the scalability of the library.

I also still stand by my first points: You should start with figuring out your actual goal and have quantified goals on whether you achieved it or not. Right now this CR just changes code without describing any goal or having any numbers. In fact the PR doesn't even have a description that tells anyone what the motivation is. If this would be merged - how would one now know whether the goal would be been achieved?

I could now go ahead and measure the content of this PR against my performance goals - but even if I would do that, how would I know whether it helps you?

And again: Please don't take this in a negative way. Take it as feedback on how to make your project succeed, which involves determining goals, quantifying whether goals had been reached, and figuring out where to best spend any engineering effort.

@Ralith
Copy link
Collaborator

Ralith commented Jul 3, 2022

Subsumed by #1364.

@Ralith Ralith closed this Jul 3, 2022
@kpp kpp deleted the udp_use_async_io branch July 3, 2022 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants