-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Back-pressure on opening new streams on yamux #471
Comments
The number of streams is probably the thing that peers will impose a limit on, not the number of pending streams. So the "pending stream budget" seems like the wrong thing to measure. A global stream limit is not that useful to begin with. What we need in libp2p is a limit per protocol. This would require a coordination mechanism running on top of a stream muxer. |
Thanks for the input! At least from an implementation PoV, I wouldn't immediately know, what a useful starting point for the max number of streams is and without an extension to the messages in yamux, we can't communicate a limit directly I think. In case both nodes are equally fast in processing streams, it seems okay to me to have them open more streams. That is already the situation today. Also see libp2p/rust-libp2p#3039 which started this thought process. Setting a limit on the pending streams only triggers if the receiving node slows down in processing. A downside of this proposal is that I think we can never increase the budget so limiting the total number of streams seems too much.
To clarify my intention: I am looking at this from a bugfix point of view. What is the best value for money solution so we can get some kind of backpressure into yamux without investing heaps of time? It is a legacy muxer from a spec PoV but still widely used at this point (i.e. every rust-libp2p deployment unless they are using mplex for some reason). |
I’m not yet convinced of adding some sort of back-pressure to yamux in such a fashion: existing nodes will still translate the back-pressure signal as UnsupportedProtocols, leading to false failures in the existing node, thus breaking network compatibility. This looks like a bug that cannot be fixed without switching to a new muxer (which could be a rebranding of yamux, if need be). |
The above proposal does not intentionally drop streams. What it does is to stop opening new streams in case the other party starts to drop streams. Once libp2p/rust-libp2p#3071 is fixed, no streams will be dropped unless one node can actually not keep up with the load (i.e. is working at 100% accepting new streams). This would be shipped completely independent of any buffering implementation. With libp2p/rust-yamux#142, we are getting a |
We seem to have a different perspective on network compatibility. A new version could, of course, stop opening new streams once it concludes that that is a good idea, but existing versions won’t do that — they will just understand “unsupported protocol” and draw incorrect conclusions from that. There is nothing you can do to change that. |
An implementation that draws conclusions about dropped streams by the remote (i.e. this proposal) does not necessarily have to drop streams itself. That is orthogonal. I originally intended to ship this in the following order:
This by itself is already a fairly good measure, assuming your peer does not apply any explicit backpressure but just tries to process streams as fast as possible. If your peer is faster then theirs, the ACKs for existing streams will naturally pile up and you will eventually stop opening streams until they have answered them. My suggestion for implementing this would be to wait for all streams to be ACKd before we start opening new ones.
We can roll this out in a backwards-compatible way by first trying to negotiate Footnotes
|
Yes, this matches what I meant above by “rebranding of yamux”. I haven’t yet thought through all the details — if you consider error handling then the scheme to be implemented will likely be similar to Reactive Streams (considering substream requests as |
Thank you for linking this. I think our current interface almost matches this:
The backpressure bit (returning |
If I understand correctly, this proposal has now changed to rolling out a new yamux version. That makes sense, since it’s the only way to solve the problem that doesn’t rely on extrapolating from incomplete information. However, rolling out a new muxer version is really no different from rolling out a new muxer. I can think of a lot more things I’d like to change about yamux than just adding the “Stream Buffer Update” (and I’m not convinced it’s the easiest / most elegant solution for the problem). In fact, we already have a proposal for a more well-behaved stream muxer: qmux (#394). Operationally, I’d consider #446 a prerequisite for rolling out a new muxer. Paying a 1 RTT penalty if the peer doesn’t support the muxer is probably prohibitive, and updating the network to a state where you can just assume that the new muxer (version) is support by the vast majority of nodes takes a long time. Priority-wise, I’m a bit skeptical of both approaches (that’s why progress on #394 has stalled). Our resource control problem would much better be solved by introducing per protocol stream limits than by a global stream limit. Furthermore, the vast majority of our connections are QUIC connections anyway, so any improvement on the muxer side would only apply so a small minority of connections. |
I agree with @marten-seemann, those are several good points. One aspect I see differently concerns the layering of back-pressure: I think there is a place for limiting new substreams regardless of protocol, purely at the muxer level. When a new substream is accepted and its protocol negotiated, the respective handler will need to allocate resources for it, so it makes sense to add a back-pressure signal from handler to muxer as well. However, this is a bigger change in the ecosystem, since now a protocol may be temporarily unavailable on a connection. This deserves a different response code than a permanent negotiation failure to allow the other side to come back later. Communicating a per-protocol budget for new substreams between peers seems excessive to me, I’ll come back to this point. @thomaseizinger The crucial point of Reactive Streams is not in the semantics of onNext/onError/onComplete (those are present already in the non-back-pressured Rx.Net ancestor), it is the semantics of Coming back to the last point of the first paragraph: this is why I think this principle should be employed on the muxer level for all new substreams. Propagating demand for all possible protocols is too costly, complex, and in some cases not even possible (since e.g. two protocol names may share the same providing resources). |
There is a difference in how much work you need to invest in creating that new muxer (in each language) and agreeing what its functionality should be etc. That is likely a timeline of months.
That depends entirely on which network you are looking at. IPFS? Probably. A smaller deployment that consists entirely of rust-libp2p based nodes1? Likely easier.
This is only true for Like I said above, my intention here is a low-effort high-impact solution. The suggested changes are fairly trivial to implement on the I think there is value in doing that, perhaps for now only on the Rust side of things if it benefits some of our users. Footnotes |
Can you explain this again mapped to our problem domain? I am struggling with the abstract description around reactive streams. Here is my proposal again, as an example:
Assuming peer A opens 10 streams. The number of pending (no ACK received yet) streams is 10. Peer B processes 3 streams and thus sends 3 ACK frames back. It read all 10 SYN frames but because the upper application has only pulled 3 streams, 7 remain in an internal buffer. To exercise back-pressure, we need to limit the number of pending streams that peer A can have on their end. Say we configure our yamux client with max pending streams 20. In the above example, peer A would have then a budget of 13 more streams (7 are still pending out of 20 max). Every time a stream is consumed on peer B, a SYN is sent and thus the budget is increased again. My proposal allows for this number to be dynamic. We could also hard-code some value on the producer side (or let the user configure it). It would result in largely the same outcome except that we have to "trust" the other party that they are honoring the limit. If we send the limit to the other party, we can detect a protocol violation and close the connection. Perhaps this would be an even better change to make, at least to start with. It is fully backwards-compatible if we use an unbounded buffer on the receiver side. |
Disclaimer: for the following I assume that What you describe is similar to a particular way of using Reactive Streams: have an initial budget (implied) and piggy-back the interpretation Dynamic updates would only work in one direction, though, because a I agree that it is interesting to consider compatible improvements. As we seem to agree on, this can only work by voluntarily limiting the request sender (as no quota or enforcement thereof has been built into existing nodes). Without a means to communicate limits, the sender would have to assume some default limit and stick to it. Meanwhile the receiver needs to be prepared to receive an unbounded rate of new substream requests — which will eventually lead to failures (that should be confined to one connection if possible, by using quotas). It is crucial to remember that in this world there is no back-pressure per se, just polite clients and failures. This will lead to resource quotas differing substantially from back-pressured stream buffer sizes. |
This race condition exists but I'd argue that only a naive implementation would be triggered by it. A stream buffer update can only come into effect (and thus trigger a protocol violation) after the number of pending streams has dropped below the new value (or maybe even 0). I'd consider this an edge-case though: I wouldn't know under which condition a client changes their configured stream buffer size at runtime. Really, the only reason why we even send this value is because we can't assume one from the other party (with 0 being the only safe exception). I'd expect implementations to send this value only once and have it remain constant throughout the session.
Perhaps that is good enough though? I wonder whether that would add delay and if so, how much. My takeaway from this is that going for
|
That will cause a huge performance regression, depending on the application protocol. Imagine HTTP layered on top of libp2p, and the user loading a website with 50 sub-resources. This would now take 50 RTTs instead of a single RTT. |
What happens when this limit is reached? |
Yes, that is a good question; also: do you @marten-seemann mean a backlog in the receiver or a quota in the sender? @thomaseizinger I don’t think we’re on the same page regarding what’d be broken with dynamically lowering buffer size, but that is irrelevant if we take the compatibility route. In that latter case I don’t think we need to dial up the politeness level to ask for permission for every single substream. Having any fixed pending ack budget would be an improvement already, so the bound can also be greater than one. |
The accept backlog is the number of streams that we have opened but haven't received an ACK for. Once that limit is reached, we won't allow opening new streams until an ACK is received. Quick comment on terminology: substreams are not a thing (when it comes to libp2p specs). It's called streams. :) |
Okay, so that is what I called "pending streams" above. Good to know the go implementation has this feature. I am planning to introduce the same on the Rust side, will also go with a default of 256 then!
Yeah, "Stream" means something very specific in Rust: https://docs.rs/futures/latest/futures/prelude/trait.Stream.html I'd love to be consistent across the ecosystem here but I am not sure we can transition to that fully without causing a lot of ambiguity :/ |
Side note: I think it would be worth writing a "spec" document in here that outlines how we use yamux specifically and f.e. recommends an accept backlog of 256 etc. |
I've now implemented this in rust-yamux: libp2p/rust-yamux#153 tl;dr: I think I have a solution for back-pressure. In the above PR, I implemented two tests:
The combination of the above two solves back-pressure assuming all implementations agree on the same magic number 256 and each implementation has a buffer of 256 streams. In case node A is slower than node B, the buffer on node A will eventually be full. Because the application hasn't written to any of the streams yet, they are not acknowledged. Once it is catches up and processes a new stream, it will acknowledge it node B which gives B another credit for opening a new stream whilst at the same time having a buffer slot on node A. This hits a sweet spot in my eyes. We don't need to roll out a new muxer to networks but still manage to not drop streams between compliant implementations. The only downside is that the limit isn't configurable but I think that is acceptable for the benefit of introducing this in a backwards-compatible way. I'd consider this issue to be solved if we extend the spec with:
@marten-seemann Let me know what you think! |
I'm surprised that this is still considered a problem worth solving. I thought we had agreed to not pour more engineering time into yamux.
That's not a valid assumption. 256 is a pretty high number, and I'd be opposed to requiring all implementations to support that many. |
It is a problem on the Regardless, most of the work (which wasn't much) was to make yamux follow the go implementation to also support the ACK backlog of 256.
I went off your comment here: #471 (comment) I guess with the default window size of 256KB, this could amount to 256KB * 256 which is 64MB of buffered data. However, aren't you implicitly requiring this already anyway with an ACK backlog of 256? What is the point of opening up to 256 streams if you don't expect the other party to buffer them until it is ready to process them? |
Yes, you can't, and you shouldn't. Relying on your muxer to provide backpressure is the wrong thing to hardcode without #520 anyway.
I'm not opposed to that number in go-libp2p, but making it a requirement for all implementations is a completely different story. |
I am happy to change the wording to SHOULD instead of MUST. Does that work for you? |
rust-libp2p yamux implementationDocumenting an out-of-band discussion with @marten-seemann here. Unless I am missing something libp2p/rust-yamux#153 only implements what go-libp2p-yamux does already today, that is tracking a stream accept backlog and adding an upper limit on the outbound side. It does not introduce a new mechanism, nor does it introduce an incompatibility with any existing implementation. Please confirm @thomaseizinger. libp2p yamux specificationI'm with @marten-seemann on not including new requirements (like MUST and MUST NOT) into the Yamux specification. Even though a multi-stage rollout strategy would theoretically be possible, it seems like too much effort. In my opinion, our time is better spent focusing on QUIC. It seems like we're all on the same page here, @thomaseizinger and @marten-seemann.
Adding a recommendation (not requirement) to the specification sounds good to me. As far as I understand this would just update the specification to reflect the status-quo, that is what go-libp2p already does today. Similar to how #519 documented the outbound side of the ACK-backlog, we can document the inbound side of the ACK-backlog. Replacing MUST and MUST NOT with SHOULD and SHOULD NOT in @thomaseizinger suggestion above works for me:
The entire ACK-backlog section would then read as following:
|
Yamux does not offer any form of control in how many streams can be opened by a remote.
This has been an annoyance for a while and transports like QUIC are useful because they do support managing the number of open streams.
I'd like to propose what I believe is a backwards-compatible extension of the yamux spec that gives us some amount of back-pressure for streams.
The yamux specification states for opening of a new stream:
My proposal is as follows:
I think this is backwards-compatible because existing implementations already need to handle receiving
RST
for a stream at any point. What we are doing here is adding meaning to receivingRST
whilst we have several pending substreams. This allows us to implement a bounded buffer on the receiver side where any streams exceeding that buffer will send aRST
and be dropped. This means we will have 1 failed stream per physical connection that allows us to learn the other party's buffer size.cc @mxinden @marten-seemann @rkuhn
The text was updated successfully, but these errors were encountered: