-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make client and server to resync active connections #74
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to follow what we discussed during our meeting.
Given the potential deadlock mentioned in the comments below, I think this requires more testing. It should be tested in a custom Rancher build and it would be beneficial imo to have integration tests for this, though I understand that right now remotedialer doesn't have this in place.
e8d07f6
to
1bdf8ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments are mostly nits at this point, since I am not yet familiar with this project. I may have to come back and think about the solution itself once I learn a bit more about remotedialer.
@maxsokolovsky do we have your final blessing to merge this, so that it has a chance to get into future 2.9.0 alphas? |
Issue: 44576
Relates to #68
Depends on #78
Problem
remotedialer allows multiplexing connections between two peers using a single websocket connections by including a connection ID in the messages and using separate buffers. The protocol specifies different message types for different actions (
Connect
,Data
,Error
,Pause
,Resume
, etc.). In particular, theError
type is used to communicate the other end that a certain connection must be closed. However, depending on the cause of the original error, this message may never be successfully transmitted, as the sender will give up on sending it (#67 adds additional logging for this situation).When this happens, one of the peers will never receive a termination message for that connection, making the underlying buffers to get stuck on
Read()
forever, hence causing goroutine and memory leaks.Solution
This PR adds a new message type to the protocol (
Resync
), whose payload contains a list of connection IDs. Similarly to how clients sendsPing
control messages,Resync
messages will periodically tell the receiving peer that any connection not contained in the provided list is no longer needed and can be pruned.Small caveat: we cannot use Control messages for this purpose, since websocket set a limit of 125 bytes for their payload, which would impose a tight restriction on the number of connections.
CheckList