-
Notifications
You must be signed in to change notification settings - Fork 699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve collator networking #1526
Conversation
30ea4f7
to
9a53e98
Compare
Somehow this made things substantially worse. Not sure why. it seems to be the worst of both worlds |
Actually, it seems much better now. Perhaps we were seeing another issue related to #1517 which is apparently causing validators to crash. Will hold out for better results. One change I can imagine is tweaking the timer to be 2.5s, which should work even better than 4s as it gives 3.5s for connections to be fully closed before reopening. |
I'm not sure if adding more time will help connections to "fully close". If the node is not sending anything over the substream during that time, it won't be able to detect that the substream was closed and accept a new one. |
It seemed to help in practice, but can you elaborate? How do you force a substream to close if |
@altonen could you remind why you are against merging this fix now? |
Which PR fixed the insufficient inbound slots issue? We adjusted the ratios a while back but I don't count that as a fix and AFAIR there hasn't been any direct PR addressing the issue because it's not something we can fix in I'm not against merging the PR but it's more of a hack than an actual fix. IMO proper way to fix this would be to detect when the inbound substream is closed and then consider the connection closed. |
I merely meant adjusting the ratios when a wrote about the fix.
Thanks for the comment, I forgot the discussion about not allowing half-open notifications substreams, and that was essentially what I was asking about. |
@dmitry-markin please close if not required. |
Fixes #1525
This alters the behavior of collators. We now only disconnect from reserved peers between leaves (after new candidates are likely to have been submitted). This avoids some race conditions such as #1499 , where disconnections were being immediately followed by reconnections.