-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full support for websocket reconnection/resubscription #1966
Conversation
When using `.on<event>=fn` to attach listeners, only one listener can be set at the same time. Since multiple request managers can use the same provider, the EventTarget API has to be used to ensure all of them receive the events emitted from the provider. This is needed on both the `on` and `removeListener` functions.
The method `once` is required to allow the subscription logic to identify if the provider is able to reconnect/resubscribe and then attach to the following `connect` event the function to resubscribe.
When the subscription fails on start and when it fails after it was successfully established, the same logic needs to be executed: remove subscription, listen for the next `connect` event if available to actually subscribe again, emit the error and call the callback. Prior code did that only for established subscriptions so if a subscription was unable to be set right on start, no resubscription was ever tried. The logic was moved to a single method to avoid duplication of code. In addition reentry is avoided by checking and properly clearing the `_reconnectIntervalId` variable.
On subscribe, if there is an existing `id`, the subscription listeners are removed. In the case of a resubscription, the listeners have to be kept. Therefore, the `id` property -that will change anyway- must be cleared so the listeners are not removed. Then, after the subscription object resubscribes, the listeners set by the subscription user code remain untouched, making the resubscription transparent to the user code.
When the request manager removes a subscription due to an error, it tries to send an unsubscribe package, which can also fail if i.e. the network is down. In such a case, the function must not allow reentry. Removing the subscription first ensures it will not do so. In addition, if the subscription was already removed, the callback shall be called anyway.
When error events are emitted by the provider, all subscriptions shall receive the event and trigger the unsubscription/resubscription logic.
By wrapping the available WebSocket implementation (native WebSocket object or `websocket` package) with `websocket-reconnector`, the provider is given a WebSocket that will automatically reconnect on errors. A new option was added to the WebSocket provider to controll whether it should automatically reconnect or it should behave as usual.
In the case any websocket call takes too long to return and a timeout was set for the provider to timeout, the provider should try to restart the connection. This could happen, for instance, if the client loses connection with the server, the server closes the connection and later, the connectivity is up again but since the client did not receive the closing frame *and* the client does not attempt to send any package to the server, no error is observed. `websocket` implementation for Node.js has an option to send keep-alive frames and detect such scenarios, but the standard browser W3C WebSocket does not, so it is "vulnerable" to this kind of failure which will mostly affect web3 subscriptions.
Looks very interesting! Thanks for describing your PR so well. @nivida you probably need to manually incorporate this into your work on the ethereum provider. |
@frozeman thanks for taking a look at it! You made a very key point here. Most of the complexity behind this PR was to ensure the lib resubscribes after websocket reconnects. This PR handles both automatic websocket reconnection and automatic unsubscription/resubscription on websocket reconnection so i.e. new header events and logs are received again after reconnection. My rationale was not to notify the client/developer if some error condition is being automatically handled. That worked for me but to be honest I also thought on the possibility of adding some additional events so the client/developer is aware of the reconnection/resubscription taking place in the background. By the way, Travis is failing only in Node.js 5 due to a rest argument in the websocket reconnection library. I will work on that so all tests pass. |
Does this also reattach SolidityEvent listeners? Is that a possibility if not? |
@monitz87 yes it does! Once the underlying websocket reconnects, all subscriptions (i.e. |
Maybe this is covered, but better safe than sorry. When I was toying with this concept, reattaching event listeners caused the queue to grow uncontrollably until the memory ran out. Just a heads up. I hope this pans out and your solution gets merged soon. |
@gabmontes First of all thanks for submitting this PR! This was on my roadmap for the next release! The problem is that I've refactored close to the entire code architecture in the ethereumProvider branch because the current code is not as good as it could be ;) (see here: #2000) |
This got reimplemented and improved in the ethereumProvider branch and because of this I will close your PR. Thanks for the inspiration and your efforts it speeded up the process a lot. |
Awesome! |
@nivida Any word on when the ethereumProvider implementation of this will be available in an npm package version? I just tried a server disconnect/reconnect test with npm v1 beta 37 but there wasn't any reconnect on the client when the server was put back online. Is the fix present in that release, or any special incantations required to get it to operate? thanks in advance! |
@nivida Is this going to be part of beta.38 or will it be a later version? |
+1 -- would be great to get this in v38 ! |
Hi @nivida , fyi -- still can't get this to work intuitively (or at all) in beta 38 or 41 (Chrome, Windows 10). Network right click disable... Network, right click, enable... I can see another websocket is setup and is streaming -- issue is that the subscription callbacks aren't being called (expected sections marked below /###/). Is there any "reconnect" or similar event that needs to be trapped and/or subscriptions recreated? `
` |
@nivida what do you guys think about revisiting this PR but targeting branch It was based on |
@alcuadrado please consider adding this PR to #3070 list. |
@gabmontes This topic is currently being tracked there and is a high priority fwiw. I make sure this PR is ref'd though. Thanks for pinging. |
Description
Websocket disconnection is a long standing issue in the 1.0 branch of this library and is the source of many issues when developing applications that for any reason use a not-so-reliable connection to the nodes.
Solving the disconnection and reconnection means basically resolving several major problems:
This PR addresses these problems by:
once
and updatingon
to useaddEventListener
instead of just overwriting the default event listener. That was needed to properly manage the error events that needed to be sent to everyRequestManager
s using the provider and later allow them to listen for the newconnect
events.RequestManager
's error handling logic so it does the same steps when the connection was already established and an error occurs or when the connection cannot be established on start due to a recoverable cause. This was needed to resubscribe in all scenarios. In addition there was two small bugs that was fixed so the resubscriptions can actually be processed and issues are avoided in the case several resubscriptions are tried simultaneously (no function reentry).The changes were extensively tested and work as expected, reconnecting and resubscribing under several different circumstances.
The commits of this PR have extensive comments explaining the rationale behind each one.
Fixes #1085
Fixes #1391
Fixes #1558
Fixes #1852
Fixes #1933
And possibly addresses others issues as well.
@nivida @frozeman Please review and evaluate this PR. I will be happy to address any comment you might have on it.
Thanks!
Type of change
Checklist:
npm run test
with success and extended the tests if necessary.npm run build
and tested the resulting file fromdist
folder in a browser.