-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactors sockets and moves reconnect logic and error handling into it. #7459
Conversation
@Sean-McQ I'd like to review this PR, but I have found it difficult to reproduce the original issue using Colleen's steps. Do you have any tips to make it easier to repro the original issue? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't done a deep dive in to this yet, but making shell/utils/socket.js
steve/norman specific breaks the other places it's used (epinio, harvester, streaming workload logs, shell to deployment). Easiest place to see is Workload list --> Deployment action menu --> Execute Shell
For the most part, Steve/Norman are pretty fundamental to the way we use websockets so I still think it makes sense for some of their requirements to be baked into them but I do see where/how that breaks shells atm. Gonna work on a fix for that and put in an update here. |
As requested, new changes to the socket have been moved to a new class derived from the original socket class (ResourceWatcher) and the ResourceWatcher is in use by the subscribe Vuex methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't totally finished reviewing, but hit a functionally blocking bug (see comment in resourceWatcher
send(data) {
and will pick up the rest tomorrow
const INSECURE = 'ws://'; | ||
const SECURE = 'wss://'; | ||
|
||
const STATE_DISCONNECTED = 'disconnected'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 8 - 18 are core socket stuff already defined in the root socket class
export const EVENT_CONNECT_ERROR = 'connect_error'; | ||
export const EVENT_DISCONNECT_ERROR = 'disconnect_error'; | ||
|
||
export const NO_WATCH = 'NO_WATCH'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These and WATCHSTATUSES don't need to be exported
constructor(url, autoReconnect = true, frameTimeout = null, protocol = null, maxTries = null) { | ||
super(url, autoReconnect, frameTimeout, protocol, maxTries); | ||
|
||
this.maintenanceInterval = setInterval(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maintenanceInterval
is never stopped. I need to investigate this more, but we should probably wire this in to the connect/disconnect
|
||
setUrl(url) { | ||
this.baseUrl = self.location.origin + url.replace('subscribe', ''); | ||
if ( !url.match(/wss?:\/\//) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could swap this section below out for just super.setUrl(url)
} | ||
} | ||
|
||
isConnecting() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed
} | ||
} | ||
|
||
this.dispatchEvent(new CustomEvent(EVENT_MESSAGE, { detail: event })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line and lines 246, 247 and 248 are super._onmessage
. Could that be used instead and placed either first or last? That would keep all common code in the root class and extended functionality in parent classe's method together
debug(state, on) { | ||
state.debugSocket = on !== false; | ||
}, | ||
|
||
resetSubscriptions(state) { | ||
clear(state.started); | ||
const watcheKeys = Object.keys(state.socket.watches); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this throws an exception when going from the Virtualization Management
product to a cluster's explorer product
// refactoring this to use new specific methods for subscribing and unsubscribing to resource collections which will make recovering stopped subscriptions a little bit easier. | ||
send(data) { | ||
// we're going to use these values from the data we're sending to determine if we're subscribing or unsubscribing | ||
if (data.resourceType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was never true for me (data
is always a string), so this.watches
is never initialised
Summary
Fixes #5997
Socket subscriptions did not maintain adequate information about their current state to self-heal from an error state. Most notably, there was no information as to which resourceVersion or the age of the resourceVersion being stored in vuex that would allow for a quick recovery if a subscription (watch) was stopped. Subscriptions (watches) were tracked in a store key "started" which didn't adequately describe what was being stored, and a subscription's (watch's) error state was being kept in the key "inError" which also doesn't adequately describe that the key is for use only for subscriptions (watches).
A resourceVersion remains valid to use to resubscribe only if it is less than 5 minutes old, these resourceVersions are not necessarily sequential so they can't be guessed. Current logic was to use the latest resourceVersion from a single resource in a collection rather than the collection resourceVersion, causing issues creating the new subscription (watch).
Occurred changes and/or fixed issues
Websocket subscriptions for resources are referred to as "watches" in the Kubernetes API Concepts documentation so they are referred to as such in code and in this description from here on out.
Moves logic for maintaining resource watches into the socket object itself and extends it so that the socket maintains those subscriptions by itself to consolidate logic and location of socket relevant values.
Technical notes summary
A lot of code removed from
shell/plugins/steve/subscribe.js
and replaced inshell/utils/socket.js
which should hopefully make sockets more modular (this lays a decent amount of groundwork for advanced webworkers as well). Changes outside of these files should be relatively minimal and there shouldn't be much in the way of user impact aside from more stable websocket watches.Areas or cases that should be tested
Areas which could experience regressions
This is a major change to sockets and watches so those should be thoroughly tested