-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use tokio::time::timeout to timeout fetch state and send encrypted #889
Conversation
url | ||
); | ||
continue; | ||
let work = self.http.get(url.clone()).send(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have issues with node stability, should we add this info to metrics? Each node can report the number of active participants. We can even add an alert on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good! I'll do it in another PR since this PR will already be big
6201a06
to
58d7056
Compare
as discussed with @volovyks , since we might replace the client later with sockets etc, we will not refactor the timeouts into contract. |
I see, then let's not bake these sort of configuration into terraform at all and just keep with the defaults in the CLI itself for now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
@@ -63,6 +63,10 @@ pub enum Cli { | |||
/// referer header for mainnet whitelist | |||
#[arg(long, env("MPC_CLIENT_HEADER_REFERER"), default_value(None))] | |||
client_header_referer: Option<String>, | |||
#[clap(flatten)] | |||
mesh_options: mesh::Options, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do partners need to update their terraform setup?
The previous implementation caused troubles: 1) 1s timeout have already stuck the protocol; 2) 500ms timeout still see many timeouts on /state endpoint.
I am guessing the .timeout() that came with reqwest package cannot successfully time out an async call, the counting of time may not have considered task/thread switching.
So I switched to using tokio::time::timeout and now problem solved.
Dev success increased from 70% to 100%.
This change sets the timeout to 1000ms for both fetch /state and for send encrypted. Both can be tuned as an environment variable.