-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Protocol] Retry failed requests #95
Conversation
This can be used to specify how many times to re-send requests that fail. The default value is "1" so the if the first request fails, 1 more attempt will be made.
Yep, and it's currently implemented, but I don't remember exactly when games do hand out these packets in multiple parts and when not (e.g. exceeding a maximum size, but some games have bigger packet sizes than others).
Agreed, this could be done for every protocol that does multiple queries (I think only the Valve protocol for now). |
Ah yes I forgot its a completley server side implementation dependent thing.
The way I implemented here (retries field in TimeoutSettings), I think it might be confusing to users if protocols without multiple queries don't follow the same retry logic. Would probably need to document which protocols its implemented for as the multiple query part is invisible to a library consumer or just implement everywhere. |
Then we could just treat this as a retry count, which not only applies to valve queries (where are multiple ones) but on every query (GS1 does just one, if it fails and we have 2 retries, try it again, and on those which do have multiple ones (valve), just retry the subqueries). |
Allow fetching the number of retries or the default retries value from a borrowed optional TimeoutSettings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ready for review, the only thing I might add would be unit tests but that can be a separate PR.
src/protocols/types.rs
Outdated
|
||
/// Get the number of retries if there are timeout settings else fall back | ||
/// to the default | ||
pub fn get_retries_or_default(timeout_settings: &Option<TimeoutSettings>) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This maybe shouldn't be in impl TimeoutSettings
but you cannot implement functions on external types (Optional
) so I put it here for organization. (Also probably doesn't need to be pub). Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think thats fine for now.
let mut server_vars = query_vars(address, timeout_settings)?; | ||
let retry_count = TimeoutSettings::get_retries_or_default(&timeout_settings); | ||
let mut server_vars = retry_on_timeout(retry_count, move || { | ||
query_vars(address, timeout_settings.clone()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The clone here is undesirable but needed because there is no struct to take ownership of it here.
src/protocols/quake/client.rs
Outdated
let data = get_data::<Client>(address, timeout_settings)?; | ||
let retry_count = TimeoutSettings::get_retries_or_default(&timeout_settings); | ||
let data = retry_on_timeout(retry_count, || { | ||
get_data::<Client>(address, timeout_settings.clone()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undesirable clone (see before)
Retries also need to be done for games with proprietary queries. rust-gamedig/src/protocols/types.rs Lines 13 to 17 in 2eb1d12
... and i think this hints that retrying queries shouldn't be a top-level code in queries (duplicated stuff), but refactoring this should (?) be done in another pr. |
I agree, I just couldn't see an easy way to do it.
So I think we're kind of stuck in this layer, I guess syntactically might look nicer if added retry to the |
My mistake, I see what you mean now: FFOW and JC2MP call the internal query functions which do not implement retries themselves. So implementing the retries there is much preferred. |
Retries are now implemented as wrappers on the single function that would need to be retried on timeout. In order to avoid cloning of TimeoutSettings, Socket::apply_timeouts() was changed to accept a borrowed TimeoutSettings. And extra helpers were added to the TimeoutSettings impl to reduce repetition.
I have changed so all of the retry_on_timeouts now occur in wrappers around the method in each protocol that makes the query. This means that the proprietary games that call these functions directly don't need to implement retries themselves. I will add some unit tests for retry_on_timeout and then assuming everyone else is happy this is ready to merge. |
Sending packets could also timeout and until error_generic_member_access is stable we have no way of determining the type of the underlying `std::error::Error`.
I added some logic to retry requests when failed, this seems to make
fail (less) often for me. (there are still sometimes malformed packet errors but I don't see WouldBlock errors anymore).
The problem with the specific query seems to be that the
players
andrules
requests sometimes don't receive a response. Inspecting the packet captures the requests look valid so my assumption is that these responses are abnormally large for UDP packets and hence are occasionally dropped on the way to me.For valve query protocol there might be a better way to solve this as I believe that it supports "paginated" multi-packet responses. But I think that having some retry logic could be useful as internally retrying the individual part of the request that fails might be faster than the caller retrying the entire query.
NOTE: Setting
GatherOptions::{gather_player, gather_rules}
to false avoids my original issue and causes query to succeed every time.