[Protocol] Retry failed requests #95

Douile · 2023-09-16T01:26:33Z

I added some logic to retry requests when failed, this seems to make

$ cargo run --example generic -- tf2 91.216.250.10

fail (less) often for me. (there are still sometimes malformed packet errors but I don't see WouldBlock errors anymore).

The problem with the specific query seems to be that the players and rules requests sometimes don't receive a response. Inspecting the packet captures the requests look valid so my assumption is that these responses are abnormally large for UDP packets and hence are occasionally dropped on the way to me.

For valve query protocol there might be a better way to solve this as I believe that it supports "paginated" multi-packet responses. But I think that having some retry logic could be useful as internally retrying the individual part of the request that fails might be faster than the caller retrying the entire query.

NOTE: Setting GatherOptions::{gather_player, gather_rules} to false avoids my original issue and causes query to succeed every time.

This can be used to specify how many times to re-send requests that fail. The default value is "1" so the if the first request fails, 1 more attempt will be made.

CosminPerRam · 2023-09-16T07:09:37Z

For valve query protocol there might be a better way to solve this as I believe that it supports "paginated" multi-packet responses.

Yep, and it's currently implemented, but I don't remember exactly when games do hand out these packets in multiple parts and when not (e.g. exceeding a maximum size, but some games have bigger packet sizes than others).

But I think that having some retry logic could be useful as internally retrying the individual part of the request that fails might be faster than the caller retrying the entire query.

Agreed, this could be done for every protocol that does multiple queries (I think only the Valve protocol for now).

Douile · 2023-09-17T10:13:09Z

Yep, and it's currently implemented

Ah yes I forgot its a completley server side implementation dependent thing.

Agreed, this could be done for every protocol that does multiple queries

The way I implemented here (retries field in TimeoutSettings), I think it might be confusing to users if protocols without multiple queries don't follow the same retry logic. Would probably need to document which protocols its implemented for as the multiple query part is invisible to a library consumer or just implement everywhere.

CosminPerRam · 2023-09-17T10:54:10Z

I think it might be confusing to users if protocols without multiple queries don't follow the same retry logic.

Then we could just treat this as a retry count, which not only applies to valve queries (where are multiple ones) but on every query (GS1 does just one, if it fails and we have 2 retries, try it again, and on those which do have multiple ones (valve), just retry the subqueries).

Allow fetching the number of retries or the default retries value from a borrowed optional TimeoutSettings.

Douile

I think this is ready for review, the only thing I might add would be unit tests but that can be a separate PR.

Douile · 2023-09-18T21:14:12Z

src/protocols/types.rs

+
+    /// Get the number of retries if there are timeout settings else fall back
+    /// to the default
+    pub fn get_retries_or_default(timeout_settings: &Option<TimeoutSettings>) -> usize {


This maybe shouldn't be in impl TimeoutSettings but you cannot implement functions on external types (Optional) so I put it here for organization. (Also probably doesn't need to be pub). Let me know what you think.

I think thats fine for now.

Douile · 2023-09-18T21:15:28Z

src/protocols/gamespy/protocols/one/protocol.rs

-    let mut server_vars = query_vars(address, timeout_settings)?;
+    let retry_count = TimeoutSettings::get_retries_or_default(&timeout_settings);
+    let mut server_vars = retry_on_timeout(retry_count, move || {
+        query_vars(address, timeout_settings.clone())


The clone here is undesirable but needed because there is no struct to take ownership of it here.

Douile · 2023-09-18T21:16:24Z

src/protocols/quake/client.rs

-    let data = get_data::<Client>(address, timeout_settings)?;
+    let retry_count = TimeoutSettings::get_retries_or_default(&timeout_settings);
+    let data = retry_on_timeout(retry_count, || {
+        get_data::<Client>(address, timeout_settings.clone())


Undesirable clone (see before)

CosminPerRam · 2023-09-19T07:50:28Z

Retries also need to be done for games with proprietary queries.

rust-gamedig/src/protocols/types.rs

Lines 13 to 17 in 2eb1d12

    
           pub enum ProprietaryProtocol { 
        
               TheShip, 
        
               FFOW, 
        
               JC2MP, 
        
           }

... and i think this hints that retrying queries shouldn't be a top-level code in queries (duplicated stuff), but refactoring this should (?) be done in another pr.

Douile · 2023-09-19T13:37:54Z

and i think this hints that retrying queries shouldn't be a top-level code in queries (duplicated stuff), but refactoring this should (?) be done in another pr.

I agree, I just couldn't see an easy way to do it.

Could add to generic query, but then that requires duplicating to every game impl as well
Can't add to socket as retrying a recv doesn't work because you don't resend the request

So I think we're kind of stuck in this layer, I guess syntactically might look nicer if added retry to the get_info like function, then query just calls through without needing retry code. But that still would involve a lot of duplicated code just in different places.

Douile · 2023-09-19T15:41:19Z

My mistake, I see what you mean now: FFOW and JC2MP call the internal query functions which do not implement retries themselves. So implementing the retries there is much preferred.

Retries are now implemented as wrappers on the single function that would need to be retried on timeout. In order to avoid cloning of TimeoutSettings, Socket::apply_timeouts() was changed to accept a borrowed TimeoutSettings. And extra helpers were added to the TimeoutSettings impl to reduce repetition.

Douile · 2023-09-24T13:50:30Z

I have changed so all of the retry_on_timeouts now occur in wrappers around the method in each protocol that makes the query. This means that the proprietary games that call these functions directly don't need to implement retries themselves.

I will add some unit tests for retry_on_timeout and then assuming everyone else is happy this is ready to merge.

Sending packets could also timeout and until error_generic_member_access is stable we have no way of determining the type of the underlying `std::error::Error`.

Douile added 2 commits September 16, 2023 02:14

Add retry count to TimeoutSettings

b129142

This can be used to specify how many times to re-send requests that fail. The default value is "1" so the if the first request fails, 1 more attempt will be made.

Add retries to valve queries

7dcf0f0

Douile changed the title ~~Retry failed requests~~ [Protocol] Retry failed requests Sep 16, 2023

CosminPerRam added enhancement New feature or request protocol This is something regarding a protocol labels Sep 16, 2023

Douile added 5 commits September 18, 2023 21:58

[Protocol] &Optional<TimeoutSettings> add get_retries_or_default

5ed2992

Allow fetching the number of retries or the default retries value from a borrowed optional TimeoutSettings.

[Protocol] Add retries to minecraft protocol

2ac0a64

[Protocol] Add retries to quake

2127cae

[Protocol] Add retries to gamespy

a81a9bc

[Protocol] Update TimeoutSettings docs, and change default retries to 0

88cdcfd

Douile commented Sep 18, 2023

View reviewed changes

Douile marked this pull request as ready for review September 18, 2023 21:17

Douile added 2 commits September 18, 2023 22:28

Remove logging from retry_on_timeout

5bc118d

[Protocol] TimeoutSettings make retries non-optional

cb03bc9

Douile added 2 commits September 24, 2023 14:40

[Examples] Add retries to the generic example

102d3b8

Douile added 3 commits September 24, 2023 16:21

Also retry on PacketSend error

ecf0f2c

Sending packets could also timeout and until error_generic_member_access is stable we have no way of determining the type of the underlying `std::error::Error`.

Add retry unit tests

a76f6fa

[Docs] Update changelog

2d972c2

CosminPerRam approved these changes Sep 25, 2023

View reviewed changes

CosminPerRam merged commit c3281be into gamedig:main Sep 25, 2023
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Protocol] Retry failed requests #95

[Protocol] Retry failed requests #95

Douile commented Sep 16, 2023 •

edited

Loading

CosminPerRam commented Sep 16, 2023

Douile commented Sep 17, 2023 •

edited

Loading

CosminPerRam commented Sep 17, 2023

Douile left a comment

Douile Sep 18, 2023

CosminPerRam Sep 19, 2023

Douile Sep 18, 2023

Douile Sep 18, 2023

CosminPerRam commented Sep 19, 2023 •

edited

Loading

Douile commented Sep 19, 2023

Douile commented Sep 19, 2023

Douile commented Sep 24, 2023

[Protocol] Retry failed requests #95

[Protocol] Retry failed requests #95

Conversation

Douile commented Sep 16, 2023 • edited Loading

CosminPerRam commented Sep 16, 2023

Douile commented Sep 17, 2023 • edited Loading

CosminPerRam commented Sep 17, 2023

Douile left a comment

Choose a reason for hiding this comment

Douile Sep 18, 2023

Choose a reason for hiding this comment

CosminPerRam Sep 19, 2023

Choose a reason for hiding this comment

Douile Sep 18, 2023

Choose a reason for hiding this comment

Douile Sep 18, 2023

Choose a reason for hiding this comment

CosminPerRam commented Sep 19, 2023 • edited Loading

Douile commented Sep 19, 2023

Douile commented Sep 19, 2023

Douile commented Sep 24, 2023

Douile commented Sep 16, 2023 •

edited

Loading

Douile commented Sep 17, 2023 •

edited

Loading

CosminPerRam commented Sep 19, 2023 •

edited

Loading