-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix error on ping timeout with stream management #4153
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #4153 +/- ##
=======================================
Coverage 84.04% 84.04%
=======================================
Files 564 564
Lines 34050 34067 +17
=======================================
+ Hits 28618 28633 +15
- Misses 5432 5434 +2
☔ View full report in Codecov by Sentry. |
969ddc0
to
297a123
Compare
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a few comments. The main issue is that this PR only handles the case when the server resends unacked stanzas, while the original story required to handle user-sent ping errors as well - see https://xmpp.org/extensions/xep-0199.html#example-3
This comment was marked as outdated.
This comment was marked as outdated.
7707443
to
499f365
Compare
This comment was marked as outdated.
This comment was marked as outdated.
big_tests/tests/sm_SUITE.erl
Outdated
Backend = mongoose_helper:mnesia_or_rdbms_backend(), | ||
[{mod_stream_management, config_parser_helper:mod_config(mod_stream_management, SMConfig)}, | ||
{mod_offline, config_parser_helper:mod_config(mod_offline, #{backend => Backend})}]. | ||
{mod_offline, config_parser_helper:mod_config(mod_offline, #{backend => Backend})}, | ||
{mod_ping, config_parser_helper:mod_config(mod_ping, PingConfig)}]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as far as I understand we don't need to add mod_ping module for any other groups/testcases than ping_timeout
big_tests/tests/sm_SUITE.erl
Outdated
logger_ct_backend:start(), | ||
logger_ct_backend:capture(error), | ||
|
||
%% connect Alice and simulate ping timeout by dropping connection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think that this comment doesn't reflect the actions in the subsequent steps
big_tests/tests/sm_SUITE.erl
Outdated
FilterFun = fun(_, Msg) -> | ||
re:run(Msg, "parse_iq_id_failed", [global]) /= nomatch | ||
end, | ||
?assertEqual([], logger_ct_backend:recv(FilterFun)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking that some error hasn't occured is a weak approach, especiallly using regular expressions, parse_iq_id_failed identifier may change in the future and this testcase will become false-positive.
I think mocking on the MIM node is a better option to make a validation.
big_tests/tests/sm_SUITE.erl
Outdated
@@ -124,6 +127,9 @@ parallel_unacknowledged_message_hook_cases() -> | |||
unacknowledged_message_hook_resume, | |||
unacknowledged_message_hook_filter]. | |||
|
|||
ping_timeout_cases() -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need a group for just one testcase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believed that introducing a separate group would be an optimal solution, given that the stanzas that are sent by mod_ping
tend to create issues in other tests
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks better now, we may even make an additional check that message is filtered out before it's getting to the crashing handler, e.g. the list of processed stanzas by mod_ping handler is the same as processed by crashing halder plus the filtered out stanza, also you may want to check that the filtered stanza is the one that you expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I only have a minor concern about a test case.
elasticsearch_and_cassandra_26 / elasticsearch_and_cassandra_mnesia / e0f3cd4 small_tests_25 / small_tests / e0f3cd4 small_tests_26 / small_tests / e0f3cd4 small_tests_26_arm64 / small_tests / e0f3cd4 ldap_mnesia_25 / ldap_mnesia / e0f3cd4 dynamic_domains_pgsql_mnesia_25 / pgsql_mnesia / e0f3cd4 ldap_mnesia_26 / ldap_mnesia / e0f3cd4 pubsub_SUITE:tree+basic:subscribe_options_deliver_option_test{error,
{{badmatch,
[{xmlel,<<"message">>,
[{<<"from">>,<<"pubsub.localhost">>},
{<<"to">>,
<<"bob_subscribe_options_deliver_option_test_1521@localhost/res1">>},
{<<"type">>,<<"headline">>}],
[{xmlel,<<"event">>,
[{<<"xmlns">>,
<<"http://jabber.org/protocol/pubsub#event">>}],
[{xmlel,<<"items">>,
[{<<"node">>,<<"princely_musings_oh8GVddWFFc=">>}],
[{xmlel,<<"item">>,
[{<<"id">>,<<"item1">>}],
[{xmlel,<<"entry">>,
[{<<"xmlns">>,
<<"http://www.w3.org/2005/Atom">>}],
[]}]}]}]},
{xmlel,<<"headers">>,
[{<<"xmlns">>,<<"http://jabber.org/protocol/shim">>}],
[]}]}]},
[{pubsub_SUITE,'-subscribe_options_deliver_option_test/1-fun-0-',3,
[{file,"/home/circleci/project/big_tests/tests/pubsub_SUITE.erl"},
{line,401}]},
{escalus_story,story,4,
[{file,
"/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_story.erl"},
{line,72}]},
{test_server,ts_tc,3,[{file,"test_server.erl"},{line,1793}]},
{test_server,run_test_case_eval1,6,
[{file,"test_server.erl"},{line,1302}]},
{test_server,run_test_case_eval,9,
[{file,"test_server.erl"},{line,1234}]}]}} dynamic_domains_pgsql_mnesia_26 / pgsql_mnesia / e0f3cd4 dynamic_domains_mssql_mnesia_26 / odbc_mssql_mnesia / e0f3cd4 dynamic_domains_mysql_redis_26 / mysql_redis / e0f3cd4 internal_mnesia_26 / internal_mnesia / e0f3cd4 pgsql_mnesia_25 / pgsql_mnesia / e0f3cd4 pgsql_mnesia_26 / pgsql_mnesia / e0f3cd4 pgsql_cets_26 / pgsql_cets / e0f3cd4 mysql_redis_26 / mysql_redis / e0f3cd4 mssql_mnesia_26 / odbc_mssql_mnesia / e0f3cd4 small_tests_25 / small_tests / e0f3cd4 ldap_mnesia_26 / ldap_mnesia / e0f3cd4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍
This PR addresses an issue arising when both
mod_ping
(withtimeout_action = "kill”
) andmod_stream_management
are enabled. Previously, error stanzas were routed back to the server after the ping and resume timeouts passed, leading to the resending of unacknowledged messages. Due to the lack of appropriate stanza handling inmod_ping
, they were processed byejabberd_local
, resulting in ID-related errors.The stanza arrives with the same ID in case of an error, but it doesn't require further processing since the ping timeout handling has already been invoked.
mod_ping
likely skips usingejabberd_local:route_iq
with a callback to avoid another laver of indirection. Handling ping responses directly in the module makes the process more straightforward and easier to understand.