Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for migrations tests #24073

Merged
merged 3 commits into from
Nov 12, 2024

Conversation

bashtanov
Copy link
Contributor

@bashtanov bashtanov commented Nov 8, 2024

Tests/migrations: relax migrations tests: do not check for number of migrations when using high level API under failure injector.
Eventual consistency of migrations table may come late, and we'd rather not wait for it to test but move on with other actions to see how a disturbed cluster copes with active use.

Raft/consensus: catch ss::condition_variable_timed_out not ss::timed_out_error when waiting for a condvar with timeout. Otherwise it bubbles up unhandled appearing in logs and potentially breaking the logic too.

Tests/migrations: run mount/unmount commands without finjector. Otherwise we'd need to be prepared to a situation where we get no successful reply from admin API, but the migration is nevertheless created (node killed right before it has sent a reply). When testing low level API we handle this by checking the migration is present. But high-level mount/unmount commands auto-remove migration objects on completion. Telling apart an uncreated and a completed migration in the test logic would be somewhat tricky.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.2.x
  • v24.1.x
  • v23.3.x

Release Notes

  • none

otherwise we'll need to be prepared to a situation where we get no successful
reply, but the migration is created
…error

when waiting for a condvar with timeout
@bashtanov
Copy link
Contributor Author

bazel build failure is https://redpandadata.atlassian.net/issues/CORE-8112

@vbotbuildovich
Copy link
Collaborator

the below tests from https://buildkite.com/redpanda/redpanda/builds/57834#01930b9c-fc4b-4f48-a4b4-68e8ca969689 have failed and will be retried

translator_test_rpfixture

@bashtanov
Copy link
Contributor Author

this one is also known https://redpandadata.atlassian.net/issues/CORE-8093

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Nov 8, 2024

Retry command for Build#57834

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/data_migrations_api_test.py::DataMigrationsApiTest.test_creating_and_listing_migrations
tests/rptest/tests/data_migrations_api_test.py::DataMigrationsApiTest.test_higher_level_migration_api

@vbotbuildovich
Copy link
Collaborator

non flaky failures in https://buildkite.com/redpanda/redpanda/builds/57834#01930bf9-31a1-44ca-85c9-373157f3f2a0:

"rptest.tests.data_migrations_api_test.DataMigrationsApiTest.test_higher_level_migration_api"

…sence

Finj may make things lag, so tolerate migration absence, but not wrong data.
@bashtanov bashtanov force-pushed the migrations-test-fixes2 branch from 056a9cb to eafe51f Compare November 8, 2024 17:46
@bashtanov bashtanov merged commit ac52b7e into redpanda-data:dev Nov 12, 2024
17 checks passed
} catch (const ss::timed_out_error& e) {
} catch (const ss::condition_variable_timed_out& e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too bad they didn't inherit from timed_out_error :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants