Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QQs: periodically apply policies if there's a discrepancy between the current and desired policy-driven state #12640

Merged
merged 54 commits into from
Nov 4, 2024

Conversation

michaelklishin
Copy link
Member

@michaelklishin michaelklishin commented Nov 4, 2024

This is #12412 by @LoisSotoLopez rebased on top of main.

LoisSotoLopez and others added 30 commits October 24, 2024 07:23
Instead of checking the values for current configuration, represented in
`rabbit_quorum_queue:handle_tick` by the `Overview` variable, against
the effective policy, just regenerate the configuration and compare with
the current configuration.
(some of this is just reverting to the original format to reduce the
diff against main)
Removes the usage of a ShouldLog parameter on several functions
and limits the logging of the message warning about the delivery_limit
not being set to the moment of queueDeclaration
As described in #12413 (comment)
test case queue_topology flaked in CI with the following error:
```
rabbitmq_amqp_client > management_SUITE > cluster_size_3 > queue_topology
    #1. {error,{test_case_failed,{824,
                                  <<"rmq-ct-cluster_size_3-1-21000@localhost">>}}}
```

This flake could not be reproduced locally (neither with Mnesia nor with Khepri).
in order to troubleshoot the flake described in
#12413 (comment)
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received\n
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477\n
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```
Support x-cc message annotation

Support an `x-cc` message annotation in AMQP 1.0
similar to the [CC](https://www.rabbitmq.com/docs/sender-selected) header in AMQP 0.9.1.

The value of the `x-cc` message annotation must by a list of strings.
A message annotation is used since application properties allow only simple types.
Prior to this commit tests
* leader_transfer_quorum_queue_credit_single
* leader_transfer_quorum_queue_credit_batches
flaked in CI during 4.1 (main) and 4.0 mixed version testing.

The follwing error occurred on node 0:
```
[error] <0.1950.0> Timed out waiting for credit reply from quorum queue 'leader_transfer_quorum_queue_credit_batches' in vhost '/'. Hint: Enable feature flag rabbitmq_4.0.0
[warning] <0.1950.0> Closing session for connection <0.1945.0>: {'v1_0.error',
[warning] <0.1950.0>                                             {symbol,<<"amqp:internal-error">>},
[warning] <0.1950.0>                                             {utf8,
[warning] <0.1950.0>                                              <<"Timed out waiting for credit reply from quorum queue 'leader_transfer_quorum_queue_credit_batches' in vhost '/'. Hint: Enable feature flag rabbitmq_4.0.0">>},
[warning] <0.1950.0>                                             undefined}
```

Therefore we enable this feature flag for both tests.

This commit also simplifies some test setups that were necessary for
4.0/3.13 mixed version testing, but isn't necessary anymore for 4.1/4.0
mixed version testing.
This test flakes in CI as described in
#12413 (comment)

The test case fails with
```
Node: rabbit_shard2@localhost
Case: amqp_system_SUITE:access_failure
Reason: {error,{{badmatch,{error,134,
                                 "Unhandled exception. System.Exception: expected exception not received
                                 at Program.Test.accessFailure(String uri) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 477
                                 at Program.main(String[] argv) in /home/runner/work/rabbitmq-server/rabbitmq-server/deps/rabbit/test/amqp_system_SUITE_data/fsharp-tests/Program.fs:line 509\n"}},
                [{amqp_system_SUITE,run_dotnet_test,2,
                                    [{file,"amqp_system_SUITE.erl"},
                                     {line,257}]},
```

However, RabbitMQ closes the session as expected due to the missing read
permissions to the queue as shown in the RabbitMQ logs:
```
[debug] <0.1321.0> Asked to create a new user 'access_failure', password length in bytes: 24
[info] <0.1321.0> Created user 'access_failure'
[debug] <0.1324.0> Asked to set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1324.0> Successfully set permissions for user 'access_failure' in virtual host '/' to '.*', '^banana.*', '^banana.*'
[info] <0.1333.0> accepting AMQP connection 127.0.0.1:36248 -> 127.0.0.1:25000
[debug] <0.1333.0> User 'access_failure' authenticated successfully by backend rabbit_auth_backend_internal
[info] <0.1333.0> Connection from AMQP 1.0 container 'AMQPNetLite-101d7d51': user 'access_failure' authenticated using SASL mechanism PLAIN and granted access to vhost '/'
[debug] <0.1333.0> AMQP 1.0 connection.open frame: hostname = 127.0.0.1, extracted vhost = /, idle-time-out = undefined
[debug] <0.1333.0> AMQP 1.0 created session process <0.1338.0> for channel number 0
[warning] <0.1338.0> Closing session for connection <0.1333.0>: {'v1_0.error',
[warning] <0.1338.0>                                             {symbol,
[warning] <0.1338.0>                                              <<"amqp:unauthorized-access">>},
[warning] <0.1338.0>                                             {utf8,
[warning] <0.1338.0>                                              <<"read access to queue 'test' in vhost '/' refused for user 'access_failure'">>},
[warning] <0.1338.0>                                             undefined}
[debug] <0.1333.0> AMQP 1.0 closed session process <0.1338.0> with channel number 0
[warning] <0.1333.0> closing AMQP connection <0.1333.0> (127.0.0.1:36248 -> 127.0.0.1:25000, duration: '269ms'):
[warning] <0.1333.0> client unexpectedly closed TCP connection
```

```
let receiver = ReceiverLink(ac.Session, "test-receiver", src)
```
uses a null constructur for the onAttached callback.
ReceiverLink doesn't seem to block.

Given that the exact same authorization error is already tested in test
case attach_source_queue of amqp_auth_SUITE, it's safe to delete this F#
test.
Bumps [org.springframework.boot:spring-boot-starter-parent](https://github.com/spring-projects/spring-boot) from 3.3.4 to 3.3.5.
- [Release notes](https://github.com/spring-projects/spring-boot/releases)
- [Commits](spring-projects/spring-boot@v3.3.4...v3.3.5)

---
updated-dependencies:
- dependency-name: org.springframework.boot:spring-boot-starter-parent
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [org.springframework.boot:spring-boot-starter-parent](https://github.com/spring-projects/spring-boot) from 3.3.4 to 3.3.5.
- [Release notes](https://github.com/spring-projects/spring-boot/releases)
- [Commits](spring-projects/spring-boot@v3.3.4...v3.3.5)

---
updated-dependencies:
- dependency-name: org.springframework.boot:spring-boot-starter-parent
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
This is enabled on main and for pull requests. Bazel remains
used in previous branches.
`init_per_group/3`, which starts the broker, was already called earlier
in the function.

This fixes a bug where the node can't be stopped in `end_per_group/2`,
attecting the next group ability to start one.
MarcialRosales and others added 20 commits November 4, 2024 00:34
For example, if the first restarted node doesn't start,
don't try to restart the other nodes. This mimics what
orchestrators such as Kubernetes or BOSH would do
(although they perform this check differently)
Closes #9259.

 ## What?
Allow an AMQP 1.0 client to renew an OAuth 2.0 token before it expires.

 ## Why?
This allows clients to keep the AMQP connection open instead of having
to create a new connection whenever the token expires.

 ## How?
As explained in #9259 (comment)
the client can `PUT` a new token on HTTP API v2 path `/auth/tokens`.
RabbitMQ will then:
1. Store the new token on the given connection.
2. Recheck access to the connection's vhost.
3. Clear all permission caches in the AMQP sessions.
4. Recheck write permissions to exchanges for links publishing to
   RabbitMQ, and recheck read permissions from queues for links
   consuming from RabbitMQ. The latter complies with the user
   expectation in #11364.
Using a log macro has the benefit that location data is added as
explained in https://www.erlang.org/doc/apps/kernel/logger.html#t:metadata/0
By running it

 * On push, when relevant code paths change
 * Every Monday morning

The peer discovery subsystem does not change
particularly often, and this plugin in particular
does not. Nonetheless, we currently run it for
every push unconditionally.
It is possible for a slow running follower with local consumers
to crash after a snapshot installation as it tries to read an entry
from its log that is no longer there (as it has been consumed and
completed by another node but still refers to prior consumers on the
current node).

This commit makes the log effect callback function more defensive
to check that the number of commands returned by the log effect
isn't different from what was requested. if it is different we
consider this a stale read request and return no further effects.

Conflicts:
	deps/rabbit/test/quorum_queue_SUITE.erl
in rabbitmq/server-packages, an Actions-only
repo dedicated to open source RabbitMQ release
automation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants