Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][pip] Change cursor`s properties to store chunk ID map. #21027

Closed
wants to merge 1 commit into from
Closed

[improve][pip] Change cursor`s properties to store chunk ID map. #21027

wants to merge 1 commit into from

Conversation

liangyepianzhou
Copy link
Contributor

Motivation

Chunk messages can be effectively filtered on the broker side. Ensure that chunk messages work normally after enabling deduplication and the topic has no duplicate chunks.

Modifications

  1. Add chunkIDPushed and chunkIDPersisted to store the chunk of each producer`s ongoing chunk messages. It will be used to check whether the chunks in a single message are duplicated.
  2. Optimize the properties of the MarkDeleteEntry from Map<String, Long> to Map<String, String>.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@github-actions
Copy link

@liangyepianzhou Please add the following content to your PR description and select a checkbox:

- [ ] `doc` <!-- Your PR contains doc changes -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

```java
// Recover properties map
Map<String, String> recoveredProperties;
if (info.getPropertiesCount() == 0 && info.getmarkDeletePropertiesCount() == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The info means ManagedCursorInfo, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true.

# Goals

## In Scope
Chunk messages can be effectively filtered on the broker side. Ensure that chunk messages work normally after enabling deduplication and the topic has no duplicate chunks.
Copy link
Contributor

@poorbarcode poorbarcode Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Background:*: There are two properties in the metadata of the cursor

[1]: a structure of properties:

properties: 
  - "producer_name_1" : {{last_persist_sequence_1}}
  - "producer_name_2" : {{last_persist_sequence_2}}

In this PIP, you want to change properties<String, Long> to properties<String, String>, right? Could you also explain this change here?

.build();
```

Optimize the `properties` of the `MarkDeleteEntry` from `Map<String, Long>` to `Map<String, String>`. In the depublication design, the ' MarkDeleteEntry' properties are used as a snapshot to store the sequence ID map. After introducing the chunk ID map, it cannot hold two long for each producer. So we hope to change the `MarkDeleteEntry' properties from `Map<String, Long>` to `Map<String, String>` to make it more flexible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a demo to describe the structure of the attribute properties of cursor metadata that you wanted after this PIP?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants