Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Improve delete and rolling strategy for tiered storage modules #8481

Closed
1 task done
lizhimins opened this issue Aug 2, 2024 · 0 comments · Fixed by #8493
Closed
1 task done

[Enhancement] Improve delete and rolling strategy for tiered storage modules #8481

lizhimins opened this issue Aug 2, 2024 · 0 comments · Fixed by #8493

Comments

@lizhimins
Copy link
Member

lizhimins commented Aug 2, 2024

Before Creating the Enhancement Request

  • I have confirmed that this should be classified as an enhancement rather than a bug/feature.

Summary

改进和修复分级存储模块相关的几个小问题

Motivation

改进和修复分级存储模块

Describe the Solution You'd Like

  1. File Deletion Policy: Shortly after a broker starts, the data retention time in tiered storage will be shorter than that in local storage. A scheduled task will delete the data in tiered storage and retransmit it. While this does not pose a correctness issue, the retransmission wastes computing resources and bandwidth.

  2. File Rolling Policy: The system supports configuring the minimum size for retained files. For the rolling of flat files, the default requirement is that the time exceeds one day and the file size is greater than 16 MB.

  3. Index Module: The index supports forced uploading of the last file, and the init logic has been moved out of the constructor.

  4. Remove invalid configuration items which only exist in old versions.

  5. During pop consumption, the revive process generates a large number of random reads, reducing the amount of message count retrieved through prefetch.

改进和修复分级存储模块:

  1. 文件删除策略,在节点启动不久时,分级存储中数据保留时间会小于本地存储。
    定时任务会删除分级存储中的数据并重传,此时没有正确性问题,但重传浪费了计算资源和带宽。

  2. 文件滚动策略,支持配置文件保留的最小大小,对于 flat file 文件的滚动,默认需要满足时间大于 1 天并且文件大于 16M。

  3. 索引部分,Index 支持强制上传最后一个文件,并将初始化逻辑从构造函数中移出。

  4. 删除低版本中的无效配置项。

  5. 在 pop 消费时,revive 流程会产生大量对于分级存储的随机读,此时减少了预读取回的数据量。

Describe Alternatives You've Considered

None

Additional Context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant