Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implementing cross-segment read/write for WAL based on local disk #1556

Merged
merged 16 commits into from
Sep 4, 2024

Conversation

dracoooooo
Copy link
Contributor

@dracoooooo dracoooooo commented Aug 23, 2024

Rationale

Improving WAL based on local disk.

This is a follow-up task for #1552.

Detailed Changes

  1. Make MAX_FILE_SIZE configurable.
  2. Allocate enough space when creating a segment to avoid remapping when appending to the segment.​
  3. Add MultiSegmentLogIterator to enable cross-segment reading.
  4. When writing, if the current segment has insufficient space, create a new segment and write to the new segment.​

Test Plan

Unit test.

@github-actions github-actions bot added the feature New feature or request label Aug 23, 2024
@jiacai2050 jiacai2050 self-requested a review August 26, 2024 02:11
src/wal/src/local_storage_impl/segment.rs Outdated Show resolved Hide resolved
src/wal/src/local_storage_impl/segment.rs Outdated Show resolved Hide resolved
src/wal/src/local_storage_impl/segment.rs Outdated Show resolved Hide resolved
src/wal/src/local_storage_impl/segment.rs Outdated Show resolved Hide resolved
src/wal/src/local_storage_impl/segment.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@jiacai2050 jiacai2050 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jiacai2050 jiacai2050 merged commit 28e4760 into apache:main Sep 4, 2024
9 checks passed
jiacai2050 pushed a commit that referenced this pull request Sep 14, 2024
## Rationale

Currently the WAL based on the local disk does not support the delete
function. This PR implements that functionality.

This is a follow-up task of #1552 and #1556.

## Detailed Changes

1. For each `Segment`, add a hashmap to record the minimum and maximum
sequence numbers of all tables within that segment. During `delete` and
`write` operations, this hashmap will be updated. During read
operations, logs will be filtered based on this hashmap.

2. During the `delete` operation, based on the aforementioned hashmap,
if all logs of all tables in a read-only segment (a segment that is not
currently being written to) are marked as deleted, the segment file will
be physically deleted from the disk.

## Test Plan

Unit test, TSBS and running a script locally that repeatedly inserts
data, forcibly kills, and restarts the database process to test
persistence.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants