Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Seq scanner scans data by time range #4809

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

evenyag
Copy link
Contributor

@evenyag evenyag commented Oct 10, 2024

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

#4757

What's changed and what's your intention?

Seq scanner scans data according to the order of partition ranges.

Moves scan_file_ranges and scan_mem_ranges to scan_util. Refactors them so that SeqScan and UnorderedScan can reuse them.

To reuse codes, this PR adds a new metric struct PartitionMetrics, and shares it between streams in the same partition. This struct also prints the debug log in drop() so we can still get the log when the stream is dropped before exhausted.

It removes all unused codes.

There are still some remaining works:

  • Remove the scan parallelism. But we need to support file-level parallelism.
  • Support splitting multiple row groups in SeqScan.
  • Support field pruning in last_non_null mode if the time range only has one file.
  • More tests for RangeMeta and StreamContext

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.

@github-actions github-actions bot added the docs-not-required This change does not impact docs. label Oct 10, 2024
@evenyag evenyag marked this pull request as ready for review October 10, 2024 06:02
Copy link

codecov bot commented Oct 10, 2024

Codecov Report

Attention: Patch coverage is 95.70957% with 13 lines in your changes missing coverage. Please review.

Project coverage is 84.01%. Comparing base (a8ed3db) to head (74b4088).
Report is 26 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4809      +/-   ##
==========================================
- Coverage   84.42%   84.01%   -0.42%     
==========================================
  Files        1124     1128       +4     
  Lines      204759   208254    +3495     
==========================================
+ Hits       172873   174955    +2082     
- Misses      31886    33299    +1413     

src/mito2/src/read/seq_scan.rs Show resolved Hide resolved
@evenyag evenyag mentioned this pull request Oct 14, 2024
7 tasks
@evenyag
Copy link
Contributor Author

evenyag commented Oct 16, 2024

74b4088 changed the PartitionRange::end to exclusive. @waynexia @discord9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-not-required This change does not impact docs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants