Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: change InvertedIndexWriter method signature to offsets to f… #4250

Merged
merged 1 commit into from
Jul 2, 2024

Conversation

v0y4g3r
Copy link
Contributor

@v0y4g3r v0y4g3r commented Jul 2, 2024

…acilliate caching

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

Change methods signatures of InvertedIndexWriter to offset/size so that we can add a cache layer from arguments provided.

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.

Summary by CodeRabbit

  • Refactor

    • Updated parameter handling in fst and bitmap methods to improve performance by using direct offsets and sizes instead of metadata references.
  • Tests

    • Adjusted test cases to align with the updated method parameters, ensuring consistent and accurate testing of the new method signatures.

Copy link
Contributor

coderabbitai bot commented Jul 2, 2024

Note

Reviews paused

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Walkthrough

The recent changes modify how the InvertedIndexReader and associated methods handle offsets and sizes. Instead of passing a metadata structure, methods now directly take offset and size parameters. This simplifies the interfaces of fst and bitmap methods and adjusts related logic and tests accordingly.

Changes

File Path Change Summary
src/index/src/inverted_index/format/reader.rs Updated InvertedIndexReader trait methods to accept direct offset and size parameters instead of metadata structure.
src/index/src/inverted_index/format/reader/blob.rs Adjusted fst and bitmap method implementations to use direct offset and size. Updated test calls to match the new signature.
src/index/src/inverted_index/format/writer/blob.rs Modified calls to fst and bitmap methods in the writer to include base offsets and sizes.
src/index/src/inverted_index/search/fst_values_mapper.rs Updated method call in FstValuesMapper to use the new offset and size parameters. Simplified test setup closure parameters.
src/index/src/inverted_index/search/index_apply/...rs Changed method calls in PredicatesIndexApplier to utilize new offsets and sizes, updated test signatures for fetching fst.

In the fields where data blooms,
We shift offsets and sizes for new rooms.
No longer bound by meta’s hold,
Directly we seek, our methods bold.
With simplified code and tests anew,
Our indexing reads, swift and true. 🐰📘


Tip

Early access features: enabled

We are currently testing the following features in early access:

  • OpenAI gpt-4o model for code reviews and chat: OpenAI claims that this model is better at understanding and generating code than the previous models. We seek your feedback over the next few weeks before making it generally available.

Note:

  • You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.
  • Please join our Discord Community to provide feedback and report issues.
  • OSS projects are currently opted into early access features by default.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@v0y4g3r v0y4g3r marked this pull request as ready for review July 2, 2024 08:17
@v0y4g3r v0y4g3r requested a review from zhongzc as a code owner July 2, 2024 08:17
@github-actions github-actions bot added the docs-not-required This change does not impact docs. label Jul 2, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between f2c08b8 and 144aad0.

Files selected for processing (5)
  • src/index/src/inverted_index/format/reader.rs (2 hunks)
  • src/index/src/inverted_index/format/reader/blob.rs (4 hunks)
  • src/index/src/inverted_index/format/writer/blob.rs (2 hunks)
  • src/index/src/inverted_index/search/fst_values_mapper.rs (2 hunks)
  • src/index/src/inverted_index/search/index_apply/predicates_apply.rs (6 hunks)
Additional comments not posted (19)
src/index/src/inverted_index/format/reader.rs (2)

33-34: LGTM! But verify the function usage in the codebase.

The method signature for fst has been updated to use offset and size parameters.

However, ensure that all function calls to fst match the new signature.


36-37: LGTM! But verify the function usage in the codebase.

The method signature for bitmap has been updated to use offset and size parameters.

However, ensure that all function calls to bitmap match the new signature.

Verification successful

All function calls to bitmap match the new signature.

The verification confirms that all instances of the bitmap function have been updated to use the offset: u64 and size: u32 parameters as specified in the new method signature.

  • src/index/src/inverted_index/search/fst_values_mapper.rs
  • src/index/src/inverted_index/format/writer/blob.rs
  • src/index/src/inverted_index/format/reader/blob.rs
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify all function calls to `bitmap` match the new signature.

# Test: Search for the function usage. Expect: Only occurances of the new signature.
rg --type rust $'bitmap'

Length of output: 24853



Script:

#!/bin/bash
# Description: Verify all function calls to `bitmap` match the new signature with `offset: u64` and `size: u32`.

# Pattern to match function calls to `bitmap` with the new signature.
ast-grep --lang rust --pattern '$_ .bitmap($_, $_)' 

Length of output: 2127

src/index/src/inverted_index/search/fst_values_mapper.rs (2)

51-51: LGTM! But verify the function usage in the codebase.

The bitmap method call has been updated to use the new offset calculation.

However, ensure that all function calls to bitmap match the new signature.

Verification successful

Verify all function calls to bitmap match the new signature.

The bitmap method is used in multiple files. Ensure that all instances follow the new signature with the updated offset calculation.


All bitmap method calls follow the new signature.

The bitmap method calls across the codebase have been verified, and they all include the updated offset calculation as expected.

  • src/index/src/inverted_index/search/fst_values_mapper.rs
  • src/index/src/inverted_index/search/index_apply/predicates_apply.rs
  • src/index/src/inverted_index/format/reader.rs
  • src/index/src/inverted_index/format/writer/blob.rs
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify all function calls to `bitmap` match the new signature.

# Test: Search for the function usage. Expect: Only occurances of the new signature.
rg --type rust $'bitmap'

Length of output: 24853



Script:

#!/bin/bash
# Description: Verify all function calls to `bitmap` match the new signature.

# Extract all instances of the `bitmap` method calls and their surrounding lines for context.
rg 'bitmap\(' -A 2 -B 2

Length of output: 22265


82-82: LGTM! But verify the function usage in the codebase.

The mock implementation for bitmap has been updated to use the new offset calculation.

However, ensure that all function calls to bitmap match the new signature.

src/index/src/inverted_index/format/writer/blob.rs (4)

177-183: LGTM! But verify the function usage in the codebase.

The fst method call has been updated to use the new offset calculation.

However, ensure that all function calls to fst match the new signature.


211-217: LGTM! But verify the function usage in the codebase.

The fst method call has been updated to use the new offset calculation.

However, ensure that all function calls to fst match the new signature.


186-201: LGTM! But verify the function usage in the codebase.

The bitmap method call has been updated to use the new offset calculation.

However, ensure that all function calls to bitmap match the new signature.


220-235: LGTM! But verify the function usage in the codebase.

The bitmap method call has been updated to use the new offset calculation.

However, ensure that all function calls to bitmap match the new signature.

Verification successful

All function calls to bitmap match the new signature.

The bitmap method calls in the following files have been verified to use the new offset calculation:

  • src/index/src/inverted_index/format/writer/blob.rs
  • src/index/src/inverted_index/search/fst_values_mapper.rs
  • src/index/src/inverted_index/format/reader/blob.rs
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify all function calls to `bitmap` match the new signature.

# Test: Search for the function usage. Expect: Only occurances of the new signature.
rg --type rust $'bitmap'

Length of output: 24853



Script:

#!/bin/bash
# Description: Verify all function calls to `bitmap` match the new signature.

# Searching for the definition of the `bitmap` function to understand its signature.
rg 'async fn bitmap' --type rust

# Searching for all instances of the `bitmap` function call to verify their arguments.
rg '\.bitmap\(' --type rust

Length of output: 1635

src/index/src/inverted_index/format/reader/blob.rs (6)

64-69: LGTM! But verify the function usage in the codebase.

The fst method implementation has been updated to use the new offset calculation.

However, ensure that all function calls to fst match the new signature.


216-222: LGTM! But verify the function usage in the codebase.

The fst method call has been updated to use the new offset calculation.

However, ensure that all function calls to fst match the new signature.


244-246: LGTM! But verify the function usage in the codebase.

The bitmap method call has been updated to use the new offset calculation.

However, ensure that all function calls to bitmap match the new signature.


204-210: LGTM! But verify the function usage in the codebase.

The fst method call has been updated to use the new offset calculation.

However, ensure that all function calls to fst match the new signature.


236-238: LGTM! But verify the function usage in the codebase.

The bitmap method call has been updated to use the new offset calculation.

However, ensure that all function calls to bitmap match the new signature.

Verification successful

Verification Successful

All function calls to bitmap match the new signature with the parameters offset and size.

  • src/index/src/inverted_index/search/fst_values_mapper.rs
  • src/index/src/inverted_index/format/reader/blob.rs
  • src/index/src/inverted_index/format/writer/blob.rs
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify all function calls to `bitmap` match the new signature.

# Test: Search for the function usage. Expect: Only occurances of the new signature.
rg --type rust $'bitmap'

Length of output: 24853



Script:

#!/bin/bash
# Description: Verify all function calls to `bitmap` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type rust 'bitmap\('

Length of output: 7350


75-79: LGTM! But verify the function usage in the codebase.

The bitmap method implementation has been updated to use the new offset calculation.

However, ensure that all function calls to bitmap match the new signature.

Verification successful

All bitmap function calls match the new signature.

The bitmap function calls found in the codebase correctly use the new signature with two parameters: offset and size.

  • src/index/src/inverted_index/search/fst_values_mapper.rs: .bitmap(self.metadata.base_offset + relative_offset as u64, size)
  • src/index/src/inverted_index/format/reader/blob.rs: let bitmap = blob_reader.bitmap(meta.base_offset, 2).await.unwrap();
  • src/index/src/inverted_index/format/writer/blob.rs: .bitmap(tag0.base_offset + offset as u64, size)
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify all function calls to `bitmap` match the new signature.

# Test: Search for the function usage. Expect: Only occurances of the new signature.
rg --type rust $'bitmap'

Length of output: 24853



Script:

#!/bin/bash
# Search for function calls to `bitmap` and verify their signatures.
ast-grep --lang rust --pattern '$_.$bitmap($_, $_)' --json

Length of output: 408



Script:

#!/bin/bash
# Search for function calls to `bitmap` and verify their signatures.
ast-grep --lang rust --pattern '$_.$bitmap($_, $_)' --json

Length of output: 408



Script:

#!/bin/bash
# Search for function calls to `bitmap` and verify their signatures.
rg '\.bitmap\(' --type rust

Length of output: 1354

src/index/src/inverted_index/search/index_apply/predicates_apply.rs (5)

79-80: LGTM!

The calculation of fst_offset and fst_size is correct and aligns with the new method signature.


81-81: LGTM!

The call to reader.fst with the new parameters is correct and aligns with the new method signature.


164-164: LGTM!

The mock_metas function signature is updated correctly to accept tuples of (&'static str, u32).


204-204: LGTM!

The mock_reader is correctly updated to return metadata with the new structure.


205-207: LGTM!

The test setup is correctly updated to call reader.fst with the new parameters.

@v0y4g3r
Copy link
Contributor Author

v0y4g3r commented Jul 2, 2024

@coderabbitai pause

Copy link
Contributor

coderabbitai bot commented Jul 2, 2024

Actions performed

Reviews paused.

Copy link

codecov bot commented Jul 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.69%. Comparing base (f2c08b8) to head (144aad0).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4250      +/-   ##
==========================================
- Coverage   84.96%   84.69%   -0.28%     
==========================================
  Files        1052     1052              
  Lines      186787   186830      +43     
==========================================
- Hits       158705   158233     -472     
- Misses      28082    28597     +515     

@fengjiachun
Copy link
Collaborator

@zhongzc PTAL

@v0y4g3r v0y4g3r added this pull request to the merge queue Jul 2, 2024
Merged via the queue into GreptimeTeam:main with commit 2261360 Jul 2, 2024
55 of 69 checks passed
@v0y4g3r v0y4g3r deleted the refactor/index-method-signature branch July 2, 2024 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-not-required This change does not impact docs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants