Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update hashbrown requirement from 0.14.2 to 0.15.0 #6493

Closed
wants to merge 3 commits into from

Conversation

wiedld
Copy link
Contributor

@wiedld wiedld commented Oct 1, 2024

Which issue does this PR close?

Alternative PR to #6486.

Rationale for this change

Update the ver 0.15 of hashbrown.

We are not using the deprecated RawTable, only raw entries.
In order to continue using raw entries, we need to use the inline-more and raw-entry feature flags.

What changes are included in this PR?

See above.

Are there any user-facing changes?

No

@github-actions github-actions bot added parquet Changes to the parquet crate arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate labels Oct 1, 2024
@wiedld wiedld marked this pull request as ready for review October 1, 2024 22:39
@@ -29,7 +29,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
// protoc in unbuntu builder needs this option
.protoc_arg("--experimental_allow_proto3_optional")
.out_dir("src")
.compile_with_config(prost_config(), &[proto_path], &[proto_dir])?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tonic deprecation warning, fixed here. I can move to another PR if a problem.

@crepererum
Copy link
Contributor

From the change log:

Raw entry API is now under raw-entry feature, to be eventually removed.

So if we don't fix that during the upgrade, we should track the migration in a ticket so we're not getting into a pickle when that functionality is actually removed.

@Jefffrey
Copy link
Contributor

Jefffrey commented Oct 2, 2024

From the change log:

Raw entry API is now under raw-entry feature, to be eventually removed.

So if we don't fix that during the upgrade, we should track the migration in a ticket so we're not getting into a pickle when that functionality is actually removed.

Raised #6498

@@ -48,7 +48,7 @@ chrono = { workspace = true }
chrono-tz = { version = "0.10", optional = true }
num = { version = "0.4.1", default-features = false, features = ["std"] }
half = { version = "2.1", default-features = false, features = ["num-traits"] }
hashbrown = { version = "0.14.2", default-features = false }
hashbrown = { version = "0.15.0", default-features = false, features = ["inline-more", "raw-entry"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is setting inline-more strictly necessary? I think it's mutually exclusive from raw-entry as can compile without needing it 🤔

Copy link
Contributor Author

@wiedld wiedld Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got that from reading the src. It's enabled by default for better performance, but we don't compile with the default features. I should have mentioned that in the PR description (sorry). This also points out how I should have really run the performance test. Thank you!

I'm also working on an alternative approach to remove use of raw-entry. I hope to performance test both approaches later today.

@alamb
Copy link
Contributor

alamb commented Oct 2, 2024

Before we merge this, we should certainly run benchmarks given how important hashbrown is to performance

ALso there is a bunch of discussion downstream in DataFusion (which uses the lower level RawEntry APIs) on what to do now that hashbrown plans to remove it

@wiedld wiedld marked this pull request as draft October 2, 2024 16:25
@alamb
Copy link
Contributor

alamb commented Oct 2, 2024

I also wanted to point out given how fundamental hashbrown is to a bunch of things (esp in DataFusion) I think we should let the new 0.15.0 release have some bake time (aka let other projects on the bleeding edge test it and find any bugs)

Thus I think we should target upgrading in the next arrow release (0.52.2) not the one I am preparing now

@wiedld
Copy link
Contributor Author

wiedld commented Oct 2, 2024

Thus I think we should target upgrading in the next arrow release (0.52.2) not the one I am preparing now

Understood. I'll circle back to this one (or anyone else can feel free to take a shot), after I address some higher priority items. I'm partially into an alternative solution, but it's not near ready.

I'll comment here when I pick it up again, if someone else doesn't beat me to it. 😆

@tustvold
Copy link
Contributor

tustvold commented Oct 9, 2024

FWIW I've filed an upstream ticket related to this, to see if the maintainers have any recommendations on how we should move forward here - rust-lang/hashbrown#573

@crepererum
Copy link
Contributor

With #6537 merged, I think we no longer need the raw-entry feature.

@crepererum
Copy link
Contributor

Covered by #6684.

@crepererum crepererum closed this Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants