Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use AHash instead of SipHash for storing attributes and classes. #117

Merged
merged 1 commit into from
Mar 3, 2023
Merged

Use AHash instead of SipHash for storing attributes and classes. #117

merged 1 commit into from
Mar 3, 2023

Conversation

adamreichold
Copy link
Member

This has better performance while still being DoS resistant.

This PR is the hasher change broken out from #91 as that part has no interactions with #101 and appears straight forward enough.

This has better performance while still being DoS resistant.
@j-mendez
Copy link
Member

j-mendez commented Mar 1, 2023

@adamreichold I use these changes on another project and it works solid. Changes lg2m too.

@j-mendez
Copy link
Member

j-mendez commented Mar 1, 2023

Hashbrown could fit well too depending on env (not the core from std) @adamreichold . That’s what I use with ahash. For the project using one hash made more sense due to the allocators.

@adamreichold
Copy link
Member Author

Hashbrown could fit well too depending on env (not the core from std) @adamreichold . That’s what I use with ahash. For the project using one hash made more sense due to the allocators.

There is little difference here in the end as std's HashMap is Hashbrown and Hashbrown uses AHash as its default hasher. So in the end this would result in the same code being used, only in possibly slightly different versions. That being the case, I chose AHash as it is the smaller dependency compared to Hashbrown (switching just the hasher instead of adding another copy of the same hash table implementation).

@cfvescovo
Copy link
Member

LGTM. What should we do with #91?

@adamreichold
Copy link
Member Author

LGTM. What should we do with #91?

Decide whether we want all or parts of #101, then if classes are made to be collected lazily drop the switch to String or if we keep collecting them eagerly merge it.

@j-mendez
Copy link
Member

j-mendez commented Mar 2, 2023

Hashbrown could fit well too depending on env (not the core from std) @adamreichold . That’s what I use with ahash. For the project using one hash made more sense due to the allocators.

There is little difference here in the end as std's HashMap is Hashbrown and Hashbrown uses AHash as its default hasher. So in the end this would result in the same code being used, only in possibly slightly different versions. That being the case, I chose AHash as it is the smaller dependency compared to Hashbrown (switching just the hasher instead of adding another copy of the same hash table implementation).

Hashbrown could fit well too depending on env (not the core from std) @adamreichold . That’s what I use with ahash. For the project using one hash made more sense due to the allocators.

There is little difference here in the end as std's HashMap is Hashbrown and Hashbrown uses AHash as its default hasher. So in the end this would result in the same code being used, only in possibly slightly different versions. That being the case, I chose AHash as it is the smaller dependency compared to Hashbrown (switching just the hasher instead of adding another copy of the same hash table implementation).

Valid, great explanation 🙌.

@cfvescovo cfvescovo merged commit 80ba1a5 into rust-scraper:master Mar 3, 2023
@adamreichold adamreichold deleted the ahash branch March 3, 2023 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants