Skip to content
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

Do not generate keys that are too long for LMDB #522

Merged
merged 2 commits into from
May 3, 2022
Merged

Conversation

Kerollmops
Copy link
Member

This PR fixes meilisearch/meilisearch#2338 by making sure that we do not generate keys that are too long for LMDB especially when we are creating our prefix and proximity pairs keys.

@Kerollmops Kerollmops added bug Something isn't working no breaking The related changes are not breaking (DB nor API) labels May 2, 2022
@Kerollmops Kerollmops changed the title Fix lmdb long keys Do not generate too long keys for LMDB May 2, 2022
@Kerollmops Kerollmops changed the title Do not generate too long keys for LMDB Do not generate keys that are too long for LMDB May 2, 2022
Copy link
Member

@ManyTheFish ManyTheFish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You seem to want to unwrap each key insertion in the database, there are some in typed_chunk.rs.
Is it relevant for you to unwrap them too?

milli/src/update/index_documents/mod.rs Outdated Show resolved Hide resolved
@MarinPostma
Copy link
Contributor

Why weren't we able to get the error ? And why is unwraping here a reasonable fix?

@Kerollmops Kerollmops force-pushed the fix-lmdb-long-keys branch from b4df522 to 5a8263f Compare May 3, 2022 07:57
@Kerollmops Kerollmops force-pushed the fix-lmdb-long-keys branch from 5a8263f to 211c876 Compare May 3, 2022 08:03
@Kerollmops
Copy link
Member Author

Thank you both for your reviews,

The unwraps were a mistake, sorry, I didn't take the time to review before sending the PR. It will not happen again!
I reduced the test data to a smaller text, thank you.

@Kerollmops Kerollmops requested a review from ManyTheFish May 3, 2022 11:47
Copy link
Member Author

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors merge

@bors
Copy link
Contributor

bors bot commented May 3, 2022

@dvic
Copy link

dvic commented May 3, 2022

I think I read somewhere that the maximum key size can be configured at compile time for LMDB. Is this a route worthwhile investigating? I guess there is probably some performance impact.

@Kerollmops
Copy link
Member Author

Hey @dvic,

I read that too but it, indeed, impacts the performances and I prefer that we keep most of the settings by default, it is easier to debug and the default maximum key size is already set to a big enough value (512bytes), which mean a pair of words of length ~210 bytes can fit.

Thank you very much for your interest in the project!

bors bot added a commit that referenced this pull request May 3, 2022
519: Release v0.26.4: returns facets even when there is no value associated to it + bug fix during indexation r=curquiza a=curquiza

I cherry-picked the commits from these PRs
- #518
- #522

`release-v0.26.4` has been started from the tag `v0.26.3` and not from main

The release tag `v0.26.4` will be done on the branch `release-v0.26.4` once this PR is merge

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
bors bot added a commit that referenced this pull request May 3, 2022
519: Release v0.26.4: returns facets even when there is no value associated to it + bug fix during indexation + bug fix on typo-tolerance r=Kerollmops a=curquiza

I cherry-picked the commits from these PRs
- #518
- #522
- #520

`release-v0.26.4` has been started from the tag `v0.26.3` and not from main

The release tag `v0.26.4` will be done on the branch `release-v0.26.4` once this PR is merge

Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
Co-authored-by: ad hoc <postma.marin@protonmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working no breaking The related changes are not breaking (DB nor API)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Indexing fails with 0.27.0 rc versions - MDB_BAD_VALSIZE
4 participants