-
-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Work on WASM support, part 1/2: Remove the asynchronous code in indexer. #1312
Conversation
(cargo check passes, test are broken)
Thanks @StyMaar for your PR. I think that my explanation was not very clear :/ and more importantly I did not understand that you wanted to remove the merger thread pool too! Rationale without segment updater thread and with merger thread poolJust to clarify what I tried to explain to you in my previous answer (but failed). The segment updater thread is ensuring that each operation is executed one by one. Removing the segment updater thread raises then the question: can 2 operations be executed at the same time without interfering with each other? The code is complex and it's hard to know what can happen. Here is a theoretical example that can lead to unexpected results:
The tantivy/src/indexer/segment_updater.rs Lines 413 to 447 in 2069e3e
I don't see how the code guarantees safe read/write of metas and ensure that at the end we have an index metas with only segments 4 and 5. I looked only at Rationale without segment updater thread and merger thread poolI see 2 problems:
|
Closing as this PR is way too naive. Good news however: Incidentally, I am removing all of the async stuff in #1315 (in a correct way). |
Of course it's naive: I wrote this pull request explicitly as a way to identify exactly how such a change would break tantivy, in order to reach a design that actually works. I appreciate that you're working on an alternative. Unfortunately it won't work in WASM as it is, but maybe we can find a way to make it work with your new design. |
Warning: This PR is far from a mergeable state, it's intended as a PoC and a support for technical discussions.
Motivations:
In order to port Tantivy on wasm (including indexing), there are two main issues in the current Tantivy implementation:
segment_updater.rs
currently uses asynchronous code, which rely onfutures::executor
to work. This is what's addressed in this pull-request.index_writer.rs
(not the subject of this PR)What this PR does:
In this PR, I removed all asynchronous code and made the code synchronous instead. Which at first glance should not affect the behavior in any way since the current code then uses
futures::executor::block_on
on top of the asynchronous code.But, when discussing with @fmassot about it, he questioned whether this would alter the logic:
Using sync instead of async changes this logic.
I'm not entirely sure I understands what he meant, or that he understood what I had in mind, so that's why I'm creating this pull-request, so we can discuss about it more concretely.
Note: What must be done before merging could be considered:
schedule_task
)tantivy::PreparedCommit::commit_async
: should we wrap the now-synchronous implementation in a async function (by spawning a thread) or just drop it? What do we do in WASM ?*_future
even if they don't hold futures anymore.