-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory map file loading #50
Comments
Is that 1313M RES on startup, or could there be a leak? (I'm seeing about half when I test with se.zcheck |
Isn't that a persistent pipe using CG-3's libcg3 API as part of the process? 'cause if so then GrammarSoft/cg3#74 |
Hm, could perhaps reload the data every so often as a workaround, though it might be easier to just restart the divvun-checker process in that case ;-) |
I was profiling a bit for fun and it least my version that uses hfst-ospell didn't really have memory leaks but used up increasing amount of memory on some cache, I disabled that cache in the last version I hope if you can test that again? I guess we are planning to replace hfst-ospell stuff with divvunspell especially if it continues to be the bottleneck? |
ah, is it using hfst-ospell? hehe, well we need to fix that then. |
If you could give me a list of the functionality that is used by libdivvun from hfst-ospell, I can inventory anything missing for it to be ported across. If there's not much, I can publish a stable C API header somewhere (basically leached straight from divvunspell-sdk-swift without the Swift ;) ) |
Mmh, I cannot remember if I made this stuff anymore but main part of hfst_ospell seems to be in speller::Spell here: https://github.com/divvun/libdivvun/blob/master/src/cgspell.cpp#L136 |
The main additions to standard
@unhammer would know more 😊 |
oh god analysis, nooooo. Someone else can port that across, hahaha |
Yeah as the code shows, we just use You can easily make a pipeline without the speller step and check if that takes the pain away (just edit the pipespec.xml in your zcheck zip and remove that one element). |
Grammar checking is using quite a lot of RAM on our Divvun API server:
We've mitigated this for the spellchecking in DivvunSpell by using mmap instead of loading data into RAM, with minimal performance penalty in our use cases. Is this something that can be implemented for these grammar checking pipelines?
The text was updated successfully, but these errors were encountered: