Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using a Trie for dictionary storage #53

Closed
sspanak opened this issue Sep 11, 2022 · 4 comments
Closed

Consider using a Trie for dictionary storage #53

sspanak opened this issue Sep 11, 2022 · 4 comments
Labels
enhancement New feature or request stale wontfix This will not be worked on

Comments

@sspanak
Copy link
Owner

sspanak commented Sep 11, 2022

Instead of the flat storage in SQL, try using a Trie. It may be much faster and use less space. When switching, ensure the suggestions still have priorities.

Educational info:

A DAWG may be even a better option, because it allows sharing common intermediate nodes. For example:

t | o | p
  | a |
  | i |

This may be useful for compressing the assets. Here is a nice implementation in Python.

@Clam-
Copy link
Contributor

Clam- commented Jan 12, 2023

You may find that SQLite is quite optimized at searching and storing due to the underlying tree nature of its internals.
Historically the very first version of the app I attempted to use a trie-like data structure. I wasn't very successful, since I wasn't sure how to optimally store it in memory vs disk and ended up consuming all memory trying to load a dictionary haha.
SQLite does all that messy work for you, however even in saying all that I would be very interested to see what you end up creating.
Good luck!

@sspanak
Copy link
Owner Author

sspanak commented Jan 26, 2023

You may find that SQLite is quite optimized at searching and storing due to the underlying tree nature of its internals.

This is not really something absolutely necessary. SQLite works just fine. Even in debug mode it loads the suggestion list in less than 10 ms, so there is no need to replace it.

I was not familiar with this data structure. I found about it accidentally, while I was working on another tree in my real job. The demo in the first link seemed quite impressive, so I though I could experiment with it, when I have enough free time. But of course, I immediatelly realized that a relational database is not an optimal storage for this data structure and I knew I had to find a better one.

Anyway, my interest in this could be classified as purely scientific.

Historically the very first version of the app I attempted to use a trie-like data structure. I wasn't very successful, since I wasn't sure how to optimally store it in memory vs disk and ended up consuming all memory trying to load a dictionary

That's strange. According to many sources this is the data structure for dictionaries. It's the most optimal one and it has been used in early phones that had very little processing power and memory. Who knows... you may have not been successful, because of the way Java works... I may find one day!

@github-actions
Copy link

This issue is stale because it has been open for 100 days with no activity.

@sspanak
Copy link
Owner Author

sspanak commented Sep 25, 2023

This is an interesting academic problem, but a more realistic approach is replacing the database as described in #366. If performance issues still persist, this can be reopened.

@sspanak sspanak closed this as completed Sep 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants