Skip to content
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

speedup exact words #538

Merged
merged 4 commits into from
May 30, 2022
Merged

speedup exact words #538

merged 4 commits into from
May 30, 2022

Conversation

MarinPostma
Copy link
Contributor

This PR make exact_words return an Option instead of an empty set, since set creation is costly, as noticed by @Kerollmops.

I was not convinces that this was the cause for all of the performance drop we measured, and then realized that methods that initialized it were called recursively which caused initialization times to add up. While the first fix solves the issue when not using exact words, using exact word remained way more expensive that it should be. To address this issue, the exact words are cached into the Context, so they are only initialized once.

@MarinPostma MarinPostma requested a review from Kerollmops May 24, 2022 07:50
milli/src/index.rs Show resolved Hide resolved
milli/src/search/query_tree.rs Outdated Show resolved Hide resolved
milli/src/search/query_tree.rs Outdated Show resolved Hide resolved
milli/src/search/query_tree.rs Outdated Show resolved Hide resolved
@MarinPostma MarinPostma requested a review from Kerollmops May 24, 2022 10:17
@Kerollmops Kerollmops added the no breaking The related changes are not breaking (DB nor API) label May 24, 2022
milli/src/search/query_tree.rs Outdated Show resolved Hide resolved
milli/src/search/query_tree.rs Outdated Show resolved Hide resolved
@MarinPostma MarinPostma requested a review from Kerollmops May 24, 2022 12:15
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please run the search benchmarks on this branch? This way we will see the changes!

@MarinPostma
Copy link
Contributor Author

group                                                                                                    search_songs_main_9f78e392             search_songs_speedup-exact-words_25fc5766
-----                                                                                                    --------------------------             -----------------------------------------
smol-songs.csv: asc + default/Notstandskomitee                                                           1.01      3.6±0.02ms        ? ?/sec    1.00      3.6±0.03ms        ? ?/sec
smol-songs.csv: asc + default/charles                                                                    1.05      2.5±0.01ms        ? ?/sec    1.00      2.4±0.01ms        ? ?/sec
smol-songs.csv: asc + default/charles mingus                                                             1.18      4.0±0.02ms        ? ?/sec    1.00      3.4±0.01ms        ? ?/sec
smol-songs.csv: asc + default/david                                                                      1.04      3.4±0.01ms        ? ?/sec    1.00      3.2±0.01ms        ? ?/sec
smol-songs.csv: asc + default/david bowie                                                                1.11      5.5±0.02ms        ? ?/sec    1.00      4.9±0.02ms        ? ?/sec
smol-songs.csv: asc + default/john                                                                       1.04      3.7±0.01ms        ? ?/sec    1.00      3.5±0.02ms        ? ?/sec
smol-songs.csv: asc + default/marcus miller                                                              1.12      6.1±0.02ms        ? ?/sec    1.00      5.5±0.03ms        ? ?/sec
smol-songs.csv: asc + default/michael jackson                                                            1.11      5.5±0.02ms        ? ?/sec    1.00      5.0±0.02ms        ? ?/sec
smol-songs.csv: asc + default/tamo                                                                       1.10  1706.7±15.65µs        ? ?/sec    1.00  1552.0±11.03µs        ? ?/sec
smol-songs.csv: asc + default/thelonious monk                                                            1.08      5.7±0.02ms        ? ?/sec    1.00      5.3±0.02ms        ? ?/sec
smol-songs.csv: asc/Notstandskomitee                                                                     1.02      3.3±0.02ms        ? ?/sec    1.00      3.2±0.01ms        ? ?/sec
smol-songs.csv: asc/charles                                                                              1.28    678.5±3.86µs        ? ?/sec    1.00   530.2±12.36µs        ? ?/sec
smol-songs.csv: asc/charles mingus                                                                       1.56  1409.0±40.88µs        ? ?/sec    1.00    903.5±4.18µs        ? ?/sec
smol-songs.csv: asc/david                                                                                1.18    963.3±4.31µs        ? ?/sec    1.00    816.2±8.37µs        ? ?/sec
smol-songs.csv: asc/david bowie                                                                          1.39  1722.6±39.79µs        ? ?/sec    1.00   1238.6±8.40µs        ? ?/sec
smol-songs.csv: asc/john                                                                                 1.20    849.6±3.28µs        ? ?/sec    1.00    707.9±5.37µs        ? ?/sec
smol-songs.csv: asc/marcus miller                                                                        1.43   1635.8±8.91µs        ? ?/sec    1.00  1142.0±17.96µs        ? ?/sec
smol-songs.csv: asc/michael jackson                                                                      1.41  1684.4±18.05µs        ? ?/sec    1.00  1192.5±12.75µs        ? ?/sec
smol-songs.csv: asc/tamo                                                                                 2.73    231.0±1.57µs        ? ?/sec    1.00     84.5±0.84µs        ? ?/sec
smol-songs.csv: asc/thelonious monk                                                                      1.15      4.0±0.02ms        ? ?/sec    1.00      3.5±0.02ms        ? ?/sec
smol-songs.csv: basic filter: <=/Notstandskomitee                                                        2.01   298.6±16.12µs        ? ?/sec    1.00   148.5±11.89µs        ? ?/sec
smol-songs.csv: basic filter: <=/charles                                                                 3.97    193.7±1.28µs        ? ?/sec    1.00     48.9±0.46µs        ? ?/sec
smol-songs.csv: basic filter: <=/charles mingus                                                          5.76    604.8±6.50µs        ? ?/sec    1.00    105.0±0.85µs        ? ?/sec
smol-songs.csv: basic filter: <=/david                                                                   4.49    185.3±1.67µs        ? ?/sec    1.00     41.2±0.61µs        ? ?/sec
smol-songs.csv: basic filter: <=/david bowie                                                             6.78    579.2±6.94µs        ? ?/sec    1.00     85.4±0.99µs        ? ?/sec
smol-songs.csv: basic filter: <=/john                                                                    5.28    180.4±2.24µs        ? ?/sec    1.00     34.2±0.26µs        ? ?/sec
smol-songs.csv: basic filter: <=/marcus miller                                                           6.26    601.8±6.72µs        ? ?/sec    1.00     96.1±0.70µs        ? ?/sec
smol-songs.csv: basic filter: <=/michael jackson                                                         5.48    615.3±4.05µs        ? ?/sec    1.00    112.3±0.96µs        ? ?/sec
smol-songs.csv: basic filter: <=/tamo                                                                    5.26    180.7±1.13µs        ? ?/sec    1.00     34.4±0.43µs        ? ?/sec
smol-songs.csv: basic filter: <=/thelonious monk                                                         3.99    630.8±5.85µs        ? ?/sec    1.00    158.0±1.40µs        ? ?/sec
smol-songs.csv: basic filter: TO/Notstandskomitee                                                        2.09   294.8±14.58µs        ? ?/sec    1.00    140.9±0.96µs        ? ?/sec
smol-songs.csv: basic filter: TO/charles                                                                 3.92    192.8±2.70µs        ? ?/sec    1.00     49.2±0.39µs        ? ?/sec
smol-songs.csv: basic filter: TO/charles mingus                                                          5.72    610.2±4.74µs        ? ?/sec    1.00    106.6±1.24µs        ? ?/sec
smol-songs.csv: basic filter: TO/david                                                                   4.49    189.5±1.35µs        ? ?/sec    1.00     42.2±0.41µs        ? ?/sec
smol-songs.csv: basic filter: TO/david bowie                                                             6.69    581.2±5.23µs        ? ?/sec    1.00     86.9±0.65µs        ? ?/sec
smol-songs.csv: basic filter: TO/john                                                                    5.15    182.6±1.73µs        ? ?/sec    1.00     35.5±0.39µs        ? ?/sec
smol-songs.csv: basic filter: TO/marcus miller                                                           6.14    590.3±7.96µs        ? ?/sec    1.00     96.1±0.93µs        ? ?/sec
smol-songs.csv: basic filter: TO/michael jackson                                                         5.33    600.4±4.35µs        ? ?/sec    1.00    112.5±0.96µs        ? ?/sec
smol-songs.csv: basic filter: TO/tamo                                                                    5.23    181.9±1.46µs        ? ?/sec    1.00     34.8±0.53µs        ? ?/sec
smol-songs.csv: basic filter: TO/thelonious monk                                                         4.30    641.0±8.20µs        ? ?/sec    1.00    149.2±6.64µs        ? ?/sec
smol-songs.csv: basic placeholder/                                                                       1.04     61.4±0.32µs        ? ?/sec    1.00     59.2±0.47µs        ? ?/sec
smol-songs.csv: basic with quote/"Notstandskomitee"                                                      1.00    204.0±1.54µs        ? ?/sec    1.02    208.7±1.45µs        ? ?/sec
smol-songs.csv: basic with quote/"charles"                                                               1.00    173.1±1.12µs        ? ?/sec    1.02    177.3±1.56µs        ? ?/sec
smol-songs.csv: basic with quote/"charles" "mingus"                                                      1.00  1269.4±10.77µs        ? ?/sec    1.01  1288.0±12.00µs        ? ?/sec
smol-songs.csv: basic with quote/"david"                                                                 1.00    247.2±1.73µs        ? ?/sec    1.01    249.2±1.45µs        ? ?/sec
smol-songs.csv: basic with quote/"david" "bowie"                                                         1.00  1436.9±10.27µs        ? ?/sec    1.00   1438.4±8.40µs        ? ?/sec
smol-songs.csv: basic with quote/"john"                                                                  1.00    362.7±2.56µs        ? ?/sec    1.01    365.9±4.02µs        ? ?/sec
smol-songs.csv: basic with quote/"marcus" "miller"                                                       1.00    306.1±1.76µs        ? ?/sec    1.01    309.5±1.92µs        ? ?/sec
smol-songs.csv: basic with quote/"michael" "jackson"                                                     1.00   1442.3±7.33µs        ? ?/sec    1.00   1441.7±7.51µs        ? ?/sec
smol-songs.csv: basic with quote/"tamo"                                                                  1.00    555.9±3.43µs        ? ?/sec    1.00   556.4±16.79µs        ? ?/sec
smol-songs.csv: basic with quote/"thelonious" "monk"                                                     1.00   1752.4±7.07µs        ? ?/sec    1.00  1752.7±11.03µs        ? ?/sec
smol-songs.csv: basic without quote/Notstandskomitee                                                     1.05      3.5±0.03ms        ? ?/sec    1.00      3.3±0.02ms        ? ?/sec
smol-songs.csv: basic without quote/charles                                                              1.48    436.0±3.54µs        ? ?/sec    1.00    295.4±1.98µs        ? ?/sec
smol-songs.csv: basic without quote/charles mingus                                                       1.19      3.6±0.01ms        ? ?/sec    1.00      3.1±0.07ms        ? ?/sec
smol-songs.csv: basic without quote/david                                                                1.33    591.5±3.88µs        ? ?/sec    1.00    446.4±4.42µs        ? ?/sec
smol-songs.csv: basic without quote/david bowie                                                          1.08      6.9±0.03ms        ? ?/sec    1.00      6.4±0.02ms        ? ?/sec
smol-songs.csv: basic without quote/john                                                                 1.10  1432.4±14.73µs        ? ?/sec    1.00   1299.2±9.68µs        ? ?/sec
smol-songs.csv: basic without quote/marcus miller                                                        1.19      3.1±0.01ms        ? ?/sec    1.00      2.6±0.01ms        ? ?/sec
smol-songs.csv: basic without quote/michael jackson                                                      1.12      4.6±0.02ms        ? ?/sec    1.00      4.1±0.02ms        ? ?/sec
smol-songs.csv: basic without quote/tamo                                                                 1.16  1072.8±11.71µs        ? ?/sec    1.00   927.4±10.85µs        ? ?/sec
smol-songs.csv: basic without quote/thelonious monk                                                      1.10      5.9±0.02ms        ? ?/sec    1.00      5.3±0.06ms        ? ?/sec
smol-songs.csv: big filter/Notstandskomitee                                                              1.03      3.5±0.03ms        ? ?/sec    1.00      3.4±0.02ms        ? ?/sec
smol-songs.csv: big filter/charles                                                                       1.46    464.2±2.57µs        ? ?/sec    1.00    318.6±1.52µs        ? ?/sec
smol-songs.csv: big filter/charles mingus                                                                1.64   1257.3±8.79µs        ? ?/sec    1.00    766.8±4.51µs        ? ?/sec
smol-songs.csv: big filter/david                                                                         1.16   1087.6±9.86µs        ? ?/sec    1.00    937.4±7.77µs        ? ?/sec
smol-songs.csv: big filter/david bowie                                                                   1.27      2.3±0.01ms        ? ?/sec    1.00   1821.5±8.19µs        ? ?/sec
smol-songs.csv: big filter/john                                                                          1.14  1238.0±21.04µs        ? ?/sec    1.00  1089.2±11.10µs        ? ?/sec
smol-songs.csv: big filter/marcus miller                                                                 1.59  1334.0±16.04µs        ? ?/sec    1.00    836.5±4.30µs        ? ?/sec
smol-songs.csv: big filter/michael jackson                                                               1.27      2.3±0.01ms        ? ?/sec    1.00  1795.0±12.29µs        ? ?/sec
smol-songs.csv: big filter/tamo                                                                          1.91    304.8±2.25µs        ? ?/sec    1.00    159.5±1.69µs        ? ?/sec
smol-songs.csv: big filter/thelonious monk                                                               1.11      4.2±0.02ms        ? ?/sec    1.00      3.7±0.02ms        ? ?/sec
smol-songs.csv: desc + default/Notstandskomitee                                                          1.03      3.7±0.02ms        ? ?/sec    1.00      3.6±0.01ms        ? ?/sec
smol-songs.csv: desc + default/charles                                                                   1.08   1882.2±9.71µs        ? ?/sec    1.00  1744.4±14.56µs        ? ?/sec
smol-songs.csv: desc + default/charles mingus                                                            1.18      3.1±0.01ms        ? ?/sec    1.00      2.6±0.01ms        ? ?/sec
smol-songs.csv: desc + default/david                                                                     1.02      6.1±0.02ms        ? ?/sec    1.00      6.0±0.03ms        ? ?/sec
smol-songs.csv: desc + default/david bowie                                                               1.07      9.9±0.04ms        ? ?/sec    1.00      9.3±0.03ms        ? ?/sec
smol-songs.csv: desc + default/john                                                                      1.02      4.9±0.08ms        ? ?/sec    1.00      4.8±0.07ms        ? ?/sec
smol-songs.csv: desc + default/marcus miller                                                             1.11      4.7±0.02ms        ? ?/sec    1.00      4.2±0.02ms        ? ?/sec
smol-songs.csv: desc + default/michael jackson                                                           1.07      7.4±0.03ms        ? ?/sec    1.00      6.9±0.02ms        ? ?/sec
smol-songs.csv: desc + default/tamo                                                                      1.09   1708.4±9.45µs        ? ?/sec    1.00   1574.3±9.82µs        ? ?/sec
smol-songs.csv: desc + default/thelonious monk                                                           1.05      5.6±0.02ms        ? ?/sec    1.00      5.4±0.02ms        ? ?/sec
smol-songs.csv: desc/Notstandskomitee                                                                    1.03      3.3±0.03ms        ? ?/sec    1.00      3.2±0.01ms        ? ?/sec
smol-songs.csv: desc/charles                                                                             1.27    676.9±4.33µs        ? ?/sec    1.00    532.5±3.26µs        ? ?/sec
smol-songs.csv: desc/charles mingus                                                                      1.54   1397.8±8.67µs        ? ?/sec    1.00    905.4±6.42µs        ? ?/sec
smol-songs.csv: desc/david                                                                               1.18    960.7±7.83µs        ? ?/sec    1.00    815.5±3.72µs        ? ?/sec
smol-songs.csv: desc/david bowie                                                                         1.40  1736.7±11.55µs        ? ?/sec    1.00  1243.0±10.59µs        ? ?/sec
smol-songs.csv: desc/john                                                                                1.21    851.0±5.14µs        ? ?/sec    1.00    706.1±3.06µs        ? ?/sec
smol-songs.csv: desc/marcus miller                                                                       1.43   1635.6±6.68µs        ? ?/sec    1.00   1145.1±8.11µs        ? ?/sec
smol-songs.csv: desc/michael jackson                                                                     1.40   1669.3±7.92µs        ? ?/sec    1.00   1192.2±6.99µs        ? ?/sec
smol-songs.csv: desc/tamo                                                                                2.70    227.3±3.44µs        ? ?/sec    1.00     84.3±0.61µs        ? ?/sec
smol-songs.csv: desc/thelonious monk                                                                     1.16      4.1±0.01ms        ? ?/sec    1.00      3.5±0.02ms        ? ?/sec
smol-songs.csv: prefix search/a                                                                          1.12  1372.5±12.54µs        ? ?/sec    1.00  1220.6±17.26µs        ? ?/sec
smol-songs.csv: prefix search/b                                                                          1.13   1235.4±8.87µs        ? ?/sec    1.00  1089.7±10.75µs        ? ?/sec
smol-songs.csv: prefix search/i                                                                          1.11  1506.4±28.31µs        ? ?/sec    1.00  1354.4±17.05µs        ? ?/sec
smol-songs.csv: prefix search/s                                                                          1.18   985.1±15.30µs        ? ?/sec    1.00    833.2±4.68µs        ? ?/sec
smol-songs.csv: prefix search/x                                                                          1.47    429.8±3.09µs        ? ?/sec    1.00    291.8±2.27µs        ? ?/sec
smol-songs.csv: proximity/7000 Danses Un Jour Dans Notre Vie                                             3.65     18.7±0.51ms        ? ?/sec    1.00      5.1±0.86ms        ? ?/sec
smol-songs.csv: proximity/The Disneyland Sing-Along Chorus                                               1.45     12.8±0.08ms        ? ?/sec    1.00      8.8±0.03ms        ? ?/sec
smol-songs.csv: proximity/Under Great Northern Lights                                                    1.60      5.1±0.02ms        ? ?/sec    1.00      3.2±0.02ms        ? ?/sec
smol-songs.csv: proximity/black saint sinner lady                                                        1.46      6.1±0.02ms        ? ?/sec    1.00      4.2±0.02ms        ? ?/sec
smol-songs.csv: proximity/les dangeureuses 1960                                                          1.18      6.0±0.02ms        ? ?/sec    1.00      5.1±0.02ms        ? ?/sec
smol-songs.csv: typo/Arethla Franklin                                                                    1.92    902.0±6.83µs        ? ?/sec    1.00    469.6±2.70µs        ? ?/sec
smol-songs.csv: typo/Disnaylande                                                                         1.02      3.3±0.03ms        ? ?/sec    1.00      3.3±0.02ms        ? ?/sec
smol-songs.csv: typo/dire straights                                                                      1.11      4.2±0.02ms        ? ?/sec    1.00      3.8±0.02ms        ? ?/sec
smol-songs.csv: typo/fear of the duck                                                                    3.05      3.0±0.01ms        ? ?/sec    1.00    974.4±3.72µs        ? ?/sec
smol-songs.csv: typo/indochie                                                                            2.08    272.8±2.08µs        ? ?/sec    1.00    131.2±1.38µs        ? ?/sec
smol-songs.csv: typo/indochien                                                                           1.83    309.6±1.87µs        ? ?/sec    1.00    169.5±1.25µs        ? ?/sec
smol-songs.csv: typo/klub des loopers                                                                    2.62  1622.6±14.43µs        ? ?/sec    1.00    618.3±6.16µs        ? ?/sec
smol-songs.csv: typo/michel depech                                                                       1.81    952.6±4.39µs        ? ?/sec    1.00    525.2±5.01µs        ? ?/sec
smol-songs.csv: typo/mongus                                                                              1.92    303.2±2.28µs        ? ?/sec    1.00    158.1±0.95µs        ? ?/sec
smol-songs.csv: typo/stromal                                                                             1.86    310.4±2.77µs        ? ?/sec    1.00    167.1±1.18µs        ? ?/sec
smol-songs.csv: typo/the white striper                                                                   2.23  1792.6±10.24µs        ? ?/sec    1.00    804.0±3.72µs        ? ?/sec
smol-songs.csv: typo/thelonius monk                                                                      2.22    759.0±6.25µs        ? ?/sec    1.00    342.5±4.84µs        ? ?/sec
smol-songs.csv: words/7000 Danses / Le Baiser / je me trompe de mots                                     4.00     94.5±3.12ms        ? ?/sec    1.00     23.6±3.71ms        ? ?/sec
smol-songs.csv: words/Bring Your Daughter To The Slaughter but now this is not part of the title         3.78    197.4±9.04ms        ? ?/sec    1.00     52.2±6.56ms        ? ?/sec
smol-songs.csv: words/The Disneyland Children's Sing-Alone song                                          2.50     33.6±0.46ms        ? ?/sec    1.00     13.4±0.61ms        ? ?/sec
smol-songs.csv: words/les liaisons dangeureuses 1793                                                     1.62      7.2±0.03ms        ? ?/sec    1.00      4.5±0.04ms        ? ?/sec
smol-songs.csv: words/seven nation mummy                                                                 1.94      2.5±0.01ms        ? ?/sec    1.00  1308.4±16.35µs        ? ?/sec
smol-songs.csv: words/the black saint and the sinner lady and the good doggo                             3.06    202.0±7.10ms        ? ?/sec    1.00     66.0±6.13ms        ? ?/sec
smol-songs.csv: words/whathavenotnsuchforth and a good amount of words to pop to match the first one     3.08    205.4±6.88ms        ? ?/sec    1.00     66.6±5.31ms        ? ?/sec

@MarinPostma MarinPostma requested a review from Kerollmops May 30, 2022 07:46
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much! Perfs are kind of back!
bors merge

@bors
Copy link
Contributor

bors bot commented May 30, 2022

@bors bors bot merged commit 582930d into main May 30, 2022
@bors bors bot deleted the speedup-exact-words branch May 30, 2022 08:36
@irevoire irevoire added the performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption label Jul 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
no breaking The related changes are not breaking (DB nor API) performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants