Sort dictionary entries by number of text processors applied #1200

Casheeew · 2024-07-11T11:25:43Z

Resolves #1060 .

This PR adds a textProcessorRuleChainCandidates object similar to inflectionRuleChainCandidates so that it is easier to trace the textProcessor rules applied and sort based on that number.

For example, after this PR, (using CC100 frequency-based sorting) おっと should correctly match 夫 before 音, and へった should correctly match 減る before 下手。

codspeed-hq · 2024-07-11T11:27:45Z

CodSpeed Performance Report

Merging #1200 will not alter performance

_{Comparing cashewnuttynuts:sort-by-variants (fa2346e) with master (751acba)}

Summary

✅ 5 untouched benchmarks

StefanVukovic99

The translator stuff is pretty complicated so I'll need to look into this more.

test/data/translator-test-results.json

Casheeew · 2024-07-11T13:42:16Z

Okay, I believe that the issue should be fixed, and everything should be sound. I will leave this as draft for some more so we can test things more thoroughly though.

Casheeew · 2024-07-12T08:37:08Z

Tested on Japanese.
Ready for review

ext/js/language/translator.js

Co-authored-by: Stefan Vuković <stefanvukovic44@gmail.com> Signed-off-by: Cashew <52880648+cashewnuttynuts@users.noreply.github.com>

…-tan into sort-by-variants

Casheeew added 4 commits July 11, 2024 16:59

sort by text processing chain length

a322242

write dictionary data

12eca5a

fix wrong variantsMap initialization

c73a72e

write dictionary data

26267fe

Casheeew requested a review from a team as a code owner July 11, 2024 11:25

StefanVukovic99 reviewed Jul 11, 2024

View reviewed changes

test/data/translator-test-results.json Outdated Show resolved Hide resolved

Casheeew marked this pull request as draft July 11, 2024 13:02

fix logic bug

5afa90f

Casheeew added 3 commits July 12, 2024 08:05

fix logic

a128ab5

move textprocessing comparison up

79c7bad

add textProcessorRuleChainCandidates to TermDictionaryEntry

5007672

Casheeew marked this pull request as ready for review July 12, 2024 08:36

StefanVukovic99 added kind/bug The issue or PR is regarding a bug area/linguistics The issue or PR is related to linguistics labels Jul 12, 2024

StefanVukovic99 reviewed Jul 12, 2024

View reviewed changes

ext/js/language/translator.js Outdated Show resolved Hide resolved

StefanVukovic99 reviewed Jul 12, 2024

View reviewed changes

ext/js/language/translator.js Outdated Show resolved Hide resolved

StefanVukovic99 reviewed Jul 12, 2024

View reviewed changes

ext/js/language/translator.js Show resolved Hide resolved

StefanVukovic99 reviewed Jul 12, 2024

View reviewed changes

ext/js/language/translator.js Outdated Show resolved Hide resolved

Casheeew and others added 6 commits July 12, 2024 20:01

remove comment

ff9bc77

Update ext/js/language/translator.js

d15da5b

Co-authored-by: Stefan Vuković <stefanvukovic44@gmail.com> Signed-off-by: Cashew <52880648+cashewnuttynuts@users.noreply.github.com>

Merge branch 'sort-by-variants' of https://github.com/Scrub1492/lesen…

3e81d1f

…-tan into sort-by-variants

remove unused variable

57820a5

add text replacements to TextProcessorRuleChain

d2404a2

write dictionary data

fa2346e

StefanVukovic99 approved these changes Jul 12, 2024

View reviewed changes

StefanVukovic99 added this pull request to the merge queue Jul 12, 2024

Merged via the queue into yomidevs:master with commit 502f71c Jul 12, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sort dictionary entries by number of text processors applied #1200

Sort dictionary entries by number of text processors applied #1200

Casheeew commented Jul 11, 2024 •

edited

Loading

codspeed-hq bot commented Jul 11, 2024 •

edited

Loading

StefanVukovic99 left a comment

Casheeew commented Jul 11, 2024

Casheeew commented Jul 12, 2024

Sort dictionary entries by number of text processors applied #1200

Sort dictionary entries by number of text processors applied #1200

Conversation

Casheeew commented Jul 11, 2024 • edited Loading

codspeed-hq bot commented Jul 11, 2024 • edited Loading

CodSpeed Performance Report

Merging #1200 will not alter performance

Summary

StefanVukovic99 left a comment

Choose a reason for hiding this comment

Casheeew commented Jul 11, 2024

Casheeew commented Jul 12, 2024

Casheeew commented Jul 11, 2024 •

edited

Loading

codspeed-hq bot commented Jul 11, 2024 •

edited

Loading