Skip to content
This repository has been archived by the owner on Feb 25, 2023. It is now read-only.

Deinflector optimization #238

Merged
merged 2 commits into from
Oct 6, 2019

Conversation

toasted-nutbread
Copy link
Collaborator

@toasted-nutbread toasted-nutbread commented Oct 5, 2019

This change addresses two things:

  1. The deinflection rules added in Add support for progressive/perfect inflections #235 weren't specific enough and were generating false positives. Invalid terms like 着かないで and 着かないでない were being deinflected incorrectly. This change improves the specificity of the deinflection rules, as well as adds support for the とる contraction which I previously thought would be more difficult. This change fixes all the false positives that I was seeing.

  2. Improves the performance of Deinflector.deinflect by somewhere in the range of 2x to 5x. Previously, the deinflect algorithm would do an intersection test on the rule arrays to see if there was any overlap. This has been simplified to use a bitwise & check instead, after first converting the rules arrays into a single flags integer value.

    This is similar to the algorithm I saw in rikaichamp when creating Update deinflect.json #199, except I have not changed the JSON file structure. Instead, the JSON data is converted to an array-based structure, which should also result in faster access than object property lookup (i.e. kanaIn = array[0]; rather than kanaIn = object.kanaIn;, and so on). This structure also converts the rules arrays into flags integer values.

    Performance optimizations #189

@FooSoft
Copy link
Owner

FooSoft commented Oct 6, 2019

Sounds good -- I'll push this out to testing to get some more people to hammer on this change.

@FooSoft FooSoft merged commit 14a5e3c into FooSoft:master Oct 6, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants