Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add some Japanese transforms #833

Merged
merged 10 commits into from
May 4, 2024
Merged

Conversation

Lyroxide
Copy link

Reference: https://tsukuba.repo.nii.ac.jp/record/3420/files/5.pdf

-rya is the -ba contraction, I can't think of any better romaji names. maybe '-ya' is better?

-cha is -ては contraction. usually it's followed by いけない, so "行っちゃいけない" gets highlighted as 行っちゃい (-chau < masu stem) which is incorrect. "行っちゃだめ" won't even show the plain form

-n usually appears in -んばかり, like いわんばかり

@Lyroxide Lyroxide requested a review from a team as a code owner April 17, 2024 09:16
@StefanVukovic99 StefanVukovic99 added kind/enhancement The issue or PR is a new feature or request area/linguistics The issue or PR is related to linguistics labels Apr 18, 2024
Copy link

github-actions bot commented Apr 18, 2024

✔️ No visual differences introduced by this PR.

View Playwright Report (note: open the "playwright-report" artifact)

@jamesmaa
Copy link
Collaborator

Btw you'll have to update after we merged #784

@Lyroxide
Copy link
Author

@djahandarie Would need your review 🙏

@StefanVukovic99
Copy link
Collaborator

As the pdf says, the -rya contraction for verbs is pretty rare but there's no harm in having it I guess. Should 帰らなきゃ and such also be supported? Also, what do you think about, instead of having it deinflect directly to the dictionary form, going to -ba, like:

suffixInflection('けりゃ', 'ければ', [], ['ba']) // adding intermediate rule for ba

image.
For naming, since the r sound is not always there, ya, (r)ya. If doing the chaining with -ba, maybe naming it ba -> (r)ya instead of contraction.

I don't see this kind of -n for verbs mentioned in the pdf, but I think it's good, since we currently have nothing for 話さんか and such.

-cha is good i think. 行っちゃいけない will still try and match the longest string, so it will say chau + masu, but that's a separate issue we should hopefully fix soon (should also start showing the descriptions on hover, maybe also which text preprocessing was applied)

@Lyroxide
Copy link
Author

I think chaining -ba -> -ya sounds good

帰らなきゃ

yeah I forgot about that one. will add that in

rare

i think it's pretty common especially for ければ->けりゃ and れば->りゃ. but tbf I don't know if I have ever seen the rest, like 待ちゃ (???)

@jamesmaa jamesmaa added this pull request to the merge queue May 4, 2024
Merged via the queue into yomidevs:master with commit 7e9eed6 May 4, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/linguistics The issue or PR is related to linguistics kind/enhancement The issue or PR is a new feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants