Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alpha support for German #53

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

cainesap
Copy link

Hi Chris!

  • add "lang" = language to argparse, with "en" the default.
  • Enables multilingual use of ERRANT.
  • Prompted by "MultiGEC" shared task for NLP4CALL 2025 ; will add more language options if you approve the idea

best, Andrew

add "lang" = language to argparse
@chrisjbryant
Copy link
Owner

chrisjbryant commented Sep 28, 2024

Hey Andrew!

The above won't actually work because errant.load officially only accepts en as a valid language in the __init__.py file. Additionally, the lang parameter also controls where the merger and classifier are loaded, so you'd need an errant.de directory that contains something similar to the errant.en directory.

You can fudge it and use the english merger/classifier with German spacy, and most of errant should still work from coarse POS tags and lemmas etc., but certainly other aspects like the rules for spelling errors and fine-grained verb errors (tense/sva/form) are unlikely to work well.

@chrisjbryant
Copy link
Owner

chrisjbryant commented Sep 28, 2024

I just updated the __init__.py file so it should work as you intended now!

I don't want to make it an official addition to errant however, because the German support is untested, but it should certainly suffice for MultiGEC as a custom extension!

Let me know how well it works for German!

@chrisjbryant chrisjbryant changed the title Update parallel_to_m2.py Alpha support for German Sep 28, 2024
@cainesap
Copy link
Author

awesome, thanks Chris, will do!

different model names for English vs German
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants