Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sacrebleu: Add more tokenizers for SacreBLEU metric #2068

Merged
merged 30 commits into from
Oct 9, 2023

Conversation

stancld
Copy link
Contributor

@stancld stancld commented Sep 10, 2023

What does this PR do?

Add more tokenizers for SacreBLEU metric.

Fixes #2054

Before submitting
  • Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?
PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃


📚 Documentation preview 📚: https://torchmetrics--2068.org.readthedocs.build/en/2068/

@stancld stancld marked this pull request as ready for review September 10, 2023 13:15
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Sep 10, 2023
@stancld stancld added this to the v1.2.0 milestone Sep 10, 2023
@stancld
Copy link
Contributor Author

stancld commented Sep 10, 2023

Looks like there are some incompatibilities between various python versions and dependencies required for additional tokenizers. Will have a look.

@Borda
Copy link
Member

Borda commented Sep 11, 2023

Will have a look.

lets set it as draft until it is resolved :)

@Borda Borda marked this pull request as draft September 11, 2023 08:13
@SkafteNicki SkafteNicki modified the milestones: v1.2.0, v1.3.0 Sep 15, 2023
@stancld stancld force-pushed the feature/sacrebleu-tokenizers branch 2 times, most recently from b01d247 to 60aae55 Compare September 23, 2023 08:22
@stancld stancld marked this pull request as ready for review September 23, 2023 08:30
@codecov
Copy link

codecov bot commented Sep 23, 2023

Codecov Report

Merging #2068 (47b5030) into master (d8962b4) will increase coverage by 2%.
The diff coverage is 91%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #2068    +/-   ##
=======================================
+ Coverage      85%     87%    +2%     
=======================================
  Files         293     293            
  Lines       16370   16428    +58     
=======================================
+ Hits        13863   14284   +421     
+ Misses       2507    2144   -363     

@stancld stancld changed the title Sacrebleu: Add more tokenizers for SacreBLEU metric Draft: Sacrebleu: Add more tokenizers for SacreBLEU metric Sep 23, 2023
@stancld stancld changed the title Draft: Sacrebleu: Add more tokenizers for SacreBLEU metric Sacrebleu: Add more tokenizers for SacreBLEU metric Sep 23, 2023
@stancld stancld marked this pull request as draft September 23, 2023 10:37
@mergify mergify bot added the ready label Oct 2, 2023
@mergify mergify bot added the has conflicts label Oct 3, 2023
@mergify mergify bot removed the has conflicts label Oct 3, 2023
@Borda Borda requested review from Borda and RistoAle97 October 3, 2023 19:49
@Borda
Copy link
Member

Borda commented Oct 3, 2023

@stancld seems also Win is failing...

@mergify mergify bot added the has conflicts label Oct 4, 2023
@Borda Borda enabled auto-merge (squash) October 4, 2023 11:05
@mergify mergify bot removed the has conflicts label Oct 4, 2023
@stancld
Copy link
Contributor Author

stancld commented Oct 8, 2023

@Borda Does anyone of you have access to a Win machine who can try to debug? Honestly, I have no clue what might be wrong and can test it only on Linux or MacOS :/

@Borda
Copy link
Member

Borda commented Oct 8, 2023

@Borda Does anyone of you have access to a Win machine who can try to debug? Honestly, I have no clue what might be wrong and can test it only on Linux or MacOS :/

not me, maybe @SkafteNicki?

@SkafteNicki
Copy link
Member

@Borda Does anyone of you have access to a Win machine who can try to debug? Honestly, I have no clue what might be wrong and can test it only on Linux or MacOS :/

not me, maybe @SkafteNicki?

Yes I have a windows machine, so I will try to debug

@Borda Borda merged commit f283b8f into master Oct 9, 2023
74 checks passed
@Borda Borda deleted the feature/sacrebleu-tokenizers branch October 9, 2023 12:29
matsumotosan added a commit to matsumotosan/metrics that referenced this pull request Oct 14, 2023
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: Nicki Skafte Detlefsen <skaftenicki@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation ready topic: Text
Projects
None yet
Development

Successfully merging this pull request may close these issues.

More tokenizers for the SacreBLEU metric
5 participants