-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added TurkishMMLU to LM Evaluation Harness #2283
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks real good. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi there! One last thing: could you also add an entry to lm_eval/tasks/README.md
describing the task in 1 sentence as is done for the other entries in that table (mentioning in this sentence that your dataset is not translated from MMLU and not machine-translated!), and note that the language is Turkish?
We want to make sure others can discover this task and your work more easily!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated Readme addresses the raised issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay on this, fixed up some casing issues and added the documentation. Note also that template YAML stubs should not end in .yaml
in order for them to not be registered as tasks erroneously.
Test failures unrelated. |
* Added TurkishMMLU to LM Evaluation Harness * Fixed COT name * Fixed COT name * Updated Readme * Fixed Test issues * Completed Scan for changed tasks * Updated Readme * Update README.md * fixup task naming casing + ensure yaml template stubs aren't registered --------- Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com> Co-authored-by: haileyschoelkopf <hailey@eleuther.ai>
* Added TurkishMMLU to LM Evaluation Harness * Fixed COT name * Fixed COT name * Updated Readme * Fixed Test issues * Completed Scan for changed tasks * Updated Readme * Update README.md * fixup task naming casing + ensure yaml template stubs aren't registered --------- Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com> Co-authored-by: haileyschoelkopf <hailey@eleuther.ai>
* Added TurkishMMLU to LM Evaluation Harness * Fixed COT name * Fixed COT name * Updated Readme * Fixed Test issues * Completed Scan for changed tasks * Updated Readme * Update README.md * fixup task naming casing + ensure yaml template stubs aren't registered --------- Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com> Co-authored-by: haileyschoelkopf <hailey@eleuther.ai>
In this pull request, I would like to add our work TurkishMMLU: Measuring Massive Multitask Language Understanding in Turkish to LM Evaluation Harness.
You can find the details of our work in our repository:
https://github.com/ArdaYueksel/TurkishMMLU
Also, our dataset is made available in HuggingFace: https://huggingface.co/datasets/AYueksel/TurkishMMLU
Key Features: