-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test for valid YAML files #308
Conversation
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mathemakitten!
My main suggestion: What do you think about moving the validation code to something like src/evaluate/utils/file_utils.py
. This could be just a function that takes a string as input and raises an error if it's not valid yaml. Then we could test if that function works with a few examples as well as testing all the readmes. What do you think?
tests/test_hub.py
Outdated
readme_filepaths.extend(glob.glob(glob_path)) | ||
for readme_file in readme_filepaths: | ||
with open(readme_file) as f_yaml: | ||
x = yaml.safe_load_all(f_yaml) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not necessary to split off the YAML part of the README?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The README is loaded in delimited by ---
, so calling next() on the generator automatically gets the YAML (the first section)
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
@lvwerra The bit which actually does the validation is If we wanted to parse out the first bit of the file and pass it around to validate it'd look like this:
However, this isn't any cleaner and also doesn't really generalize, since it depends on loading from the file in a specific way. We could have a general function which wraps |
Fair enough, let's leave it like that then! To solve the issue on windows, maybe you need to load with utf-8? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks @mathemakitten 🚀
* yaml check fast * Update tests/test_hub.py Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * fix merge * Adding encoding utf8 * code quality Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Closes #296, which hopefully results in fewer broken Spaces. Nothing fancy about this implementation and it's pretty specific to Hub metric card formats but works just fine for what we need, let me know if you think it should do anything else!