Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs for EvaluationSuite #340

Merged
merged 21 commits into from
Dec 9, 2022
Merged

Docs for EvaluationSuite #340

merged 21 commits into from
Dec 9, 2022

Conversation

mathemakitten
Copy link
Contributor

@mathemakitten mathemakitten commented Nov 3, 2022

Adding docs for EvaluationSuite.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Nov 3, 2022

The documentation is not available anymore as the PR was closed or merged.

@mathemakitten mathemakitten force-pushed the hn-evaluation-suite-v2 branch 3 times, most recently from e64cd83 to f8da2a6 Compare November 8, 2022 23:00
@lhoestq lhoestq mentioned this pull request Nov 16, 2022
Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool thanks ! I think you can also mention it to the quick tour :)


```python
{'glue/cola': {'accuracy': 0.0, 'total_time_in_seconds': 0.9766696180449799, 'samples_per_second': 10.238876909079256, 'latency_in_seconds': 0.09766696180449798}, 'glue/sst2': {'accuracy': 0.5, 'total_time_in_seconds': 1.1422595420153812, 'samples_per_second': 8.754577775166744, 'latency_in_seconds': 0.11422595420153811}, 'glue/qqp': {'accuracy': 0.6, 'total_time_in_seconds': 1.3553926559980027, 'samples_per_second': 7.377935800188323, 'latency_in_seconds': 0.13553926559980026}, 'glue/mrpc': {'accuracy': 0.6, 'total_time_in_seconds': 2.021696529001929, 'samples_per_second': 4.946340786832532, 'latency_in_seconds': 0.2021696529001929}, 'glue/mnli': {'accuracy': 0.2, 'total_time_in_seconds': 2.0380110969999805, 'samples_per_second': 4.9067446270142145, 'latency_in_seconds': 0.20380110969999807}, 'glue/qnli': {'accuracy': 0.3, 'total_time_in_seconds': 2.082032073987648, 'samples_per_second': 4.802999975330509, 'latency_in_seconds': 0.20820320739876477}, 'glue/rte': {'accuracy': 0.7, 'total_time_in_seconds': 2.8592985830036923, 'samples_per_second': 3.4973612267855576, 'latency_in_seconds': 0.2859298583003692}, 'glue/wnli': {'accuracy': 0.5, 'total_time_in_seconds': 1.5406486629508436, 'samples_per_second': 6.490772517107661, 'latency_in_seconds': 0.15406486629508437}}
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Would be nice to show it as a pandas DataFrame for readability

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, the result is now a list of dicts so it can be easily transformed into a dataframe. I've added that to the example 😄

self.preprocessor = lambda x: {"text": x["text"].lower()}
self.suite = [
SubTask(
task_type="text-classification",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you list the available task types maybe ? Or redirect to their docs ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a link to the supported tasks on the Evaluator docs so we don't have to maintain the list in two places!

Base automatically changed from hn-evaluation-suite-v2 to main November 16, 2022 15:44
Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mathemakitten this great, thanks for working on this. I left a few comments, happy to discuss further if you want.

docs/source/a_quick_tour.mdx Outdated Show resolved Hide resolved
docs/source/a_quick_tour.mdx Outdated Show resolved Hide resolved
docs/source/a_quick_tour.mdx Outdated Show resolved Hide resolved
docs/source/a_quick_tour.mdx Outdated Show resolved Hide resolved
docs/source/a_quick_tour.mdx Outdated Show resolved Hide resolved
docs/source/base_evaluator.mdx Outdated Show resolved Hide resolved
docs/source/evaluation_suite.mdx Outdated Show resolved Hide resolved
docs/source/evaluation_suite.mdx Outdated Show resolved Hide resolved
docs/source/evaluation_suite.mdx Outdated Show resolved Hide resolved
docs/source/evaluation_suite.mdx Show resolved Hide resolved
mathemakitten and others added 4 commits November 29, 2022 09:08
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor comments, then we can merge 🚀

docs/source/a_quick_tour.mdx Outdated Show resolved Hide resolved
docs/source/evaluation_suite.mdx Outdated Show resolved Hide resolved
docs/source/evaluation_suite.mdx Outdated Show resolved Hide resolved
>>> suite = EvaluationSuite.load('mathemakitten/glue-evaluation-suite')
>>> results = suite.run("gpt2")

| accuracy | total_time_in_seconds | samples_per_second | latency_in_seconds | task_name |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would do the same here and remove the table from the codeblock so it's actually rendered as a nice table.

@lvwerra lvwerra merged commit 2814419 into main Dec 9, 2022
@lvwerra lvwerra deleted the hn-docs-evalsuite branch December 9, 2022 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants