Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Danish Discourse dataset #247

Merged
merged 8 commits into from
Mar 18, 2024

Conversation

MartinBernstorff
Copy link
Contributor

@MartinBernstorff MartinBernstorff commented Mar 15, 2024

Awaiting license from @KennethEnevoldsen. Feel free to nitpick, want to do this right 👍

@MartinBernstorff MartinBernstorff changed the title Ddisco Add Danish Discourse dataset Mar 15, 2024
Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Generally, I would like a performance metric on some models as well (but we already looked over those).

@imenelydiaker adding you as a reviewer as well.

mteb/tasks/Classification/DdiscoCohesionClassification.py Outdated Show resolved Hide resolved
mteb/tasks/Classification/DdiscoCohesionClassification.py Outdated Show resolved Hide resolved
Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually just realized that we also need the update to the table in the readme with the dataset (I will add that to the guide as well)

MartinBernstorff and others added 2 commits March 15, 2024 15:11
Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>
Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>
Copy link
Contributor

@imenelydiaker imenelydiaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some comments, else everything looks good ! Nice job 😄

mteb/tasks/Classification/DdiscoCohesionClassification.py Outdated Show resolved Hide resolved
from datasets import load_dataset
from sentence_transformers import SentenceTransformer

from mteb.abstasks.AbsTaskClassification import AbsTaskClassification
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from mteb.abstasks.AbsTaskClassification import AbsTaskClassification
from ...abstasks import AbsTaskClassification

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to prefer this?

mteb/tasks/Classification/DdiscoCohesionClassification.py Outdated Show resolved Hide resolved
mteb/tasks/Classification/DdiscoCohesionClassification.py Outdated Show resolved Hide resolved
MartinBernstorff and others added 3 commits March 15, 2024 17:12
Co-authored-by: Imene Kerboua <33312980+imenelydiaker@users.noreply.github.com>
Co-authored-by: Imene Kerboua <33312980+imenelydiaker@users.noreply.github.com>
Co-authored-by: Imene Kerboua <33312980+imenelydiaker@users.noreply.github.com>
@KennethEnevoldsen KennethEnevoldsen merged commit d46d0f5 into embeddings-benchmark:main Mar 18, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants