Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure tasks are neither trivially easy nor impossible #248

Closed
MartinBernstorff opened this issue Mar 15, 2024 · 3 comments
Closed

Ensure tasks are neither trivially easy nor impossible #248

MartinBernstorff opened this issue Mar 15, 2024 · 3 comments

Comments

@MartinBernstorff
Copy link
Contributor

MartinBernstorff commented Mar 15, 2024

When writing #247, the dataset contained two possible "label" columns:

  • Rating: A rating of the cohesiveness of the comment
  • Domain: Whether the text is from Wikipedia or Reddit

I did not make a task with the domain column as label, because I imagined it would be trivially easy. Perhaps ideally, the PR submission process should test this? E.g. also submit the run-information as .json, to see that the task is non-trivial?

On the other end of the spectrum, it's probabyl also important to ensure that the task is doable. We could do this by running the task with e.g. both

  • intfloat/multilingual-e5-small and
  • sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2,

and ensuring there is a differential between them.

To facilitate making this easy to do, I suggest:

  • Writing a wrapper function that users can call, which runs a task with the two models
  • Modifying the evaluation script to:
    • Suffix the task result name with the model, e.g. amazon_counterfactual_classification_e5_small.json
    • Log the location of the result.json to terminal after running evaluation, to make it apparent for the user where they are located

Been a pleasure so far! What do you guys think? 😊

@KennethEnevoldsen
Copy link
Contributor

@imenelydiaker would like your thoughts on this. I generally agree with @MartinBernstorff of at least providing the results, though I would probably just use the CLI for it.

@imenelydiaker
Copy link
Contributor

I answered this here: #254. It's a really great idea! 🤩

@KennethEnevoldsen
Copy link
Contributor

This has been added in #275

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants