You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When writing #247, the dataset contained two possible "label" columns:
Rating: A rating of the cohesiveness of the comment
Domain: Whether the text is from Wikipedia or Reddit
I did not make a task with the domain column as label, because I imagined it would be trivially easy. Perhaps ideally, the PR submission process should test this? E.g. also submit the run-information as .json, to see that the task is non-trivial?
On the other end of the spectrum, it's probabyl also important to ensure that the task is doable. We could do this by running the task with e.g. both
@imenelydiaker would like your thoughts on this. I generally agree with @MartinBernstorff of at least providing the results, though I would probably just use the CLI for it.
When writing #247, the dataset contained two possible "label" columns:
I did not make a task with the domain column as label, because I imagined it would be trivially easy. Perhaps ideally, the PR submission process should test this? E.g. also submit the run-information as
.json
, to see that the task is non-trivial?On the other end of the spectrum, it's probabyl also important to ensure that the task is doable. We could do this by running the task with e.g. both
intfloat/multilingual-e5-small
andsentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
,and ensuring there is a differential between them.
To facilitate making this easy to do, I suggest:
amazon_counterfactual_classification_e5_small.json
Been a pleasure so far! What do you guys think? 😊
The text was updated successfully, but these errors were encountered: