docs: document schema for datasets in tasks #250

MartinBernstorff · 2024-03-15T16:21:38Z

When I added #247, it was non-obvious for me which columns a dataset should contain.

I propose documenting this at the AbsTask level, e.g. for classification something like:

class AbsTaskClassification(AbsTask):
    """
    Abstract class for kNN classification tasks
    The similarity is computed between pairs and the results are ranked. 
    
    Dataset must be a huggingface dataset split into train/test, and contain the following columns:
        text: str
        label: int
    """

The text was updated successfully, but these errors were encountered:

imenelydiaker · 2024-03-15T16:23:54Z

I completely agree with you! We ran into the same issue when creating the benchmark for French. We should add a docstring like this one for each task type maybe 🤔

MartinBernstorff · 2024-03-15T16:28:00Z

I'd gladly take on part of this btw 👍

KennethEnevoldsen · 2024-03-17T17:00:56Z

Perfect @MartinBernstorff will assign it to you. Feel free to add me as the reviewer

imenelydiaker assigned KennethEnevoldsen and Muennighoff Mar 15, 2024

KennethEnevoldsen closed this as completed Mar 17, 2024

KennethEnevoldsen reopened this Mar 17, 2024

KennethEnevoldsen assigned MartinBernstorff and unassigned Muennighoff Mar 17, 2024

MartinBernstorff mentioned this issue Mar 18, 2024

docs: add dataset schemas #255

Merged

KennethEnevoldsen closed this as completed in #255 Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: document schema for datasets in tasks #250

docs: document schema for datasets in tasks #250

MartinBernstorff commented Mar 15, 2024 •

edited

Loading

imenelydiaker commented Mar 15, 2024 •

edited

Loading

MartinBernstorff commented Mar 15, 2024 •

edited

Loading

KennethEnevoldsen commented Mar 17, 2024 •

edited

Loading

docs: document schema for datasets in tasks #250

docs: document schema for datasets in tasks #250

Comments

MartinBernstorff commented Mar 15, 2024 • edited Loading

imenelydiaker commented Mar 15, 2024 • edited Loading

MartinBernstorff commented Mar 15, 2024 • edited Loading

KennethEnevoldsen commented Mar 17, 2024 • edited Loading

MartinBernstorff commented Mar 15, 2024 •

edited

Loading

imenelydiaker commented Mar 15, 2024 •

edited

Loading

MartinBernstorff commented Mar 15, 2024 •

edited

Loading

KennethEnevoldsen commented Mar 17, 2024 •

edited

Loading