Skip to content

Latest commit

 

History

History

task-oriented-dialogues

Task Oriented Dialogues

Below is a general format for task oriented dialogues:

{
    "dataset_name--train/val/test--dialog_id": {
        "original dialog id": str,
        "dialog index": int,
        "original dialog info": dict,
        "log": [
            {
                "turn id": int,
                "user utterance": str,
                "system response": str,
                "dialog history": str,
                "original user side information": dict,
                "original system side information": dict,
                "dst": str,
                "dst accumulated": str
            },
         	...
        ],
        "external knowledge non-flat": {
            "metadata": dict,
            "slots and values": dict
            "intents": dict,
            ...
        },
        "external knowledge": str,
        "intent knowledge": str,
        "prompt": [
            "This is a bot helping users to get navigation. Given the dialog context and external database, please generate a relevant system response for the user.",
            ...
        ]
    },
    ...
}

In general, datasets have the "external knowledge non-flat" and "external knowledge" in the whole dialogue level. There are also some datasets where every turn in "log" has own "external knowledge non-flat" and "external knowledge".

Here are datasets with turn-level "external knowledge":

'SimJointGEN', 'BiTOD', 'OpenDialKG', 'SimJointMovie', 'MS-DC', 'STAR', 'SimJointRestaurant', 'Taskmaster1', 'Taskmaster2', 'Taskmaster3'

And below is a general format for such datasets:

{
    "dataset_name--train/val/test--dialog_id": {
        "original dialog id": str,
        "dialog index": int,
        "original dialog info": dict,
        "log": [
            {
                "turn id": int,
                "user utterance": str,
                "system response": str,
                "dialog history": str,
                "original user side information": dict,
                "original system side information": dict,
                "dst": str,
                "dst accumulated": str
                "external knowledge non-flat": list,
                "external knowledge": str,
            },
         	...
        ]
        "prompt": [
            "This is a bot helping users to get navigation. Given the dialog context and external database, please generate a relevant system response for the user.",
            ...
        ]
    },
    ...
}

Please refer to each dataset folder for more details.