Please organize the datasets as follows, otherwise you may need to revise the write_*.py
files to meet your dataset path and files.
root
├── images
│ ├── 00000005.jpeg
│ ├── 00000008.jpeg
│ └── ...
├── labels
│ ├── 00000005.json
│ ├── 00000008.json
│ └── ...
└── split.json
root
├── images
│ ├── train
│ │ ├── apple_pie
│ │ │ ├── apple_pie_0.jpg
│ │ │ └── ...
│ │ ├── baby_back_ribs
│ │ │ ├── baby_back_ribs_0.jpg
│ │ │ └── ...
│ │ └── ...
│ ├── test
│ │ ├── apple_pie
│ │ │ ├── apple_pie_0.jpg
│ │ │ └── ...
│ │ ├── baby_back_ribs
│ │ │ ├── baby_back_ribs_0.jpg
│ │ │ └── ...
│ │ └── ...
├── texts
│ ├── train_titles.csv
│ └── test_titles.csv
├── class_idx.json
├── text.json
└── split.json
Update: The datasets we used are from (Kaggle), however, the test.jsonl here does not contain the label information. To make the label available for the evaluation, we download the test_seen.jsonl from (Kaggle2), which has the same test set as the previous one with the label information. You can directly download the test_seen.jsonl file here and use it (of course you should rename it as test.jsonl or revise the write_*.py file).
root
├── img
│ ├── xxxxx.png
│ ├── xxxxx.png
│ └── ...
├── train.jsonl
├── dev.jsonl
└── test.jsonl