Releases: v-dvorak/omr-layout-analysis
Releases · v-dvorak/omr-layout-analysis
Models
Datasets
The final dataset is split into four logical parts:
- AudioLabs v2
- Muscima++
- OSLiC
- MZKBlank
Due to GitHub's restrictions on file size, the OSLiC dataset is split into two parts. OSLiC in COCO format keeps the same folder structure as the original dataset.
Quick Start
To train a YOLO model on the datasets, download all archives that are not tagged with COCO and combine them into one. When setting up the training pass the config.yaml
file as an argument to to the script.
Dataset Overview
images | system measures | stave measures | staves | systems | grand staves | |
---|---|---|---|---|---|---|
AudioLabs v2 | 940 | 24 186 | 50 064 | 11 143 | 5 376 | 5 375 |
Muscima++ | 140 | 2 888 | 4 616 | 883 | 484 | 94 |
OSLiC | 4 927 | 72 028 | 220 868 | 55 038 | 17 991 | 17 959 |
MZKBlank | 1 006 | 0 | 0 | 0 | 0 | 0 |
total | 7 013 | 99 102 | 275 548 | 67 064 | 23851 | 23 428 |
COCO format
zip/
img/ ... all images
json/ ... corresponding labels in COCO format
{
"width": 3483,
"height": 1693,
"system_measures": [
{
"left": 211,
"top": 726,
"width": 701,
"height": 120
},
...
YOLO format
zip/
images/ ... all images
labels/ ... corresponding labels in YOLO format
The *.txt file is formatted with one row per object in class x_center y_center width height format. Box coordinates must be in normalized xywh format (from 0 to 1).
0 0.163365 0.429003 0.205570 0.090634
0 0.328309 0.429003 0.112834 0.090634
0 0.462245 0.429003 0.138961 0.090634
0 0.598048 0.429003 0.124605 0.090634
0 0.741746 0.429003 0.150158 0.090634
0 0.889176 0.429003 0.136090 0.090634