Skip to content

Releases: v-dvorak/omr-layout-analysis

Models

28 Aug 07:11
Compare
Choose a tag to compare

The model comes with a set of arguments that were used to create it. The latest model is trained on the latest dataset.

Datasets

Datasets

28 Aug 07:06
Compare
Choose a tag to compare

The final dataset is split into four logical parts:

  • AudioLabs v2
  • Muscima++
  • OSLiC
  • MZKBlank

Due to GitHub's restrictions on file size, the OSLiC dataset is split into two parts. OSLiC in COCO format keeps the same folder structure as the original dataset.

Quick Start

To train a YOLO model on the datasets, download all archives that are not tagged with COCO and combine them into one. When setting up the training pass the config.yaml file as an argument to to the script.

Dataset Overview

images system measures stave measures staves systems grand staves
AudioLabs v2 940 24 186 50 064 11 143 5 376 5 375
Muscima++ 140 2 888 4 616 883 484 94
OSLiC 4 927 72 028 220 868 55 038 17 991 17 959
MZKBlank 1 006 0 0 0 0 0
total 7 013 99 102 275 548 67 064 23851 23 428

COCO format

zip/
    img/     ... all images
    json/  ... corresponding labels in COCO format
{
 "width": 3483,
 "height": 1693,
 "system_measures": [
  {
   "left": 211,
   "top": 726,
   "width": 701,
   "height": 120
  },
...

YOLO format

zip/
    images/    ... all images
    labels/      ... corresponding labels in YOLO format

The *.txt file is formatted with one row per object in class x_center y_center width height format. Box coordinates must be in normalized xywh format (from 0 to 1).

0	0.163365	0.429003	0.205570	0.090634
0	0.328309	0.429003	0.112834	0.090634
0	0.462245	0.429003	0.138961	0.090634
0	0.598048	0.429003	0.124605	0.090634
0	0.741746	0.429003	0.150158	0.090634
0	0.889176	0.429003	0.136090	0.090634