You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been testing small-object detection on music scores via Detectron 2, and I got pretty good results, but they are not good enough for small objects such as note heads or stems, and I am wondering if there is a way to improve that, or I'd need a completely different approach (maybe a different system by Detectron?)
I am using the largest music score dataset available online (Deepscores V2), so I have a pretty good dataset of over 100,000 images. In my specific current case, I am trying to detect the stems of notes, and as you can see from the example below, the model I have trained after over 8,000 iterations already gives good results:
But that's not enough. I'd like to be able to detect almost all the stems on that score, just as an example. And unfortunately, I don't see any improvement with Detectron beyond that. The total loss starts bouncing after around 8000 iterations and I see no improvement after that.
Here is what I have tried:
I have tried different starting models for music score detection, and the best one I found is the "faster_rcnn_X_101_32x8d_FPN_3x.yaml"
I have tried different batch sizes (16, 32, 64, 128), and in the case of note stems, a batch size of 64 seems to work best. But besides all that, I don't know what else to do to improve the detection of small objects on the music score.
Here is the simple Python program I have set up for training:
############################
import detectron2
from detectron2.utils.logger import setup_logger
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
from detectron2.data import MetadataCatalog, DatasetCatalog
from detectron2 import model_zoo
from detectron2.data.datasets import register_coco_instances
import os
setup_logger()
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x.yaml"))
cfg.DATASETS.TEST = ()
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x.yaml")
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.001
cfg.SOLVER.MAX_ITER = 10000
cfg.SOLVER.CHECKPOINT_PERIOD = 2000
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 64
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 2
json_path = 'data/json'
images_path = 'data/images_selected'
#Starts loading of annotation files...
json_files = []
coco_names = []
cc = 0
for root, dirs, files in os.walk(json_path):
for file in files:
if file.endswith(".json"):
fileHr = "data/json/" + file
#print ("fileHr: ", fileHr)
json_files.append(fileHr)
coco_names.append("coco_train_" + str(cc))
register_coco_instances("coco_train_" + str(cc), {}, fileHr, "data/images_selected")
cc += 1
cfg.DATASETS.TRAIN = (coco_names)
# Initialize the model using the configuration
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg)
# Train the model
trainer.resume_or_load(resume=False)
trainer.train()
#Save configuration for later use...
with open("mycfgALL.yaml", "w") as f:
f.write(cfg.dump())
############################
Do you have any ideas I could try? Or would you suggest a different approach?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello here.
I have been testing small-object detection on music scores via Detectron 2, and I got pretty good results, but they are not good enough for small objects such as note heads or stems, and I am wondering if there is a way to improve that, or I'd need a completely different approach (maybe a different system by Detectron?)
I am using the largest music score dataset available online (Deepscores V2), so I have a pretty good dataset of over 100,000 images. In my specific current case, I am trying to detect the stems of notes, and as you can see from the example below, the model I have trained after over 8,000 iterations already gives good results:
But that's not enough. I'd like to be able to detect almost all the stems on that score, just as an example. And unfortunately, I don't see any improvement with Detectron beyond that. The total loss starts bouncing after around 8000 iterations and I see no improvement after that.
Here is what I have tried:
I have tried different starting models for music score detection, and the best one I found is the "faster_rcnn_X_101_32x8d_FPN_3x.yaml"
I have tried different batch sizes (16, 32, 64, 128), and in the case of note stems, a batch size of 64 seems to work best. But besides all that, I don't know what else to do to improve the detection of small objects on the music score.
Here is the simple Python program I have set up for training:
Do you have any ideas I could try? Or would you suggest a different approach?
I look forward to hearing from you.
Thank you in advance.
Beta Was this translation helpful? Give feedback.
All reactions