[Github Repo Link
]https://github.com/ZanQraaa21/object-detection.git
This repository provides a step-by-step guide for preparing and training a custom object detection model using popular deep learning frameworks such as TensorFlow, PyTorch, or YOLO, as well as text recognition modelling and PET value Extraction.
The purpose of listing the structure of a repository is to provide a clear and organized overview of the contents of the repository. This helps users to easily navigate and understand the different components and files within the repository, as well as their respective functions and relationships to each other.
object-detection
├── EasyOCR
├── Yolov5_Deepsort #submodule Yolov5_DeepSort
├── custom_textRecog_model #submodule deep-text-recognition-benchmark
├── Yolov5 #submodule yolov5
├── yolov8Ndeepsort #submodule yolov8Ndeepsor
├── JSON2YOLO #submodule JSON2YOLO
├── datasets
├── custom_traffic_sign
├── images
├── train # original images
├── val
└── test
├── labels
├── train # YOLO format (label, xmin, ymin, xmax, ymax)
├── val
└── test
├── train.txt #list image path to train dataset
├── val.txt
└── test.txt
├── pretrained_models # YOLOv5 pre-trained model architectures
├── custom_traffic_sign.yaml # a fixed data format that is compatible with YOLO system
└── ...
├── results
├── train # can be loaded for detection
├── yolov5l_results_custom_ts
├── weights
├── best.pt
└── last.pt
├── labels_correlogram.jpg #key statistics
├── F1_curve.png
└── ...
├── yolov5m_results_road_coco
└── ...
├── detect
├── track
└── text_detection
├── scripts
├── deep_text_classification_model
└── PET_Extraction
├── custom_dataset_preparation.ipynb # Custom dataset preparation
├── object_detection_impl.ipynb # Object detection modelling
├── sign_detectionNrecog_model.ipynb # Sign detection and text recognition
├── train_custom_ocr.ipynb # text recognition modelling
├── PET_eval_anomaly_detect.ipynb # PET extraction program
├── visualize.ipynb # Visualizing ground truth from training set
└── git_impl.ipynb
- A computer with a GPU is highly recommended to run the training process efficiently.
- Familiarity with deep learning and computer vision concepts is assumed.
- Install the required dependencies and libraries, such as TensorFlow, PyTorch, OpenCV, NumPy, etc.
- Create a Clearml account that can used to monitor training model
# Setting credantials before connecting clearml # clearml-agent init %env CLEARML_WEB_HOST=https://app.clear.ml %env CLEARML_API_HOST=https://api.clear.ml %env CLEARML_FILES_HOST=https://files.clear.ml %env CLEARML_API_ACCESS_KEY=########## %env CLEARML_API_SECRET_KEY=########## from clearml import Task # Task.set_credentials(host='http://localhost:8008',key='<access_key>', secret='<secret_key>') # create a task and start training task = Task.init(#project_name, #task_name) # task = Task.get_task(task_id='######')
The first step in building a custom object detection model is to collect a labeled dataset of images that contains the objects of interest. This dataset should be annotated to include the object's bounding boxes and class labels.
Open-source: [Github Repo Link
]https://github.com/ynlx/Yolov5_DeepSort.git
%cd yolov5
%pip install -qr requirements.txt
# detect
# balanced road-object COCO
!python detect.py --weights results/runs/train/yolov5s_results_road_coco_bal/weights/best.pt --conf 0.05 --source /content/object-detection/datasets/test/street_view --name exp_coco_bal
# road-object COCO
!python detect.py --weights results/runs/train/yolov5l_results_road_coco/weights/sweat_best.pt --conf 0.05 --source /content/object-detection/datasets/test/street_view --name exp_coco_bal
#detect and track
%cd Yolov5_DeepSort
!python detect.py --weights results/yolo/train/yolov5l_results_coco_road/sweat_best.pt --conf 0.5 --source ../datasets/test/video4track/trains.mp4 --name exp_coco_train
#image segmentation
!python segment/predict.py --weights runs/train-seg/exp/weights/best.pt --img 640 --conf 0.25 --source ../datasets/test/LX_TS --name exp_seg
Open-source: [Github Repo Link
]https://github.com/ultralytics/yolov5
%cd yolov5
%pip install -qr requirements.txt
#train
!python train.py --batch 32 --epochs 300 --data ../datasets/custom_traffic_sign.yaml --cfg ./models/yolov5l.yaml --weight ../datasets/pretrained_models/yolov5l.pt --name yolov5l_results_custom_ts --cache
#detect and recognize texts
image_path = 'datasets/custom_traffic_sign/images/test/'
output_saveto = './results/text_detection/custom_ocr_detection_dbnet'
detection_model = torch.hub.load('ultralytics/yolov5',
'custom',
'results/yolo/train/yolov5l_results_road_coco/weights/sweat_best.pt',
force_reload=True).to(device) #custom trained model
detection_model.conf = 0.5
text_reader = Reader(['en'] ,
model_storage_directory = 'EasyOCR/model/',
user_network_directory='EasyOCR/user_network/',
detect_network='dbnet18',
recog_network='ts_TPS_ResNet_Attn') # CRAFT and dbnet18 is supportive
# text_reader = easyocr.Reader(['en'] ,detect_network='dbnet18')
CROP_SAVE = './results/text_detection/crop_signs'
global textnsign
textnsign = {'filename':[],'words':[], 'org':[]}
for fn in os.listdir(image_path):
IMAGE_PATH = os.path.join(image_path,fn)
text_cls_on_traffic_sign(IMAGE_PATH, output_saveto, detection_model, text_reader, crop = True, cropSaveto = CROP_SAVE, cropSaveto = 'datasets/custom_traffic_sign/crop', textnsign= textnsign)
Open-source: [Github Repo Link
]https://github.com/clovaai/deep-text-recognition-benchmark.git
Open-source: [Github Repo Link
]https://github.com/JaidedAI/EasyOCR.git
%cd EasyOCR
%pip install -qr requirements.txt
When preparing a custom OCR model, there are several steps you need to follow:
-
Before starting, make sure to update both EasyOCR and deep-text-recognition-benchmark by cloning the git repositories instead of installing the python libraries. This approach allows for easy modification of the scripts, especially when running a custom model on your own dataset.
-
Create a custom lmdb dataset by executing the create_lmdb_dataset.py script. The output will be the data.mdb and lock.mdb files, which will be used as input to the deep-text-recognition model in the next step. Note: Ensure that your dataset follows the following structure:
|---images
|---xxx.jpg
|---...
|---gt.txt #list the image path and its ground truth label by following 'path\tlabel\n' line by line
- Copy the
user_network
andmodel
folders under the EasyOCR folder and create two new folders. -- user_network: The script ending with.py
is a brief program that supports EasyOCR in loading a custom deep-text-recognition model. The script outlines specificTransformation/Feature-extraction/Prediction
functionalities. For reference, you can look at the scripts located under the custom_textRecog_model/modules folder. Another script ending with.yaml
lists custom changes made, usually indicating the dataset used to train the deep learning model. It's important to ensure that these two scripts are consistent, otherwise, you may encounter errors when running the EasyOCR model.
-- model: The .pth
files contain all information about the model architecture.
-
(Optional) Replace the
train&dataset&test.py
files with the original files under thecustom_textRecog_model
folder. There may be slight differences in the dataloader attribute due to different python versions. To avoid errors, changedataloader_iter.next()
tonext(dataloader_iter)
. -
Finally, you are able to execute the training process with the updated script train.py. You can check the results by observing loss curve, accuracy of model and intermediate images during training. After the training is completed, you should have a new .pth file under
custom_textRecog_model/model
which can be used as the custom model for EasyOCR.
!python train.py --exp_name ts_resnet_attn_adam2 \
--train_data datasets/custom_traffic_sign/lmdb_dataset/train/ \
--valid_data datasets/custom_traffic_sign/lmdb_dataset/val/ \
--select_data data --batch_ratio .19 --num_iter 30000 --manualSeed 10000\
--Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn \
--imgH 200 --imgW 240 --batch_max_length 30 \
--character 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789' --FT --PAD --data_filtering_off --sensitive
Open-source: [Github Repo Link
]https://github.com/MuhammadMoinFaisal/YOLOv8-DeepSORT-Object-Tracking.git
%cd yolov8Ndeepsort
%pip install -qr requirements.txt
To ensure that the changes have been made before executing the code, please follow these steps:
-
Put the
best_v8l.pt
or othermodel.cfg
file that you are working with into the folder./yolov8Ndeepsort/ultraltics/yolo/v8/detect
. This will make sure that the correct model is being used for object detection. -
Replace
predict.py
with the original one downloaded from the open-source repository. The original file can be found by following the path:./yolov8Ndeepsort/ultraltics/yolo/v8/detect
. The changes made to the original file were to add more lines to draw dots, which will help us to see from the outputs whether the upcoming road object is reaching the TT-Box (also known as the danger zone). -
Add one extra line,
TTbox: None
, to the configuration file. This will enable us to easily add the extra information of the TT-Box into the command line, rather than writing it into a new program. The added line will help to configure the program to use the defined TT-Box.
!python predict.py model=best_v8L.pt source="LX_3.mp4" TTbox="[464, 460, 688, 499]"
Once the training process is complete, the final step is to evaluate the model's performance on the testing set. This will give a clear indication of the model's accuracy and help identify any potential issues that need to be addressed.
In conclusion, preparing and training a custom object detection model is a complex process that requires a solid understanding of deep learning and computer vision concepts, as well as a data labelling pipeline. However, by following these steps, a high-quality object detection model can be built that can be used for a wide range of applications.