Skip to content

A demo for OCR on Signboard (ArT dataset) using PaddleOCR' EAST + CRNN, MMOCR's ASTER and CRAFT.

License

Notifications You must be signed in to change notification settings

chunchet-ng/signboard_ocr

Repository files navigation

Introduction

This repo contains the demo code for OCR on Signboard images from the ArT dataset using PaddleOCR' EAST + CRNN, MMOCR's ASTER and CRAFT. Google Colab is used for this demonstration.

Code Structure/Explanation

  1. Setup on Google Colab or ➡️Click here to kickstart this demo directly at Google Colab
  2. HMean Calculation for Text Detection & Spotting Tasks
  3. Accuracy & Normalized Edit Distance Calculation for Text Recognition Task
  4. Data Exploration for the ArT dataset
  5. Text Detection, Recognition & Spotting on the ArT dataset

Do note that you need to run the ArT notebook before going into Signboard_OCR for the first time.

Takeaways

  1. You will learn how the HMean is calculated for the text detection and text spotting tasks with detailed examples.
  2. You will learn how the accuracy, number of correctly recognized words, and normalized edit distance are calculated for the text recognition task.
  3. You will learn how to make use of the ArT, a commonly used scene text dataset. We will compare the results of using pre-trained regular and irregular scene text methods using the evaluation methods mentioned above. Then, we can analyze the performance gap between the regular and irregular scene text methods on signboard images.

Credits

  1. ArT dataset
  2. PaddleOCR
  3. MMOCR
  4. CRAFT
  5. Huge thanks to @nwjun and @alex for making the notebooks better and provide valuable feedbacks to this demo! 💪😇👍

About

A demo for OCR on Signboard (ArT dataset) using PaddleOCR' EAST + CRNN, MMOCR's ASTER and CRAFT.

Topics

Resources

License

Stars

Watchers

Forks