Skip to content

A Deep Learning pipeline to recognize mathematical expressions from images

License

Notifications You must be signed in to change notification settings

kasim95/OCR_Math_Expressions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Optical Character Recognition for Handwritten Mathematical Expressions

license

Introduction

The aim of this repo is to recognize handwritten mathematical expression present in image using Deep Learning

The Project is divided into three tasks:

  • Character Localization

    Localizes characters present in the image with bounding box

  • Character Classification

    Identifies the class of character present in the bounding box

  • Syntactic Analysis

    Verifies the collection of symbols predicted by previous two tasks if it represents a mathematical image and generates MathML representation of it.


Quick Start

  1. Clone repo and download saved weights
  2. Run this command to identify the mathematical expression from image exp0030.png
python3 evaluate.py -m "trained_models/model3.h5"  -i "datasets/object_detection/evaluate/exp0030.png"

Example image for characters in expression located with Bounding boxes Object Detection

MathML Output:

<math xmlns="http://www.w3.org/1998/Math/MathML">
    <mrow>
        <mi>Z</mi>
        <mo>=</mo>
        <mi>X</mi>
        <mo>+</mo>
        <mi>Y</mi>
    </mrow>
</math>

Training

Convolutional Neural Networks and Dense Networks are trained in Classification_task.ipynb

Object Detection Models are trained in their respective submodules


Submodules

  • keras-yolo3 (ocr_math)

    YOLOv3 implementation in Keras
    Used to train Tiny-YOLOv3 model on object_detection dataset

  • models (tf112)

    Tensorflow Object Detection API
    Used to train Faster-RCNN with Resnet-50 model on object_detection dataset


Project Structure:

Directories

  • datasets/ : Contains datasets for Object Detection and Character Classification
  • plots/ : Contains plots generated by notebooks and scripts
  • Report/ : Contains Project Report
  • processed_data/ : Contains labels and other processed stuff from Dataset_Preprocessing.ipynb
  • syntactical_analysis :
  • trained_models/ : Contains saved models weights for CNNs and ANNs

Notebooks

  • Dataset_Preprocessing : Notebook containing code to combine screen dataset, combine with custom images and generate train-test splits
  • OD_Character_Segmentation.ipynb : Notebook demonstrating Character Localization using Contour Search
  • OD_Faster-RCNN.ipynb : Notebook demonstrating Character Localization using Faster-RCNN with Resnet50 model
  • OD_yolov3.ipynb : Notebook demonstrating Character Localization using Tiny YOLOv3 model
  • Optical_Character_Recognition : Notebook demonstrating the complete Project Pipeline

Python scripts

  • evaluate.py : Python file used to evaluate math expression/s from a single image or multiple images using Project Pipeline
  • utils.py : Python file containing Helper functions

Issues

At the moment, parser rules are set for binary operators only. This limits the scope of the Project to supported operators.

Supported operators in mathematical expression:

  • =
  • +
  • /
  • ÷
  • *
  • ×
  • %

Note

This Project has been tested in the following environment:

  • Python 3.6.9
  • Tensorflow 1.12.3
  • Keras 2.2.4
  • OpenCV 3.4.2.16
  • Numpy 1.17.4
  • Pandas 0.25.3