Skip to content

Latest commit

 

History

History
61 lines (54 loc) · 1.46 KB

README.md

File metadata and controls

61 lines (54 loc) · 1.46 KB

Table of contents

General info

Application which aims to visualise how OCR work (using tessaract OCR library and its HOCR output format).

Requirements

Technologies

Project is created with:

Features

  • Displaying tessaractOCR bounding boxes on image (words, lines, paragraphs)
  • Displaying recognised phrases over text on image
  • Customisation of draw parameters - changing bounding box stroke color, font-size etc

Installation

Clone the repository

Install front-end dependencies

yarn install

Compile assets

yarn encore dev

Run composer

composer install
Create database
 php bin/console doctrine:database:create

Run migrations

php bin/console doctrine:migrations:migrate

Screenshots

Homepage

Main page

Image with all bounding boxes

Single news page

Part of the image with recognised text

Comments section