Skip to content

Kavya8897/OCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR

OCR is a technology to convert handwritten, typed, scanned text, or text inside images to machine-readable text You can use OCR on any image files containing text or a PDF document or any scanned document, printed document, or handwritten document that is legible to extract text Eliminating manual data entry by digitizing printed documents like reading passports, invoices, bank statements, etc. Create secure access to sensitive information by digitizing Id cards, credit cards, etc you can use any pdf's

Comparision of Different OCRs

In this I compared different OCRs with 37 screenshots and analyised the results of each OCRs and used spacy Library for cleaning text and for applying POS tagging. You can check the analysis in documentation part.

Extracting text from Images

Reading a Text from an Image In this You will use Easy OCR, Tesseract, Normcap and Nanonets all of these are for optical character recognition (OCR), to read the text embedded in images. You will need to understand some of the configuration options that can be applied and

For Intallation Process and for further Queries You can refer this links :

🔗 Links

EasyOCR - https://github.com/JaidedAI/EasyOCR

Tesseract - https://github.com/tesseract-ocr/tesseract

Normcap - https://github.com/dynobo/normcap

Nanonets - https://nanonets.com/

Spacy - https://spacy.io/usage

Dataset

In this I used manually created dataset it contains 37 Images.

Documentation

Comparision of OCRs

Comparision of VGG16 & MobileNet