GitHub - pitbuk101/Text-Recognizing-with-Trained-Model: Design & develop a pipeline to Extract data from unstructured documents

Folder/file Purpose

20230401T120858Z-001 - Data that is used src - Source Code of the Assignment my_solution.ipynb - Test case of my solution in ipynb Alternate_Solution.ipynb - Alternate solution Requirements.txt - Requirements to Run Reuslt.png - Auto generated output

Objective: Design & develop a pipeline to Extract data from unstructured documents

Here I proposed two solutions :

my_solution.ipynb : It is in traditional way where first Preprocessing of image with opencv and then use of OCR libraries to gain information from images.
Alternate_Solution.ipynb : It is a Deep Learning Pre traind Model with text detection and recognisation with pyTorch and OCR , which can be customized as per our use. I selected this because of the robust ML pipeline that is used first text detection (localizing words), then text recognition (identify all characters in the word), which gives the edge to extract information seamlessly.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Assignment-20230401T120858Z-001/Assignment		Assignment-20230401T120858Z-001/Assignment
src		src
Alternate_Solution.ipynb		Alternate_Solution.ipynb
README.md		README.md
my_solution.ipynb		my_solution.ipynb
output_view.pdf		output_view.pdf
requirements.txt		requirements.txt
result.png		result.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

pitbuk101/Text-Recognizing-with-Trained-Model

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages