Skip to content

Searching A Text or Keyword in The Handwritten Scanned Pile of Images

Notifications You must be signed in to change notification settings

muthuvenki/Searching-Text-In-Image

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Searching-Text-In-Image

Searching A Text or Keyword in The Handwritten Scanned Pile of Images

Have you ever faced a problem in finding a text or keyword in the handwritten scanned pile of documents? And got frustrated for doing it manually? or, cursed yourself for doing such a job? Sometimes you might have cursed the technology too! for not making your job easier.

Anytime while doing this kind of work, did you thought or search on the internet is there any application that can solve your problem? If that so believe me, you were in the right place!

Let’s breakdown our problem statement as execution steps,

  • Image Processing: Converting the scanned copy of the handwritten images into the machine-readable text.
  • Search Keyword: Scanning the converted image to a text document to find given keywords.
  • Highlight Keyword: Highlighting the found keywords in a different color which helps the reader to read it fast.
  • Restore Image: After the successful process of searching and highlighting store the image back to its original form.

Pre-requisites

Pseudo code

  • Step 1: Input the Image repository and the keyword to highlight.
  • step 2: Iterate each keyword, split each word in the keyword as a list.
  • step 3: Now iterate keyword by keyword, in turn again word by word of that keyword.
  • step 4:
    • A: Load the img by img in the repository
    • B: Make a copy of the original img which will be helpful for rollback.
    • C: Iterate first word of the first keyword
    • D: Check whether the keyword present in the image
      • IF,
        • D1: Highlight the word
        • D2: check the next concurrent word present next to next
        • IF,
          • D1.1: Highlight all the remaining following words
        • Else,
          • D1.2: Rollback the img as org_img, since consecutive words not found

After step 4, we will be successfully scanned the image for the keyword and highlighted if its presents

How to run the code?

  • Loadpythonfile ImgTextScan.py
  • Input:
  • Example Image File: testfilesample1.jpg # Construction Contract
  • Keyword To Highlight: “Florida Department of Transportation”
  • Run the PY file
  • Output Refer output.png img file

Output

Conclusion

This is just a sample that I’ve demoed. Further, we can use this in the different useful way of real-life business case problems like,

    1. Scanning a string in a scanned document of millions of images and find similarity documents.
    1. Finding a specific statement or quotation of large scanned books

For any clarification in the codes or require different functionalities or if you have any different use case in similar lines, please feel free to contact me on venkatmuthiahc@gmail.com.

Releases

No releases published

Packages

No packages published

Languages