This project demonstrates the use Boto3, an AWS SDK for Python, along with Textract, a machine learning service provided by AWS, to extract text from Images.
- Uses Boto3 to interact with the AWS Textract service.
- Extracts text from Images in various formats, such as jpg or png.
- Supports processing documents stored in AWS S3 buckets or locally on the file system.
- Provides a simple and straightforward interface for extracting text from scanned documents.
- Clone the repository:
git clone <repository_url>
- Install all the required dependencies
pip install -r requirement.txt
- Run the python Application:
python extract_text.py
Note: You need to configure your AWS Profile with the project before running.
Contributions are welcome! If you find any bugs or have ideas for improvements, feel free to open an issue or submit a pull request.
You may use this project freely at your own risk. See LICENSE.
Copyright (c) 2023 Mahima Churi