Welcome to GobHobs, the goblin-themed data extraction tool! In the world of goblins, there’s one thing they excel at—looting and extracting valuable information. Inspired by their relentless pursuit of treasures, GobHobs is here to help you “extract” data from multiple sources and organize it in ways only a goblin would dream of!
Once upon a time, a mischievous goblin stumbled upon an ancient library filled with data. From legal PDFs to phone records and web search results, it was all too much for even the greediest goblin to process. But this goblin was no ordinary thief—he wanted all the data and to process it intelligently, organizing it for future "plunders."
Thus, GobHobs was born—a magical tool that allows you to:
- Extract and structure data from PDFs (loot the precious structured data).
- Search and manage phone records (sneaky, sneaky goblin-style investigation).
- Scrape the web and organize it based on your needs (goblin-grade web-looting).
But there’s a twist—the goblin leaves some work for you to do! You can implement the brains of the goblin's operations in either Python (.py) or JavaScript (.js). Let’s dive in!
GobHobs is divided into three microservices:
- PDF Extractor: Extract structured data from PDF files and convert it into JSON format.
- Phone Records: Manage and search phone records by converting CSV data into JSON and performing complex searches.
- Web Scraping: Scrape data from web pages and order it based on relevance or user-defined criteria.
Each microservice provides the option to implement functionality in either Python (.py) or JavaScript (.js), giving you the freedom to choose based on your expertise.
-
PDF Extractor:
- Extract tables, key-value pairs, or any structured data from a PDF file.
- Save the data as a JSON file.
- You can implement this in either
extractfrompdf.py
orextractfrompdf.js
.
-
Phone Records:
- Convert phone records from CSV format to JSON.
- Implement a smart search function to find exact and approximate matches (goblins love finding hidden treasures).
- You can implement this in
searchrecords.py
orsearchrecords.js
.
-
Web Scraping:
- Scrape the web based on a query provided by the user.
- Implement functionality to order and rank the results.
- You can implement this in
ordering.py
orordering.js
.
To unleash the goblin magic, follow these steps:
Ensure you have both Python and Node.js installed on your system. Then, install the necessary Python dependencies by running:
pip install -r requirements.txt
Navigate to the frontend folder and run the Goblin command-line interface (CLI) using:
cd frontend
python shell_script.py
Navigate to the backend directory and run the backend file
cd backend
python app.py
You can use the ferontend cli using the commands by typing a help keyword and seeing the command and its usage
Watch the video tutorial below to get a complete walkthrough of how to set up and implement each microservice within GobHobs. The video covers:
- Installing dependencies.
- Running the frontend CLI and backend API.
- Using the GobHobs CLI for PDF extraction, phone record searching, and web scraping.
- Implementing your custom logic in Python or JavaScript for each microservice.
Click here to watch the full video walkthrough
Follow these steps to pull the code, make your own changes, and create a new branch:
First, clone the repository to your local machine using Git:
git clone "Repository-link"
cd GobHobs
git checkout -b your-feature-branch
git add .
git commit -m "change-mentioned"
git push origin your-feature-branch-name
Thank you for joining the GobHobs adventure! We hope you enjoy working on this project as much as the goblins enjoy looting and extracting data.
Whether you're mastering the art of PDF extraction, building intelligent search algorithms for phone records, or crafting the perfect web scraping and ranking logic, the goblins are always watching your progress with excitement. 🧙♂️✨
Remember, this project is all about improving your skills in:
- Data extraction.
- Algorithm building.
- Smart searching.
- Web scraping.
Feel free to customize, improve, and make this project your own! The goblins can’t wait to see what you’ll do next.
Good luck, and happy coding! 💻💡🧙♀️