With Nebula Expired Article Hunter you can get tons of expired content that is usually no longer indexed in search engines like Google or Bing, so you can use it for your website or marketing campaigns, all by scraping expired websites for their forgotten articles.
- Low memory consumption.
- Configurable using config.ini file:
- Connection parameters.
- Files and folder names.
- Min and max words per article.
- Verbose interface.
- Organize the discovered articles in a friendly way.
- You can scrape as many expired domains as you wish.
To install and run this project copy or clone all the files to your preferred folder and type and execute:
pip install -r requirements.txt
python main.py
It's recommended to run in a virtual environment.
Nebula Expired Article Hunter was developed under Python 3.9.0
it should be fine in any Python 3 environment.
- Speed up the scraping process.
- Add proxy rotation.
- GUI.
- Check the articles for plagiarism.
- Add a expired domain scraper.