A Python project for scraping website data using Selenium, and transforming it into a CSV format.
Bot scrapes data from "notino.co.uk" and transforms it into a CSV format.
Python & Selenium for backend logic.
This project is under MIT license. Libraries and modules have their own licenses:
- Selenium: Apache License 2.0
- Python: Python Software Foundation License
abstract_scraper.py
: Base class with scraping methods.scraper.py
: Scrapes data from Notino and saves tonotino_raw.csv
.transformation.py
: Transforms raw data, adds extra columns, and saves tonotino_transformed.csv
.
- Python 3.7+
- Google Chrome & ChromeDriver
- Required Python packages
- Install packages:
pip install -r requirements.txt
-
Update the URL in
scraper.py
to the Notino website for the region you want to scrape (e.g., https://www.notino.co.uk/toothpaste/). -
Run the scraper to collect raw data:
python scraper.py
-
The raw data will be saved to
notino_raw.csv
. -
Transform the raw data to the final format:
python transformation.py
-
The transformed data will be saved to
notino_transformed.csv
.
- Web Scraping: Demonstrates how to scrape data from dynamically loaded web pages using Selenium.
- Data Transformation: Shows how to transform and enhance scraped data with additional information and save it in a structured format.
- Error Handling and Logging: Incorporates robust error handling and logging for better debugging and maintenance.