Releases: gonzalopezgil/scraping-interface
Releases · gonzalopezgil/scraping-interface
Release v1.2.0: Support for Foreground Processes
New Features in version 1.2.0:
- The browser of the app can now run the scraping process.
- The user is able to set if the process will run in the background or in the foreground.
- The foreground processes support the interaction with the user.
Release v1.1.0: Enhancements and Stability Improvements
The new version of Scraping Interface incorporates several key features and necessary bug fixes.
Features
- Multiple Pagination XPaths: The pagination control is enhanced accepting multiple pagination xpaths, allowing to visit different pages in more cases, not just following the "Next" button.
- Process Management Enhancements: Provides the ability to remove individual processes from the table for improved task control, including the possibility of deleting the corresponding file.
- Cross-platform Style Improvement: Implements visual enhancements for a superior cross-platform compatibility and user experience.
Fixes
- Selenium Driver Modifications: Numerous fixes related to the selenium driver, including size specifications, HTML entity handling, and xpath operations.
- Improved Pagination Stability: Modifications to prevent errors related to unlimited pagination and enhance handling if a pagination button fails.
- Interface and Compatibility Enhancements: Adjustments to window size across platforms, space allocation for UI elements, and updated color schemes for broader compatibility.
As always, opening an issue or submitting a pull request for any encountered issues or suggestions is highly encouraged.
Release v1.0.0: Desktop Application
The first release of the application offers the following key functionalities:
- Web scraping: Extract information from web pages using a browser.
- Dynamic browsing: Navigate the web with Chromium and perform standard browser actions.
- XPath selection: Highlight and select elements on sites using generalized XPath.
- Table preview: Select data from web pages and view it in a table format.
- Pagination support: Extract data from multiple pages with consistent structures.
- Data export: Save scraped data in Excel, CSV, JSON and XML formats.
- Template management: Save and load scraping configurations for reuse.
- Authentication support: Store and use securely encrypted login credentials.
- CAPTCHA handling: Uninterrupted data extraction with the manual solving method for CAPTCHAs.
- Process monitoring: Track and manage scraping processes with progress indicators.
Please note that this is the first release, and further improvements and bug fixes are expected in subsequent versions.