Releases · gonzalopezgil/scraping-interface

26 Jun 19:45

Latest

The browser of the app can now run the scraping process.
The user is able to set if the process will run in the background or in the foreground.
The foreground processes support the interaction with the user.

Assets 2

16 Jun 07:54

The new version of Scraping Interface incorporates several key features and necessary bug fixes.

Multiple Pagination XPaths: The pagination control is enhanced accepting multiple pagination xpaths, allowing to visit different pages in more cases, not just following the "Next" button.
Process Management Enhancements: Provides the ability to remove individual processes from the table for improved task control, including the possibility of deleting the corresponding file.
Cross-platform Style Improvement: Implements visual enhancements for a superior cross-platform compatibility and user experience.

Selenium Driver Modifications: Numerous fixes related to the selenium driver, including size specifications, HTML entity handling, and xpath operations.
Improved Pagination Stability: Modifications to prevent errors related to unlimited pagination and enhance handling if a pagination button fails.
Interface and Compatibility Enhancements: Adjustments to window size across platforms, space allocation for UI elements, and updated color schemes for broader compatibility.

As always, opening an issue or submitting a pull request for any encountered issues or suggestions is highly encouraged.

Assets 2

12 Jun 16:41

Release v1.0.0: Desktop Application

The first release of the application offers the following key functionalities:

Web scraping: Extract information from web pages using a browser.
Dynamic browsing: Navigate the web with Chromium and perform standard browser actions.
XPath selection: Highlight and select elements on sites using generalized XPath.
Table preview: Select data from web pages and view it in a table format.
Pagination support: Extract data from multiple pages with consistent structures.
Data export: Save scraped data in Excel, CSV, JSON and XML formats.
Template management: Save and load scraping configurations for reuse.
Authentication support: Store and use securely encrypted login credentials.
CAPTCHA handling: Uninterrupted data extraction with the manual solving method for CAPTCHAs.
Process monitoring: Track and manage scraping processes with progress indicators.

Please note that this is the first release, and further improvements and bug fixes are expected in subsequent versions.

Assets 2

Provide feedback