Skip to content

An interactive CLI tool for choosing CSS selectors. Designed for use with BeautifulSoup and Scrapy. This project uses the BeautifulSoup and Textualize.rich libraries to create an interactive element selecting experience. It can be run as program or used as a library.

License

Notifications You must be signed in to change notification settings

Makaze/csschooser

Repository files navigation

csschooser

Demo

demo

Description:

An interactive CLI tool for choosing CSS selectors for a web page. Designed for use as a library with BeautifulSoup and Scrapy.

This project uses the BeautifulSoup and rich libraries to create an interactive element-selecting experience. It can be run as program or used as a library.

Prerequisites

This project was made using Python 3.10.12 and pip 22.0.2. See requirements.txt for module information.

Installation

Using Git:

git clone https://github.com/Makaze/csschooser.git
cd csschooser
pip install -r requirements.txt

Usage

On the Command Line:

$ python3 csschooser.py

As A Library:

Example using the BeautifulSoup library to print the text from all matching elements:

import csschooser

soup = csschooser.get_soup("http://github.com/Makaze/csschooser") # Example URLexit

selector = csschooser.interactive_select(soup)

for tag in soup.select(selector):
    print(tag.get_text().strip())

API / Documentation

get_soup(name):

Takes in a string name and returns a BeautifulSoup instance based on the contents of the file or URL named name. Raises a FileNotFoundError if name is neither a valid URL nor a valid file name.

get_regex(s):

Takes in a string s and returns a Regular Expression pattern as a string for matching the outermost element in s. Returns s unchanged if it contains no elements.

interactive_select(soup):

Takes in soup as a BeautifulSoup instance and prompts the user to enter a CSS selector. Matching elements are highlighted in an auto-scrolling output window. Clears the terminal screen and returns the last chosen selector when the user follows the prompt to exit.

clear(lines):

Takes in an int lines. If lines is >= 1, moves the cursor up and to the end of the line lines times and returns the resulting backtrack sequence as a string. Otherwise calls the system's clear terminal command, clearing the terminal screen, then returns False.

paginate(console, pretty):

Takes in console as a rich.Console instance and pretty as a string, then passes pretty to the console and sends the rich string to the system's pager utility (less for Linux systems).

About

An interactive CLI tool for choosing CSS selectors. Designed for use with BeautifulSoup and Scrapy. This project uses the BeautifulSoup and Textualize.rich libraries to create an interactive element selecting experience. It can be run as program or used as a library.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published