Skip to content

russmckendrick/discogs-scraper

Repository files navigation

Discogs Scraper ๐ŸŽต

A basic scraper for generating files for https://www.russ.fm/ ๐ŸŽธ. While this was initially created for personal use, feel free to use it if you find it helpful! ๐Ÿ˜ƒ Although the documentation is minimal, the code is fairly straightforward.

You can find the repo containing the website files and config at russmckendrick/records, it's a Hugo-powered site and there are ALOT of files.

Getting Started ๐Ÿš€

  1. Clone the repository to your local machine.
  2. Install the required dependencies using pip install -r requirements.txt.
  3. Run the discogs_scraper.py script to start the scraper.

Configuration โš™๏ธ

To customize the scraper for your needs, create a copy of the secrets.json.example file calling it secrets.json and file in the details.

How it Works ๐Ÿ› 

The scraper fetches data from the Discogs API and processes the information to generate markdown files and download images. This data can then be used to create a static site showcasing your music collection ๐ŸŽง.

Running the Scraper ๐Ÿƒโ€โ™‚๏ธ

The scraper can be run using the following commands:

To process just 10 releases every 2 seconds run the script without any flags;

$ python3 discogs_scraper.py

You can add the --all flag to process all releases in your collection;

$ python3 discogs_scraper.py --all

You can also add the --num-items flag to process a specific number of releases;

$ python3 discogs_scraper.py --num-items 100

Finally, you can override the default 2 second delay between requests using the --delay flag, this is not recommended as it may cause issues with the Discogs API so be careful;

$ python3 discogs_scraper.py --delay 0

You can also combine the flags to process a specific number of releases without any delay;

$ python3 discogs_scraper.py --all --delay 0

Contribution ๐Ÿค

If you'd like to contribute or suggest improvements, feel free to submit a pull request or open an issue on GitHub. We appreciate your input! ๐ŸŒŸ

Enjoy scraping and building your music collection website! ๐ŸŽถ

One More Thing... ๐Ÿค–

Oh yeah, it was mostly written by ChapGPT ๐Ÿ’ฌ with me debugging ๐Ÿ› it and adding some features. ๐Ÿค“

Some random links

For when reviewing the wrong matches and you need to move a release to the collection_cache_overrides.json file from your collection_cache.json file.

About

A basic scraper for generating files for my website ๐ŸŽธ.

Topics

Resources

Stars

Watchers

Forks

Languages