Skip to content

A media scrapper that values simplicity and performance.

License

Notifications You must be signed in to change notification settings

qzcool/Media-Scrapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Media-Scrapper

A media scrapper that values simplicity and performance. Download the best story under a certain tag automatically. Builded on top of BeautifulSoup4 and Requests.

Functionality

  1. Download all media of a story (url) in the folder under the story name.
  2. Create a list of stories by tag (topic), download all media of stories (urls)

Sample Story Media

Supported Media Sources

  1. 福利秀
  2. 91自拍论坛

Deployment Instruction

  1. Download the repository to the local path, where the media will be saved.
  2. Install Packages Dependencies:
  • tqdm: pip install tqdm
  • fake-useragent: pip install fake-useragent
  • BeautifulSoup4: pip install beautifulsoup4
  • Requests: pip install requests
  1. Open the MediaScrapper.ipynb file with Jupyter Notebook.
  2. Select the media source (code block).
  3. Alter the url or tag information for your need.
  4. Run to start scrapping media.

Issues

  1. Due to the high volume of traffic at night for the media sources, we suggest you to run the MediaScrapper other time.

Disclaimer

Sharing allergic (adult) contents might be against the law. This media scrapper is purely for personal academic purpose and therefore not obliged to any legal issues related with non-personal, non-academic purposes.