This repo contains code to crawl images and videos:
- ORIGINAL images from Google Search
- ORIGINAL videos from Youtube
-
ChromeDriver
- Check your current Google Chrome Version
- Download ChromeDriver corresponding to your Chrome Version at ChromeDriver, unzip it.
For example, I'm using Chrome Version
95.0.4638.69
, Linux, so I downloadedchromedriver_linux64.zip
-
Enviroments
conda env create -f environment.yml
Download original (not thumbnails) from Google Images Search with multi-threading :D
- Get URLs by keywords
python crawl_url.py
- Download imgs from URLs
python crawl_data.py
- Get URLs by keywords
python crawl_youtube_link.py
- Download videos from URLs
python crawl_videos.py python crawl_videos.py --metadata --thumbnail # thumbnail and metadata only
- Init
- Multithreading
- Requiremets
- Write Guideline
- Add parser to save_dirs, chromedriver, etc.