Skip to content

Alibaba scraper with using of rotating proxies and headless Chrome from ScrapingAnt

Notifications You must be signed in to change notification settings

ScrapingAnt/alibaba_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Alibaba parser using scrapingant.com

This project shows how to use ScrapingAnt scraping service to load public data from alibaba.

ScrapingAnt takes away all the messy work necessary to set up a browser and proxies for crawling. So you can just focus on your data.

Usage

To run this code you need RapidApi key. Just go to ScrapingAnt page on Rapidapi, and click "Subscribe to Test" button. After that you have to select plan(there is a free one including 100 requests). After that you can find you api key in "X-RapidAPI-Key" field on endpoints page.

With Docker

docker build -t alibaba_scraper . && docker run -it -v ${PWD}/data:/app/data alibaba_scraper adidas --rapidapi_key <RAPID_API_KEY>

Without Docker

This code was written for python 3.7+

git clone https://github.com/ScrapingAnt/alibaba_scraper.git
cd alibaba_scraper
python3 -m venv .env
.env/bin/pip install -r requirements.txt
.env/bin/python main.py --help
.env/bin/python main.py adidas --rapidapi_key <RAPID_API_KEY>

Available params

.env/bin/python python main.py --help

Usage: main.py [OPTIONS] SEARCH_STRING

Options:
  --rapidapi_key TEXT             Api key from https://rapidapi.com/okami4kak/api/scrapingant  [required]
  --pages INTEGER                 Number of search pages to parse
  --country [ae|br|cn|de|es|fr|gb|hk|in|it|il|jp|nl|ru|sa|us]
                                  Country of proxies location
  --help                          Show this message and exit.

Sample output:

Output is saved to data/ directory in csv format.