hamosapiens / scraper Public

forked from mitchdaniels/scraper

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Python web scraper takes list of URLs and XPaths and outputs CSV of values

0 stars 5 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
config.yml		config.yml
scraper.py		scraper.py

Repository files navigation

Scraper

A Python scraper that takes a list of URLs and XPaths and returns the associated data in CSV format. To use, update the config.yml file with:

A source CSV with a column called 'URL' containing full hrefs
Fields listing the specific XPaths to search against.
Name of desired output CSV

Requirements

csv
pandas
Queue
threading
urllib2
re
lxml
requests
PyYAML

About

Python web scraper takes list of URLs and XPaths and outputs CSV of values

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%