Skip to content

sharkpick/python_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 

Repository files navigation

spammer domains scraper

This is a scraper that grabs spammer domains from (right now 2) different sources and makes a nice list (in a format I need for the tools I use). This could easily be adapted to dump JSON arrays too, and I'm sure some other stuff.

Usage:

This guy's really simple. It has to be run in Python3, but that makes sense at this point. You simply add a cron job (or systemd timers or whatever) to run it once daily. The blacklists don't update all that often, and after the initial runs I tend to see between 15-50 new domains daily.

The scraper uses the included data/ directory to store your raw lists and blacklists. The blacklists are generated by running a diff between yesterday's and today's raw lists, which makes the blacklists output especially usable. If you choose, you can delete the files from the raw/ and blacklists/ directories, blacklists will begin to be generated on the second run (but you should definitely take a look at the first raw list).

About

malware domains scraper/parser

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages