HaveIBeenPawned is a well-known website which aggregegates email breach data. It allows people to search their email address via the website and they also have an API.
This script and jupyter notebook allow you to run your own list of emails against the Have I Been Pawned API to find out if they appear in any breaches that HIBP have loaded.
A jupyer notebook and an equivalent script
- It's handy to be able to find out whether a list of email addresses appear in any data breaches
- HaveIBeenPawned doesn't support bulk email address lookup as of February 2021
- It's easy to extend this and analyse the addresses further with pandas
There are a couple of scripts floating around but I didn't like the way they were written, or they used superseded API or Python versions.
Thought someone else might find this useful, so figured I'd share it.
This tool takes a csv of email addresses and hits the HaveIBeenPawned API to find out whether those email addresses appear in any breaches
- Python 3.9 (probably works on 3xx but I haven't tested it)
- Some standard python libs
- An HIBP API Key https://haveibeenpwned.com/API/Key
- (Don't forget to read the docs https://haveibeenpwned.com/API/v3#Authorisation)
- A csv file containing email addresses in the first column
First, set up a virtual environment
foo@bar:~$ python3 -m venv env
Then activate it
foo@bar:~$ source env/bin/activate
In the Second Cell:
-
Update the
fileLocation
variable to the location of your email source file -
Update the
apikey
anduser-agent
variables to your HIBP API Key and a relevant user-agent per HIBP docs https://haveibeenpwned.com/API/v3#Authorisation
Run all cells
In settings.py:
- Update the
fileLocation
variable to the location of your email source file - Update the
apikey
anduser-agent
variables to your HIBP API Key and a relevant user-agent per HIBP docs https://haveibeenpwned.com/API/v3#Authorisation - Save changes
And back in your terminal run
foo@bar:~$ python3 hibp.py