Source code for a Django, PostGres & AWS Lambda - based tool for scraping data from seasonaljobs.dol.gov, as well as deduplicating employer records and importing data from other sources..
Code (c) Research Action Design, LLC. Originally produced for Centro de los Derechos del Migrante, Inc.
Released under a GPL v3 license, see LICENSE file for specific text of license.
Create a migration using alembic by running
python -m alembic revision --autogenerate -m "<MESSAGE>"
Migrate the db by running
python -m alembic upgrade head
The lambda function is built within the container specified by lambda.Dockerfile
.
Build the lambda function with sam build
.
On initial deploy you will need to do the following:
- Play around with approaches to fixing the circular dependency problem, see https://aws.amazon.com/blogs/mt/resolving-circular-dependency-in-provisioning-of-amazon-s3-buckets-with-aws-lambda-event-notifications/
- If deployment does not succeed initially, you will be stuck in a ROLLBACK state on initial deployment and need to run
sam delete
before re-trying the initial deployment.
- Build the lambda function with
sam build
. - Run
sam deploy --profile cdm
Note: The CloudFormation template is not reliably setting up S3 bucket triggers. You may find it easier to just add those via the AWS console UI.
- Start up postgres by running
docker-compose up
- Run
pipenv run python interactive_dedupe_session.py