Template for Data Engineering and Data Pipeline projects
This is a high level description of the project, what it is trying to accomplish.
- Add your requirements to the
requirements.txt
file for Python pip packages. - Add any nessesary installations to the Dockerfile.
This is a high level description of the tool(s) and decisions around why those tool(s) were choosen.
This is instructions on how to test this repo. All tests are located inside the tests
folder. We are using pytest
.
Run the following steps.
- docker build --tag my-project .
docker-compose up test
Add your unit tests to files inside the tests
folder ... name your files test_somename.py
High level description of data source(s) and sink(s), as well as the general pattern and data flow through the pipeline. Discuss any assumptions made.
If you have your own hooks, you can add them to git-hooks.
Use this command to add them to the appropriate folder then commit.
sh git-hooks/copy_hooks.sh
Whatever is copied from git-hooks/copy_hooks.sh will replace anything set up using the pre-commit.