- Create a DWH for staring pricing data from various cloud providers.
- Create orchestration that will pull data from the sources periodicaly.
- Create a dashboard that will show relevant metrics.
- Create a workflow to automaticaly deploy all of this.
- Cloud: AWS
- Containerization: Docker with Docker-Compose
- Infrastructure: Terraform
- DWH: Redshift with dbt core.
- Orchestration: Prefect
- Data Transformation: Pandas
- Data Visualization: Metabase
If you want to run it locally. Local setup
If you want to run it with GitHub Actions GitHub Setup
-
Azure price list API.
- Lower powered AWS spot instances are often as costly as on-demand. Even when available the savings are much less than typically advertized.
- Bigger instances come with bigger discounts. For example: m5a.large spot instances are 44% cheapter than on-demand. a1.medium spot instances are the same price as on-demand.
- Add GCP Data.
- Add persistance for metabase.
- Add data quality tests to dbt flow.
Thanks to the instructors:
Thanks to collegues:
- Anna Geller, her articles on Prefect DataOps have been a huge infuence.
- Andy Nelson, for telling me about ZoomCamp and generaly being a great mentor.
- Matt Little, his articles on Terraform and AWS were what made this entire project possible.