Skip to content

lucasabrantes1/BEES-DATA-ENGINEERING-BREWERIES-CASE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Summary

Introduction

This repository contains a case study on a brewery database by location, where you can examine the entire data engineering workflow. The following tools were used:

  • Airflow Airflow for orchestration
  • Docker Docker
  • Pyspark Pyspark
  • Databricks DataBricks
  • Python Python for API requests
  • GCP GCP







Step-by-Step Instructions Docker

Navigate to the project directory:

   cd ~/bees-data-engineering-breweries-case

⚠️ Warning: If you are running this project on Windows using VS Code, ensure that the start_airflow.sh file is set to LF (Line Feed) line endings to avoid errors.


**Start the Docker containers:**
docker-compose up -d

Verify that the containers are running and Ensure you see two containers: one for Airflow and one for SQL Server.

docker ps

Access the Airflow container:

docker exec -it bees-data-engineering-breweries-case-airflow-1 sh

Create an Airflow user: Inside the Airflow container, run the following command:
This will create an admin user with the username admin and password admin.

airflow users create \
  --username admin \
  --password admin \
  --firstname Admin \
  --lastname User \
  --role Admin \
  --email admin@example.com

Exit the Airflow container:

exit

How to Stop and remove the Docker containers: you can stop and remove the containers:

docker-compose down







Steps to Start Airflow Without Example DAGs**

Access the Airflow Container:

docker exec -it bees-data-engineering-breweries-case-airflow-1 sh

Find the Airflow Configuration File:

find / -name "airflow.cfg"

Edit the Airflow Configuration File:

vi /path/to/airflow.cfg

Disable the Example DAGs manually or with sed command line:

sed -i 's/load_examples = True/load_examples = False/' /path/to/airflow.cfg

Restart Airflow:

exit
docker restart bees-data-engineering-breweries-case-airflow-1

Re-enable the Example DAGs:
If you want to re-enable the example DAGs, change load_examples = False back to load_examples = True.

docker exec -it bees-data-engineering-breweries-case-airflow-1 sh
sed -i 's/load_examples = False/load_examples = True/' /path/to/airflow.cfg
exit
docker restart bees-data-engineering-breweries-case-airflow-1

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published