Data Pipeline Server

This project is an example of a end to end data pipeline.

It has the following:

Pyflink
Kafka
- w/ Zookeeper
Postgres
Airflow
DBT
Metabase

Normally these services would be spread out across a few VM's, but this is just for demonstraction purposes.

For development

Pull git repo

docker compose up --build -d

Postgres DB will be initialized You will see a users, events, and attributed_sucessful_applications table.

A seperate process will generate users, click events, and job applicaiton events, the proceed to populate the tables.

It will generate 500 users, and 2000 click events.

The central point of this setup is to simulate click events and successfully attributing successful job applications to users from these events processed in "real-time" through flink.

Open up the JobManager UI in your browser localhost:8081

You will need to create the topics:

docker exec -it kafka /bin/bash

kafka-topics.sh --create --bootstrap-server kafka:9092 --replication-factor 1 --partitions 1 --topic clicks

kafka-topics.sh --create --bootstrap-server kafka:9092 --replication-factor 1 --partitions 1 --topic applications

exit

Run the Job:

docker exec jobmanager ./bin/flink run --python code/attribute_successful_applications.py

Or simply run run.sh

./run.sh

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
.github/workflows		.github/workflows
code		code
containers		containers
postgres		postgres
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
run.sh		run.sh
server_deploy.sh		server_deploy.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Pipeline Server

For development

About

Releases

Packages

Languages

Dematom1/data_server

Folders and files

Latest commit

History

Repository files navigation

Data Pipeline Server

For development

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages