Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
-
Updated
Nov 9, 2024 - Python
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
An orchestration platform for the development, production, and observation of data assets.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
One framework to develop, deploy and operate data workflows with Python and SQL.
dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Work with your web service, database, and streaming schemas in a single format.
Main repo including core data model, data marts, reference data, terminology, and the clinical concept library
Relational data pipelines for the science lab
Cloud-native, data onboarding architecture for Google Cloud Datasets
Data pipelines from re-usable components
Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation, validation and loading of data from S3 -> Redshift -> S3
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
Found a data engineering challenge or participated in a selection process ? Share with us!
Conductor OSS SDK for Python programming language
Easiest way to monitor asynchronous data pipelines
The practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.
Add a description, image, and links to the data-pipelines topic page so that developers can more easily learn about it.
To associate your repository with the data-pipelines topic, visit your repo's landing page and select "manage topics."