Metal is a data flow modeling software that can manage data flow processing operators, support visual modeling and submit batch jobs.
If you often use Spark SQL to develop ETL Pipeline and accumulate a large number of DTD (Dataframe To Dataframe) operators/operations, you can modify your operators/operations according to the Metal plugin specification and use Metal to manage these plugins.
If you use Metal, you can easily reuse these plugins. Metal provides two ways to build data streams, and data streams are composed of plugins.
- The first construction method is the Cli style. You need to write a spec file to configure the structure of the data stream and the parameters of the data processing operator.
- The second way is the visual style. Metal provides a Web UI for data flow design, namely metal-ui. Metal-ui is a simple data flow integrated development environment. Compared with the Cli style, metal-ui reduces the difficulty of configuring data flow. metal-ui manages each data flow with the concept of Project. In metal-ui, you can create projects, configure projects, draw data flows, track data processing tasks, manage operator plug-ins, and more.
- Support Spark SQL batch processing engine
- Supports multiplexing and management of processing operators
- Support
spark-submit
command line submission - Provides REST-API service
- Support visual construction of data flow
- Support operator extension
- Provides a packaging tool
- Provides Web-UI
- Support user-level and project-level resource isolation
Thanks for your interest in contributing! The easiest way is to just send a pull request(PR). Before send a PR, you need to understand how to build the source code and do somethings.
Building Metal requires at minimum JDK 11. Pull the latest source from the repository and use Maven install (or package) to build:
git pull origin master
mvn clean package -pl metal-dist -am -Dmaven.test.skip=true
Please check code format and fix the spotless
errors if any:
mvn spotless:check
More details in Contributing.md.
Thanks to JetBrains for the free license.