A framework to develop production grade Spark jobs.
The key concept of this framework is the Run Context
, which made of three major components,
- The Spark session - which you have to use it for Transformation and Action
- The Parameters (parsed arguments from command input) - which from the
spark-submit
command - The config - which should have different config for different environments.
This framework helps you to manage those three components and enable you focus on the actual business logic - DataFrame transformation.
To build and run test cases, just simply run,
sbt test
To package the Jar run,
make build-spark-jar
The
sbt-assembly
plugin to build a fat Jar with dependencies
To run the example Spark jobs as local mode, just run,
make submit-job