Releases: pipeline-tools/gusty-demo
General Updates
-
Using
jupyter/r-notebook
as a base image to decrease build time. -
General updating of
requirements.txt
.
Support for .py Tasks
gusty 0.5.0 introduces an intuitive way to use .py files as task files. The full release notes for gusty 0.5.0 can be found here.
In short, by default, gusty treats .py files in DAG directory as a file to be executed by Airflow's PythonOperator
, meaning your .py file can say as little as:
print("this will become an operator")
and gusty will take care of the rest.
Of course, .py files can be configured just like any other task file. For this, gusty reads raw markdown in the following format at the beginning of your file.
# ---
# operator: airflow.operators.python.PythonVirtualenvOperator
# requirements:
# - "siuba==0.0.24"
# python_callable: print_phrase
# ---
def print_phrase():
print("this will become an operator")
We designated a new operator
, whose arguments include requirements
, and we specified a python_callable
, which tells gusty that there is a defined function named print_phrase
that we want to use as the python_callable
function for our operator.
Please note gusty will only search for/attach an actual callable function to python_callable
when the operator inherits from Airflow's PythonOperator
class.
Support for Task Groups
gusty 0.3.0 provides full support for Airflow 2.x task groups through the use of subfolders in a DAG directory, as illustrated in the updated breakfast DAG.
gusty 0.3.0 task group support includes:
-
task_group_defaults
- pass a default task group configuration as a dictionary to thecreate_dag
function, which will take any parameter from Airflow's TaskGroup Class -
METADATA.yml
support - Task groups can also be configured with aMETADATA.yml
file in their folder, just like DAGs -
prefix_group_id off by default - Explicitly set when you want a task group's id prefixed to each task group's task.
-
suffix_group_id - A gusty exclusive! Explicitly set when you want a task group's id suffixed to each task group's task.
-
smart dependencies (dependency levels) - A task can depend on any other task, regardless of task group. Task groups can depend on only other tasks / task groups that are their siblings.
-
smart dependencies (prefixes/suffixes) - If tasks depend on other tasks in their task group, where the task group specifies a prefix or a suffix, the prefixed/suffixed dependency id will be added to the list of task group dependencies automatically
The full gusty 0.3.0 release notes can be found here.
Airflow 2.x Demo
Airflow 2.0 is here and operators can now come from many different places ("providers"). To account for this, gusty 0.2.0 requires a full module.operator string be passed to the operator parameter in a YAML spec. This demo release is the structurally same as the Airflow 1.x Demo, but accounts for the package/application updates of both Airflow 2.0 and gusty 0.2.0.
Airflow 1.x Demo
v0.1.0 airflow container runs python 3.9
Legacy Demo
The older demo contains separate containers for Jupyter and Rmd jobs, which are triggered via gusty operators, which have since been deprecated in gusty 0.1.0 in favor of custom operators.