You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been meandering a bit, so I think I need to formalize a plan here.
Here is the argument for using some composition-based pattern:
We need to build DAGs for each region. DAGs will be quite similar, but not necessarily identical. Here is the plan I think makes most sense (using modis_aqua_sst "pipeline" as an example:
one DAG file for each "pipeline" (and all regions) eg dags/modis_aqua_sst.py
options for different regions are passed as parameters to a DAGBuilder function
parameters like {"args":[], "kwargs":{}} within the "sst" DAG or nearby (/dags/modis_aqua_sst/regional_params.py)
builder ModisAquaSSTDAGBuilder(place_name='gom'), adds tasks to create a modis_aqua_sst pipeline for the gom region.
a list of parameters can be defined to build all ROI DAGs quickly: for region_opts in [{"args":[],"kwargs":{"place_name": "gom"}}, {...}]: ModisAquaSSTDAGBuilder(region_opts['args']*, region_opts['kwargs']**).build()
Some thoughts on how editing this pipeline works:
to add an ROI to a pipeline: open pipeline DAG, add options for region
to customize pipeline for an ROI: change the parameters for the builder
no global "regions", although enum-like classes should probably be used for "magic" strings (eg place_names like "gom", "fknms", etc)
So that above looks like a simple factory pattern, but does it make sense to go all the way to an entity-component-like pattern? I think that is overkill because "entities" maintain state outside of this builder pattern anyway. That is... Assuming that in this case by "Entity" we mean "DAG", and by "Component" we mean "set of 1+ operators".
The text was updated successfully, but these errors were encountered:
I did some work on this on the regionBuilders branch, but it is messy. I think I need to focus in and come back to this after I finish cleaning up the last few experiments.
Now that I have learned more about airflow, the correct solution is jus to use DAG and Operator subclasses. The correct approach is to extend the DAG and *Operator classes and use mixins for DAG subclass composition if you really want to get fancy.
Other patterns mentioned here are likely to cause more confusion than they are worth.
I have been meandering a bit, so I think I need to formalize a plan here.
Here is the argument for using some composition-based pattern:
We need to build DAGs for each region. DAGs will be quite similar, but not necessarily identical. Here is the plan I think makes most sense (using
modis_aqua_sst
"pipeline" as an example:dags/modis_aqua_sst.py
{"args":[], "kwargs":{}}
within the "sst" DAG or nearby (/dags/modis_aqua_sst/regional_params.py
)ModisAquaSSTDAGBuilder(place_name='gom'),
adds tasks to create amodis_aqua_sst
pipeline for thegom
region.for region_opts in [{"args":[],"kwargs":{"place_name": "gom"}}, {...}]: ModisAquaSSTDAGBuilder(region_opts['args']*, region_opts['kwargs']**).build()
Some thoughts on how editing this pipeline works:
place_name
s like "gom", "fknms", etc)So that above looks like a simple factory pattern, but does it make sense to go all the way to an entity-component-like pattern? I think that is overkill because "entities" maintain state outside of this builder pattern anyway. That is... Assuming that in this case by "Entity" we mean "DAG", and by "Component" we mean "set of 1+ operators".
The text was updated successfully, but these errors were encountered: