Replies: 3 comments
-
I guess we're jumping right into the deep end of what the plugin system can or should be able to do! I'm going to answer the subset and QC in separate replies as they likely take advantage of the plugin system in different ways. |
Beta Was this translation helpful? Give feedback.
-
QCIf I'm understanding the QC use case right, you'd like to use existing dataset routers, but only give them a modified version of the dataset (that's provided from elsewhere)? Since you are looking to modify the path that they are responding to, I believe this should be largely possible by getting creative with dataset routers and the ability to access other plugins and modify the what datasets they may be passed. Single requestThis is why the class QCConfig(BaseModel):
...
class QCPlugin(Plugin):
name = "qc"
dataset_router_prefix = "/run_qc"
dataset_router_tags = ["qc"]
@hookimpl
def dataset_router(self, deps: Dependencies):
# override the existing get_dataset, to retrieve datasets from other plugins, and apply our QC filtering
def get_dataset(dataset_id: str, qc_post: QCConfig):
# there may need to be some creativity to make sure we don't try to get the dataset from this plugin again,
# similar to the `subset_hook_caller` below
dataset = dep.dataset(dataset_id)
# do QC things to the dataset here
return dataset
new_deps = Dependencies(**deps, dataset=get_dataset)
router = APIRouter(prefix=self.dataset_router_prefix, tags=self.dataset_router_tags)
# add all other dataset routers as sub-routers below this one in addition to the normal mounting
for dataset_router in deps.plugin_manager.subset_hook_caller("dataset_router", remove_plugins: [self.name])(deps=new_deps):
router.include_router(dataset_router)
return router I think this largely meets those needs if the QC process can be run in the same request. It also doesn't modify any of the existing routes, so they won't all become post. But since request bodies aren't always a good way to make friends, a multi-request process may make more sense. It also may be possible to modify the other dataset routers in the loop before including them and transforming the request types. Multiple requestsI think to use post, it would probably make sense to have a multi step process and use a cache and a plugin that is both a dataset router and provider.
|
Beta Was this translation helpful? Give feedback.
-
SubsetIf subsetting can be part of the path, I think a similar method to the single QC request could be used. Then class SubsetPlugin(Plugin):
name = "subset"
dataset_router_prefix = "/subset/<subset_format>"
dataset_router_tags = ["subset"]
@hookimpl
def dataset_router(self, deps: Dependencies):
# override the existing get_dataset, to retrieve datasets from other plugins, and apply our QC filtering
def get_dataset(dataset_id: str, subset_format: str):
# there may need to be some creativity to make sure we don't try to get the dataset from this plugin again,
# similar to the `subset_hook_caller` below
dataset = dep.dataset(dataset_id)
# slice and dice the dataset using the subset format
return dataset
new_deps = Dependencies(**deps, dataset=get_dataset)
router = APIRouter(prefix=self.dataset_router_prefix, tags=self.dataset_router_tags)
# add all other dataset routers as sub-routers below this one in addition to the normal mounting
for dataset_router in deps.plugin_manager.subset_hook_caller("dataset_router", remove_plugins: [self.name])(deps=new_deps):
router.include_router(dataset_router)
return router |
Beta Was this translation helpful? Give feedback.
-
After some playing around with
xpublish
I'm at a crossroad on how to implement a pluggable "output" system. I am trying to make aPlugin
dynamically change a dataset and provide it back into thexpublish
system so it would be available to other plugins.A use-case might explain a bit better... a global model needs to be regionally subset into N regions. We could lazily subset the regions into many
xarray.Dataset
objects and launchxpublish
with the collection of regions. That works, but requires the regions to be pre-determined. What if I wanted the spatial subset to be defined in the request context? APlugin
can take in the parameters required to perform the subset, but the results can't be served back out throughxpublish
, they would need to be returned from inside of the plugin. The goal would be to serve the results of the subset out via the existingxpublish
plugins... i.e.dataset_info
,zarr
,opendap
, etc. I can get this working now by writing aSubsetRequest
request parameter class and sub-classing eachPlugin
I want to be a "subsetting plugin" with new request parameters and code to do the subsetting. It isn't pretty, especially as thePlugin
system will grow.What if we separated the
input
and theoutput
in thePlugins
? Write them as documented now but have the option to return axarray.Dataset
object from the routes and pass it through a different "output"Plugin
system that has already identified registered output plugins and has created dynamic routes for them on top of thedataset
.Another example may help... a QC endpoint that takes in a POSTed QC config object, runs QC functions, and returns the results in a variety of requested formats. If I wanted a
zarr
endpoint for my results I might POST to/datasets/[dataset_id]/run_qc/zarr/
and if I wanted a CSV response I might POST to/datasets/[dataset_id]/run_qc.csv
. The/zarr/
and.csv
are registered output plugins, but thePlugin
calledrun_qc
computed the result the same way for both and just returned thexr.Dataset
result.Thoughts?
Beta Was this translation helpful? Give feedback.
All reactions