New Epic: Parameter metadata #772
Replies: 2 comments 10 replies
-
Thanks @jmcook1186 I like the approach of moving the aggregation method to the aggregate featyure, it makes a lot of sense and it's good to be explicit. Dynamic outputsSomething else to consider is plugins which return a dynamic list of outputs such as the mock-observations and csv-lookup plugins.
We could just leave it as it is, they just don't have description or units, so if you want to do some automatic unit conversion it's not possible, it think that's ok and we should allow users the ability to be succinct in a manifest file. So maybe these plugins need additional (optional) config where you can provide the units and description values? I think it really should be optional. initialize:
plugins:
csv-lookup:
method: XXXX
path: "@grnsft/if-plugins"
global-config:
metadata:
cloud/vendor:
units: string
description: blah blah Units -> TypeWe've had criticism before because the work unit doesn't quite describe what we are explaining, e.g. some units would be "string". What if we renamed it as type? Does that make more sense, i'm unsure but there is something not quite right with calling it a unit. Inputs as well as outputsI was thinking through this and had a realisation there was a really good reason to document inputs as well as outputs, but i've lost it now 🤦🏽♂️ Might come to me later, was a really really good reason to document inputs! |
Beta Was this translation helpful? Give feedback.
-
Love the explainer block & explicit units/descriptions!
@josh-swerdlow and I agree that this wouldn't be the end of the world to do now, especially as the number of existing plugins is relatively low (compared to where we hope it'll be in the near future). Lots more to think through -- we'll comment more individually soon |
Beta Was this translation helpful? Give feedback.
-
Hi folks,
this epic is a little less settled compared to the others I described earlier on this forum, which means there's a lot of opportunity to influence the direction by engaging with this post. We're refining this in the open so you can comment and help us refine the thinking.
Background
As we have developed IF we have explored many methods for standardizing the units and associated metadata for plugins, but we still haven't settled on a good general purpose solution.
It is important to have some way to verify plugin metadata. The metadata includes at least the list of parameters and return values and their units. The units are critical because two plugins that both return
carbon
might have units oflbs C / kwH
,gCO2eq
,kg C
etc etc. A plugin later in the pipeline that does some additional transformation on thatcarbon
value needs to know it is getting the value in the expected unit. A plugin expectinggCo2
that receiveskgCO2
will return a value with a 1000x error.Reading the plugin documentation is one simple solution, but there's no guarantee that a plugin's documentation and code develop at the same pace and it could be difficult to determine what metadata was valid for plugin run in the past if the metadata information is only available in documentation that might subsequently have been updated or removed. For verifiability, auditability and re-executability we need an in-code solution.
Our initial solution was to include a list of parameters that can be used by plugins in a file,
params.ts
that comes bundled withIF
. Plugin builders can add to the list by appending to it at runtime or providing a totally new parameter file that overrides our default one. The rationale is that everyone is clear about the units for specific named parameters, i.e. if you want to name a parametercarbon
it has to have the units we define inparams.ts
(unless you provide a new params file, but this is deliberately an advanced feature).However, this has turned out to be a fairly poor solution as it adds too much friction for plugin developers and it constrains people's ability to build freely on top of IF and build complex pipelines. It also forced us as the IF core team to impose certain standards that turned out to be very difficult to reason about while still keeping IF as general purpose as possible. For example, for the carbon intensity of electricity, do we choose grid/carbon-intensity? Plugins that do things with the carbon intensity of electricity might not need to just work with grid/carbon-intensity, they might also want to work with other names like electric/carbon-intensity, or perhaps a cloud has its own computed carbon intensity factoring in the on-site generation, which they might want to name
cloud/carbon-intensity
. There is no way to decide on the one name for every “thing” we want to compute.This means we need a more flexible solution that balances the need for auditable metadata with the ability to build freely and expressively on IF with the minimum of developer friction.
Proposed solution
There are several parts to the proposed solution. First is to remove the
params.ts
file and the IF logic that checks theparams
file for parameter metadata. Instead, we can make use of themetadata
field we already expose in our plugin interface and move the metadata definitions into the plugins themselves rather than an external file.The params file is also used to grab the aggregation method that the
aggregate
feature should use to aggregate the values for a given parameter across time or across components. This can be moved into theaggregate
feature config instead and completely removed from the plugin metadata.This means you only have to provide the information when it is actually needed. e.g.
Once this is done, we can develop an
explainer
feature that collates the metadata from all the plugins in a given pipeline and output it as a node in the manifest. This means users can always see what units were generated by each plugin in an execution pipeline and check that the units fed from one pipeline to another are consistent. It also opens the door to a future static analysis type feature that auto-audits the unit propagation through a pipeline.The
explainer
block should look similar to:❓ Q: An alternative to consider is whether it is actually better for
explainer
to enrich the existinginitialize
block for each plugin to create a leaner manifest file. I probably lean towards enriching theinitialize
block personally, just to keep all the plugin metadata and config in one place in the manifest. It might also be good to renameinitialize
toplugins
or similar to remove any naming confusion.Finally, parameter mapping should be implemented so that we can automatically add the return values of one plugin to the
inputs
array under the name required by another plugin.For example, let's say one of our plugins returned
cloud/vendor
by default, and another plugin required the same information but expected to receivecloud-vendor-name
. We could provide amapping
field in the config for the first plugin so that thecloud/vendor
data is actually appended to the manifest data ascloud-vendor-name
. There are already ways to do this, but the mapping reduces the amount of redundant data in the final manifest.It could look as follows:
❓ Q: will we REQUIRE metadata for each plugin - will this be a breaking change that will obselete existing plugins?
Tasks
params.ts
and related logic from IFbuiltins
and alert community to add the same to their existing pluginsexplainer
feature that collatesmetadata
from all the plugins in a pipeline and either enriches theinitialize
info or adds a new block.How you can help
You can read through this post and give feedback in comments, especially if you are a plugin developer affected by these changes. Later, when the specific tasks are available as tickets on our issue board you can let us know if you want to work on one. There may be some that are reserved for core developers, but in general we are keen to open up IF development to the community.
@jawache @zanete @narekhovhannisyan @MariamKhalatova @manushak
Beta Was this translation helpful? Give feedback.
All reactions