A docker image to process Semantic Convention YAML models.
See CONTRIBUTING.md
for information on making changes to this repository.
The image can be used to generate Markdown tables or code.
docker run --rm -v<yaml-path>:<some-path> -v<output-path>:<some-path> otel/semconvgen [OPTION]
For help try:
docker run --rm otel/semconvgen -h
The expected YAML input file format is documented in syntax.md.
There is also a JSON schema definition available for the YAML files, which can
be used e.g. in VS code to get validation and auto-completion: semconv.schema.json.
For example, with the redhat.vscode-yaml
plugin, use the following snippet in your VS Code settings.json
to apply it
to the test YAML files:
{
"yaml.schemas": {
"./semantic-conventions/semconv.schema.json": [
"semantic-conventions/src/tests/**/*.yaml"
]
}
}
Tables can be generated using the command:
docker run --rm otel/semconvgen --yaml-root {yaml_folder} markdown --markdown-root {markdown_folder}
Where {yaml_folder}
is the absolute path to the directory containing the yaml files and
{markdown_folder}
the absolute path to the directory containing the markdown definitions
(specification
for opentelemetry-specification).
The tool will automatically replace the tables with the up to date definition of the semantic conventions. To do so, the tool looks for special tags in the markdown.
<!-- semconv {semantic_convention_id} -->
<!-- endsemconv -->
Everything between these two tags will be replaced with the table definition.
The {semantic_convention_id}
MUST be the id
field in the yaml files of the semantic convention
for which we want to generate the table.
After {semantic_convention_id}
, optional parameters enclosed in parentheses can be added to customize the output:
tag={tag}
: prints only the attributes that have{tag}
as a tag;full
: prints attributes inherited from the parent semantic conventions or from included ones;ref
: prints attributes that are referenced from another semantic convention;
By default markdown tables are rendered with stability badges (like or ) which can be disabled with --md-disable-stable-badge
, --md-disable-experimental-badge
, --md-disable-deprecated-badge
.
When badges are disabled, the stability column contains plain text representation of stability or deprecation status.
These examples assume that a semantic convention with the id http.server
extends another semantic convention with the id http
.
<!-- semconv http.server -->
will print only the attributes of the http.server
semantic
convention.
<!-- semconv http.server(full) -->
will print the attributes of the http
semantic
convention and also the attributes of the http.server
semantic convention.
<!-- semconv http.server() -->
is equivalent to <!-- semconv http.server -->
.
<!-- semconv http.server(tag=network) -->
will print the attributes of the http.server
semantic
convention that have the tag network
.
<!-- semconv http.server(tag=network, full) -->
will print the attributes of both http
and http.server
semantic conventions that have the tag network
.
<!-- semconv metric.http.server.active_requests(metric_table) -->
will print a table describing a single metric
http.server.active_requests
.
You can check compatibility between the local one specified with --yaml-root
and specific OpenTelemetry semantic convention version using the following command:
docker run --rm otel/semconvgen --yaml-root {yaml_folder} compatibility --previous-version {semconv version}
The {semconv version}
(e.g. 1.24.0
) is the previously released version of semantic conventions.
The following checks are performed:
-
On all attributes and metrics (experimental and stable):
- attributes and metrics must not be removed
- enum attribute members must not be removed
-
On stable attributes and attribute templates:
- stability must not be changed
- the type of attribute must not be changed
- enum attribute: type of value must not be changed
-
On stable enum attribute members:
- stability must not be changed
id
andvalue
must not be changed
-
On stable metrics:
- stability must not be changed
- instrument and unit must not be changed
- new attributes should not be added.
This check does not take into account opt-in attributes. Adding new attributes to metric is not always breaking,
so it's considered non-critical and it's possible to suppress it with
--ignore-warnings
Previous versions of semantic conventions are not always compatible with newer versions of build-tools. You can suppress validation errors by adding --continue-on-validation-errors
flag:
docker run --rm otel/semconvgen --yaml-root {yaml_folder} --continue-on-validation-errors compatibility --previous-version {semconv version}
The image supports Jinja templates to generate code from the models.
For example, opentelemetry-java generates typed constants for semantic conventions. Refer to https://github.com/open-telemetry/semantic-conventions-java for all semantic conventions.
The commands used to generate that are here in the semantic-conventions-java repo
By default, all models are fed into the specified template at once, i.e. only a single file is generated. This is helpful to generate constants for the semantic attributes, example from opentelemetry-java.
If the parameter --file-per-group {pattern}
is set, a single yaml model is fed into the template
and the value of pattern
is resolved from the model and may be used in the output argument.
This way, multiple files are generated. The value of pattern
can be one of the following:
semconv_id
: The id of the semantic convention.prefix
: The prefix with which all attributes starts with.extends
: The id of the parent semantic convention.root_namespace
: The root namespace of attribute to group by.
The --output
parameter, when --file-per-group
is used is evaluated as a template. The following variables are provided to output:
prefix
: A prefix name for files, determined from the grouping. e.g.http
,database
,user-agent
.pascal_prefix
: A Pascal-case prefix name for files. e.g.Http
,Database
,UserAgent
.camel_prefix
: A camel-case prefix name for files. e.g.http
,database
,userAgent
.snake_prefix
: A snake-case prefix name for files. e.g.http
,database
,user_agent
.
For example, you could do the following:
docker run --rm \
-v ${SCRIPT_DIR}/opentelemetry-specification/semantic_conventions/trace:/source \
-v ${SCRIPT_DIR}/templates:/templates \
-v ${ROOT_DIR}/semconv/src/main/java/io/opentelemetry/semconv/trace/attributes/:/output \
otel/semconvgen:$GENERATOR_VERSION \
--yaml-root /source \
code \
--template /templates/SemanticAttributes.java.j2 \
--file-per-group root_namespace \
--output "/output/{{pascal_prefix}}Attributes.java" \
...other parameters...
Finally, additional value can be passed to the template in form of key=value
pairs separated by
comma using the --parameters [{key=value},]+
or -D
flag.
Generating code from older versions of semantic conventions with new tooling is, in general, not supported.
However in some cases minor incompatibilities in semantic conventions can be suppressed by adding --continue-on-validation-errors
flag:
docker run --rm \
otel/semconvgen:$GENERATOR_VERSION \
--yaml-root /source \
--continue-on-validation-errors \
code \
...other parameters...
The image also supports customizing
Whitespace Control in Jinja templates
via the additional flag --trim-whitespace
. Providing the flag will enable both lstrip_blocks
and trim_blocks
.
The COLORED_DIFF
environment variable is set in the semantic-conventions
Dockerfile
. When this environment varibale is set, errors related to reformatting tables will show a "colored diff" using standard ANSI control characters. While this should be supported natively in any modern terminal environment, you may unset this variable if issues arise. Doing so will enable a "fall back" of non-colored inline diffs showing what was "added" and what was "removed", followed by the exact tokens added/removed encased in single quotes.
When the template is processed, it has access to a set of variables that depends on the --file-per-group
value (or lack of it).
You can access properties of these variables and call Jinja or Python functions defined on them.
Processes all parsed semantic conventions
semconvs
- the dictionary containing parsedBaseSemanticConvention
instances with semconvid
as a keyattributes_and_templates
- the dictionary containing all attributes (including template ones) grouped by their root namespace. Each element in the dictionary is a list of attributes that share the same root namespace. Attributes that don't have a namespace appear under""
key. Attributes and templates are sorted by attribute name.attributes
- the list of all attributes from all parsed semantic conventions. Does not include template attributes.attribute_templates
- the list of all attribute templates from all parsed semantic conventions.metrics
- the list of all metric semantic conventions sorted by metric name.
Processes a single namespace and is called for each namespace detected.
attributes_and_templates
- the list containing all attributes (including template ones) in the given root namespace. Attributes are sorted by their name.enum_attributes
- the list containing all enum attributes in the given root namespace. Attributes are sorted by their name.root_namespace
- the root namespace being processed.
Processes a single pattern value and is called for each distinct value.
semconv
- the instance of parsedBaseSemanticConvention
being processed.
Jinja templates has a notion of filters allowing to transform objects or filter lists.
Semconvgen supports the following additional filters to simplify common operations in templates.
is_definition
- Checks if the attribute is the original definition of the attribute and not a reference.is_deprecated
- Checks if the attribute is deprecated. The same check can also be done with(attribute.stability | string()) == "StabilityLevel.DEPRECATED"
is_experimental
- Checks if the attribute is experimental. The same check can also be done with(attribute.stability | string()) == "StabilityLevel.EXPERIMENTAL"
is_stable
- Checks if the attribute is experimental. The same check can also be done with(attribute.stability | string()) == "StabilityLevel.STABLE"
is_template
- Checks if the attribute is a template attribute.attribute | print_member_value(member)
- Applies to enum attributes only and takesEnumMember
as a parameter. Prints value of a given enum member as a constant - strings are quoted, integers are printed as is.
first_up
- Upper-cases the first character in the string. Does not modify anything elseregex_replace(text, pattern, replace)
- Makes regex-based replace intext
string using `pattern``to_camelcase
- Converts a string to camel case (using.
and_
as words delimiter in the original string). The first character of every word is upper-cased, other characters are lower-cased. E.g.foo.bAR_baz
becomesfooBarBaz
to_const_name
- Converts a string to Python or Java constant name (SNAKE_CASE) replacing.
or-
with_
. E.g.foo.bAR-baz
becomesFOO_BAR_BAZ
.to_doc_brief
- Trims whitespace and removes dot at the end. E.g.Hello world.\t
becomesHello world
is_metric
- Checks if semantic convention describes a metric.
First, we should iterate over all attributes.
{%- for attribute in attributes_and_templates %}
...
{%- endfor %}
Now, for each attribute we want to generate constant declaration like
SERVER_ADDRESS = "server.address"
"""
Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name.
Note: When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available.
"""
we can achieve it with the following template:
```jinja
{{attribute.fqn | to_const_name}} = "{{attribute.fqn}}"
"""
{{attribute.brief | to_doc_brief}}.
{%- if attribute.note %}
Note: {{attribute.note | to_doc_brief | indent}}.
{%- endif %}
"""
We should also annotate deprecated attributes and potentially generate template attributes differently. Here's a full example:
{%- for attribute in attributes_and_templates %}
{% if attribute | is_template %}
{{attribute.fqn | to_const_name}}_TEMPLATE = "{{attribute.fqn}}"
{%- else %}
{{attribute.fqn | to_const_name}} = "{{attribute.fqn}}"
{%- endif %}
"""
{{attribute.brief | to_doc_brief}}.
{%- if attribute.note %}
Note: {{attribute.note | to_doc_brief | indent}}.
{%- endif %}
{%- if attribute | is_deprecated %}
Deprecated: {{attribute.deprecated | to_doc_brief}}.
{%- endif %}
"""
{%- endfor %}
It's possible to split attributes into stable and unstable for example to ship them in different artifacts or namespaces.
You can achieve it by running code generation twice with different filters and output destinations.
Here's an example of how to keep one template file for both:
{%- set filtered_attributes = attributes_and_templates | select(filter) | list %}
{%- for attribute in attributes_and_templates %}
...
{%- endfor %}
Here we apply a Jinja test named filter
which we can define in the generation script:
docker run --rm \
-v ${SCRIPT_DIR}/semantic-conventions/model:/source \
-v ${SCRIPT_DIR}/templates:/templates \
-v ${ROOT_DIR}/opentelemetry-semantic-conventions/src/opentelemetry/semconv/:/output \
otel/semconvgen:$OTEL_SEMCONV_GEN_IMG_VERSION \
-f /source code \
--template /templates/semantic_attributes.j2 \
--output /output/{{snake_prefix}}_attributes.py \
--file-per-group root_namespace \
-Dfilter=is_stable
Here we run the generation with filter
variable set to is_stable
, which resolves to attributes_and_templates | select("is_stable")
expression.
It will apply is_stable
custom function to each attribute and collect only stable ones.
We can also generate experimental attributes by changing the destination path and filter value:
docker run --rm \
-v ${SCRIPT_DIR}/semantic-conventions/model:/source \
-v ${SCRIPT_DIR}/templates:/templates \
-v ${ROOT_DIR}/opentelemetry-semantic-conventions/src/opentelemetry/semconv/:/output \
otel/semconvgen:$OTEL_SEMCONV_GEN_IMG_VERSION \
-f /source code \
--template /templates/semantic_attributes.j2 \
--output /output/experimental/{{snake_prefix}}_attributes.py \
--file-per-group root_namespace \
-Dfilter=is_experimental
Enum attribute members could be generated in the following way:
{%- for attribute in enum_attributes %}
{%- set class_name = attribute.fqn | to_camelcase(True) ~ "Values" %}
{%- set type = attribute.attr_type.enum_type %}
class {{class_name}}(Enum):
{%- for member in attribute.attr_type.members %}
{{ member.member_id | to_const_name }} = {{ attribute | print_member_value(member) }}
"""{{member.brief | to_doc_brief}}."""
{% endfor %}
{% endfor %}
resulting in en enum like this:
class NetworkTransportValues(Enum):
TCP = "tcp"
"""TCP."""
UDP = "udp"
"""UDP."""
PIPE = "pipe"
"""Named or anonymous pipe."""
UNIX = "unix"
"""Unix domain socket."""
In some cases you might want to skip certain namespaces. For example, JVM attribute and metric definitions might not be very useful in Python application.
You can create a list of excluded namespaces and pass it over to the template as parameter (or hardcode it):
{%- if root_namespace not in ("jvm", "dotnet") %}
...
{%- endif %}
If result of the rendering is empty string, code generator does not store it.
You can generate metric names as constants, but could also generate method definitions that create instruments and populate name, description, and unit:
"""
Duration of HTTP client requests
"""
@staticmethod
def create_http_client_request_duration(meter: Meter) -> Histogram:
return meter.create_histogram(
name="http.client.request.duration",
description="Duration of HTTP client requests.",
unit="s",
)
Since metric types (like Histogram
) and factory methods (like create_histogram
) depend on the language, it's necessary to define mappings in the template.
For example, this is a macro rendering Python instrument type name based on the semantic convention type:
{%- macro to_python_instrument_type(instrument) -%}
{%- if instrument == "counter" -%}
Counter
{%- elif instrument == "histogram" -%}
Histogram
{%- elif instrument == "updowncounter" -%}
UpDownCounter
{%- elif instrument == "gauge" -%}
ObservableGauge
{%- endif -%}
{%- endmacro %}
We'd need a very similar one for factory method.
This is the template that generates above metric definition:
"""
{{metric.brief | to_doc_brief}}
"""
@staticmethod
{%- if metric.instrument == "gauge" %}
def create_{{ metric.metric_name | replace(".", "_") }}(meter: Meter, callback: Sequence[Callable]) -> {{to_python_instrument_type(metric.instrument)}}:
{%- else %}
def create_{{ metric.metric_name | replace(".", "_") }}(meter: Meter) -> {{to_python_instrument_type(metric.instrument)}}:
{%- endif %}
return meter.create_{{to_python_instrument_factory(metric.instrument)}}(
name="{{ metric.metric_name }}",
{%- if metric.instrument == "gauge" %}
callback=callback,
{%- endif %}
description="{{ metric.brief }}",
unit="{{ metric.unit }}",
)