Skip to content

Commit

Permalink
Use short keys for OTLP/JSON
Browse files Browse the repository at this point in the history
Resolves open-telemetry#412

This change sets short one or two letter keys for all fields when JSON
encoding is used. This results in about 1.3-1.5 times smaller uncompressed
payload.

Here is size comparison using some sample data from exp-otelproto test bench:

```
===== Encoded sizes
Encoding                       Uncomp/json[Improved]   zlib/json[Improved]
OTLP 0.18/Logs                 52577 by   [1.000]      4601 by [1.000]
ShortKeys/Logs                 39668 by   [1.325]      4344 by [1.059]

Encoding                       Uncomp/json[Improved]   zlib/json[Improved]
OTLP 0.18/Trace/Attribs        41704 by   [1.000]      3189 by [1.000]
ShortKeys/Trace/Attribs        31998 by   [1.303]      3353 by [0.951]

Encoding                       Uncomp/json[Improved]   zlib/json[Improved]
OTLP 0.18/Trace/Events         49302 by   [1.000]      1917 by [1.000]
ShortKeys/Trace/Events         34396 by   [1.433]      2430 by [0.789]

Encoding                       Uncomp/json[Improved]   zlib/json[Improved]
OTLP 0.18/Metric/Histogram     42376 by   [1.000]      1067 by [1.000]
ShortKeys/Metric/Histogram     27071 by   [1.565]       839 by [1.272]

Encoding                       Uncomp/json[Improved]   zlib/json[Improved]
OTLP 0.18/Metric/MixOne       184836 by   [1.000]      2778 by [1.000]
ShortKeys/Metric/MixOne       119031 by   [1.553]      2143 by [1.296]

Encoding                       Uncomp/json[Improved]   zlib/json[Improved]
OTLP 0.18/Metric/MixSeries    707615 by   [1.000]     11482 by [1.000]
ShortKeys/Metric/MixSeries    457010 by   [1.548]      9829 by [1.168]
```

Unfortunately this is a breaking change for default configuration of
Protobuf/JSON marshaler, which marshals field names in lowerCamelCase.
This is not a breaking change for marshalers which use the "OrigName=true"
JSON marshaling option. Nothing changes in the output when "OrigName=true"
is used.

I do not see an easy way to make this change gracefully. It will require
duplicating the entire proto, give the duplicate messages diffent names,
then handle the duplicates in the receivers. It is quite a lot of work
that can be also error prone. I think in this pasrticular case we should
not attenmpt to handle it gracefully and simply still to our formal
stability guarantees, which for JSON are "Alpha" and allow any changes
any time.

### Examlple Outputs

Current JSON output before this change, using default lowerCamelCase marshaler:

```json
{
    "resourceSpans": [
        {
            "resource": {
                "attributes": [
                    {
                        "key": "StartTimeUnixnano",
                        "value": {
                            "intValue": "12345678"
                        }
                    },
                    {
                        "key": "Pid",
                        "value": {
                            "intValue": "1234"
                        }
                    },
                    {
                        "key": "HostName",
                        "value": {
                            "stringValue": "fakehost"
                        }
                    },
                    {
                        "key": "ServiceName",
                        "value": {
                            "stringValue": "generator"
                        }
                    }
                ]
            },
            "scopeSpans": [
                {
                    "scope": {
                        "name": "io.opentelemetry"
                    },
                    "spans": [
                        {
                            "traceId": "AQAAAAAAAADw3rwKeFY0Eg==",
                            "spanId": "AQAAAAAAAAA=",
                            "name": "load-generator-span",
                            "kind": "SPAN_KIND_CLIENT",
                            "startTimeUnixNano": "1572516672000000013",
                            "endTimeUnixNano": "1572516672000000013",
                            "attributes": [
                                {
                                    "key": "db.mongodb.collection",
                                    "value": {
                                        "stringValue": "!##$"
                                    }
                                }
                            ],
                            "events": [
                                {
                                    "timeUnixNano": "1572516672000000013",
                                    "attributes": [
                                        {
                                            "key": "te",
                                            "value": {
                                                "intValue": "1"
                                            }
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}
```

JSON output before and after this change, using OrigName=true marshaler.

```json
{
    "resource_spans": [
        {
            "resource": {
                "attributes": [
                    {
                        "key": "StartTimeUnixnano",
                        "value": {
                            "int_value": "12345678"
                        }
                    },
                    {
                        "key": "Pid",
                        "value": {
                            "int_value": "1234"
                        }
                    },
                    {
                        "key": "HostName",
                        "value": {
                            "string_value": "fakehost"
                        }
                    },
                    {
                        "key": "ServiceName",
                        "value": {
                            "string_value": "generator"
                        }
                    }
                ]
            },
            "scope_spans": [
                {
                    "scope": {
                        "name": "io.opentelemetry"
                    },
                    "spans": [
                        {
                            "trace_id": "AQAAAAAAAADw3rwKeFY0Eg==",
                            "span_id": "AQAAAAAAAAA=",
                            "name": "load-generator-span",
                            "kind": "SPAN_KIND_CLIENT",
                            "start_time_unix_nano": "1572516672000000013",
                            "end_time_unix_nano": "1572516672000000013",
                            "attributes": [
                                {
                                    "key": "db.mongodb.collection",
                                    "value": {
                                        "string_value": "!##$"
                                    }
                                }
                            ],
                            "events": [
                                {
                                    "time_unix_nano": "1572516672000000013",
                                    "attributes": [
                                        {
                                            "key": "te",
                                            "value": {
                                                "int_value": "1"
                                            }
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}
```

JSON output after this change, using proposed short keys and default
lowerCamelCase marshaler:

```json
{
    "s": [
        {
            "r": {
                "a": [
                    {
                        "k": "StartTimeUnixnano",
                        "v": {
                            "i": "12345678"
                        }
                    },
                    {
                        "k": "Pid",
                        "v": {
                            "i": "1234"
                        }
                    },
                    {
                        "k": "HostName",
                        "v": {
                            "s": "fakehost"
                        }
                    },
                    {
                        "k": "ServiceName",
                        "v": {
                            "s": "generator"
                        }
                    }
                ]
            },
            "s": [
                {
                    "i": {
                        "n": "io.opentelemetry"
                    },
                    "s": [
                        {
                            "ti": "AQAAAAAAAADw3rwKeFY0Eg==",
                            "si": "AQAAAAAAAAA=",
                            "n": "load-generator-span",
                            "k": "SPAN_KIND_CLIENT",
                            "s": "1572516672000000013",
                            "e": "1572516672000000013",
                            "a": [
                                {
                                    "k": "db.mongodb.collection",
                                    "v": {
                                        "s": "!##$"
                                    }
                                }
                            ],
                            "ev": [
                                {
                                    "t": "1572516672000000013",
                                    "a": [
                                        {
                                            "k": "te",
                                            "v": {
                                                "i": "1"
                                            }
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}
```
  • Loading branch information
tigrannajaryan committed Jul 5, 2022
1 parent 95cf8f4 commit d74b93f
Show file tree
Hide file tree
Showing 9 changed files with 143 additions and 142 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ Full list of differences found in [this compare](https://github.com/open-telemet
InstrumentationLibraryMetrics messages. Delete deprecated
instrumentation_library_logs, instrumentation_library_spans and
instrumentation_library_metrics fields.
* :stop_sign: [BREAKING] Use short keys for OTLP/JSON.

### Added

Expand Down
2 changes: 1 addition & 1 deletion opentelemetry/proto/collector/logs/v1/logs_service.proto
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ message ExportLogsServiceRequest {
// element. Intermediary nodes (such as OpenTelemetry Collector) that receive
// data from multiple origins typically batch the data before forwarding further and
// in that case this array will contain multiple elements.
repeated opentelemetry.proto.logs.v1.ResourceLogs resource_logs = 1;
repeated opentelemetry.proto.logs.v1.ResourceLogs resource_logs = 1 [json_name="l"];
}

message ExportLogsServiceResponse {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ message ExportMetricsServiceRequest {
// element. Intermediary nodes (such as OpenTelemetry Collector) that receive
// data from multiple origins typically batch the data before forwarding further and
// in that case this array will contain multiple elements.
repeated opentelemetry.proto.metrics.v1.ResourceMetrics resource_metrics = 1;
repeated opentelemetry.proto.metrics.v1.ResourceMetrics resource_metrics = 1 [json_name="m"];
}

message ExportMetricsServiceResponse {
Expand Down
2 changes: 1 addition & 1 deletion opentelemetry/proto/collector/trace/v1/trace_service.proto
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ message ExportTraceServiceRequest {
// element. Intermediary nodes (such as OpenTelemetry Collector) that receive
// data from multiple origins typically batch the data before forwarding further and
// in that case this array will contain multiple elements.
repeated opentelemetry.proto.trace.v1.ResourceSpans resource_spans = 1;
repeated opentelemetry.proto.trace.v1.ResourceSpans resource_spans = 1 [json_name="s"];
}

message ExportTraceServiceResponse {
Expand Down
30 changes: 15 additions & 15 deletions opentelemetry/proto/common/v1/common.proto
Original file line number Diff line number Diff line change
Expand Up @@ -29,21 +29,21 @@ message AnyValue {
// The value is one of the listed fields. It is valid for all values to be unspecified
// in which case this AnyValue is considered to be "empty".
oneof value {
string string_value = 1;
bool bool_value = 2;
int64 int_value = 3;
double double_value = 4;
ArrayValue array_value = 5;
KeyValueList kvlist_value = 6;
bytes bytes_value = 7;
string string_value = 1 [json_name="s"];
bool bool_value = 2 [json_name="b"];
int64 int_value = 3 [json_name="i"];
double double_value = 4 [json_name="d"];
ArrayValue array_value = 5 [json_name="a"];
KeyValueList kvlist_value = 6 [json_name="l"];
bytes bytes_value = 7 [json_name="by"];
}
}

// ArrayValue is a list of AnyValue messages. We need ArrayValue as a message
// since oneof in AnyValue does not allow repeated fields.
message ArrayValue {
// Array of values. The array may be empty (contain 0 elements).
repeated AnyValue values = 1;
repeated AnyValue values = 1 [json_name="v"];
}

// KeyValueList is a list of KeyValue messages. We need KeyValueList as a message
Expand All @@ -56,22 +56,22 @@ message KeyValueList {
// contain 0 elements).
// The keys MUST be unique (it is not allowed to have more than one
// value with the same key).
repeated KeyValue values = 1;
repeated KeyValue values = 1 [json_name="v"];
}

// KeyValue is a key-value pair that is used to store Span attributes, Link
// attributes, etc.
message KeyValue {
string key = 1;
AnyValue value = 2;
string key = 1 [json_name="k"];
AnyValue value = 2 [json_name="v"];
}

// InstrumentationScope is a message representing the instrumentation scope information
// such as the fully qualified name and version.
message InstrumentationScope {
// An empty instrumentation scope name means the name is unknown.
string name = 1;
string version = 2;
repeated KeyValue attributes = 3;
uint32 dropped_attributes_count = 4;
string name = 1 [json_name="n"];
string version = 2 [json_name="v"];
repeated KeyValue attributes = 3 [json_name="a"];
uint32 dropped_attributes_count = 4 [json_name="da"];
}
34 changes: 17 additions & 17 deletions opentelemetry/proto/logs/v1/logs.proto
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ message LogsData {
// one element. Intermediary nodes that receive data from multiple origins
// typically batch the data before forwarding further and in that case this
// array will contain multiple elements.
repeated ResourceLogs resource_logs = 1;
repeated ResourceLogs resource_logs = 1 [json_name="l"];
}

// A collection of ScopeLogs from a Resource.
Expand All @@ -50,28 +50,28 @@ message ResourceLogs {

// The resource for the logs in this message.
// If this field is not set then resource info is unknown.
opentelemetry.proto.resource.v1.Resource resource = 1;
opentelemetry.proto.resource.v1.Resource resource = 1 [json_name="r"];

// A list of ScopeLogs that originate from a resource.
repeated ScopeLogs scope_logs = 2;
repeated ScopeLogs scope_logs = 2 [json_name="l"];

// This schema_url applies to the data in the "resource" field. It does not apply
// to the data in the "scope_logs" field which have their own schema_url field.
string schema_url = 3;
string schema_url = 3 [json_name="u"];
}

// A collection of Logs produced by a Scope.
message ScopeLogs {
// The instrumentation scope information for the logs in this message.
// Semantically when InstrumentationScope isn't set, it is equivalent with
// an empty instrumentation scope name (unknown).
opentelemetry.proto.common.v1.InstrumentationScope scope = 1;
opentelemetry.proto.common.v1.InstrumentationScope scope = 1 [json_name="i"];

// A list of log records.
repeated LogRecord log_records = 2;
repeated LogRecord log_records = 2 [json_name="l"];

// This schema_url applies to all logs in the "logs" field.
string schema_url = 3;
string schema_url = 3 [json_name="u"];
}

// Possible values for LogRecord.SeverityNumber.
Expand Down Expand Up @@ -118,7 +118,7 @@ message LogRecord {
// time_unix_nano is the time when the event occurred.
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
// Value of 0 indicates unknown or missing timestamp.
fixed64 time_unix_nano = 1;
fixed64 time_unix_nano = 1 [json_name="t"];

// Time when the event was observed by the collection system.
// For events that originate in OpenTelemetry (e.g. using OpenTelemetry Logging SDK)
Expand All @@ -135,43 +135,43 @@ message LogRecord {
//
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
// Value of 0 indicates unknown or missing timestamp.
fixed64 observed_time_unix_nano = 11;
fixed64 observed_time_unix_nano = 11 [json_name="o"];

// Numerical value of the severity, normalized to values described in Log Data Model.
// [Optional].
SeverityNumber severity_number = 2;
SeverityNumber severity_number = 2 [json_name="sn"];

// The severity text (also known as log level). The original string representation as
// it is known at the source. [Optional].
string severity_text = 3;
string severity_text = 3 [json_name="st"];

// A value containing the body of the log record. Can be for example a human-readable
// string message (including multi-line) describing the event in a free form or it can
// be a structured data composed of arrays and maps of other values. [Optional].
opentelemetry.proto.common.v1.AnyValue body = 5;
opentelemetry.proto.common.v1.AnyValue body = 5 [json_name="b"];

// Additional attributes that describe the specific event occurrence. [Optional].
// Attribute keys MUST be unique (it is not allowed to have more than one
// attribute with the same key).
repeated opentelemetry.proto.common.v1.KeyValue attributes = 6;
uint32 dropped_attributes_count = 7;
repeated opentelemetry.proto.common.v1.KeyValue attributes = 6 [json_name="a"];
uint32 dropped_attributes_count = 7 [json_name="da"];

// Flags, a bit field. 8 least significant bits are the trace flags as
// defined in W3C Trace Context specification. 24 most significant bits are reserved
// and must be set to 0. Readers must not assume that 24 most significant bits
// will be zero and must correctly mask the bits when reading 8-bit trace flag (use
// flags & TRACE_FLAGS_MASK). [Optional].
fixed32 flags = 8;
fixed32 flags = 8 [json_name="f"];

// A unique identifier for a trace. All logs from the same trace share
// the same `trace_id`. The ID is a 16-byte array. An ID with all zeroes
// is considered invalid. Can be set for logs that are part of request processing
// and have an assigned trace id. [Optional].
bytes trace_id = 9;
bytes trace_id = 9 [json_name="ti"];

// A unique identifier for a span within a trace, assigned when the span
// is created. The ID is an 8-byte array. An ID with all zeroes is considered
// invalid. Can be set for logs that are part of a particular processing span.
// If span_id is present trace_id SHOULD be also present. [Optional].
bytes span_id = 10;
bytes span_id = 10 [json_name="si"];
}
Loading

0 comments on commit d74b93f

Please sign in to comment.