Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define a data model for span links #7171

Closed
axw opened this issue Feb 2, 2022 · 2 comments · Fixed by #7291
Closed

Define a data model for span links #7171

axw opened this issue Feb 2, 2022 · 2 comments · Fixed by #7291
Assignees
Labels
Milestone

Comments

@axw
Copy link
Member

axw commented Feb 2, 2022

OpenTelemetry has a concept of "span links", which is a property of a span that is a collection of span identifiers. This can be used, for example, when restarting a trace: in this case the incoming span context could be preserved in span link. For the same reasons, we would also like to support span links in Elastic APM: see elastic/apm#286 (comment).

We need to define how span links should be stored in Elasticsearch. See elastic/apm#122 for some previous discussions on this.

@simitt simitt added this to the 8.2 milestone Feb 8, 2022
@axw
Copy link
Member Author

axw commented Feb 10, 2022

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#specifying-links says:

A Link is structurally defined by the following properties:

  • SpanContext of the Span to link to.
  • Zero or more Attributes further describing the link.

We definitely need the trace context: a trace ID and span ID. The span ID may correspond to either a transaction or span in our model. We could skip storing attributes for the moment, but leave room for introducing them later.

I propose we use an array-of-objects encoding, to minimise storage costs and simplify queries. Each object would have a trace ID and span ID, and maybe later some attributes/labels.

Naming-wise, I can see a couple of options:

  • extend the ECS related fields. e.g. related.spans
  • introduce a field under span, e.g. span.links

My current preference is for the latter. It's probably premature to try and extend the related fieldset. So if we go with span.links, we would have a mapping along these lines:

links:
  properties:
    trace:
      properties:
        id:
          type: keyword
    span:
      properties:
        id:
          type: keyword

e.g.

{
  "trace": {
    "id": "trace1"
  },
  "transaction": {
    "id": "transaction1",
    "name": "GET /foo"
  },
  "span": {
    "links": [
      {"trace": {"id": "trace1"}, "span": {"id": "span1"}},
      {"trace": {"id": "trace2"}, "span": {"id": "span2"}},
    ]
  }
}

Due to the way arrays of objects are flattened, we won't be able to sensibly search for links by both trace ID and span ID. e.g. consider the example above, where there are two links to spans in different traces. If one were to search trace.id:trace1 and span.id:span2, the above example document would be matched. The client (APM UI) will need to perform additional client-side filtering.

@axw
Copy link
Member Author

axw commented Feb 11, 2022

Due to the way arrays of objects are flattened, we won't be able to sensibly search for links by both trace ID and span ID. e.g. consider the example above, where there are two links to spans in different traces. If one were to search trace.id:trace1 and span.id:span2, the above example document would be matched. The client (APM UI) will need to perform additional client-side filtering.

I have confirmation from @cauemarcondes that this won't pose a problem for the UI, as it only ever queries by trace.id in the waterfall anyway, and then filters depending on the view (focused vs. full trace waterfall).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants