Skip to content
This repository has been archived by the owner on Dec 6, 2024. It is now read-only.

Proposal: Resource Scope and Namespace API #78

Closed
jmacd opened this issue Jan 11, 2020 · 6 comments
Closed

Proposal: Resource Scope and Namespace API #78

jmacd opened this issue Jan 11, 2020 · 6 comments
Labels
Future Discussions Things to look at for future improvements release:after-ga Not required before GA release, and not going to work on before GA

Comments

@jmacd
Copy link
Contributor

jmacd commented Jan 11, 2020

Resource Scope and Namespace API

Resources

Resources are a term to describe properties about things like the enviroment, process name and identifiers, and shard numbers, things that are usually statically known in a library of code.

In current terms, We call such properties "attributes" on a span or span event, "labels" on a metric, and "correlations" in the distributed context. Resources are the implicit properties that become span attributes and metric labels when and where they are used. This is a generally accepted idea for process-wide properties, which may be initialized with the SDK and are not an absolute requirement in the API.

Resources are not included in the current OpenTelemetry APIs (they were removed in the v0.2 release).

Namespaces

Namespace refers to a qualifier on names used in the OpenTelemetry API. While both spans and metric instruments are named entities, there is concern that unrelated code may use the same name, therefore namespaces are introduced. Names are only considered identical when their namespaces match. Namespace is a property of the exported span and metric instrument.

The metric API recommends that Meter implementations SHOULD generate errors when metric instruments are registered with the same name and different kind. For this to work reliably, the API should support a namespace. Most existing metric APIs include a namespace concept, so this is probably a requirement for OpenTelemetry v1.0.

Status of this issue

This issue is filed alongside #68 and #73, as raising a complex issue for discussion and consideration. There is also relationship between these issues discussed below. This is not meant as a proposal to be incorporated into the v1.0 OpenTelemetry API.

Detailed discussion

Across the OpenTelemetry APIs, every method is defined in association with a "current" distributed context. This proposal introduces the notion of a "current" static context. These contexts are practically identical, only they are used differently.

Distributed context is passed from call to call dynamically. Static context is organized by units of code. We sometimes refer to the unit of code as "libraries", "components", or "modules". Concretely, this proposal refers to this concept as a Resource Scope.

The "current" Resource Scope determines the following properties of the static OpenTelemetry context:

  1. Tracer SDK: an implementation of the trace.Tracer API
  2. Meter SDK: an implementation of the metric.Meter API
  3. Namespace: the namespace of any new Span or Metric instrument
  4. Resources: implicit properties associated with Metric events.

To understand why the current Scope's resources only apply to Metric events, it is important to recognize that Spans are scopes of their own. Spans start with a set of attributes that serve, in this proposal as a new Resource Scope. Spans events happen in their own scope, whereas metric events happen in the context of another scope.

Scope type

The Scope type supports accessors that return the Tracer and Meter API for use. In this proposal, all Tracer and Meter API functionality are accessed via a Scope, the contract being tht when these API functions are called, the Scope's namespace and resources take effect.

Tracer relationship

The Tracer API supports only Start method. When starting a span, the resources associated with the Tracer's Scope are included in the span's attributes. The Scope's namespace is associated with the new span.

The Span interface returned by Start is considered a scope of its own. When the span interface is used, those events do not implicitly take on static properties from the current Resource Scope when used.

Meter relationship

The Meter API supports New constructors for each kind of instrument. When creating metric instruments, the namespaces associated with the Meter's Scope is used to disambiguate the new instrument's name.

The Meter API supports a RecordBatch function that reports multiple metric events. When recording a batch of measurements, the resources associated with the Meter's Scope are included in the metric event's labels.

Metric instruments (and bound instruments) are not considered scopes-- like spans are. Metric events use the current Resource Scope when called

The metric instrument and bound instrument are not considered scopes the way spans are, and they do take on static properties from the current Resource Scope when used.

Global Scope provider

This proposal replaces the independent global Tracer and Meter singletons with a single global Scope. The global scope will be used as a default whenever there is no "current" scope otherwise defined.

The global Scope is used as the default "current" Resource Scope, allowing process-wide resources to be set through the API. This proposal recommends #74, i.e., that the global scope only be initialized once.

Changes to existing APIs

The new functionality is nearly independent of existing Trace, Metric, and Context Propagation APIs. The Tracer API is unchanged in this proposal. Context propagation is completely independent of static resource scope.

The Meter API is simplified by this proposal. The metric LabelSet API moves into the Resource Scope and disappears from the metric API. In each of the metric calling conventions, whereas the former call accepted LabelSet the replacement in this proposal takes a list of additional labels (called "call site" labels). The current resource scope is combined with the call site labels to generate a call-site resource scope.

Benefits of using Scopes

Using the Scope API as proposed here ensures that it is easy for developers to coordinate their span attributes with their metric labels. Whereas before the developer was responsible for computing metric label sets, these variables can now be placed into current resource scope and used implicitly.

Metrics that happen in the context of a Span will automatically include the span's attributes in their LabelSet, by setting the Span as the Resource Scope. This addresses a topic raised in open-telemetry/opentelemetry-specification#381 about making the Tracer and Meter APIs more "aware" of each other.

If OpenTelemetry adds a logging interface, the current resource scope would implicitly apply to log events.

Relationship with Context Propagation

Context propagation happens independently of the current resource scope. In the context of #66, the Scope type here determines the current set of Propagators. The global Scope determines the global propagators.

Relationship with Span

It is unclear what the relationship between Scope and Span should be. Should the Scope's Span have been started by the Scope's Tracer? Must it have been? There are subtle implications. What resource scope should the Span's Tracer() method return?

Followed to its logical conclusion, the proposal here would replace the Span.Tracer() accessor by a Scope() accessor, and the former functionality would be accessed through Span.Scope().Tracer(). This would make it easy to explicitly switch into a Span's scope.

Relationship with "Named" Tracers and Meters

The "Named Tracer" and "Named Meter" proposal #76 overlaps with this topic because it sounds appealingly similar to a namespace, and it is also an implied resource (i.e., the reporting library).

This proposal does not directly address the topic of that proposal, however. The question behind "Named" Tracers and Meters is whether the developer is obligated to provide a name when obtaining a new Tracer or Meter. The same can be done in this proposal by preventing the construction of a Scope without providing a name.

Prototype code

This proposal has been prototyped in the Golang repository. See open-telemetry/opentelemetry-go#427. The new SDK initialization sequence looks like:

// initTelemetry initializes the global OpenTelemetry SDK and returns
// a function to shut it down before exiting the process.
func initTelemetry() func() {
	tracer := initTracer()
	meter := initMeter()
	global.SetScope(
		scope.WithTracerSDK(tracer.Get()).
			WithMeterSDK(meter.Get()).
			WithNamespace("example").
			AddResources(
				key.String("process1", "value1"),
				key.String("process2", "value2"),
			),
	)
	return func() {
		tracer.Stop()
		meter.Stop()
	}
}

To inject resources into a module of instrumented code, for example:

   // All instrumentation from this client is tagged by "shard=...".
   ctx := global.Scope().WithResources(key.String("shard", ...)).InContext(context.Background())
   someClient := somepackage.NewClient(ctx)

To attach a namespace to a group of metric instruments:

func NewClient(ctx context.Context) *Client {
   // All metrics used in this code are namespaced by "somepackage".
   scope := scope.Current(ctx).WithNamespace("somepackage")
   meter := scope.Meter()
   client := &Client{
      instrument1: meter.NewCounter("instrument1"),
      instrument2: meter.NewCounter("instrument2"),
   }
   return client
}

Points of interest:

  1. See the above example/basic/main.go for more context
  2. An example of four ways generate equivalent metric events with different uses of Scope
  3. The new Scope type
  4. The new current Scope machinery
  5. The label.Set type is now concrete
@dyladan
Copy link
Member

dyladan commented Jan 14, 2020

Thanks for writing this up @jmacd. The quick run-through in the spec meeting last week was a little too much for me to follow in real-time so I appreciate having a full writeup. I do have some concerns over the general complexity of this, especially in the simple/average use case. I think this could be easily misunderstood and misused by SIG maintainers and end-users alike.

In addition, I have some more specific questions:

Would it be fair to say that this is a fancy way to add attributes and labels to metrics and spans automatically, based on their current scope? Does it provide anything else?

What is meant by "preventing the construction of a Scope providing a name"? Was this meant to say "preventing the construction of a Scope without providing a name"?

Is this meant to live alongside named tracers/meters, or to replace it? If replace, then how would it solve the core use case where a disabled instrumentation library should get a no-op tracer?

This is not meant as a proposal to be incorporated into the v1.0 OpenTelemetry API.

Does that mean you are in favor of moving forward with the current named tracers and this would be an addition on top of it later?

@jmacd
Copy link
Contributor Author

jmacd commented Jan 17, 2020

Would it be fair to say that this is a fancy way to add attributes and labels to metrics and spans automatically, based on their current scope? Does it provide anything else?

It combined the notion of a namespace. Although I think namespace can be separated from the tracer/meter object (e.g., it could be attached to the name instead), I prefer it be implied by the tracer/meter.

It has the notion that a single Scope manages a reference to both a Tracer and a Meter at once. It makes it the default that these interfaces will be coordinated with the same resources and library name. I worry that we've created interfaces that are too separate. Tying the resources to the scope ensures they are coordinated across tracer and meter.

Was this meant to say "preventing the construction of a Scope without providing a name"?

Yes. (Updated.)

Is this meant to live alongside named tracers/meters, or to replace it? If replace, then how would it solve the core use case where a disabled instrumentation library should get a no-op tracer?

I am not sure I would say "replace". I was trying to separate the matter into separate discussions, as in:
(1) If tracer/meters had resources baked in, we could use resources to determine library name
(2) If we used resources to determine library name, we might still require an obligatory name be provided.

I see this (2) as the fundamental concern, but you asked how to solve the use-case. If the resources are stored as a map[Key]Value, then any time a tracer/meter is used you can perform a lookup in a map of disabled libraries (i.e., LibraryConfig[Resources[LibraryNameKey]].IsDisabled). This can be done when scopes are created, to avoid computing this for every method call.

This approach supports disabling tracing or metrics based on other resource values, potentially on boolean predicates over multiple resource values. Say you want to disable mongodb for a specific database name, for example.

Does that mean you are in favor of moving forward with the current named tracers and this would be an addition on top of it later?

I agree that we could turn these library names into resources later. I'm not sure we can retrofit something like a Scope in, which would make it easier to coordinate resources across tracer/meter impls.

@evankanderson
Copy link

This looks useful to me as an implementer of a multi-tenant service. I can see a few different ways that this helps:

  1. This is useful for associating metrics (or spans) with logical Resources based on work done on their behalf (i.e. activation work when scaling from zero in Knative)
  2. For storage services, this is useful for exporting logical Resources which may be "data at rest" rather than actual computation (i.e. data used/quota for a particular bucket or keyspace)

Being able to use a Resource bound to the context when performing this work makes it easier to share a common set of base labels (disk=XYZ, zone=Q, service=ABC, etc) across several different measurements which may not have the same full labelset. (In Knative, an example of this is request metrics including both queue size (with no additional labels) and request statistics (with response code labels).

@andrewhsu andrewhsu removed the metrics Relates to the Metrics API/SDK label Oct 23, 2020
@andrewhsu
Copy link
Member

from the spec issue triage mtg today, taking metrics label off since this issue kinda applies to more

@Oberon00
Copy link
Member

It looks like the problem that issue proposes to solve, is now solved only slightly differently by #201. At least I believe so. Or are there some leftover use cases that the solution in this issue would solve that #201 does not?

@jmacd
Copy link
Contributor Author

jmacd commented Jul 13, 2022

I consider this effectively replaced by #207.

@jmacd jmacd closed this as completed Jul 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Future Discussions Things to look at for future improvements release:after-ga Not required before GA release, and not going to work on before GA
Projects
None yet
Development

No branches or pull requests

5 participants