Verify compliant metric SDK specification implementation: MetricReader/MetricReader operations/Collect #3662

MrAlias · 2023-02-03T16:38:10Z

Identify all the normative requirements, recommendations, and options the specification defines as comments to this issue
Ensure the current metric SDK implementation is compliant with these normative requirements, recommendations, and options in those comments.

dashpole · 2023-07-21T18:56:12Z

I would like to take this.

dashpole · 2023-07-21T19:01:36Z

MetricReader is an SDK implementation object that provides the common configurable aspects of the OpenTelemetry Metrics SDK and determines the following capabilities:

Registering MetricProducer(s)

opentelemetry-go/sdk/metric/reader.go

Lines 60 to 65 in cbc5890

    
           // RegisterProducer registers a an external Producer with this Reader. 
        
           // The Producer is used as a source of aggregated metric data which is 
        
           // incorporated into metrics collected from the SDK. 
        
           // 
        
           // This method needs to be concurrent safe. 
        
           RegisterProducer(Producer)

Collecting metrics from the SDK and any registered MetricProducers on demand.

opentelemetry-go/sdk/metric/reader.go

Lines 73 to 79 in cbc5890

    
           // Collect gathers and returns all metric data related to the Reader from 
        
           // the SDK and stores it in out. An error is returned if this is called 
        
           // after Shutdown or if out is nil. 
        
           // 
        
           // This method needs to be concurrent safe, and the cancelation of the 
        
           // passed context is expected to be honored. 
        
           Collect(ctx context.Context, rm *metricdata.ResourceMetrics) error

Handling the ForceFlush and Shutdown signals from the SDK.

opentelemetry-go/sdk/metric/reader.go

Lines 81 to 103 in cbc5890

    
           // ForceFlush flushes all metric measurements held in an export pipeline. 
        
           // 
        
           // This deadline or cancellation of the passed context are honored. An appropriate 
        
           // error will be returned in these situations. There is no guaranteed that all 
        
           // telemetry be flushed or all resources have been released in these 
        
           // situations. 
        
           // 
        
           // This method needs to be concurrent safe. 
        
           ForceFlush(context.Context) error 
        
           // Shutdown flushes all metric measurements held in an export pipeline and releases any 
        
           // held computational resources. 
        
           // 
        
           // This deadline or cancellation of the passed context are honored. An appropriate 
        
           // error will be returned in these situations. There is no guaranteed that all 
        
           // telemetry be flushed or all resources have been released in these 
        
           // situations. 
        
           // 
        
           // After Shutdown is called, calls to Collect will perform no operation and instead will return 
        
           // an error indicating the shutdown state. 
        
           // 
        
           // This method needs to be concurrent safe. 
        
           Shutdown(context.Context) error

dashpole · 2023-07-21T19:07:58Z

To construct a MetricReader when setting up an SDK, the caller SHOULD provide at least the following:

The exporter to use, which is a MetricExporter instance.

This option is ONLY exposed on periodic readers.

opentelemetry-go/sdk/metric/periodic_reader.go

Line 114 in cbc5890

    
           func NewPeriodicReader(exporter Exporter, options ...PeriodicReaderOption) *PeriodicReader {

The default output aggregation (optional), a function of instrument kind. If not configured, the default aggregation SHOULD be used.

This option is only provided on ManualReaders:

opentelemetry-go/sdk/metric/manual_reader.go

Lines 222 to 226 in cbc5890

    
           // WithAggregationSelector sets the AggregationSelector a reader will use to 
        
           // determine the aggregation to use for an instrument based on its kind. If 
        
           // this option is not used, the reader will use the DefaultAggregationSelector 
        
           // or the aggregation explicitly passed for a view matching an instrument. 
        
           func WithAggregationSelector(selector AggregationSelector) ManualReaderOption {

For PeriodicReaders, the MetricExporter provides the default output aggregation.

The default output temporality (optional), a function of instrument kind. If not configured, the Cumulative temporality SHOULD be used.

This option is only provided for ManualReaders:

opentelemetry-go/sdk/metric/manual_reader.go

Lines 205 to 210 in cbc5890

    
           // WithTemporalitySelector sets the TemporalitySelector a reader will use to 
        
           // determine the Temporality of an instrument based on its kind. If this 
        
           // option is not used, the reader will use the DefaultTemporalitySelector. 
        
           func WithTemporalitySelector(selector TemporalitySelector) ManualReaderOption { 
        
           	return temporalitySelectorOption{selector: selector} 
        
           }

For PeriodicReaders, the MetricExporter provides the default output temporality.

The default aggregation cardinality limit to use, a function of instrument kind. If not configured, a default value of 2000 SHOULD be used.

This option is not provided when constructing a MetricReader.
IMO, this should not be part of the stable specification.

dashpole · 2023-07-21T19:11:12Z

A common sub-class of MetricReader, the periodic exporting MetricReader SHOULD be provided to be used typically with push-based metrics collection.

We provide a PeriodicReader:

opentelemetry-go/sdk/metric/periodic_reader.go

Line 114 in cbc5890

    
           func NewPeriodicReader(exporter Exporter, options ...PeriodicReaderOption) *PeriodicReader {

dashpole · 2023-07-21T20:16:00Z

The MetricReader MUST ensure that data points from OpenTelemetry instruments are output in the configured aggregation temporality for each instrument kind. For synchronous instruments being output with Cumulative temporality, this means converting Delta to Cumulative aggregation temporality. For asynchronous instruments being output with Delta temporality, this means converting Cumulative to Delta aggregation temporality.

The periodic reader defaults to the exporter's temporality:

opentelemetry-go/sdk/metric/periodic_reader.go

Lines 213 to 215 in cbc5890

    
           // temporality reports the Temporality for the instrument kind provided. 
        
           func (r *PeriodicReader) temporality(kind InstrumentKind) metricdata.Temporality { 
        
           	return r.exporter.Temporality(kind)

The manual reader defaults to the configured temporality:

opentelemetry-go/sdk/metric/manual_reader.go

Lines 84 to 87 in cbc5890

    
           // temporality reports the Temporality for the instrument kind provided. 
        
           func (mr *ManualReader) temporality(kind InstrumentKind) metricdata.Temporality { 
        
           	return mr.temporalitySelector(kind) 
        
           }

The collection + aggregation pipeline uses the temporality of the reader:

opentelemetry-go/sdk/metric/pipeline.go

Lines 328 to 329 in cbc5890

    
           b := aggregate.Builder[N]{ 
        
           	Temporality: i.pipeline.reader.temporality(kind),

That temporality is used to ensure that the aggregation is output in the correct temporality:

opentelemetry-go/sdk/metric/internal/aggregate/aggregate.go

Lines 92 to 96 in cbc5890

    
           switch b.Temporality { 
        
           case metricdata.DeltaTemporality: 
        
           	s = newPrecomputedDeltaSum[N](monotonic) 
        
           default: 
        
           	s = newPrecomputedCumulativeSum[N](monotonic)

I believe that satisfies the "MUST ensure that data points ... are output in the configured aggregation temporality", even though much of the implementation details are not implemented within our Readers.

dashpole · 2023-07-21T20:34:45Z

The SDK MUST support multiple MetricReader instances to be registered on the same MeterProvider, and the MetricReader.Collect invocation on one MetricReader instance SHOULD NOT introduce side-effects to other MetricReader instances.

Multiple readers can be registered on a single MeterProvider (readers are appended to a list):

opentelemetry-go/sdk/metric/config.go

Lines 120 to 130 in cbc5890

    
           // WithReader associates Reader r with a MeterProvider. 
        
           // 
        
           // By default, if this option is not used, the MeterProvider will perform no 
        
           // operations; no data will be exported without a Reader. 
        
           func WithReader(r Reader) Option { 
        
           	return optionFunc(func(cfg config) config { 
        
           		if r == nil { 
        
           			return cfg 
        
           		} 
        
           		cfg.readers = append(cfg.readers, r) 
        
           		return cfg

Each reader results in an independent pipeline:

opentelemetry-go/sdk/metric/pipeline.go

Lines 481 to 488 in cbc5890

    
           func newPipelines(res *resource.Resource, readers []Reader, views []View) pipelines { 
        
           	pipes := make([]*pipeline, 0, len(readers)) 
        
           	for _, r := range readers { 
        
           		p := newPipeline(res, r, views) 
        
           		r.register(p) 
        
           		pipes = append(pipes, p) 
        
           	} 
        
           	return pipes

Invoking Collect on a reader only calls produce on its own pipeline. Periodic:

opentelemetry-go/sdk/metric/periodic_reader.go

Line 273 in cbc5890

err := ph.produce(ctx, rm)

and Manual:

opentelemetry-go/sdk/metric/manual_reader.go

Line 148 in cbc5890

err := ph.produce(ctx, rm)

produce only invokes callbacks and computes aggregations from the pipeline's own, independent callbacks and aggregations:

opentelemetry-go/sdk/metric/pipeline.go

Lines 123 to 179 in cbc5890

    
           func (p *pipeline) produce(ctx context.Context, rm *metricdata.ResourceMetrics) error { 
        
           	p.Lock() 
        
           	defer p.Unlock() 
        
           	var errs multierror 
        
           	for _, c := range p.callbacks { 
        
           		// TODO make the callbacks parallel. ( #3034 ) 
        
           		if err := c(ctx); err != nil { 
        
           			errs.append(err) 
        
           		} 
        
           		if err := ctx.Err(); err != nil { 
        
           			rm.Resource = nil 
        
           			rm.ScopeMetrics = rm.ScopeMetrics[:0] 
        
           			return err 
        
           		} 
        
           	} 
        
           	for e := p.multiCallbacks.Front(); e != nil; e = e.Next() { 
        
           		// TODO make the callbacks parallel. ( #3034 ) 
        
           		f := e.Value.(multiCallback) 
        
           		if err := f(ctx); err != nil { 
        
           			errs.append(err) 
        
           		} 
        
           		if err := ctx.Err(); err != nil { 
        
           			// This means the context expired before we finished running callbacks. 
        
           			rm.Resource = nil 
        
           			rm.ScopeMetrics = rm.ScopeMetrics[:0] 
        
           			return err 
        
           		} 
        
           	} 
        
           	rm.Resource = p.resource 
        
           	rm.ScopeMetrics = internal.ReuseSlice(rm.ScopeMetrics, len(p.aggregations)) 
        
           	i := 0 
        
           	for scope, instruments := range p.aggregations { 
        
           		rm.ScopeMetrics[i].Metrics = internal.ReuseSlice(rm.ScopeMetrics[i].Metrics, len(instruments)) 
        
           		j := 0 
        
           		for _, inst := range instruments { 
        
           			data := rm.ScopeMetrics[i].Metrics[j].Data 
        
           			if n := inst.compAgg(&data); n > 0 { 
        
           				rm.ScopeMetrics[i].Metrics[j].Name = inst.name 
        
           				rm.ScopeMetrics[i].Metrics[j].Description = inst.description 
        
           				rm.ScopeMetrics[i].Metrics[j].Unit = inst.unit 
        
           				rm.ScopeMetrics[i].Metrics[j].Data = data 
        
           				j++ 
        
           			} 
        
           		} 
        
           		rm.ScopeMetrics[i].Metrics = rm.ScopeMetrics[i].Metrics[:j] 
        
           		if len(rm.ScopeMetrics[i].Metrics) > 0 { 
        
           			rm.ScopeMetrics[i].Scope = scope 
        
           			i++ 
        
           		} 
        
           	} 
        
           	rm.ScopeMetrics = rm.ScopeMetrics[:i] 
        
           	return errs.errorOrNil()

dashpole · 2023-07-21T20:38:12Z

The SDK MUST NOT allow a MetricReader instance to be registered on more than one MeterProvider instance.

We return this error when a reader is registered more than once (e.g. with different MeterProviders):

opentelemetry-go/sdk/metric/reader.go

Line 27 in cbc5890

var errDuplicateRegister = fmt.Errorf("duplicate reader registration")

Periodic Reader impl:

opentelemetry-go/sdk/metric/periodic_reader.go

Lines 188 to 195 in cbc5890

    
           // register registers p as the producer of this reader. 
        
           func (r *PeriodicReader) register(p sdkProducer) { 
        
           	// Only register once. If producer is already set, do nothing. 
        
           	if !r.sdkProducer.CompareAndSwap(nil, produceHolder{produce: p.produce}) { 
        
           		msg := "did not register periodic reader" 
        
           		global.Error(errDuplicateRegister, msg) 
        
           	} 
        
           }

Manual Reader impl:

opentelemetry-go/sdk/metric/manual_reader.go

Lines 57 to 65 in cbc5890

    
           // register stores the sdkProducer which enables the caller 
        
           // to read metrics from the SDK on demand. 
        
           func (mr *ManualReader) register(p sdkProducer) { 
        
           	// Only register once. If producer is already set, do nothing. 
        
           	if !mr.sdkProducer.CompareAndSwap(nil, produceHolder{produce: p.produce}) { 
        
           		msg := "did not register manual reader" 
        
           		global.Error(errDuplicateRegister, msg) 
        
           	} 
        
           }

dashpole · 2023-07-21T20:44:02Z

The SDK SHOULD provide a way to allow MetricReader to respond to MeterProvider.ForceFlush and MeterProvider.Shutdown.

The ForceFlush and Shutdown signals of readers are aggregated when the MeterProvider is created:

opentelemetry-go/sdk/metric/config.go

Lines 32 to 43 in cbc5890

    
           // readerSignals returns a force-flush and shutdown function for a 
        
           // MeterProvider to call in their respective options. All Readers c contains 
        
           // will have their force-flush and shutdown methods unified into returned 
        
           // single functions. 
        
           func (c config) readerSignals() (forceFlush, shutdown func(context.Context) error) { 
        
           	var fFuncs, sFuncs []func(context.Context) error 
        
           	for _, r := range c.readers { 
        
           		sFuncs = append(sFuncs, r.Shutdown) 
        
           		fFuncs = append(fFuncs, r.ForceFlush) 
        
           	} 
        
           	return unify(fFuncs), unifyShutdown(sFuncs)

opentelemetry-go/sdk/metric/provider.go

Lines 51 to 58 in cbc5890

    
           func NewMeterProvider(options ...Option) *MeterProvider { 
        
           	conf := newConfig(options) 
        
           	flush, sdown := conf.readerSignals() 
        
           	mp := &MeterProvider{ 
        
           		pipes:      newPipelines(conf.res, conf.readers, conf.views), 
        
           		forceFlush: flush, 
        
           		shutdown:   sdown,

And then are invoked when ForceFlush and Shutdown are invoked on the MeterProvider:

opentelemetry-go/sdk/metric/provider.go

Lines 114 to 118 in cbc5890

    
           func (mp *MeterProvider) ForceFlush(ctx context.Context) error { 
        
           	if mp.forceFlush != nil { 
        
           		return mp.forceFlush(ctx) 
        
           	} 
        
           	return nil

opentelemetry-go/sdk/metric/provider.go

Lines 137 to 151 in cbc5890

    
           func (mp *MeterProvider) Shutdown(ctx context.Context) error { 
        
           	// Even though it may seem like there is a synchronization issue between the 
        
           	// call to `Store` and checking `shutdown`, the Go concurrency model ensures 
        
           	// that is not the case, as all the atomic operations executed in a program 
        
           	// behave as though executed in some sequentially consistent order. This 
        
           	// definition provides the same semantics as C++'s sequentially consistent 
        
           	// atomics and Java's volatile variables. 
        
           	// See https://go.dev/ref/mem#atomic and https://pkg.go.dev/sync/atomic. 
        
           	mp.stopped.Store(true) 
        
           	if mp.shutdown != nil { 
        
           		return mp.shutdown(ctx) 
        
           	} 
        
           	return nil 
        
           }

dashpole · 2023-07-21T20:53:00Z

MetricReader Operations:

Collect

opentelemetry-go/sdk/metric/manual_reader.go

Line 129 in cbc5890

    
           func (mr *ManualReader) Collect(ctx context.Context, rm *metricdata.ResourceMetrics) error {

opentelemetry-go/sdk/metric/periodic_reader.go

Line 249 in cbc5890

    
           func (r *PeriodicReader) Collect(ctx context.Context, rm *metricdata.ResourceMetrics) error {

Collects the metrics from the SDK and any registered MetricProducers.

From SDK:

opentelemetry-go/sdk/metric/manual_reader.go

Line 148 in cbc5890

err := ph.produce(ctx, rm)

opentelemetry-go/sdk/metric/periodic_reader.go

Line 273 in cbc5890

err := ph.produce(ctx, rm)

From Producers:

opentelemetry-go/sdk/metric/manual_reader.go

Lines 153 to 159 in cbc5890

    
           for _, producer := range mr.externalProducers.Load().([]Producer) { 
        
           	externalMetrics, err := producer.Produce(ctx) 
        
           	if err != nil { 
        
           		errs = append(errs, err) 
        
           	} 
        
           	rm.ScopeMetrics = append(rm.ScopeMetrics, externalMetrics...) 
        
           }

opentelemetry-go/sdk/metric/periodic_reader.go

Lines 278 to 284 in cbc5890

    
           for _, producer := range r.externalProducers.Load().([]Producer) { 
        
           	externalMetrics, err := producer.Produce(ctx) 
        
           	if err != nil { 
        
           		errs = append(errs, err) 
        
           	} 
        
           	rm.ScopeMetrics = append(rm.ScopeMetrics, externalMetrics...) 
        
           }

Collect SHOULD provide a way to let the caller know whether it succeeded, failed or timed out.

Collect returns an error. See signature above.

When the Collect operation fails or times out on some of the instruments, the SDK MAY return successfully collected results and a failed reasons list to the caller.

We support partial results by aggregating errors and continuing on during callbacks that return errors:

opentelemetry-go/sdk/metric/pipeline.go

Lines 130 to 132 in cbc5890

    
           if err := c(ctx); err != nil { 
        
           	errs.append(err) 
        
           }

opentelemetry-go/sdk/metric/pipeline.go

Lines 142 to 144 in cbc5890

    
           if err := f(ctx); err != nil { 
        
           	errs.append(err) 
        
           }

dashpole · 2023-07-21T21:02:41Z

Shutdown

opentelemetry-go/sdk/metric/reader.go

Lines 91 to 103 in cbc5890

    
           // Shutdown flushes all metric measurements held in an export pipeline and releases any 
        
           // held computational resources. 
        
           // 
        
           // This deadline or cancellation of the passed context are honored. An appropriate 
        
           // error will be returned in these situations. There is no guaranteed that all 
        
           // telemetry be flushed or all resources have been released in these 
        
           // situations. 
        
           // 
        
           // After Shutdown is called, calls to Collect will perform no operation and instead will return 
        
           // an error indicating the shutdown state. 
        
           // 
        
           // This method needs to be concurrent safe. 
        
           Shutdown(context.Context) error

Shutdown MUST be called only once for each MetricReader instance. After the call to Shutdown, subsequent invocations to Collect are not allowed. SDKs SHOULD return some failure for these calls, if possible.

In Shutdown(), readers store a shutdownProducer in the produer, which returns an error that is returned to the caller of Collect:

opentelemetry-go/sdk/metric/manual_reader.go

Lines 107 to 110 in cbc5890

    
           // Any future call to Collect will now return ErrReaderShutdown. 
        
           mr.sdkProducer.Store(produceHolder{ 
        
           	produce: shutdownProducer{}.produce, 
        
           })

opentelemetry-go/sdk/metric/periodic_reader.go

Lines 330 to 333 in cbc5890

    
           // Any future call to Collect will now return ErrReaderShutdown. 
        
           ph := r.sdkProducer.Swap(produceHolder{ 
        
           	produce: shutdownProducer{}.produce, 
        
           })

Shutdown SHOULD provide a way to let the caller know whether it succeeded, failed or timed out.

Shutdown returns an error.

Shutdown SHOULD complete or abort within some timeout.

Shutdown does not impose a timeout on collection or exporting of telemetry in the periodic reader

It does accept a context, which allows users to easily place a timeout on collection or export.

dashpole · 2023-07-21T21:07:38Z

That is all for the MetricReader, MetricReader operations, Collect, and Shutdown sections. The bolded text above are contentious, or are areas where we differ from the specification. Our only differences occur in SHOULD sections.

dashpole · 2023-07-24T15:46:56Z

I think I misread the intention of this issue... If this is only about verifying collect, see #3662 (comment)

I did not find any issues regarding the Collect function..

MrAlias · 2023-07-27T17:25:09Z

The targeted material has all been identified and checked to be compliant. Closing.

MrAlias added this to Go: Metric SDK (GA) Feb 3, 2023

MrAlias converted this from a draft issue Feb 3, 2023

MrAlias mentioned this issue Feb 3, 2023

Have OTel TC review metric SDK and sign-off on stable release #3674

Closed

33 tasks

MrAlias added pkg:SDK Related to an SDK package area:metrics Part of OpenTelemetry Metrics labels Feb 3, 2023

dashpole self-assigned this Jul 21, 2023

dashpole moved this from Todo to In Progress in Go: Metric SDK (GA) Jul 21, 2023

dashpole mentioned this issue Jul 24, 2023

Aggregation cardinality default configuration should be experimental open-telemetry/opentelemetry-specification#3618

Closed

MrAlias closed this as completed Jul 27, 2023

github-project-automation bot moved this from In Progress to Done in Go: Metric SDK (GA) Jul 27, 2023

MrAlias added this to the v1.17.0/v0.40.0 milestone Aug 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verify compliant metric SDK specification implementation: MetricReader/MetricReader operations/Collect #3662

Verify compliant metric SDK specification implementation: MetricReader/MetricReader operations/Collect #3662

MrAlias commented Feb 3, 2023 •

edited by dashpole

Loading

dashpole commented Jul 21, 2023

dashpole commented Jul 21, 2023

dashpole commented Jul 21, 2023 •

edited

Loading

dashpole commented Jul 21, 2023 •

edited

Loading

dashpole commented Jul 21, 2023 •

edited

Loading

dashpole commented Jul 21, 2023

dashpole commented Jul 21, 2023

dashpole commented Jul 21, 2023

dashpole commented Jul 21, 2023

dashpole commented Jul 21, 2023

dashpole commented Jul 21, 2023

dashpole commented Jul 24, 2023

MrAlias commented Jul 27, 2023

Verify compliant metric SDK specification implementation: MetricReader/MetricReader operations/Collect #3662

Verify compliant metric SDK specification implementation: MetricReader/MetricReader operations/Collect #3662

Comments

MrAlias commented Feb 3, 2023 • edited by dashpole Loading

dashpole commented Jul 21, 2023

dashpole commented Jul 21, 2023

dashpole commented Jul 21, 2023 • edited Loading

dashpole commented Jul 21, 2023 • edited Loading

dashpole commented Jul 21, 2023 • edited Loading

dashpole commented Jul 21, 2023

dashpole commented Jul 21, 2023

dashpole commented Jul 21, 2023

dashpole commented Jul 21, 2023

dashpole commented Jul 21, 2023

dashpole commented Jul 21, 2023

dashpole commented Jul 24, 2023

MrAlias commented Jul 27, 2023

MrAlias commented Feb 3, 2023 •

edited by dashpole

Loading

dashpole commented Jul 21, 2023 •

edited

Loading

dashpole commented Jul 21, 2023 •

edited

Loading

dashpole commented Jul 21, 2023 •

edited

Loading