Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design experiment #3

Open
wants to merge 141 commits into
base: main
Choose a base branch
from
Open

Design experiment #3

wants to merge 141 commits into from

Conversation

victlu
Copy link
Collaborator

@victlu victlu commented Feb 2, 2021

A PR for discussion. Not intended to be merged.

Currently, there are 3 different designs…

./examples3 – An interpretation for a simpler API/SDK (from Victor)
./experimental – An interpretation from a .NET perspective (From Noah)
./examples2 – Trying to implement an interpretation of existing OTel Metric Spec (From Victor)

public LabelSet DefaultLabels { get; }

public void Record(int measurement) { }
public void Record(int measurement, string labelName1, string labelValue1) { }
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you're separating Key and Value. So, why not Record(..., string[] keys, string[] values)?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I completely missed some of these comments. My bad!
A few reasons:
a) It is one less allocation per metric recording (cost ~10ns)
b) passing strings at Record implies the names can be different every time which means the SDK has to search and/or sort them

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE: b. Although I agree, I don't think that's your top priority concern.

@codecov-io
Copy link

codecov-io commented Feb 2, 2021

Codecov Report

❗ No coverage uploaded for pull request base (main@b212c91). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main       #3   +/-   ##
=======================================
  Coverage        ?   83.71%           
=======================================
  Files           ?      192           
  Lines           ?     6165           
  Branches        ?        0           
=======================================
  Hits            ?     5161           
  Misses          ?     1004           
  Partials        ?        0           

{
class SquidAnalyzer
{
static readonly MetricSource metrics = new MetricSource("SquidLibrary",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to discuss if we really want to have a "Provider" as the starting point.

I think it makes more sense to just "new Counter(...)" immediately. And employ late binding to attach to a "provider".

{ "SquidBetaBuild", "true" },
{ "SeafoodFeatureLevel", "4" }
});
static readonly Counter squidsAnalyzedCounter = metrics.CreateCounter("squids-analyzed");
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if there's an easy way (preferably at compile time) to get the caller's fully qualified class name. That can act as the "default" name for this counter. It may be done with reflections (but it's slow).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like ILogger, we can also use generics to take a class name that we use for library name and we can get library version via reflection.

string labelValue2 = "shrimp";
string labelValue3 = "clam";
string labelValue4 = "soup";
squidsAnalyzedCounter.Add(1,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty awkward to me in passing in key/values...


Micrometer (MM) - https://micrometer.io/docs/concepts

1. Do the names that OpenTelemetry chose match industry conventions?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering maybe we leave the "naming" convention match to the SDK. Thus, API just needs to collect all the component fields, and let the SDK do what they want with forming a correct "name" for the exporter.

Basically, I'm saying maybe we are not so concern with naming from API side as it can be adjusted by the SDK.

Micrometer (MM) - https://micrometer.io/docs/concepts

1. Do the names that OpenTelemetry chose match industry conventions?
For the MM library there are different types of Meters such as timers, counters, gauges, etc. Depending on their type, meters implement operations such
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I heard may calls for a "timing" type counter. Maybe we include a TimeRange counter of the form...

using (var timer = new Timer(...))
{
// Do Operations...
}

As best I can tell the best conceptual mapping is that MM's Meter == OT's bound instrument. It doesn't worry me too much that OT would use a different
name for the same concept, what worries me more is if OT is reusing the same names but altering the concept they apply to.

2. MM has no equivalent abstraction for OT's Meter or Instrument. It goes straight from MeterProvider (MeterRegistry in MM-terminology) to a bound-instrument
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I would propose we employ "late binding" to attach a Provider.

If MM has been succesful with these simpler abstractions it raises the question what value is OT getting from its more complex 4 layers of
abstraction (MeterProvider->Meter->Instrument->BoundInstrument)?

3. MM landed on a different set of instrument types than OT did. It would be good to understand why we believe OT's instrument set is the right set to pick?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think the OT SIG also pointed this out. I think Reiley will likely drive this to some better conclusion.

or is there a scenario where the atomicity of the measurements matter?

6. MM has a lot of API surface dedicated to configuring Meters (in MM terminology) and tags. I am guessing this is the kind of functionality that would be sitting in OT's
Views API?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be a HUGE topic for SDK side. And probably will be very vendor opinionated in those discussions!

Histogram ValueRecorder
NA *Observer

4. Prometheus uses Observe() API where OT uses Record() API. Prometheus doesn't appear to have asynchronous/callback style APIs.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do need an "async" callback for reporting data. Or at least some examples of how to do this. Need to resolve the issue of Sum vs Incremental dataset. Cause any "Observed" value implies a Time Range, and thus, some kind of aggregation. This needs to be reviewed a bit deeper I think.

var counter = new CounterOptions { Name = "my_counter" } // this defines the metric
metrics.Measure.Counter.Increment(counter); // increments the value of the metric stored in this IMetrics

3. Metrics are never bound. Tags can be provided at the time a measurement is captured and merged with tags set via global tags in the IMetrics interface
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering... Maybe we setup a memory storage for binding a set of KVs. Then, we just pass the id of that entry, instead of "binding" a meter. This way, any "bound" KVs can be shared by any meter and addressible with an id.

Double and long. The typical distinction between counters, gauges, histograms, etc are defined separately as aggregations using the View API.
3. Measures are created directly from a static factory method and they are not associated with any particular collection pipeline. A separate StatsRecorder object provides
the data collection API.
4. The StatsRecorder collection API builds up a list of measurements first, then commits it with a single API call. I didn't see any simpler option presented in the samples
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We likely need a BatchRecord type functionality. To allow our user to inject data for the same "timeframe". I don't know what other scenarios requires "atomicity" (outside of marking same timeframe) but would like to find out more from others what the need here is.

example3/API/MetricSource.cs Outdated Show resolved Hide resolved
victlu and others added 5 commits April 15, 2021 12:11
The API review folks generally don't like to have interfaces with no
members and we can do a type check that avoids needing them. If we
want a better non-type based check I think the best approach will be
to decide on a hinting API.
The perf regression from adding the T was larger than I remembered,
about 15ns. I tracked it down to perf impact of making generic virtual
calls. Unlike normal virtual calls that can be resolved by looking up
the function pointer in a known slot in a table, the generic T requires
a dictionary lookup in the runtime. Thankfully we can work around
that by providing type specific overloads.
@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (main@b212c91). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main       #3   +/-   ##
=======================================
  Coverage        ?   83.76%           
=======================================
  Files           ?      192           
  Lines           ?     6165           
  Branches        ?        0           
=======================================
  Hits            ?     5164           
  Misses          ?     1001           
  Partials        ?        0           

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants