Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed strategy for making the collector an observable application #7223

Closed
djaglowski opened this issue Feb 20, 2023 · 3 comments
Closed

Comments

@djaglowski
Copy link
Member

djaglowski commented Feb 20, 2023

The OpenTelemetry collector is fundamentally an observability application. Naturally, we want it to also be an observable application. There has already been a lot of work towards this goal. However, I believe we still lack a clear strategy for gathering and emitting telemetry from the collector itself. To that end, this is a high-level proposal for a strategy that I believe would make the collector an examplar of how applications should be made observable, while simultaneously providing usability benefits that only the collector can offer.


I believe we should start with the same solution we would recommend to any other application: Use the appropriate OTel instrumentation library to gather and emit telemetry. Therefore, the collector should be instrumented with the otel-go library, and should emit its telemetry data via otel-go exporters.

The collector’s configuration can define which otel-go exporters may be used as well as which settings should be exposed. Naturally, otel-go’s otlp exporter should be an option, probably the default option.

At the same time, we should acknowledge the unique role of the collector in the OTel ecosystem. Most OTel-instrumented applications will emit data that eventually flows through a collector, so it is natural to look for ways to allow the collector’s own telemetry to flow through itself.

One option would require no additional work beyond the above suggestions. Users can just configure an otlp receiver on their collector which listens for the collector’s own telemetry, as emitted by otel-go’s otlp exporter.

A slightly more usable option, which still respects the role of the observability library, is to implement a custom exporter for otel-go that is simultaneous a collector receiver. This would allow users to directly receive the collector’s own telemetry onto a collector pipeline simply by configuring this receiver and using it as they would use any other receiver - put it in one or more pipelines, process and/or export the data as desired.

receivers: [ ... , owntelemetry ]
processors: [ ... ]
exporters: [ ... ]

service:
  telemetry:
    logs:
      ...
      exporter: owntelemetry
    metrics:
      ...
      exporter: owntelemetry
  pipelines:
    logs: # Typical telemetry flowing through collector
      receivers: [ ... ]
      processors: [ ... ]
      exporters: [ ... ]
    logs/own:
      receivers: [ owntelemetry ]
      processors: [ ... ]
      exporters: [ ... ]

    metrics: # Typical telemetry flowing through collector
      receivers: [ ... ]
      processors: [ ... ]
      exporters: [ ... ]
    metrics/own:
      receivers: [ owntelemetry ]
      processors: [ ... ]
      exporters: [ ... ]
@atoulme
Copy link
Contributor

atoulme commented Feb 21, 2023

This is a discussion based on the ongoing discussions we had at the last SIG meeting.
Let's talk configuration and user interfaces?

Right now, we can do this:

service:
  telemetry:
    logs:
      level: "debug"

By default, I think if no exporter is specified, then the collector continues its legacy behavior and would export with debug level.

If no level is set, then the default logging level should apply, which is info:

service:
  telemetry:
    logs:

Now, going over your proposal:

  1. If no exporter is set (I'm not sold on the term exporter but naming is hard) then continue with legacy mode.
  2. If the exporter is set but value is invalid, or no exporter exists, log a warn but continue with legacy mode.

I see a couple things I would like to see envisioned:
Syncing log/Disabling logs
I would like a way to completely disable logging (and metrics to be consistent). As discussed during SIG meeting, an "enabled" flag (default true) makes sense there to avoid touching levels or other lib specific elements.
Filtering logs I would like to be able to filter logs and still export them as normal to stdout.

Out of this, I see the following work items:

  • Add enabled flag for logs and metrics
  • Use the go log SDK to map from our internal zap logger to export data. There is no SDK yet, and we could try to create something that goes straight to logdata, or do we want to wait and get this through the go SDK?
  • Create the "exporter" config field and map its behavior to a a pipeline. That's a new receiver we can start on now imo. The receiver can be a programmatic receiver accepting being passed in a Logs object.
  • Add a new exporter to a zap logger that exports to stdout. This supports the use case where I want all my logs to go to stdout as usual, but I want to filter them on the way out.

@TylerHelmuth
Copy link
Member

Part of this discussion feels related to #6629

@djaglowski
Copy link
Member Author

@TylerHelmuth, thanks for linking that. Clearly a lot of discussion has been going on over there that is relevant here. I think it's made clear in that issue that running our own telemetry through the collector is controversial/problematic enough that this proposal goes too far in proposing the option to do so.

My primary intention was to formalize the general principle of how the collector's telemetry should be gathered and emitted. I think @jpkrohling stated it well here:

The common practice I've been seeing is to have the monitors being external to the process being monitored, and I think this applies here as well ("who monitors the monitor" type of question). In that sense, I would recommend our users use an external OTLP endpoint whenever possible to receive the collector's telemetry data and would not have this enabled by default.

Closing this since the discussion in #7106 seems to be headed towards this idea anyways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants