-
Notifications
You must be signed in to change notification settings - Fork 649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add exporter to Datadog #572
Add exporter to Datadog #572
Conversation
e3a3619
to
e44f29b
Compare
e44f29b
to
f56b985
Compare
4405774
to
7664e6d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments about details around. I tried it and I was able to see the spans in datadog.
Are the events missing?
ext/opentelemetry-ext-datadog/src/opentelemetry/ext/datadog/__init__.py
Outdated
Show resolved
Hide resolved
ext/opentelemetry-ext-datadog/src/opentelemetry/ext/datadog/__init__.py
Outdated
Show resolved
Hide resolved
|
||
.. code-block:: sh | ||
|
||
pip install opentelemetry-api |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can roll these up into a single pip install.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at the moment
|
||
assert len(argv) == 2 | ||
|
||
with tracer.start_as_current_span("client"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to have this example live within a while True
loop with a slight delay? Might better demonstrate practical usage in the product.
|
||
|
||
if __name__ == "__main__": | ||
app.run(port=8082) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tiny nit: might be nice to have the port easily configurable via environment variable in case there's a conflict.
self._agent_writer = None | ||
|
||
@property | ||
def agent_writer(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a private method. We keep it internal within ddtrace.
ext/opentelemetry-ext-datadog/src/opentelemetry/ext/datadog/__init__.py
Outdated
Show resolved
Hide resolved
else span.attributes["http.method"] | ||
) | ||
|
||
return span.attributes.get("component", span.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not familiar with the otel spec, please excuse the naivety, but will component have a low-ish cardinality?
ext/opentelemetry-ext-datadog/src/opentelemetry/ext/datadog/__init__.py
Outdated
Show resolved
Hide resolved
def shutdown(self): | ||
if self.agent_writer.started: | ||
self.agent_writer.stop() | ||
self.agent_writer.join(self.agent_writer.exit_timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the otel folks cool with having a blocking operation like this? It could block for up to a second potentially. Might be worth adding a comment to state that it's required to ensure that the traces are flushed before shutdown. 🙂
ext/opentelemetry-ext-datadog/src/opentelemetry/ext/datadog/__init__.py
Outdated
Show resolved
Hide resolved
|
||
|
||
def _get_span_type(span): | ||
"""Map span to Datadog span type""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are other span_types
possible? Hopefully we're able to do http
, web
, worker
, cache
as well.
Co-Authored-By: Mauricio Vásquez <mauricio@kinvolk.io>
…telemetry-python into majorgreys/exporter-datadog
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the span processor piece and left some comments. It's looking pretty good.
ext/opentelemetry-ext-datadog/src/opentelemetry/ext/datadog/spanprocessor.py
Show resolved
Hide resolved
ext/opentelemetry-ext-datadog/src/opentelemetry/ext/datadog/spanprocessor.py
Outdated
Show resolved
Hide resolved
ext/opentelemetry-ext-datadog/src/opentelemetry/ext/datadog/spanprocessor.py
Outdated
Show resolved
Hide resolved
ext/opentelemetry-ext-datadog/src/opentelemetry/ext/datadog/spanprocessor.py
Outdated
Show resolved
Hide resolved
while self.check_traces_queue: | ||
trace_id = self.check_traces_queue.pop() | ||
if trace_id is self._FLUSH_TOKEN: | ||
notify_flush = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this should break the loop here to do the force flushing ASAP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm following here the approach in BatchExportSpanProcessor
where force flushing still exports all the spans that had been queued up. Similarly, in the Datadog exporter we want to export all ready to be exported traces. Breaking the loop when we get the flush token would mean we drop ready to be exported traces. But I might have misunderstood how the batch exporter was functioning with these condition variable locks.
# check whether trace is exportable again in case that new | ||
# spans were started since we last concluded trace was | ||
# exportable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this case possible? Can't we consider that when all started spans are finished the trace is done and no more spans will come from the same trace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking here of the async case where parent spans will exit before all child spans have done so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case the number of started spans would be greater than the number of ended spans (there are some child spans that are not ended yet).
My point is, is the following case possible in a real scenario?
parent = tracer.start_span("parent")
with tracer.use_span(parent, end_on_exit=True):
with tracer.start_as_current_span("foo"):
print("Hello world from OpenTelemetry Python!")
# Is the trace done here?
with tracer.start_as_current_span("child", parent=parent):
print("here we go again")
Is it possible to start a span using a parent that is ended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to start a span using a parent that is ended?
Looks to be the case. The exporter output of the above are these three spans:
# "foo"
Span(name="foo", context=SpanContext(trace_id=0xc979ccb3a89fa1edacef2e9bd1e6255b, span_id=0x1589d32ad80407e0, trace_state={}, is_remote=False), kind=SpanKind.INTERNAL, parent=Span(name="parent", context=SpanContext(trace_id=0xc979ccb3a89fa1edacef2e9bd1e6255b, span_id=0x4943984635edaebb, trace_state={}, is_remote=False)), start_time=2020-05-08T19:47:44.261087Z, end_time=2020-05-08T19:47:44.261138Z)
# "parent"
Span(name="parent", context=SpanContext(trace_id=0xc979ccb3a89fa1edacef2e9bd1e6255b, span_id=0x4943984635edaebb, trace_state={}, is_remote=False), kind=SpanKind.INTERNAL, parent=None, start_time=2020-05-08T19:47:44.260045Z, end_time=2020-05-08T19:47:44.261158Z)
# child
Span(name="child", context=SpanContext(trace_id=0xc979ccb3a89fa1edacef2e9bd1e6255b, span_id=0x05c69b112b2910f8, trace_state={}, is_remote=False), kind=SpanKind.INTERNAL, parent=Span(name="parent", context=SpanContext(trace_id=0xc979ccb3a89fa1edacef2e9bd1e6255b, span_id=0x4943984635edaebb, trace_state={}, is_remote=False)), start_time=2020-05-08T19:47:46.524331Z, end_time=2020-05-08T19:47:46.524393Z)
ext/opentelemetry-ext-datadog/src/opentelemetry/ext/datadog/spanprocessor.py
Outdated
Show resolved
Hide resolved
ext/opentelemetry-ext-datadog/src/opentelemetry/ext/datadog/spanprocessor.py
Outdated
Show resolved
Hide resolved
ext/opentelemetry-ext-datadog/src/opentelemetry/ext/datadog/spanprocessor.py
Outdated
Show resolved
Hide resolved
if not self._flushing: | ||
with self.condition: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BatchSpanProcessor has a logic to avoid waiting in the condition if there are too many spans to be sent. Do you think a similar approach here could help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Datadog exporter does not have a notion of "max batch size" as of yet. There is a limit on the number of spans to be stored for a trace (4096) but that's handled as spans are started. At export time we export individual traces. Under the hood the writing to a Datadog agent is handled by the ddtrace
library which maintains its own queue of traces and has logic around payload sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM, merging this for the 0.7b.0 release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accepted so we can start verifying interaction with the datadog client with users. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a concern around merging this PR that I would like addressed. The spec specifically states the following here:
Other vendor-specific exporters (exporters that implement vendor protocols) should not be included in language libraries and should be placed elsewhere (the exact approach for storing and maintaining vendor-specific exporters will be defined in the future).
We asked the azure exporter be moved out specifically because of this and it came up in the stackdriver PR as well. I'd like my comment addressed before moving to approve.
Just wanted to followup, here's the link to the azure exporter repo that was moved as per #193. I think long term we'll want all exporters to live in the contrib repo, so as long as we're ok with merging this in until we move all exporters into the contrib, then I'm fine to approve. @c24t @toumorokoshi thoughts? |
That sounds good to me. As we talked about a bit in the SIG call, it's much easier to develop exporters in the main repo since we don't have the same tests and tooling available in other repos. I think it's fine in general to develop in this repo and then move packages out once they're stable, like we did for the azure exporter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #691 to track moving the exporter code out of this repo.
Add an exporter to Datadog. This implementation makes use of ddtrace to handle the creation of Datadog traces and writing them to the Datadog agent. Co-Authored-By: Mauricio Vásquez <mauricio@kinvolk.io>
Add an exporter to Datadog. This implementation makes use of
ddtrace
to handle the creation of Datadog traces and writing them to the Datadog agent.