You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary
We wanted to broadcast some recently-discovered issues around Datadog tracing integration in App Mesh and provide information on each of them.
Configuring the Datadog agent as a daemonset:
Context:
The Datadog agent receiving trace data from an Envoy can be deployed in three deployment formats: as a cluster agent (one per cluster), as a sidecar container, and as a daemonset (in K8s). To configure the Datadog agent in Envoy, users need to specify the address and port of the Datadog agent (see https://docs.aws.amazon.com/app-mesh/latest/userguide/envoy.html section Datadog tracing variables).
Problem:
For deploying the agent as a daemonset specifically, the agent's address will vary for each node as it is bounded to the node local IP address. Presently App Mesh assumes that the address is either a service endpoint or localhost (for sidecars) and offers no way to dynamically configure the node local IP address, blocking the daemonset use case.
Missing url information when inspecting traces in the Datadog UI
Problem:
Users have identified an issue where the Http url shows as ? in the Datadog UI when expanding on the traces.
Next Steps:
The Datadog team has identified the root cause in their tracing library used by the Datadog plugin in Envoy. We are coordinating with them to track their fix and will release a new Envoy image with that fix.
Missing egress traces for select clusters (Applies to all tracing and not just Datadog)
Context:
For Envoy to perform tracing, the request must route through Envoy's HTTP filter (as opposed to TCP filter since we cannot inspect traces at the TCP level). In App Mesh, this is dependent on how the cluster type of the destination of the egress traffic is configured, which currently consists of the following ways:
Explicitly defined as a Virtual Node's backend: If Virtual Node A has a backend Virtual Node B, we can get traces between Envoy A and Envoy B if Virtual Node B uses a non-TCP listener.
An implicit AWS cluster is automatically created for each Envoy which is currently modeled as a TCP cluster.
Problem:
Based on the context, currently all egress traffic using the ALLOW_ALL egress filter and calls to AWS will not generate tracing data. Traces between application and Envoy as well as Envoy-to-Envoy via Http are still generated.
Next Steps:
We've cut an issue to allow traces to be generated when calling other AWS services: Feature Request: Way to enable tracing on the default *.amazonaws.com cluster #308. For a general workaround, users can model the egress destination as a Virtual Node. Please let us know if you have use cases where an alternative solution would be preferred.
The text was updated successfully, but these errors were encountered:
Summary
We wanted to broadcast some recently-discovered issues around Datadog tracing integration in App Mesh and provide information on each of them.
The Datadog agent receiving trace data from an Envoy can be deployed in three deployment formats: as a cluster agent (one per cluster), as a sidecar container, and as a daemonset (in K8s). To configure the Datadog agent in Envoy, users need to specify the address and port of the Datadog agent (see https://docs.aws.amazon.com/app-mesh/latest/userguide/envoy.html section Datadog tracing variables).
For deploying the agent as a daemonset specifically, the agent's address will vary for each node as it is bounded to the node local IP address. Presently App Mesh assumes that the address is either a service endpoint or localhost (for sidecars) and offers no way to dynamically configure the node local IP address, blocking the daemonset use case.
We have provided the capability to configure the Datadog address as
status.hostIP
to handle the daemonset use case: Allow configuring tracing address as status.hostIP using the downward API aws-app-mesh-controller-for-k8s#425. This will be included in the upcoming 1.3.0 App Mesh controller release.Users have identified an issue where the Http url shows as
?
in the Datadog UI when expanding on the traces.The Datadog team has identified the root cause in their tracing library used by the Datadog plugin in Envoy. We are coordinating with them to track their fix and will release a new Envoy image with that fix.
For Envoy to perform tracing, the request must route through Envoy's HTTP filter (as opposed to TCP filter since we cannot inspect traces at the TCP level). In App Mesh, this is dependent on how the cluster type of the destination of the egress traffic is configured, which currently consists of the following ways:
ALLOW_ALL
egress filter, a catch-all TCP cluster is generated.Based on the context, currently all egress traffic using the ALLOW_ALL egress filter and calls to AWS will not generate tracing data. Traces between application and Envoy as well as Envoy-to-Envoy via Http are still generated.
We've cut an issue to allow traces to be generated when calling other AWS services: Feature Request: Way to enable tracing on the default *.amazonaws.com cluster #308. For a general workaround, users can model the egress destination as a Virtual Node. Please let us know if you have use cases where an alternative solution would be preferred.
The text was updated successfully, but these errors were encountered: