Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add aws xray header context extraction #3898

Open
wants to merge 41 commits into
base: master
Choose a base branch
from

Conversation

wconti27
Copy link
Contributor

What does this PR do?

Add code to extract Trace Context Propagation from AWSTraceHeader, which is used by dd-trace-java to propagate context for some AWS services.

Motivation

Improving AWS propagation & dd-trace-js / dd-trace-java compatability.

Plugin Checklist

Additional Notes

Security

Datadog employees:

  • If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
  • This PR doesn't touch any of that.

Unsure? Have a question? Request a review!

@wconti27 wconti27 requested review from a team as code owners December 20, 2023 18:13
@wconti27 wconti27 requested a review from jbertran December 20, 2023 18:13
@wconti27 wconti27 changed the base branch from master to conti/add-context-extraction-aws-kinesis December 20, 2023 18:13
Copy link

github-actions bot commented Dec 20, 2023

Overall package size

Self size: 8.26 MB
Deduped: 94.55 MB
No deduping: 95.07 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.2.2 | 29.27 MB | 29.27 MB | | @datadog/native-appsec | 8.3.0 | 19.37 MB | 19.38 MB | | @datadog/native-iast-taint-tracking | 3.2.0 | 13.9 MB | 13.91 MB | | @datadog/pprof | 5.4.1 | 9.76 MB | 10.13 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.6.0 | 2.58 MB | 2.72 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 3.0.1 | 1.06 MB | 1.46 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.11.2 | 112.74 kB | 826.22 kB | | source-map | 0.7.4 | 226 kB | 226 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

Copy link

codecov bot commented Dec 20, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.67%. Comparing base (de47a4b) to head (7d99016).
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3898      +/-   ##
==========================================
- Coverage   85.23%   83.67%   -1.57%     
==========================================
  Files         247      112     -135     
  Lines       10961     4563    -6398     
  Branches       33       33              
==========================================
- Hits         9343     3818    -5525     
+ Misses       1618      745     -873     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pr-commenter
Copy link

pr-commenter bot commented Dec 20, 2023

Benchmarks

Benchmark execution time: 2024-12-13 20:33:13

Comparing candidate commit 33ee7de in PR branch conti/add-aws-xray-header-context-extraction with baseline commit 83c6928 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 264 metrics, 2 unstable metrics.

Copy link
Contributor

@Qard Qard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a test to verify the injection works correctly. Also, is support for this something we actually have consensus on and is expected to exist across all languages, or are we just implementing this because Java happened to have it?

@wconti27
Copy link
Contributor Author

Needs a test to verify the injection works correctly. Also, is support for this something we actually have consensus on and is expected to exist across all languages, or are we just implementing this because Java happened to have it?

There's no consensus on which should be used by default at the moment. This is mainly being added since Java is the most popular language for APM, and if we want propagation between a Java producer, we need to be able to extract context using the AwsTraceHeader


const parsedAttributes = this.parseDatadogAttributes(datadogAttribute)
if (parsedAttributes) return this.tracer.extract('text_map', parsedAttributes)
if (message.Attributes && message.Attributes.AWSTraceHeader) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this interact with W3C tracestate? We have just started providing guarantees that the tracestate is always kept even if we use our own extraction mechanism. For example, tracer.extract will now look at both Datadog and W3C headers to ensure the tracestate is kept. This doesn't seem to be handled here.

In general, I agree with @Qard that this change is significant enough that it shouldn't be done without proper consensus and a specification. Any changes to propagation are extremely difficult to change after the fact since any change would break compatibility with previous versions, including other languages.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up question to this; were the same concerns considered for the Java dd-trace-java support for this functionality? (I believe it was this PR: DataDog/dd-trace-java#4730)

Language and platform agnostic interoperability between dd-trace libraries would deliver significant immediate value.

If we presume that the tracestate issue is resolved and the value is persisted, are there other concerns with the AWS-specific logic encapsulated in an SQS-specific implementation?

I'm obviously not the original PR author but I'd like to do what I can to help incorporate this functionality, even if it means pretending I'm a qualified project manager. 😆

Comment on lines 226 to 233
if (!request.params.AttributeNames) {
request.params.AttributeNames = ['AWSTraceHeader']
} else if (
!request.params.AttributeNames.includes('AWSTraceHeader') &&
!request.params.AttributeNames.includes('All')
) {
request.params.AttributeNames.push('AWSTraceHeader')
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why you request an attribute with key 'AWSTraceHeader' while I see on the Java side that it's written with the key "X-Amzn-Trace-Id"? Shouldn't it be the same key on both sides ?

Copy link

@vandonr vandonr Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok wait I think I found the answer here : https://docs.aws.amazon.com/xray/latest/devguide/xray-services-sqs.html
If I understand correctly, what's written on produce to the http request in the X-Amzn-Trace-Id gets added to the message system attributes under the key AWSTraceHeader

Base automatically changed from conti/add-context-extraction-aws-kinesis to master January 25, 2024 19:15
@rbonestell
Copy link

I've been searching for this solution for a while, and anxiously await the merge and resolution to this! This will help many of us to implement proper tracing for our headless NodeJS services which use SQS!

@jeebay
Copy link

jeebay commented Apr 16, 2024

I've similarly been waiting for this to land as it would clean up disconnected spans in DataDog. We have a polyglot environment and the proposed changes above work great for Java. It would be helpful to have the same behavior across languages.

@Qard @rochdev are there any blockers to validating or merging the proposed changes? I only see linting issues in the diff. I'd love to contribute to get this over the line.

@ConnorSinnott
Copy link

Is this still alive? We're really looking forward to this implementation. Its a bit glaring that the lambdas support this feature but containerized services don't.

@jsonmorris
Copy link

Hi all - any movement on this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants