-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inject trace context into AWS Step Functions input #7585
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using reflection is not efficient and should be avoided. Instead, MethodHandles can be used.
The pull request lacks also of a minimum test coverage. Test must be added
...s-java-sdk-2.2/src/main/java/datadog/trace/instrumentation/aws/v2/AwsSdkClientDecorator.java
Outdated
Show resolved
Hide resolved
...s-java-sdk-2.2/src/main/java/datadog/trace/instrumentation/aws/v2/AwsSdkClientDecorator.java
Outdated
Show resolved
Hide resolved
...s-java-sdk-2.2/src/main/java/datadog/trace/instrumentation/aws/v2/AwsSdkClientDecorator.java
Outdated
Show resolved
Hide resolved
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 55 metrics, 8 unstable metrics. Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.42.0-SNAPSHOT~30397802a8, baseline=1.42.0-SNAPSHOT~0cf47ed20e
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.085 s) : 0, 1085441
Total [baseline] (10.494 s) : 0, 10493826
Agent [candidate] (1.085 s) : 0, 1085170
Total [candidate] (10.444 s) : 0, 10444175
section appsec
Agent [baseline] (1.219 s) : 0, 1218927
Total [baseline] (10.71 s) : 0, 10709842
Agent [candidate] (1.218 s) : 0, 1217653
Total [candidate] (10.664 s) : 0, 10664012
section iast
Agent [baseline] (1.209 s) : 0, 1208962
Total [baseline] (10.907 s) : 0, 10906523
Agent [candidate] (1.21 s) : 0, 1209769
Total [candidate] (10.888 s) : 0, 10887649
section profiling
Agent [baseline] (1.281 s) : 0, 1280920
Total [baseline] (10.705 s) : 0, 10704917
Agent [candidate] (1.279 s) : 0, 1278615
Total [candidate] (10.743 s) : 0, 10743475
gantt
title petclinic - break down per module: candidate=1.42.0-SNAPSHOT~30397802a8, baseline=1.42.0-SNAPSHOT~0cf47ed20e
dateFormat X
axisFormat %s
section tracing
BytebuddyAgent [baseline] (690.482 ms) : 0, 690482
BytebuddyAgent [candidate] (690.764 ms) : 0, 690764
GlobalTracer [baseline] (315.731 ms) : 0, 315731
GlobalTracer [candidate] (315.511 ms) : 0, 315511
AppSec [baseline] (54.354 ms) : 0, 54354
AppSec [candidate] (54.082 ms) : 0, 54082
Remote Config [baseline] (669.417 µs) : 0, 669
Remote Config [candidate] (659.642 µs) : 0, 660
Telemetry [baseline] (10.371 ms) : 0, 10371
Telemetry [candidate] (10.331 ms) : 0, 10331
section appsec
BytebuddyAgent [baseline] (707.487 ms) : 0, 707487
BytebuddyAgent [candidate] (706.179 ms) : 0, 706179
GlobalTracer [baseline] (312.766 ms) : 0, 312766
GlobalTracer [candidate] (312.316 ms) : 0, 312316
AppSec [baseline] (166.529 ms) : 0, 166529
AppSec [candidate] (166.271 ms) : 0, 166271
Remote Config [baseline] (630.732 µs) : 0, 631
Remote Config [candidate] (638.111 µs) : 0, 638
Telemetry [baseline] (7.369 ms) : 0, 7369
Telemetry [candidate] (8.088 ms) : 0, 8088
IAST [baseline] (20.663 ms) : 0, 20663
IAST [candidate] (20.726 ms) : 0, 20726
section iast
BytebuddyAgent [baseline] (805.814 ms) : 0, 805814
BytebuddyAgent [candidate] (806.248 ms) : 0, 806248
GlobalTracer [baseline] (303.735 ms) : 0, 303735
GlobalTracer [candidate] (303.975 ms) : 0, 303975
AppSec [baseline] (57.419 ms) : 0, 57419
AppSec [candidate] (55.448 ms) : 0, 55448
Remote Config [baseline] (600.239 µs) : 0, 600
Remote Config [candidate] (588.339 µs) : 0, 588
Telemetry [baseline] (7.157 ms) : 0, 7157
Telemetry [candidate] (6.998 ms) : 0, 6998
IAST [baseline] (20.426 ms) : 0, 20426
IAST [candidate] (22.655 ms) : 0, 22655
section profiling
ProfilingAgent [baseline] (91.748 ms) : 0, 91748
ProfilingAgent [candidate] (92.279 ms) : 0, 92279
BytebuddyAgent [baseline] (684.997 ms) : 0, 684997
BytebuddyAgent [candidate] (682.112 ms) : 0, 682112
GlobalTracer [baseline] (398.109 ms) : 0, 398109
GlobalTracer [candidate] (398.075 ms) : 0, 398075
AppSec [baseline] (54.61 ms) : 0, 54610
AppSec [candidate] (54.84 ms) : 0, 54840
Remote Config [baseline] (646.44 µs) : 0, 646
Remote Config [candidate] (654.178 µs) : 0, 654
Telemetry [baseline] (11.611 ms) : 0, 11611
Telemetry [candidate] (11.736 ms) : 0, 11736
Profiling [baseline] (91.772 ms) : 0, 91772
Profiling [candidate] (92.302 ms) : 0, 92302
Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.42.0-SNAPSHOT~30397802a8, baseline=1.42.0-SNAPSHOT~0cf47ed20e
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.082 s) : 0, 1081765
Total [baseline] (8.615 s) : 0, 8615252
Agent [candidate] (1.098 s) : 0, 1098301
Total [candidate] (8.696 s) : 0, 8696397
section iast
Agent [baseline] (1.209 s) : 0, 1208615
Total [baseline] (9.158 s) : 0, 9158480
Agent [candidate] (1.207 s) : 0, 1207464
Total [candidate] (9.177 s) : 0, 9177432
section iast_HARDCODED_SECRET_DISABLED
Agent [baseline] (1.209 s) : 0, 1208603
Total [baseline] (9.126 s) : 0, 9126263
Agent [candidate] (1.209 s) : 0, 1208967
Total [candidate] (9.148 s) : 0, 9147649
section iast_TELEMETRY_OFF
Agent [baseline] (1.205 s) : 0, 1205062
Total [baseline] (9.172 s) : 0, 9171690
Agent [candidate] (1.213 s) : 0, 1212908
Total [candidate] (9.126 s) : 0, 9125775
gantt
title insecure-bank - break down per module: candidate=1.42.0-SNAPSHOT~30397802a8, baseline=1.42.0-SNAPSHOT~0cf47ed20e
dateFormat X
axisFormat %s
section tracing
BytebuddyAgent [baseline] (689.889 ms) : 0, 689889
BytebuddyAgent [candidate] (697.531 ms) : 0, 697531
GlobalTracer [baseline] (314.938 ms) : 0, 314938
GlobalTracer [candidate] (320.429 ms) : 0, 320429
AppSec [baseline] (54.296 ms) : 0, 54296
AppSec [candidate] (55.151 ms) : 0, 55151
Remote Config [baseline] (661.435 µs) : 0, 661
Remote Config [candidate] (681.116 µs) : 0, 681
Telemetry [baseline] (8.151 ms) : 0, 8151
Telemetry [candidate] (10.515 ms) : 0, 10515
section iast
BytebuddyAgent [baseline] (805.32 ms) : 0, 805320
BytebuddyAgent [candidate] (804.574 ms) : 0, 804574
GlobalTracer [baseline] (303.5 ms) : 0, 303500
GlobalTracer [candidate] (303.506 ms) : 0, 303506
AppSec [baseline] (57.814 ms) : 0, 57814
AppSec [candidate] (56.765 ms) : 0, 56765
IAST [baseline] (20.466 ms) : 0, 20466
IAST [candidate] (21.155 ms) : 0, 21155
Remote Config [baseline] (612.536 µs) : 0, 613
Remote Config [candidate] (604.832 µs) : 0, 605
Telemetry [baseline] (7.078 ms) : 0, 7078
Telemetry [candidate] (7.016 ms) : 0, 7016
section iast_HARDCODED_SECRET_DISABLED
BytebuddyAgent [baseline] (804.826 ms) : 0, 804826
BytebuddyAgent [candidate] (805.217 ms) : 0, 805217
GlobalTracer [baseline] (303.641 ms) : 0, 303641
GlobalTracer [candidate] (303.353 ms) : 0, 303353
AppSec [baseline] (58.101 ms) : 0, 58101
AppSec [candidate] (55.719 ms) : 0, 55719
IAST [baseline] (20.458 ms) : 0, 20458
IAST [candidate] (22.998 ms) : 0, 22998
Remote Config [baseline] (599.373 µs) : 0, 599
Remote Config [candidate] (611.909 µs) : 0, 612
Telemetry [baseline] (7.183 ms) : 0, 7183
Telemetry [candidate] (7.195 ms) : 0, 7195
section iast_TELEMETRY_OFF
BytebuddyAgent [baseline] (801.581 ms) : 0, 801581
BytebuddyAgent [candidate] (806.708 ms) : 0, 806708
GlobalTracer [baseline] (304.357 ms) : 0, 304357
GlobalTracer [candidate] (305.745 ms) : 0, 305745
AppSec [baseline] (57.695 ms) : 0, 57695
AppSec [candidate] (58.575 ms) : 0, 58575
IAST [baseline] (19.998 ms) : 0, 19998
IAST [candidate] (20.235 ms) : 0, 20235
Remote Config [baseline] (595.976 µs) : 0, 596
Remote Config [candidate] (604.533 µs) : 0, 605
Telemetry [baseline] (6.979 ms) : 0, 6979
Telemetry [candidate] (7.109 ms) : 0, 7109
LoadParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 17 unstable metrics. Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.42.0-SNAPSHOT~30397802a8, baseline=1.42.0-SNAPSHOT~0cf47ed20e
dateFormat X
axisFormat %s
section baseline
no_agent (376.487 µs) : 357, 396
. : milestone, 376,
iast (483.745 µs) : 462, 505
. : milestone, 484,
iast_FULL (558.336 µs) : 537, 580
. : milestone, 558,
iast_GLOBAL (507.661 µs) : 487, 529
. : milestone, 508,
iast_HARDCODED_SECRET_DISABLED (480.943 µs) : 460, 502
. : milestone, 481,
iast_INACTIVE (441.893 µs) : 421, 462
. : milestone, 442,
iast_TELEMETRY_OFF (475.558 µs) : 454, 497
. : milestone, 476,
tracing (445.09 µs) : 424, 466
. : milestone, 445,
section candidate
no_agent (366.239 µs) : 347, 386
. : milestone, 366,
iast (486.627 µs) : 465, 508
. : milestone, 487,
iast_FULL (561.464 µs) : 540, 583
. : milestone, 561,
iast_GLOBAL (518.5 µs) : 496, 541
. : milestone, 519,
iast_HARDCODED_SECRET_DISABLED (482.718 µs) : 461, 504
. : milestone, 483,
iast_INACTIVE (449.674 µs) : 428, 471
. : milestone, 450,
iast_TELEMETRY_OFF (466.066 µs) : 445, 487
. : milestone, 466,
tracing (441.142 µs) : 420, 462
. : milestone, 441,
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.42.0-SNAPSHOT~30397802a8, baseline=1.42.0-SNAPSHOT~0cf47ed20e
dateFormat X
axisFormat %s
section baseline
no_agent (1.34 ms) : 1321, 1359
. : milestone, 1340,
appsec (1.728 ms) : 1706, 1750
. : milestone, 1728,
appsec_no_iast (1.726 ms) : 1702, 1749
. : milestone, 1726,
iast (1.472 ms) : 1450, 1495
. : milestone, 1472,
profiling (1.486 ms) : 1462, 1511
. : milestone, 1486,
tracing (1.472 ms) : 1447, 1496
. : milestone, 1472,
section candidate
no_agent (1.332 ms) : 1313, 1351
. : milestone, 1332,
appsec (1.72 ms) : 1697, 1743
. : milestone, 1720,
appsec_no_iast (1.71 ms) : 1686, 1733
. : milestone, 1710,
iast (1.482 ms) : 1460, 1504
. : milestone, 1482,
profiling (1.474 ms) : 1450, 1498
. : milestone, 1474,
tracing (1.475 ms) : 1450, 1500
. : milestone, 1475,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics. Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.42.0-SNAPSHOT~30397802a8, baseline=1.42.0-SNAPSHOT~0cf47ed20e
dateFormat X
axisFormat %s
section baseline
no_agent (1.469 ms) : 1457, 1480
. : milestone, 1469,
appsec (2.341 ms) : 2300, 2383
. : milestone, 2341,
iast (2.079 ms) : 2028, 2130
. : milestone, 2079,
iast_GLOBAL (2.127 ms) : 2076, 2179
. : milestone, 2127,
profiling (1.946 ms) : 1903, 1988
. : milestone, 1946,
tracing (1.931 ms) : 1891, 1971
. : milestone, 1931,
section candidate
no_agent (1.466 ms) : 1455, 1477
. : milestone, 1466,
appsec (2.335 ms) : 2294, 2376
. : milestone, 2335,
iast (2.074 ms) : 2022, 2126
. : milestone, 2074,
iast_GLOBAL (2.121 ms) : 2069, 2174
. : milestone, 2121,
profiling (1.943 ms) : 1900, 1985
. : milestone, 1943,
tracing (1.918 ms) : 1878, 1958
. : milestone, 1918,
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.42.0-SNAPSHOT~30397802a8, baseline=1.42.0-SNAPSHOT~0cf47ed20e
dateFormat X
axisFormat %s
section baseline
no_agent (14.919 s) : 14919000, 14919000
. : milestone, 14919000,
appsec (15.43 s) : 15430000, 15430000
. : milestone, 15430000,
iast (18.839 s) : 18839000, 18839000
. : milestone, 18839000,
iast_GLOBAL (17.952 s) : 17952000, 17952000
. : milestone, 17952000,
profiling (15.174 s) : 15174000, 15174000
. : milestone, 15174000,
tracing (15.373 s) : 15373000, 15373000
. : milestone, 15373000,
section candidate
no_agent (15.099 s) : 15099000, 15099000
. : milestone, 15099000,
appsec (15.29 s) : 15290000, 15290000
. : milestone, 15290000,
iast (18.926 s) : 18926000, 18926000
. : milestone, 18926000,
iast_GLOBAL (18.169 s) : 18169000, 18169000
. : milestone, 18169000,
profiling (15.184 s) : 15184000, 15184000
. : milestone, 15184000,
tracing (15.237 s) : 15237000, 15237000
. : milestone, 15237000,
|
...a-sfn-2.0/src/main/java/datadog/trace/instrumentation/aws/v2/sfn/InputAttributeInjector.java
Outdated
Show resolved
Hide resolved
...a-sfn-2.0/src/main/java/datadog/trace/instrumentation/aws/v2/sfn/InputAttributeInjector.java
Show resolved
Hide resolved
...a-sfn-2.0/src/main/java/datadog/trace/instrumentation/aws/v2/sfn/InputAttributeInjector.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a bunch of comments about refactoring and testing
// Include httpclient instrumentation for testing because it is a dependency for aws-sdk. | ||
testImplementation project(':dd-java-agent:instrumentation:apache-httpclient-4') | ||
testImplementation project(':dd-java-agent:instrumentation:aws-java-sdk-2.2') | ||
testImplementation 'software.amazon.awssdk:sfn:2.27.2' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It mismatches muzzle requirements. Is it expected?
...a-sfn-2.0/src/main/java/datadog/trace/instrumentation/aws/v2/sfn/InputAttributeInjector.java
Outdated
Show resolved
Hide resolved
...a-sfn-2.0/src/main/java/datadog/trace/instrumentation/aws/v2/sfn/InputAttributeInjector.java
Outdated
Show resolved
Hide resolved
...a-sfn-2.0/src/main/java/datadog/trace/instrumentation/aws/v2/sfn/InputAttributeInjector.java
Outdated
Show resolved
Hide resolved
String ddTraceContextJSON = InputAttributeInjector.buildTraceContext(span); | ||
// Inject the trace context into the Step Function input | ||
StringBuilder modifiedInput = | ||
InputAttributeInjector.getModifiedInput(request.input(), ddTraceContextJSON); | ||
|
||
return request.toBuilder().input(modifiedInput.toString()).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be deduplicated using a dedicated method:
SdkRequest injectTraceContext(request, span) {
String traceContext = InputAttributeInjector.buildTraceContext(span);
// Inject the trace context into the Step Function input
String modifiedInput = InputAttributeInjector.getModifiedInput(request.input(), traceContext);
return request.toBuilder().input(modifiedInput).build()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I had to handle both StartExecutionRequest
and StartSyncExecutionRequest
param types I ended up doing method overloading for injectTraceContext()
dd-java-agent/instrumentation/aws-java-sfn-2.0/src/test/groovy/SfnClientTest.groovy
Outdated
Show resolved
Hide resolved
dd-java-agent/instrumentation/aws-java-sfn-2.0/src/test/groovy/SfnClientTest.groovy
Show resolved
Hide resolved
} | ||
}) | ||
|
||
def endPoint = "http://" + LOCALSTACK.getHost() + ":" + LOCALSTACK.getMappedPort(4566) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be define as fixture field to avoid duplication (already computed at setup)
dd-java-agent/instrumentation/aws-java-sfn-2.0/src/test/groovy/SfnClientTest.groovy
Outdated
Show resolved
Hide resolved
@PerfectSlayer I think I covered all of the comments/corrections, but for some reason the CI's muzzle check is continuing to fail :/ It seems like it's mostly timing out, so may be unrelated to my changes? |
public class InputAttributeInjector { | ||
public static String buildTraceContext(AgentSpan span) { | ||
// Extract span tags | ||
StringBuilder spanTagsJSON = new StringBuilder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This worries me because it is a potential injection vector. I would prefer the use of a JsonBuffer that handles escaping properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I updated this with the JsonBuffer but when I run muzzle locally it fails with
error: cannot find symbol
import datadog.trace.bootstrap.JsonBuffer;
^
symbol: class JsonBuffer
location: package datadog.trace.bootstrap
Any ideas on if I did something very obviously wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran into the same muzzle issue trying to use a JSON parsing library when doing the EventBridge instrumentation. I was basically told we can't use additional libraries in instrumentations and to do the JSON parsing manually with stringbuilder.
There are obviously security concerns with this, so it looks like we're blocked until there is a standardized way to parse JSONs in instrumentations in this repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just merge #7973 It should provide API to build JSON payload from instrumentations (see JsonWriter
and JsonMapper
).
.../aws-java-sfn-2.0/src/main/java/datadog/trace/instrumentation/aws/v2/sfn/SfnInterceptor.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see the Json construction done in a safer way.
I also think we should guard against exceptions propagating out of the listener.
spanTagsJSON.beginObject(); | ||
span.getTags() | ||
.forEach((tagKey, tagValue) -> spanTagsJSON.name(tagKey).value(tagValue.toString())); | ||
spanTagsJSON.endObject(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this JSON format for the x-datadog-tags
expected? Not urgent but need to know at some point if we need to handle this format or not.
The value from this impl looks like this {"component":"java-aws-sdk","thread":{"name":"main","id":"1"},"env":"dev","span":{"kind":"client"}}
Usually we expect something like _dd.p.dm=-0,_dd.p.tid=658b547000000000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does seem like this was written as a json encoding of the tags from the start of this pr. and reading the other associated PR's, it's not entirely clear if they expect the tags to be json-encoded or comma-separated-key=values ... will dig further.
I'm working on #7973 that might help with the JSON part. |
@nhulston is this PR ready to be reviewed ? I see that it does not build. Otherwise it can be put in draft for now |
What Does This Do
Adds an instrumentation for AWS SDK Step Functions. This enables tracing for when a Lambda function invokes a Step Function. Trace context is injected into the Step Function's
StartExecutionRequest/StartSyncExecutionRequest.Input
object.Example of traced step function in the Serverless org: app
The Logs to Traces Reducer will read the trace context from the Step Function logs and create a span for the Step Function.
Motivation
Continues the work done in Python and NodeJS.
Additional Notes
Screenshots of feature:
Contributor Checklist
type:
and (comp:
orinst:
) labels in addition to any usefull labelsclose
,fix
or any linking keywords when referencing an issue.Use
solves
instead, and assign the PR milestone to the issueJira ticket: [SVLS-5249](https://datadoghq.atlassian.net/browse/SVLS-5249)