Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support tainted strings coming from database for SQLi, SSTi and Code injection #4904

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

uurien
Copy link
Collaborator

@uurien uurien commented Nov 18, 2024

What does this PR do?

Makes the iast able to detect vulnerabilities coming from database sources, tainting the first row in the sql database outputs.
If the data is used directly in a SQL, Template or Code eval a vulnerability is detected.

Motivation

Improve our vulnerability detections coverage.

Checklist

  • Unit tests.

Additional Notes

System tests: DataDog/system-tests#3600

Copy link

github-actions bot commented Nov 18, 2024

Overall package size

Self size: 8.26 MB
Deduped: 94.83 MB
No deduping: 95.4 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.2.2 | 29.27 MB | 29.27 MB | | @datadog/native-appsec | 8.3.0 | 19.37 MB | 19.38 MB | | @datadog/native-iast-taint-tracking | 3.2.0 | 13.9 MB | 13.91 MB | | @datadog/pprof | 5.4.1 | 9.76 MB | 10.13 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.6.0 | 2.58 MB | 2.72 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 3.0.1 | 1.06 MB | 1.46 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.11.2 | 112.74 kB | 826.22 kB | | msgpack-lite | 0.1.26 | 201.16 kB | 281.59 kB | | source-map | 0.7.4 | 226 kB | 226 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | int64-buffer | 0.1.10 | 49.18 kB | 49.18 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

@pr-commenter
Copy link

pr-commenter bot commented Nov 18, 2024

Benchmarks

Benchmark execution time: 2024-12-12 17:36:01

Comparing candidate commit 57233a0 in PR branch ugaitz/taint-database-sources-sequelize-pg with baseline commit f2a3601 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 262 metrics, 4 unstable metrics.

@uurien uurien changed the title Commit to change branch Support tainted strings coming from database for SQLi, SSTi and Code injection Nov 28, 2024
@uurien uurien force-pushed the ugaitz/taint-database-sources-sequelize-pg branch 3 times, most recently from 62c7ec2 to aff39bb Compare November 29, 2024 12:36
Copy link

codecov bot commented Nov 29, 2024

Codecov Report

Attention: Patch coverage is 84.21053% with 6 lines in your changes missing coverage. Please review.

Project coverage is 88.55%. Comparing base (b6c11a6) to head (a171415).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
.../dd-trace/src/appsec/iast/taint-tracking/plugin.js 86.36% 3 Missing ⚠️
...c/appsec/iast/analyzers/code-injection-analyzer.js 33.33% 2 Missing ⚠️
...psec/iast/analyzers/template-injection-analyzer.js 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #4904       +/-   ##
===========================================
+ Coverage   63.93%   88.55%   +24.62%     
===========================================
  Files         290      119      -171     
  Lines       13216     4245     -8971     
===========================================
- Hits         8449     3759     -4690     
+ Misses       4767      486     -4281     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@uurien uurien force-pushed the ugaitz/taint-database-sources-sequelize-pg branch 2 times, most recently from e046315 to 55d6e19 Compare December 2, 2024 10:42
@uurien uurien marked this pull request as ready for review December 2, 2024 16:10
@uurien uurien requested review from a team as code owners December 2, 2024 16:10
if (!iastContext) return result

let rowsToTaint = this.iastConfig?.dbRowsToTaint
if (typeof rowsToTaint !== 'number') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should trust dbRowsToTaint is a number since it's coming from config ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a simple and cheap check, why stop doing it? (and now it is done only once in onConfigure)

if (dbOrigin === 'sequelize' && result.dataValues) {
result.dataValues = this._taintDatabaseResult(result.dataValues, dbOrigin, iastContext, name)
} else {
Object.keys(result).forEach(key => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using for...of is better for performance here

result[i] = this._taintDatabaseResult(result[i], dbOrigin, iastContext, nextName)
}
} else if (result && typeof result === 'object') {
if (dbOrigin === 'sequelize' && result.dataValues) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO distinguishing between databases using dbOrigin in this function is unnecessary and could make the code more complex if we add more database subscriptions in the future.

I suggest sending result.dataValues directly from the sequelize subscription and removing this part from the function. wdyt?

Copy link
Collaborator Author

@uurien uurien Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was my first approach, but it would be super expensive. If we are going to taint only one row, we don't need to iterate each row of the data just to taint only the first one. And this object could be complex, like 100 rows of:

[
  { ..., dataValues: { id: 123, name: "name", children: [{.., dataValues: { /*another complex object */ }]}
]

This is the reason why I decided to check the database type here and not in the instrumentation, going against our good practices.

@uurien uurien force-pushed the ugaitz/taint-database-sources-sequelize-pg branch 2 times, most recently from 7e8cce2 to 75a344b Compare December 10, 2024 14:13
@uurien uurien force-pushed the ugaitz/taint-database-sources-sequelize-pg branch from 75a344b to 57233a0 Compare December 12, 2024 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants