Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(iast): redaction algorithms refactor #9126

Merged
merged 23 commits into from
Apr 30, 2024

Conversation

avara1986
Copy link
Member

@avara1986 avara1986 commented Apr 29, 2024

Summarize

Refactor of the IAST redaction system. The old algorithms had several problems:

  • If IAST reports two or more vulnerabilities, the last one overrides the previous ones (potential bug).
  • IAST creates a report each time a vulnerability is detected (performance regression).
  • Each vulnerability implements its own redaction algorithm, making it challenging to add more vulnerabilities with redaction.
  • The current redaction mechanism doesn't correctly cover all redaction cases, such as Pattern key, SSRF user/password scrubbing.

Description

This PR adds a new algorithm to detect sensitive data. Additionally, it migrates CMDi, SSRF, Path traversal, and Header injection vulnerabilities to this new system.

New classes:

  • Sensitive Handler: This class encapsulates the redaction mechanism, and now, the redaction behavior of each vulnerability is in a dictionary of analyzers.
  • Analyzers: Each of them implements a simpler way to find sensitive data.

Deprecated methods:

TODOs

  • Migrate SQL Injection to this new algorithm. File
  • Remove deprecated code. Example

Checklist

  • Change(s) are motivated and described in the PR description
  • Testing strategy is described if automated tests are not included in the PR
  • Risks are described (performance impact, potential for breakage, maintainability)
  • Change is maintainable (easy to change, telemetry, documentation)
  • Library release note guidelines are followed or label changelog/no-changelog is set
  • Documentation is included (in-code, generated user docs, public corp docs)
  • Backport labels are set (if applicable)
  • If this PR changes the public interface, I've notified @DataDog/apm-tees.
  • If change touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.

Reviewer Checklist

  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Description motivates each change
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Change is maintainable (easy to change, telemetry, documentation)
  • Release note makes sense to a user of the library
  • Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

@avara1986 avara1986 force-pushed the avara1986/APPSEC-52733-iast_redaction_refactor branch from 03cd1a0 to 7891e0d Compare April 29, 2024 15:59
@datadog-dd-trace-py-rkomorn
Copy link

datadog-dd-trace-py-rkomorn bot commented Apr 29, 2024

Datadog Report

Branch report: avara1986/APPSEC-52733-iast_redaction_refactor
Commit report: 1be7ed0
Test service: dd-trace-py

✅ 0 Failed, 109116 Passed, 3657 Skipped, 7m 54.1s Total duration (35m 17.65s time saved)

@pr-commenter
Copy link

pr-commenter bot commented Apr 29, 2024

Benchmarks

Benchmark execution time: 2024-04-30 15:23:31

Comparing candidate commit cc67ac8 in PR branch avara1986/APPSEC-52733-iast_redaction_refactor with baseline commit 1d5b789 in branch main.

Found 1 performance improvements and 8 performance regressions! Performance is the same for 192 metrics, 9 unstable metrics.

scenario:flasksimple-appsec-get

  • 🟥 execution_time [+230.040µs; +271.273µs] or [+3.662%; +4.319%]

scenario:otelspan-start-finish

  • 🟩 max_rss_usage [-761.176KB; -589.275KB] or [-3.380%; -2.617%]

scenario:sethttpmeta-no-collectipvariant

  • 🟥 max_rss_usage [+706.673KB; +780.584KB] or [+3.386%; +3.740%]

scenario:sethttpmeta-obfuscation-no-query

  • 🟥 max_rss_usage [+745.148KB; +818.705KB] or [+3.573%; +3.926%]

scenario:sethttpmeta-obfuscation-send-querystring-disabled

  • 🟥 max_rss_usage [+708.324KB; +784.258KB] or [+3.352%; +3.712%]

scenario:sethttpmeta-obfuscation-worst-case-implicit-query

  • 🟥 max_rss_usage [+717.428KB; +793.586KB] or [+3.395%; +3.756%]

scenario:sethttpmeta-useragentvariant_not_exists_1

  • 🟥 max_rss_usage [+713.730KB; +797.284KB] or [+3.417%; +3.817%]

scenario:tracer-large

  • 🟥 max_rss_usage [+684.108KB; +762.189KB] or [+3.147%; +3.506%]

scenario:tracer-small

  • 🟥 max_rss_usage [+647.584KB; +719.251KB] or [+3.135%; +3.482%]

@avara1986 avara1986 changed the title Avara1986/appsec 52733 iast redaction refactor chore(iast): redaction algorithms refactor Apr 30, 2024
@avara1986 avara1986 added changelog/no-changelog A changelog entry is not required for this PR. ASM Application Security Monitoring labels Apr 30, 2024
@codecov-commenter
Copy link

codecov-commenter commented Apr 30, 2024

Codecov Report

Attention: Patch coverage is 0% with 666 lines in your changes are missing coverage. Please review.

Project coverage is 6.59%. Comparing base (1d5b789) to head (cc67ac8).

Files Patch % Lines
...ec/_iast/_evidence_redaction/_sensitive_handler.py 0.00% 171 Missing ⚠️
.../appsec/iast/taint_sinks/test_command_injection.py 0.00% 91 Missing ⚠️
ddtrace/appsec/_iast/reporter.py 0.00% 82 Missing ⚠️
...ast/taint_sinks/test_command_injection_redacted.py 0.00% 60 Missing ⚠️
tests/appsec/iast/test_iast_propagation_path.py 0.00% 52 Missing ⚠️
...ests/appsec/iast/taint_sinks/test_ssrf_redacted.py 0.00% 28 Missing ⚠️
...sts/appsec/iast/taint_sinks/test_path_traversal.py 0.00% 25 Missing ⚠️
...iast/_evidence_redaction/url_sensitive_analyzer.py 0.00% 24 Missing ⚠️
...race/appsec/_iast/taint_sinks/command_injection.py 0.00% 19 Missing ⚠️
...iast/taint_sinks/test_header_injection_redacted.py 0.00% 16 Missing ⚠️
... and 13 more
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #9126       +/-   ##
===========================================
- Coverage   78.47%    6.59%   -71.88%     
===========================================
  Files        1266     1241       -25     
  Lines      119572   117813     -1759     
===========================================
- Hits        93829     7775    -86054     
- Misses      25743   110038    +84295     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@avara1986 avara1986 marked this pull request as ready for review April 30, 2024 14:57
@avara1986 avara1986 requested a review from a team as a code owner April 30, 2024 14:57
@avara1986 avara1986 enabled auto-merge (squash) April 30, 2024 14:57
@avara1986 avara1986 merged commit f1beaae into main Apr 30, 2024
98 of 99 checks passed
@avara1986 avara1986 deleted the avara1986/APPSEC-52733-iast_redaction_refactor branch April 30, 2024 15:42
avara1986 added a commit that referenced this pull request May 8, 2024
# Summarize
Refactor of the IAST redaction system. The old algorithms had several
problems:

## Description
This PR continues this #9126
- Migrate SQL Injection to this new algorithm
- Remove deprecated code

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [ ] Title is accurate
- [ ] All changes are related to the pull request's stated goal
- [ ] Description motivates each change
- [ ] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [ ] Testing strategy adequately addresses listed risks
- [ ] Change is maintainable (easy to change, telemetry, documentation)
- [ ] Release note makes sense to a user of the library
- [ ] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [ ] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
brettlangdon pushed a commit that referenced this pull request May 9, 2024
After IAST redaction refactor
(#9163 and
#9126) `sqlparse` dependency
is deprecated
## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
github-actions bot pushed a commit that referenced this pull request Jun 10, 2024
After IAST redaction refactor
(#9163 and
#9126) `sqlparse` dependency
is deprecated
## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

(cherry picked from commit a0a8330)
@DataDog DataDog deleted a comment from github-actions bot Jun 11, 2024
github-actions bot pushed a commit that referenced this pull request Jun 11, 2024
# Summarize
Refactor of the IAST redaction system. The old algorithms had several
problems:

## Description
This PR continues this #9126
- Migrate SQL Injection to this new algorithm
- Remove deprecated code

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [ ] Title is accurate
- [ ] All changes are related to the pull request's stated goal
- [ ] Description motivates each change
- [ ] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [ ] Testing strategy adequately addresses listed risks
- [ ] Change is maintainable (easy to change, telemetry, documentation)
- [ ] Release note makes sense to a user of the library
- [ ] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [ ] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

(cherry picked from commit 8d67869)
gnufede pushed a commit that referenced this pull request Jun 12, 2024
Backport 8d67869 from #9163 to 2.9.

# Summarize
Refactor of the IAST redaction system. The old algorithms had several
problems:

## Description
This PR continues this #9126
- Migrate SQL Injection to this new algorithm
- Remove deprecated code

## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

Co-authored-by: Alberto Vara <alberto.vara@datadoghq.com>
erikayasuda pushed a commit that referenced this pull request Jun 12, 2024
Backport a0a8330 from #9212 to 2.9.

After IAST redaction refactor
(#9163 and
#9126) `sqlparse` dependency
is deprecated
## Checklist

- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included
in the PR
- [x] Risks are described (performance impact, potential for breakage,
maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed or label `changelog/no-changelog` is set
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/))
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
- [x] If this PR changes the public interface, I've notified
`@DataDog/apm-tees`.
- [x] If change touches code that signs or publishes builds or packages,
or handles credentials of any kind, I've requested a review from
`@DataDog/security-design-and-guidance`.

## Reviewer Checklist

- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications
of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

Co-authored-by: Alberto Vara <alberto.vara@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ASM Application Security Monitoring changelog/no-changelog A changelog entry is not required for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants