Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplified handling of RuleMessage by removing usage of std::shared_ptr #3253

Merged
merged 4 commits into from
Oct 15, 2024

Conversation

eduar-hte
Copy link
Contributor

what

Replace usage of std::shared_ptr to pass a RuleMessage instance with a reference.

why

A RuleMessage instance is created in bool RuleWithActions::evaluate(Transaction *transaction) (see src/rule_with_actions.cc), and is then passed around many times in the evaluation of the rule with a std::shared_ptr copy in every call. Notice that this doesn't copy the RuleMessage instance, which could be costly, but copying a std::shared_ptr is not cheaper than a reference because its control block is guaranteed to be thread-safe, and thus these copies use synchronization primitives (such as atomics) which decrease performance.

Because the lifetime of the RuleMessage is limited to the RuleWithActions::evaluate method where it's created, an instance can be created in the stack (instead of allocating it in the heap and creating a std::shared_ptr) and then simply pass the instance by reference in each function call.

NOTE: Additionally, the current implementation doesn't validate that the received std::shared_ptr stores a valid instance, and just dereferences it. The current version of the code guarantees that the pointer is not going to be nullptr but this is not checked nor asserted. Using a reference explicitly states in the code that each function expects a valid instance, and thus the compiler can enforce it in every method call and the code is explicit in its requirements.

RuleWithActions::performLogging currently creates a new RuleMessage instance on the heap and switches it with the one held in the std::shared_ptr (after storing the current RuleMessage in the list of messages in the Transaction) in a specific condition (when lastLog is false and the following condition holds: hasMultimatch() && isItToBeLogged. This code has been adjusted to leverage the existing (and unused) clean method in RuleMessage, which has been renamed as reset, and just reset the fields in the RuleMessage instance being used.

misc

This is part of a series of PRs to improve performance of the library (6/n). Previous: #3248

@eduar-hte
Copy link
Contributor Author

This is part of a series of PRs to improve performance of the library (6/n). Previous: #3248

In addition to simplifying the codebase, this change introduces some minor performance improvements by replacing the copies of std::shared_ptr with a faster/cheaper reference.

These results were obtained running the benchmark test with 100'000 iterations with the OWASP CRS v4 and are presented as a reference:

  • Mac mini (1st generation)
    • v3/master: 81.33 secs
    • This PR: 79.94 secs (1.71% improvement)
  • Debian (bullseye) container running on WSL on Windows Dev Kit 2023
    • v3/master: 117.98 secs
    • This PR: 114.19 secs (3.22% improvement)

@eduar-hte
Copy link
Contributor Author

eduar-hte commented Sep 10, 2024

Issues 25 New issues

Addressed most of the simple issues (unrelated to this PR) reported by Sonarcloud to reduce 'technical debt'.

The following issues were not addressed to avoid introducing potential bugs in code that was not implemented in this PR:

  • Refactor this code to not nest more than 3 if|for|do|while|switch statements
  • Refactor this function to reduce its Cognitive Complexity from 30 to the 25 allowed.

@marcstern marcstern added the 3.x Related to ModSecurity version 3.x label Sep 20, 2024
airween
airween previously approved these changes Oct 2, 2024
@airween
Copy link
Member

airween commented Oct 2, 2024

Hi @eduar-hte,

many thanks for this great PR. Basically everything is fine, I just have one "side" question: you moved two files from test/regression directory to test/common/ (custom_debug_log.(cc|h)) (which is not a problem, you're right, they are in a better place there), but changed the syntax (I mean format/indentation). These files used to be OTR style, but it seems like now they have Allman.

This is no problem at all, you don't need to modify them, but let me ask: was that intentional or accidentally? I already thinking we should introduce a style convention (for both versions) which should be checked in CI workflow. If once we get there (at this stage), this should be reformatted, I guess.

eduar-hte and others added 4 commits October 7, 2024 11:45
…ired to execute transactions

- Simplifies memory management on error conditions
- Context will be used in unit tests too, in order to provide
  Transaction related instances.
…ead.

- Avoids copying std::shared_ptr when lifetime of the RuleMessage
  is controlled by the caller.
  - The RuleMessage instance is created in RuleWithActions::evaluate and
    then used to call the overloaded version of this method that is
    specialized by subclasses.
  - Once the call to the overloaded method returns, the std::shared_ptr
    is destroyed as it's not stored by any of the callers, so it can
    be replaced with a stack variable and avoid paying the cost of
    copying the std::shared_ptr (and its control block that is
    guaranteed to be thread-safe and thus is not a straightforward
    pointer copy)
- Introduced RuleMessage::reset because this is required by
  RuleWithActions::performLogging when it's not the 'last log', the rule
  has multimatch and it's to be logged.
  - The current version is creating allocating another instance of
    RuleMessage on the heap to copy the Rule & Transaction related state
    while all the other members in the RuleMessage are set to their
    default values.
  - The new version leverages the existent, unused and incomplete
    function 'clean' (renamed as 'reset') to do this on the current
    instance.
    - Notice that the current code preserves the value of m_saveMessage,
      so 'reset' provides an argument for the caller to control whether
      this member should be reinitialized.
- src/modsecurity.cc
  - Replace the redundant type with "auto".
- src/transaction.cc
  - Avoid this unnecessary copy by using a "const" reference.
- test/common/custom_debug_log.cc
  - Use "=default" instead of the default implementation of this special
    member functions.
    - Removed the unnecessary destructor override instead.
  - Annotate this function with "override" or "final".
    - Removed the unnecessary destructor override instead.
  - Remove this "const" qualifier from the return type in all
    declarations.
- test/common/modsecurity_test_context.h
  - Replace the redundant type with "auto".
- test/regression/regression.cc
  - Use the "nullptr" literal.
  - Replace this declaration by a structured binding declaration.
  - Replace "reinterpret_cast" with a safer operation.
- Addresses Sonarcloud issues:
  - Rewrite the code so that you no longer need this "delete".
  - Make the type of this variable a reference-to-const.
Copy link

sonarcloud bot commented Oct 7, 2024

@eduar-hte
Copy link
Contributor Author

This is no problem at all, you don't need to modify them, but let me ask: was that intentional or accidentally? I already thinking we should introduce a style convention (for both versions) which should be checked in CI workflow. If once we get there (at this stage), this should be reformatted, I guess.

You're right, this was an oversight in the 'Introduce ModSecurityTestContext to encapsulate setup of objects required to execute transactions' commit. My editor sometimes does update the formatting when saving files, and I usually catch this and revert those changes.

I've edited the commits to update the formatting to align with the rest of the codebase, as I don't like introducing formatting changes (that overwrite 'blame' history). Other than the moved custom_debug_log.* files, I think the most significant (unnecessary) reformatting was in test/regression/regression.cc.

@eduar-hte
Copy link
Contributor Author

eduar-hte commented Oct 7, 2024

(...) I already thinking we should introduce a style convention (for both versions) which should be checked in CI workflow. If once we get there (at this stage), this should be reformatted, I guess.

I haven't used it, but clang-format (and introducing a .clang-format file to the project) could be useful with this.

Did you notice the check-style (and check-coding-style) configs in Makefile.am (see here)? I think this is a mix of static code checks but also formatting (it follows Google's C++ Style Guide).

@airween
Copy link
Member

airween commented Oct 7, 2024

This is no problem at all, you don't need to modify them, but let me ask: was that intentional or accidentally? I already thinking we should introduce a style convention (for both versions) which should be checked in CI workflow. If once we get there (at this stage), this should be reformatted, I guess.

You're right, this was an oversight in the 'Introduce ModSecurityTestContext to encapsulate setup of objects required to execute transactions' commit. My editor sometimes does update the formatting when saving files, and I usually catch this and revert those changes.

I've edited the commits to update the formatting to align with the rest of the codebase, as I don't like introducing formatting changes (that overwrite 'blame' history). Other than the moved custom_debug_log.* files, I think the most significant (unnecessary) reformatting was in test/regression/regression.cc.

Oh, many thanks!

@airween
Copy link
Member

airween commented Oct 7, 2024

(...) I already thinking we should introduce a style convention (for both versions) which should be checked in CI workflow. If once we get there (at this stage), this should be reformatted, I guess.

I haven't used it, but clang-format (and introducing a .clang-format file to the project) could be useful with this.

Did you notice the check-style (and check-coding-style) configs in Makefile.am (see here)? I think this is a mix of static code checks but also formatting (it follows Google's C++ Style Guide).

I still haven't used .clang-format yet, but thanks for bringing this up, I'll check that.

Also haven't noticed that part of Makefile.am, now I just checked and realized it does nothing (perhaps because I don't have cpplint.py).

In another project where I participate as developer we use astyle, which is easy to integrate into the CI workflow. We can take a look at that.

Thanks again.

Please let me know if you finished the work on this PR, I will merge that.

@eduar-hte
Copy link
Contributor Author

eduar-hte commented Oct 7, 2024

Please let me know if you finished the work on this PR, I will merge that.

Yes, I'm done with this PR.

sonarcloud is now reporting a couple of issues about using std::make_shared & std::make_unique but those were already addressed in #3254.

@airween airween merged commit 99ce977 into owasp-modsecurity:v3/master Oct 15, 2024
49 checks passed
@airween
Copy link
Member

airween commented Oct 15, 2024

Thanks again, @eduar-hte!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.x Related to ModSecurity version 3.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants