Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce memory footprint and cpu usage when modsecurity and owasp rule… #4091

Merged
merged 1 commit into from
May 19, 2019
Merged

reduce memory footprint and cpu usage when modsecurity and owasp rule… #4091

merged 1 commit into from
May 19, 2019

Conversation

weltschraet
Copy link
Contributor

…s are enabled globally

What this PR does / why we need it:
We have 72 ingress configurations on one cluster. The nginx ingress handles that fine, with a memory footprint of a few hundred MB. But when modsecurity and owasp rules are enabled that memory footprint increased to 3.5GB at startup and over 7GB on config reload. Additionally the CPU usage was way too high, we had long startup times and some of the times on a config reload the health checks failed.
After some research I found this issue at the ModSecurity-nginx repo. After reading it I thought maybe ModSecurity also keeps a set of rules for each location and this seems to be the case.
When enabling ModSecurity and OWASP rules globally the ingress-nginx applies the rules to each location. This pull request changes that and applies the rules to the http block.
Memory consumption went down from the numbers above back to a few hundred MB. Also the cpu usage went down noticeably.
It is still possible to apply the rules for each ingress resource separately.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged):

maybe fixes #4041
maybe fixes #3926

it is not clear if they have modsecurity enabled or not

Special notes for your reviewer:

@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 16, 2019
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels May 16, 2019
@weltschraet
Copy link
Contributor Author

/retest

@k8s-ci-robot
Copy link
Contributor

@weltschraet: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@aledbf
Copy link
Member

aledbf commented May 18, 2019

@weltschraet please rebase

@weltschraet
Copy link
Contributor Author

@aledbf done

@aledbf
Copy link
Member

aledbf commented May 19, 2019

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label May 19, 2019
@codecov-io
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (master@19501b2). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #4091   +/-   ##
=========================================
  Coverage          ?   57.68%           
=========================================
  Files             ?       87           
  Lines             ?     6450           
  Branches          ?        0           
=========================================
  Hits              ?     3721           
  Misses            ?     2298           
  Partials          ?      431

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 19501b2...abca32b. Read the comment docs.

@aledbf
Copy link
Member

aledbf commented May 19, 2019

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 19, 2019
@aledbf
Copy link
Member

aledbf commented May 19, 2019

@weltschraet thanks!

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aledbf, weltschraet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 19, 2019
@k8s-ci-robot k8s-ci-robot merged commit ff80dca into kubernetes:master May 19, 2019
@weltschraet weltschraet deleted the modsecurity-memory branch May 20, 2019 14:38
@johnmarcou
Copy link

johnmarcou commented Jul 23, 2019

Hi,

I think the nginx "default server" is impacted by the ModSecurity CRS rules. I have a side effect when:

  • enable-owasp-modsecurity-crs: "true" in the ConfigMap
  • - --default-ssl-certificate=kube-system/ingress-default-certificate on the controller args
  • force the SecRuleEngine on instead of DetectionOnly

The controller keeps failing to reload the configuration.

Controller log:

-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:    0.25.0
  Build:      git-1387f7b7e
  Repository: https://github.com/kubernetes/ingress-nginx
-------------------------------------------------------------------------------

I0723 04:59:29.061288       8 flags.go:192] Watching for Ingress class: public
W0723 04:59:29.061358       8 flags.go:195] Only Ingresses with class "public" will be processed by this Ingress controller
W0723 04:59:29.061668       8 flags.go:221] SSL certificate chain completion is disabled (--enable-ssl-chain-completion=false)
nginx version: openresty/1.15.8.1
W0723 04:59:29.066074       8 client_config.go:541] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0723 04:59:29.066248       8 main.go:183] Creating API client for https://10.26.0.1:443
I0723 04:59:29.074516       8 main.go:227] Running in Kubernetes cluster version v1.14 (v1.14.3) - git (clean) commit 5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0 - platform linux/amd64
I0723 04:59:29.077861       8 main.go:91] Validated kube-ingress/external-ingress-nginx-ingress-default-backend as the default backend.
I0723 04:59:29.327408       8 main.go:102] Created fake certificate with PemFileName: /etc/ingress-controller/ssl/default-fake-certificate.pem
E0723 04:59:29.328673       8 main.go:131] v1.14.3
I0723 04:59:29.352456       8 nginx.go:275] Starting NGINX Ingress controller
I0723 04:59:29.370507       8 backend_ssl.go:66] Adding Secret "kube-system/ingress-default-certificate" to the local store
I0723 04:59:29.409719       8 event.go:258] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-ingress", Name:"external-ingress-nginx-ingress-controller", UID:"a5f1beee-ad06-11e9-b3a3-0050569117d1", APIVersion:"v1", ResourceVersion:"1130737", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap kube-ingress/external-ingress-nginx-ingress-controller
I0723 04:59:30.457873       8 event.go:258] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"kube-ingress", Name:"webapp01-nginx", UID:"a5efab8d-ad06-11e9-b3a3-0050569117d1", APIVersion:"networking.k8s.io/v1beta1", ResourceVersion:"1130730", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress kube-ingress/webapp01-nginx
I0723 04:59:30.553593       8 nginx.go:319] Starting NGINX process
I0723 04:59:30.553700       8 leaderelection.go:235] attempting to acquire leader lease  kube-ingress/ingress-controller-leader-public-public...
W0723 04:59:30.554239       8 controller.go:384] Service "kube-ingress/external-ingress-nginx-ingress-default-backend" does not have any active Endpoint
W0723 04:59:30.554273       8 controller.go:878] Service "kube-ingress/webapp01-nginx" does not have any active Endpoint.
I0723 04:59:30.555974       8 status.go:86] new leader elected: external-ingress-nginx-ingress-controller-df645c9b5-xtd4t
I0723 04:59:30.560426       8 controller.go:133] Configuration changes detected, backend reload required.
I0723 04:59:30.798596       8 controller.go:149] Backend successfully reloaded.
I0723 04:59:30.798657       8 controller.go:158] Initial sync, sleeping for 1 second.
[23/Jul/2019:04:59:31 +0000]TCP200000.000
2019/07/23 04:59:31 [error] 49#49: *61 [client unix:] ModSecurity: Access denied with code 403 (phase 2). Matched "Operator `Ge' with parameter `5' against variable `TX:ANOMALY_SCORE' (Value: `10' ) [file "/etc/nginx/owasp-modsecurity-crs/rules/REQUEST-949-BLOCKING-EVALUATION.conf"] [line "80"] [id "949110"] [rev ""] [msg "Inbound Anomaly Score Exceeded (Total Score: 10)"] [data ""] [severity "2"] [ver ""] [maturity "0"] [accuracy "0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-generic"] [hostname "69.86.0.0"] [uri "/configuration/servers"] [unique_id "15638579718.647903"] [ref ""], client: unix:, server: , request: "POST /configuration/servers HTTP/1.1", host: "nginx-status"
W0723 04:59:31.875612       8 controller.go:176] Dynamic reconfiguration failed: unexpected error code: 403
E0723 04:59:31.875640       8 controller.go:180] Unexpected failure reconfiguring NGINX:
unexpected error code: 403
W0723 04:59:31.875651       8 queue.go:130] requeuing initial-sync, err unexpected error code: 403
I0723 04:59:33.887721       8 controller.go:133] Configuration changes detected, backend reload required.
I0723 04:59:34.113133       8 controller.go:149] Backend successfully reloaded.
I0723 04:59:34.113176       8 controller.go:158] Initial sync, sleeping for 1 second.
[23/Jul/2019:04:59:35 +0000]TCP200000.001
2019/07/23 04:59:35 [error] 456#456: *233 [client unix:] ModSecurity: Access denied with code 403 (phase 2). Matched "Operator `Ge' with parameter `5' against variable `TX:ANOMALY_SCORE' (Value: `10' ) [file "/etc/nginx/owasp-modsecurity-crs/rules/REQUEST-949-BLOCKING-EVALUATION.conf"] [line "80"] [id "949110"] [rev ""] [msg "Inbound Anomaly Score Exceeded (Total Score: 10)"] [data ""] [severity "2"] [ver ""] [maturity "0"] [accuracy "0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-generic"] [hostname "69.86.0.0"] [uri "/configuration/servers"] [unique_id "156385797570.328786"] [ref ""], client: unix:, server: , request: "POST /configuration/servers HTTP/1.1", host: "nginx-status"
W0723 04:59:35.163342       8 controller.go:176] Dynamic reconfiguration failed: unexpected error code: 403
E0723 04:59:35.163362       8 controller.go:180] Unexpected failure reconfiguring NGINX:
unexpected error code: 403

In the /var/log/modsec_audit.log, I can see:

---qWK0yl4f---H--
ModSecurity: Warning. Matched "Operator `Rx' with parameter `(?i)(?:;|\{|\||\|\||&|&&|\n|\r|`)\s*[\(,@\'\"\s]*(?:[\w'\"\./]+/|[\\\\'\"\^]*\w[\\\\'\"\^]*:.*\\\\|[\^\.\w '\"/\\\\]*\\\\)?[\"\^]*(?:m[\"\^]*(?:y[\"\^]*s[\"\^]*q[\"\^]*l(?:[\"\^]*(?:d[\"\^]*u[\"\^]*m[ (4978 characters omitted)' against variable `ARGS:json.array_1.sslCert.pemCertKey' (Value: `-----BEGIN CERTIFICATE-----\x0aMIIJJDCCBwygAwIBAgIUcYqMeY48pPJsE55VDLlnrO5kzkIwDQYJKoZIhvcNAQEL\x0aB (8952 characters omitted)' ) [file "/etc/nginx/owasp-modsecurity-crs/rules/REQUEST-932-APPLICATION-ATTACK-RCE.conf"] [line "236"] [id "932110"] [rev ""] [msg "Remote Command Execution: Windows Command Injection"] [data "Matched Data:
...
[ACTUAL TLS CERTIFICATE ]
...
END RSA PRIVATE KEY
...

My understanding is the Ingress controller failed to start since it tries to load the certificate, using a POST /configuration/servers to update the config, but that endpoint is caught by modsec which drop the request.

I would say disabling modsecurity with a modsecurity off on the default server, used for NGINX healthcheck and access to nginx stats, would fix that side effect?

NOTE: if I enable modsecurity, even with DetectionOnly mode, you can find the warning on SSL certificate load in the modsec audit file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ingress-nginx crashes on reload of configuration nginx healthcheck error
5 participants