-
-
Notifications
You must be signed in to change notification settings - Fork 865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize varnish ban regexp #4231
Conversation
61ba6e7
to
9d29b58
Compare
]; | ||
|
||
yield 'iris to generate a header with exactly the maximum length and a smaller one' => [ | ||
37, | ||
['/foo', '/bar', '/baz'], | ||
[ | ||
'((^|\,)/foo($|\,))|((^|\,)/bar($|\,))', | ||
'(^|\,)/baz($|\,)', | ||
'(/foo|/bar)($|\,)', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think something needs to be changed in the splitting logic: the 2 segments would now fit the 37 characters limit if merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @julienfalque, I'm fixing tests right now.
3f5ab8a
to
483632f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding support for the xkey module could improve the performance even more.
9a52951
to
5d29b8c
Compare
@dunglas FYI xkey is already abandoned by varnish which prefers ykey https://docs.varnish-software.com/varnish-cache-plus/vmods/ykey/. Migrating to x/ykey is not "free" as you need the paid version of varnish (maybe I'm wrong on this one). This PR an optimization for what we already provide in apip. |
xkey still works with the free varnish and won't be removed soon |
…o 100% cpu. Several reasons, but let's explain how that work. This is a question of > rq/s * ban/s * size of the cache tags header * complexity of the regexp I've a high number of requests and bans per seconds. The response has a lot of tags. But I only ban one ressource at a time, so the regexp was not that complex at first look. Varnish stores all ban in a (deduplicated) list. Each cached resource has a pointer to the last ban checked. If you add bans, when you call a cached resources, it will check against all the new bans and store the last ones. The number of regexp analysis was high, but I can change my traffic, so I can only adjust the header itself or the regexp. I looked at the regexp first > ((^|,)\/pim\/v1\/variants($|,))|((^|,)\/pim\/v1\/variants\/e3615f0c-a957-4a85-88b9-8db30d5c5c02($|,))|((^|,)\/pim\/v1\/models\/bebdee7b-fb53-448b-b79e-1bcf0dbcf050($|,)) Step 1: checking the begining of the resources is not useful if tags are only handled by apip. > (\/pim\/v1\/variants($|,))|(\/pim\/v1\/variants\/e3615f0c-a957-4a85-88b9-8db30d5c5c02($|,))|(\/pim\/v1\/models\/bebdee7b-fb53-448b-b79e-1bcf0dbcf050($|,)) Stepe 2: regrouping the end tag at the end is better. > (\/pim\/v1/variants|\/pim\/v1/variants\/e3615f0c-a957-4a85-88b9-8db30d5c5c02|\/pim\/v1/models\/bebdee7b-fb53-448b-b79e-1bcf0dbcf050)($|,) Step 3: remove the prefix from the regexp > (/variants|\/variants\/e3615f0c-a957-4a85-88b9-8db30d5c5c02|\/models\/bebdee7b-fb53-448b-b79e-1bcf0dbcf050)($|,) Step 4: remove the prefix from the source header Bench: +--------------+---------+-------+ | subject | mean | diff | +--------------+---------+-------+ | benchCurrent | 1.612μs | 7.91x | | benchStep1 | 0.575μs | 2.82x | | benchStep2 | 0.504μs | 2.47x | | benchStep3 | 0.492μs | 2.41x | | benchStep4 | 0.204μs | 1.00x | +--------------+---------+-------+ This PR only goes to step 2.
5d29b8c
to
c443491
Compare
thanks @bastnic |
Yes xkey isn't maintained by Varnish.com (the company) anymore but is still maintained by the community as part of the Varnish modules project (I recently contributed to it and the patch has been merged). |
The current ban regexp is not optimize and killed my varnish server to 100% cpu.
Several reasons, but let's explain how that work. This is a question of
I've a high number of requests and bans per seconds. The response has a lot of tags.
But I only ban one resource at a time, so the regexp was not that complex at first look.
Varnish stores all ban in a (deduplicated) list. Each cached resource has a pointer to the last ban checked.
If you add bans, when you call a cached resources, it will check against all the new bans and store the last ones.
The number of regexp analysis was high, but I can change my traffic, so I can only adjust the header itself or the regexp.
I looked at the regexp first
Step 1: checking the begining of the resources is not useful if tags are only handled by apip.
Stepe 2: regrouping the end tag at the end is better.
Step 3: remove the prefix from the regexp
Step 4: remove the prefix from the source header
Bench:
This PR only goes to step 2.
On my production, I published Step3 today at 12:15: