-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Varnish ban regex has very poor performance #495
Comments
very interesting, thank you for the post and the explanations! i am surprised that removing the a while ago, we introduced support for varnish xkey. when you do cache tagging, i recommend to look at https://foshttpcache.readthedocs.io/en/latest/varnish-configuration.html#tag-invalidation-using-xkey as this lets varnish semantically understand that we are tagging, and gets rid of all regex processing - i would expect that to be a lot more efficient. still, i'd be happy for a pull request that improves the regex. as you say, in the general case we do not know what tags could look like. but at the very least when building the regex string we could put |
Xkey is a very bad implementation and i'm not that sure that's more efficient than an efficient regexp. And it's deprecated, so careful about pushing for it. At least regexp are generic and allows correct ban. The replacement Ykey seems to not be available for free. |
I have the problem on APIP projects that do not use FosHttpCache but a copied code, easier to replace and fix. Here it's a backport to let you know of the issue. |
afaik xkey is still available for the OS version https://github.com/varnish/varnish-modules - i was not aware that it is deprecated, the varnish-modules readme does not mention that. but indeed https://docs.varnish-software.com/varnish-cache-plus/vmods/ykey/ lists several issues with xkey. i guess that means i want to move to ykey in our project - though it seems we have not been bitten by the xkey shortcomings.
could we improve something in FOSHttpCache to allow you to use it directly, or is it a policy that you fork dependencies? (for the vcl, the files are made so that they can be included from the repository directly, but i would expect people to need to touch them up and use their own copy. often varnish won't have file system access to where the php code is anyways) |
I don't know, @dunglas? But the point of this issue is of course to provide the best performance out of FOSHttpCache by default. |
AFAIR I didn't copied the code of FOSHttpCache initially. But it may have changed since then! I'll check. |
thanks. just to be clear: it is totally within the license rules that parts of that code can be copied into APIP code, i have no issue with that. but imho FOSHttpCache does a decent job of documenting and testing the code so it could be benefitial for all to reuse the code, if APIP uses a significant amount of the features that the library, or also the FOSHttpCacheBundle offer. if there are a few extension points missing or such, i would be happy to help enable using it directly. but if you only use a small part of the library funcationality, then copying a few classes is probably better for maintainability than pulling in all the other functionality. |
I just checked, and we indeed now use a method copied from FosHttpCache (it wasn't the case in the beginning): https://github.com/api-platform/core/blame/main/src/HttpCache/VarnishPurger.php#L47 I was planning to remove the dependency to Guzzle in API Platform to switch to Symfony HttpClient, as we try to keep the number of vendors we depend on as low as possible, and to add support for |
Well, a new version of the bundle might use Symfony HttpClient. The reason why php-http is used is because the bundle predates symfony/http-client |
@bastnic i looked at FOSHttpCache/src/ProxyClient/Varnish.php Line 193 in 6dcfc42
i don't think we can do the other optimization steps in the generic case. whats left is the xkey vs ykey discussion. i am updating the documentation in #498. ok if we close this issue? |
I think you're right, I got confused. |
no worries. thanks for the report and the discussions - made me realize we should mention ykey (and also that i should talk to our customer about switching to ykey) |
Includes now defunct API Platform Varnish configuration. From https://github.com/api-platform/api-platform/blob/v2.5.7/api/docker/varnish/conf/default.vcl Please note that is a BAN based purge (which is kinda meh ?) Please see also FriendsOfSymfony/FOSHttpCache#495 api-platform/core#1856 api-platform/api-platform#1947
Includes now defunct API Platform Varnish configuration. From https://github.com/api-platform/api-platform/blob/v2.5.7/api/docker/varnish/conf/default.vcl Please note that is a BAN based purge (which is kinda meh ?) Please see also FriendsOfSymfony/FOSHttpCache#495 api-platform/core#1856 api-platform/api-platform#1947
Includes now defunct API Platform Varnish configuration. From https://github.com/api-platform/api-platform/blob/v2.5.7/api/docker/varnish/conf/default.vcl Please note that is a BAN based purge (which is kinda meh ?) Please see also FriendsOfSymfony/FOSHttpCache#495 api-platform/core#1856 api-platform/api-platform#1947
The regexp used in Varnish ban has very poor performance.
This is a backport of a fix already merged in API Platform, where it's a little easier to fix because we know in advance the pattern of the tags to ban. More specifically, on APIP, we know that a tag always began with a / and as they are iri, they are unique and not nested. We can then strip the first group
(^|,)
.This is not the case here where tags could be anything.
This is something I wanted to talk since a long time but didn't took the time to share because I don't have the correct generic solution. Maybe you will have one.
The current ban regexp is not optimize and killed my varnish server to 100% cpu.
Several reasons, but let's explain how that work. This is a question of
I've a high number of requests and bans per seconds. The response has a lot of tags.
But I only ban one resource at a time, so the regexp was not that complex at first look.
Varnish stores all ban in a (deduplicated) list. Each cached resource has a pointer to the last ban checked.
If you add bans, when you call a cached resources, it will check against all the new bans and store the last ones.
The number of regexp analysis was high, but I can change my traffic, so I can only adjust the header itself or the regexp.
I looked at the regexp first
Step 1: checking the begining of the resources is not useful if tags are only handled by apip.
Stepe 2: regrouping the end tag at the end is better.
Step 3: remove the prefix from the regexp
Step 4: remove the prefix from the source header
Bench:
On my production, I published Step3 today at 12:15:
On another project, I will let you guess when the patch was released:
The text was updated successfully, but these errors were encountered: