Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU spikes in prod every 5 minutes with Smarty v5 #1007

Closed
francislavoie opened this issue May 1, 2024 · 18 comments · Fixed by #1047
Closed

CPU spikes in prod every 5 minutes with Smarty v5 #1007

francislavoie opened this issue May 1, 2024 · 18 comments · Fixed by #1047

Comments

@francislavoie
Copy link

francislavoie commented May 1, 2024

Last week, we tried deploying an update to our prod servers upgrading from Smarty v4.3 to v5.1. Soon after, we started seeing patterns in our CPU graphs showing 100% CPU spikes (for about 5-10 seconds at a time) about every 5 minutes.

We then reverted the change, confirmed that the Smarty version change was the problem. Then, we tried upgrading to v4.4.1 then v4.5.2, to see if those exhibited the behaviour, but they didn't.

Our setup for Smarty is very, very simple. We're not using much in terms of config. It just looks like this (we're running on PHP 8.1 currently):

        // Initialize Smarty
        $smarty = new Smarty();
        $smarty->setTemplateDir("{$basedir}/smarty/templates");
        $smarty->setCompileDir("{$basedir}/smarty/templates_c");
        $smarty->setCaching(Smarty::CACHING_OFF);

        // Plugins, see https://github.com/smarty-php/smarty/issues/813
        $smarty->registerPlugin('modifier', 'json_encode', 'json_encode');
        $smarty->registerPlugin('modifier', 'htmlentities', 'htmlentities');
        $smarty->registerPlugin('modifier', 'htmlspecialchars', 'htmlspecialchars');
        $smarty->registerPlugin('modifier', 'ucfirst', 'ucfirst');
        $smarty->registerPlugin('modifier', 'count', 'count');
        $smarty->registerPlugin('modifier', 'is_object', 'is_object');
        $smarty->registerPlugin('modifier', 'month_names', 'month_names');
        $smarty->registerPlugin('modifier', 'stripos', 'stripos');
        $smarty->registerPlugin('modifier', 'strpos', 'strpos');
        $smarty->registerPlugin('modifier', 'is_null', 'is_null');
        $smarty->registerPlugin('modifier', 'min', 'min');
        $smarty->registerPlugin('modifier', 'max', 'max');
        $smarty->registerPlugin('modifier', 'file_exists', 'file_exists');

        $smarty->registerPlugin('function', 'helper', self::helper(...));
        $smarty->registerPlugin('function', 'popup', self::popup(...));
        $smarty->registerPlugin('function', 'popup_init', self::popup_init(...));
        $smarty->registerPlugin('function', 'querystring', self::querystring(...));
        $smarty->registerPlugin('function', 'selector', self::selector(...));
        $smarty->registerPlugin('function', 'vardump', self::vardump(...));
        $smarty->registerPlugin('modifier', 'ticketdata', self::ticketdata(...));

We're not seeing anything in the release notes that would suggest why there would be somekind of scheduled processing. We of course cleared out our compile dir when deploying.

We're certain that the problem is somewhere between v4.5.2 and v5.1.0, because we tried switching between them a couple of times and were able to replicate seeing CPU spikes. Unfortunately we don't have any kind of production profiling setup at the moment and we can only replicate it in prod due to the traffic levels, so we're not sure how else to dig into it.

Trying to scan through v4.5.2...v5.1.0 is difficult because we're not familiar enough with the codebase (also it's written in a very legacy style so it's hard to follow 😬)

I'd appreciate if you could give us any kind of leads.

@wisskid
Copy link
Contributor

wisskid commented May 1, 2024

I'm afraid I don't have any clue. There is no background processing taking place at all. The only way forward I see is to migrate forward in smaller steps to see what commit in Smarty triggers this. Or add some kind of profiling or strace to see what the CPU is doing.

@wisskid
Copy link
Contributor

wisskid commented May 1, 2024

Some other thoughts:

  • do you have the PHP multibyte extension enabled on production? (you should)
  • what do the timestamps in the templates_c folder tell you? Are the compiled templates somehow recompiled every 5 minutes? (they shouldn't)
  • do you use merge_compiled_includes or {include inline} heavily? (these are ignored silently in smarty 5 and I expect no one to notice the difference, but I'm not 100% sure)

In general: it would be very useful to have some way to replicate the traffic level in production, e.g. by using some kind of replay of access logs against a testing environment. That would allow you to use the steps I suggested above and see what happens without disrupting your production env.

@francislavoie
Copy link
Author

  • Yeah, we have multibyte.
  • Unfortunately we don't have v5 running in prod right now, but next time we try (probably within the next week) we'll keep an eye on compiled template timestamps.
  • Nope, we don't use merge_compiled_includes. We only use {include file='<file>'}, never {include inline}.

I'm thinking we'll take a shot at using https://github.com/reliforp/reli-prof to see what the heavy requests are doing.

@jdpanderson
Copy link

  • Unfortunately we don't have v5 running in prod right now, but next time we try (probably within the next week) we'll keep an eye on compiled template timestamps.

I checked the compiled template timestamps a couple times thinking maybe the issue was regular recompilation. The compiled template file timestamps weren't regularly updated once compiled by v5.

@nikitanp
Copy link

nikitanp commented Jun 5, 2024

We faced the same problem with the Smarty 5 and I found that some of the compiled cache files (not all of them!) were constantly recompiled and became "wasted" in OPcache. Because of that OPcache constantly resets which leads to CPU spikes. It seems that the spikes have the same time periods because of opcache.force_restart_timeout option which is 180 for us.
image

Also I noted that with the Smarty 4 some of our templates also recompile every time (because we use \Smarty::$force_compile option in one place), but they don't affect OPcache.
Now we switched to Smarty 4.5.3 on production.

Our config here

        $this->addExtension(new SmartyExtension());
        $this->setCompileDir(TEMP_PATH.'templates_c/');
        $this->setCompileCheck(self::COMPILECHECK_OFF);
        $this->setTemplateDir(ROOT_PATH.'/templates/');
        $this->muteUndefinedOrNullWarnings();

@francislavoie
Copy link
Author

That's interesting. We also have opcache.force_restart_timeout=180, I assume it's the default. We haven't tried again since (balancing priorities) but that's a good clue re wasted memory, we'll keep an eye on that when we find time to try again.

For the record, if it's useful at all, here's our opcache config:

opcache.enable	true
opcache.enable_cli	false
opcache.use_cwd	true
opcache.validate_timestamps	true
opcache.validate_permission	false
opcache.validate_root	false
opcache.dups_fix	false
opcache.revalidate_path	false
opcache.log_verbosity_level	1
opcache.memory_consumption	128.00 MB
opcache.interned_strings_buffer	8
opcache.max_accelerated_files	10000
opcache.max_wasted_percentage	0.05
opcache.consistency_checks	0
opcache.force_restart_timeout	180
opcache.revalidate_freq	2
opcache.preferred_memory_model	
opcache.blacklist_filename	
opcache.max_file_size	0
opcache.error_log	
opcache.protect_memory	false
opcache.save_comments	true
opcache.record_warnings	false
opcache.enable_file_override	false
opcache.optimization_level	2147401727
opcache.lockfile_path	/tmp
opcache.file_cache	
opcache.file_cache_only	false
opcache.file_cache_consistency_checks	true
opcache.file_update_protection	2
opcache.opt_debug_level	0
opcache.restrict_api	
opcache.huge_code_pages	false
opcache.preload	
opcache.preload_user	
opcache.jit	
opcache.jit_buffer_size	0
opcache.jit_debug	0
opcache.jit_bisect_limit	0
opcache.jit_blacklist_root_trace	16
opcache.jit_blacklist_side_trace	8
opcache.jit_hot_func	127
opcache.jit_hot_loop	64
opcache.jit_hot_return	8
opcache.jit_hot_side_exit	8
opcache.jit_max_exit_counters	8192
opcache.jit_max_loop_unrolls	8
opcache.jit_max_polymorphic_calls	2
opcache.jit_max_recursive_calls	2
opcache.jit_max_recursive_returns	2
opcache.jit_max_root_traces	1024
opcache.jit_max_side_traces	128
opcache.jit_prof_threshold	0

@wisskid
Copy link
Contributor

wisskid commented Jun 28, 2024

@nikitanp do you think these constantly recompiled cache files are a symptom of an underlying error in Smarty?
If so, could you provide more specifics?

@nikitanp
Copy link

@wisskid Firstly thanks for the reply and for your maintaining such a huge library.
I can't be sure, but I suppose it. Maybe Smarty 5 just touches the modified time of compiled templates and OPcache invalidates these files because of that (And we bumped into it because opcache.validate_timestamps is true). I tried to check the difference between Smarty 4 and 5, but it looked like huge refactoring without big changes under the hood. If I can help, feel free to ask me to do something. I would be glad to help.

I suspect some moments:

  1. I made an extension with custom modifiers, blocks and functions. I set protected $cacheable = false; for all blocks for instance because I definitely don't want them to be cached. But I suppose here that they're been correctly compiled and do not recompile.
  2. We do not use a template's cache. We rely on compiled templates.
  3. We force recompile templates for emails, but it works fine on Smarty 4 and doesn't affect OPcache.
  4. We use templates loading on the fly, maybe an issue somewhere there. Loading looks like that:
// runs in a factory method while creating a Template object
    private function addModuleResource(Module $module): void
    {
        $this->registerResource(
            $module->getName(),
            new class ($module) extends CustomPlugin {
                private Module $module;

                public function __construct(Module $module)
                {
                    $this->module = $module;
                }

                protected function fetch($name, &$source, &$mtime): void
                {
                    $path = $this->module->getViewsPath().$name;
                    if (file_exists($path) && is_file($path)) {
                        $source = file_get_contents($path);
                    }
                }

                protected function fetchTimestamp($name): bool|int
                {
                    $path = $this->module->getViewsPath().$name;
                    if (file_exists($path) && is_file($path)) {
                        return filemtime($path);
                    }

                    return false;
                }
            }
        );
    }

    public function fetch($template = null, $cache_id = null, $compile_id = null, $parent = null)
    {
        if ($template && !containsSubstring($template, ':')) {
            $pathData = $this->getLegacyTemplatePaths($template);

            $template = $pathData['path'];
            $this->addTemplateDir($pathData['folder']);
        }

        return parent::fetch($template);
    }

@wisskid
Copy link
Contributor

wisskid commented Jun 30, 2024

So, we have @nikitanp saying that "some of the compiled cache files (not all of them!) were constantly recompiled". This suggests that we are looking for a bug where a template is recompiled (i.e. new code written to a compiled file in the templates_c folder, or the timestamp of this file being renewed) while the source file has not changed.

But @jdpanderson states "I checked the compiled template timestamps a couple times thinking maybe the issue was regular recompilation. The compiled template file timestamps weren't regularly updated once compiled by v5." So at least in his case this does not seem to be the issue.

Or maybe we have two different bugs with similar symptoms.

In any case, it would be very helpful if you could come up with reproduction scenarios.

@tungps250991
Copy link

tungps250991 commented Jul 25, 2024

@jdpanderson @nikitanp @wisskid
Hi, did anyone figure out the problem?

We upgraded from v4 to v5 and also have a problem. When we run performance test by jmeter, the app server is down after running with ~400 users (total of 600 users for the daily test scenario), and there're a lot of "Too Many Requests" error at that moment.
We did compare between 2 archive builds: one with Smarty 5 and one with Smarty 4. We also did a lot of regression test for Smarty 5 but haven't found the problem like the performance test.

Additional information: A major change when upgrading is that we wrapped all our custom plugins (we wrote a lot of it) into a wrapper Extension. But I don't think it's the case, because we checked if the cached plugin exists before calling it:

class CustomSmartyExtension extends Base
{
    private array $functionHandlers = [];
    ...

    public function getFunctionHandler(string $functionName): ?FunctionHandlerInterface
    {
        if (isset($this->functionHandlers[$functionName])) {
            return $this->functionHandlers[$functionName];
        }

        if (! $this->checkPluginExist('function', $functionName)) {
            return null;
        }

        $this->functionHandlers[$functionName] = new class ($functionName) extends \Smarty\FunctionHandler\Base
        {
            private string $function_name;

            public function __construct(string $function_name)
            {
                $this->function_name = $function_name;
            }

            public function handle($params, $template)
            {
                $custom_function_name = "smarty_function_{$this->function_name}";

                if (function_exists($custom_function_name)) {
                    return call_user_func($custom_function_name, $params, $template);
                }

                return null;
            }
        };

        return $this->functionHandlers[$functionName];
    }
    ...
}

and then we add the extension when creating Smarty instance:

class OurSmarty extends Smarty
{
    public function __construct()
    {
        parent::__construct();
        $this->addExtension(new CustomSmartyExtension());
        ...
    }
}

@nikitanp
Copy link

@tungps250991 Unfortunately I haven't had enough time for investigation. We also used an extension class... and we had a lot of stuff in that extension. I can't share everything, but here's the extension's code.

<?php

declare(strict_types=1);

use Smarty\BlockHandler\BlockHandlerInterface;
use Smarty\Extension\Base;
use Smarty\FunctionHandler\FunctionHandlerInterface;

final class SmartyExtension extends Base
{
    public function getModifierCallback(string $modifierName): ?callable
    {
        return match ($modifierName) {
            // PHP native functions
            'json_encode' => 'json_encode',
            'htmlspecialchars' => 'htmlspecialchars',
            'count' => 'count',
            'is_null' => 'is_null',
            'intval' => 'intval',
            'date' => 'date',
            'range' => 'range',
            'strtotime' => 'strtotime',
            'constant' => 'constant',
            'floor' => 'floor',
            'html_entity_decode' => 'html_entity_decode',

            // Custom
            'account_date_format' => new AccountDateFormatModifierCallback(),
            'account_number_format' => new AccountNumberFormatModifierCallback(),
            'date_format' => new DateFormatModifierCallback(),
            'date_lang_format' => new DateLangFormatModifierCallback(),
            'date_locale_format' => new DateLocaleFormatModifierCallback(),
            'lower' => new LowerModifierCallback(),
            'mb_truncate' => new MbTruncateModifierCallback(),
            'number_reduction' => new NumberReductionModifierCallback(),
            'pretty_url' => new PrettyUrlModifierCallback(),
            'reduction' => new ReductionModifierCallback(),
            'remove_http' => new RemoveHttpModifierCallback(),
            'to_array' => new ToArrayModifierCallback(),
            'ucfirst' => new UCFirstModifierCallback(),
            'unescape' => new UnescapeModifierCallback(),
            default => null
        };
    }

    public function getFunctionHandler(string $functionName): ?FunctionHandlerInterface
    {
        return match ($functionName) {
            'action' => new ActionFunctionHandler(),
            'get_dictionary' => new GetDictionaryFunctionHandler(),
            'get_front_assets' => new GetFrontAssetsFunctionHandler(),
            'html_select_options' => new HtmlSelectOptionsFunctionHandler(),
            'legacy_pager' => new LegacyPagerFunctionHandler(),
            'message' => new MessageFunctionHandler(),
            'money_format' => new MoneyFormatFunctionHandler(),
            'pager' => new PagerFunctionHandler(),
            't' => new TFunctionHandler(),
            'url' => new UrlFunctionHandler(),
            'hasAccess' => new HasAccessFunctionHandler(),
            default => null
        };
    }

    public function getBlockHandler(string $blockTagName): ?BlockHandlerInterface
    {
        return match ($blockTagName) {
            'block_include' => new BlockIncludeBlockHandler(),
            'has_access' => new HasAccessBlockHandler(),
            'has_access_to_group' => new HasAccessToGroupBlockHandler(),
            't_namespace' => new TNamespaceBlockHandler(),
            default => null
        };
    }
}

@wisskid
Copy link
Contributor

wisskid commented Jul 25, 2024

Since there is still no reproduction scenario, I can't tell if the following suggestion will make any difference, but I've implemented returning the handlers from the Extension as follows:

See \Smarty\Extension\DefaultExtension::getFunctionHandler:

		if (isset($this->functionHandlers[$functionName])) {
			return $this->functionHandlers[$functionName];
		}

		switch ($functionName) {
			case 'count': $this->functionHandlers[$functionName] = new \Smarty\FunctionHandler\Count(); break;
			case 'counter': $this->functionHandlers[$functionName] = new \Smarty\FunctionHandler\Counter(); break;
			case 'cycle': $this->functionHandlers[$functionName] = new \Smarty\FunctionHandler\Cycle(); break;
			case 'fetch': $this->functionHandlers[$functionName] = new \Smarty\FunctionHandler\Fetch(); break;
			case 'html_checkboxes': $this->functionHandlers[$functionName] = new \Smarty\FunctionHandler\HtmlCheckboxes(); break;
			case 'html_image': $this->functionHandlers[$functionName] = new \Smarty\FunctionHandler\HtmlImage(); break;
			case 'html_options': $this->functionHandlers[$functionName] = new \Smarty\FunctionHandler\HtmlOptions(); break;
			case 'html_radios': $this->functionHandlers[$functionName] = new \Smarty\FunctionHandler\HtmlRadios(); break;
			case 'html_select_date': $this->functionHandlers[$functionName] = new \Smarty\FunctionHandler\HtmlSelectDate(); break;
			case 'html_select_time': $this->functionHandlers[$functionName] = new \Smarty\FunctionHandler\HtmlSelectTime(); break;
			case 'html_table': $this->functionHandlers[$functionName] = new \Smarty\FunctionHandler\HtmlTable(); break;
			case 'mailto': $this->functionHandlers[$functionName] = new \Smarty\FunctionHandler\Mailto(); break;
			case 'math': $this->functionHandlers[$functionName] = new \Smarty\FunctionHandler\Math(); break;
		}

		return $this->functionHandlers[$functionName] ?? null;

This will limit PHP garbage collection for new objects. Maybe it is related.

@francislavoie
Copy link
Author

We don't use any extensions, but we do use "plugins" to declare some functions & modifiers. I listed those in the top comment of the issue.

Did the implementation of registerPlugin() change from v4 -> v5? Does it handle things differently, creating closure instances per every invocation of a plugin? If so, that could be a lead, because we certainly have some very heavy pages that render like 10,000+ rows of HTML and each cell might be calling one or more of these plugin functions.

@wisskid
Copy link
Contributor

wisskid commented Jul 25, 2024

No, that should still be very fast array lookups.

@tungps250991
Copy link

@wisskid @nikitanp
Actually, I just write the anonymous class only in the example above. The actual code looks like this:

class CustomSmartyExtension extends Base
{
    private array $functionHandlers = [];
    ...

    public function getFunctionHandler(string $functionName): ?FunctionHandlerInterface
    {
        if (isset($this->functionHandlers[$functionName])) {
            return $this->functionHandlers[$functionName];
        }

        if (! $this->checkPluginExist('function', $functionName)) {
            return null;
        }

       $function = new CustomSmartyFunction($functionName);
        $this->functionHandlers[$functionName] = $function;

        return $function;
    }
    ...
}

And in CustomSmartyFunction:

class CustomSmartyFunction extends Base
{
    private string $functionName;

    public function __construct(string $function_name)
    {
        $this->functionName = $function_name;
    }

    public function handle($params, Template $template)
    {
        $smarty_function_name = "smarty_function_{$this->functionName}";

        if (function_exists($smarty_function_name)) {
            return call_user_func($smarty_function_name, $params, $template);
        }

        return null;
    }
}

I think this seems to make almost no difference with your code. Doesn't it?
One more thing, we did require_once the plugin file in checkPluginExist() function. Because we leave the custom plugins in a separate folder (so the license of Smarty is not violated)

@stephanlueckl
Copy link
Contributor

stephanlueckl commented Jul 27, 2024

Same problem here: After upgrading 4.x to 5, Production Server startet to have OOM issues.
OPCache wasted memory is rising to its limit within minutes.
And the opcache refresh afterwards causes hickups / heavy load.

I've seen that all included compiled templates are invalidated on every call.
Done so in \Smarty\Template\Compiled::loadCompiledTemplate

Since it is forced there, file is always removed and added again to the opcache.

And as the memory of removed files in the opcache isn't reused, this wasted memory raises.

	private function loadCompiledTemplate(Template $_smarty_tpl) {

		if (function_exists('opcache_invalidate')
			&& (!function_exists('ini_get') || strlen(ini_get("opcache.restrict_api")) < 1)
		) {
			opcache_invalidate($this->filepath, true);
		} elseif (function_exists('apc_compile_file')) {
			apc_compile_file($this->filepath);
		}
		if (defined('HHVM_VERSION')) {
			eval('?>' . file_get_contents($this->filepath));
		} else {
			include $this->filepath;
		}
	}

stephanlueckl added a commit to niceshops/smarty that referenced this issue Jul 27, 2024
stephanlueckl added a commit to niceshops/smarty that referenced this issue Jul 29, 2024
stephanlueckl added a commit to niceshops/smarty that referenced this issue Jul 29, 2024
stephanlueckl added a commit to niceshops/smarty that referenced this issue Jul 29, 2024
stephanlueckl added a commit to niceshops/smarty that referenced this issue Jul 29, 2024
stephanlueckl added a commit to niceshops/smarty that referenced this issue Jul 30, 2024
wisskid pushed a commit that referenced this issue Aug 14, 2024
…s resulting in fast raising wasted OpCache memory #1007 (#1047)

* Fixing forced OpCache Invalidation on every call, which is resulting in fast raising wasted memory
* Fix undefined $path variable warning
---------

Co-authored-by: Daniel Metzner <daniel.metzner@niceshops.com>
@francislavoie
Copy link
Author

Great news! Thank you @stephanlueckl, we'll try the fix sometime next week. Reading the PR, definitely sounds like it aligns with what we were experiencing too.

@francislavoie
Copy link
Author

Due to a myriad of factors, we never got around to trying the fix until today. We can confirm it's resolved for us! Thanks everyone! We're on v5 now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants