See README.md
for more details of the Magento Config Cache corruption bug caused by race conditions.
I haven't fully gotten to the bottom of this one yet, but thought I would post my findings as they definitely improve the situation.
I discovered that sometimes the Magento configuration is generated missing a lot of the required nodes. After some brief experimentation, I discovered that the config being saved is consistent with a complete failure of the Mage_Core_Model_Config:loadDb
method.
A full and valid config is generated by the calling of the following commands:
$config = Mage::getConfig();
$config->loadBase();
$config->loadModules();
$config->loadDb();
$config->saveCache();
loadDb
looks like
public function loadDb()
{
if ($this->_isLocalConfigLoaded && Mage::isInstalled()) {
Varien_Profiler::start('config/load-db');
$dbConf = $this->getResourceModel();
$dbConf->loadToXml($this);
Varien_Profiler::stop('config/load-db');
}
return $this;
}
As you can see, if the _isLocalConfigLoaded
flag or Mage::isInstalled()
return false, then no actions will be taken before attempting to save this invalid config in the cache. Ouch.
In my case, it turned out that Mage::isInstalled()
was erroneously returning false
despite the local.xml
being in place. I'm going to continue digging into this further, but for now the following code changes have kept the site alive and stable.
The following basically adds a flag to Mage_Core_Model_Config
which will prevent a saveCache
call unless we have actually succeeded to do a loadDb
call.
+ protected $allowSaveCache = false;
public function loadDb()
{
if ($this->_isLocalConfigLoaded && Mage::isInstalled()) {
Varien_Profiler::start('config/load-db');
$dbConf = $this->getResourceModel();
$dbConf->loadToXml($this);
Varien_Profiler::stop('config/load-db');
+ $this->allowSaveCache = true;
+ } else {
+ $this->allowSaveCache = false;
+ }
return $this;
}
public function saveCache($tags=array())
{
+ if (!$this->allowSaveCache) {
+ return $this;
+ }
+
Ew, that was pretty gross. I know. But this at least prevents bad configuration being cached. The bigger issue here is why my local.xml
was periodically failing to read. I'll update this when I have answers.
This one is pretty simple. At the very end of Mage_Core_Model_Config::saveCache
I added the following line.
unset($this->_cachePartsForSave);
$this->_removeCache($cacheLockId);
+ $this->_allowCacheForInit = true;
return $this;
}
This flag blocks the usage of _canUseCacheForInit
and by extension, loadModulesCache
. This flag is set to false
during the reinit
method.
This means that, without my change, calls like the following would result in multiple full regenerations of the configuration cache when it only needed to do one.
<?php
require_once __DIR__ . '/app/Mage.php';
Mage::app();
$config = Mage::getConfig();
/**
* Sets _allowCacheForInit=false, then regenerates the config and saves it to cache
* This function could be triggered by a failure to grab any part of the config from cache
* See Mage_Core_Model_Config::_loadSectionCache
*
* By the end of this function the config has been fully regenerated and added to cache and
* should be usable again.
*/
$config->reinit();
for ($i=0; $i<100; $i++) {
/**
* Init() attempts to load the config from cache with loadModulesCache
* This fails because of _allowCacheForInit which is false, despite the fact the config
* exists and is valid.
*
* This will trigger a full config regeneration which can be quite expensive depending on
* the amount of store views etc.
*
* If the $config->reinit() above had correctly reset the _allowCacheForInit flag after
* regenerating the config, then this loop would be safe and load onfiguration from the
* cache. This wouldn't cause 100 extra config regenerations to trigger.
*/
$config->init();
}
This may not be an issue for your site, but we had a cron task running Mage::app()->init()
within a loop, which was triggering Mage_Core_Model_Config::init()
to be called in a loop, which caused the configuration to be constantly regenerated multiple times in a loop!
This one line fix settled things down quite a bit.