-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extremely slow reindexing when using a MariaDb platform. #1621
Comments
@reviskar for me your query will not work under a Magento Commerce because you get rid of the |
True, did not see that as a possibility. We have |
I'm not sure the double-join can be avoided because |
@reviskar do you have any new insights on this one ? did you try to upgrade MariaDB to another version ? I saw some issues on the MariaDB tracker that looks like yours : https://jira.mariadb.org/browse/MDEV-15339?attachmentViewMode=list |
@romainruaud Thanks for asking. Seems to be exactly the issue we had. I've solved it as stated above, as it works for us very well and we had limited resources to solve it at that time. |
@romainruaud our service provider will upgrade our staging and production servers to MariaDb 10.4 in a few days time. So we will find out soon enough, if the newer version is smarter in this regard. |
Great, thank you for keeping us in touch about this (very annoying I admit it) issue. Regards |
I recently upraded MariaDB to 10.4 (as it is supported by Magento 2.4.1) and found I had to disable an optimizer_switch or indexation would be very slow when part of the result set was empty. This impacts more than just Elasticsuite (Magento itself also suffers), just figured I'd share what I found. Rowid_filter is new in MariaDB 10.4, but is about 100 times slower when result set is empty, causing indexation to be 10 times slower in my case (since I use a lot of global attributes, so store view values are empty) For my.cnf EDIT: Looks like that made it into the Magento documentation! https://devdocs.magento.com/guides/v2.4/performance-best-practices/configuration.html#indexers |
There's also this: magento/magento2#27129 to consider and potentially patch here. Improves index times even further. |
Hi @Quazz , thank you for the update. On our side, we've had good results so far on a MariaDB 10.3 with the following parameters :
|
@romainruaud Thank you for those. I believe 'optimize_join_buffer_size=on' is default from 10.4.3 onwards now. |
About 15-25% improvement (for the full catalogsearch reindex) on a 125K ish catalog with 3 store views in a quick test. Can't fully speak to whether the results are the same, though the number of documents in the index seems to match at least. |
Wow, good catch, we'll have to test it with caution, especially with Magento Enterprise where things are handled differently (row_id/entity_id). |
@Wohlie did you test it in production ? By the way, are you using Magento Commerce or Open Source ? |
Hi @romainruaud, we are using Magento Commerce 2.3.4. We are currently not live. Go live for this project is end of May. This patch is included since mid-March in our code base. We don't test the patch for Magento open source but I re-use the original code for this patch. So the chance is high, it will also work with Magneto open source. |
how time flies :) Are you still using this patch on production ? If yes, that's probably enough tested now :) Regards |
Hi @romainruaud, yes we still use this patch in production. :) |
Ok so this could probably be integrated into the core if this does help that much with performances. I'll prioritize this. Regards |
The issue only occurs when using a mariaDb platform, and I've found that it only happens with
catalog_product_entity_int
reindexing in thecatalogsearch_fulltext
indexer.Because it works in a loop, the effective sync speeds take over an hour to finish in our production server, meanwhile are done in under a minute in a local machine.
Preconditions
MariaDB: 10.2.26
Magento Version : 2.3.2
ElasticSuite Version : 2.8.1/2.8.3
Steps to reproduce
$ bin/magento indexer:reindex catalogsearch_fulltext
Expected result
3 rows (0.008 s),
Actual result
3 rows (16.280 s),
Analysis
The problem lies somewhat in MariaDb failing to use the primary key index instead of the unique one in
catalog_product_entity_int
table, but I think the main issue is in the non optimal SQL query executed.\Smile\ElasticsuiteCatalog\Model\ResourceModel\Eav\Indexer\Fulltext\Datasource\AbstractAttributeData::getAttributesRawData
finds attributes using theentity_id
's, and for some reason selects them from the main table first only to join the entity tables by theentity_id
value.I've fixed our problem by ditching the Inner join done here and selecting all Id's straight from the
t_default
(catalog_product_entity_int
) table.So instead of doing ...
... the query works perfectly well written as:
I would suggest fixing this query :)
The text was updated successfully, but these errors were encountered: