Skip to content
This repository has been archived by the owner on Jun 11, 2024. It is now read-only.

Secure VueStorefrontIndexer reindexing #308

Closed
jonathanribas opened this issue Jul 7, 2020 · 2 comments
Closed

Secure VueStorefrontIndexer reindexing #308

jonathanribas opened this issue Jul 7, 2020 · 2 comments
Assignees

Comments

@jonathanribas
Copy link

jonathanribas commented Jul 7, 2020

First of all, I'm not an Elasticsearch expert, maybe there are shortcuts with the API calls. I'm just trying to make things working better than it's actually.

Vsfbridge reindexing job is actually not checking Elasticsearch status neither it's queue thread on bulk operations before sending more operations to update your data.

  1. Check your Elasticsearch status

GET /_cat/health?format=json

Response

{ "epoch": "1594120992", "timestamp": "13:23:12", "cluster": "es-xxxxxxxx", "status": "green", "node.total": "6", "node.data": "3", "shards": "334", "pri": "167", "relo": "0", "init": "0", "unassign": "0", "pending_tasks": "0", "max_task_wait_time": "-", "active_shards_percent": "100.0%" }

If it's green we can proceed otherwise we stop reindex job.

As you can see, this endpoint also provides us the pending tasks.

  1. Get your master node

GET /_cat/master?format=json

Response

[ { "id": "xxxxxxx", "host": "10.21.30.47", "ip": "10.21.30.47", "node": "yyyyy" } ]

  1. Get your nodes thread pool statistics

GET /_cat/thread_pool?format=json

Response

[ { "node_name": "x-1", "name": "bulk", "active": "0", "queue": "0", "rejected": "0" }, { "node_name": "x-2", "name": "bulk", "active": "0", "queue": "0", "rejected": "0" }, { "node_name": "x-3", "name": "bulk", "active": "0", "queue": "0", "rejected": "0" }, { "node_name": "x-4", "name": "bulk", "active": "0", "queue": "0", "rejected": "0" }, { "node_name": "x-5", "name": "bulk", "active": "0", "queue": "0", "rejected": "0" }, { "node_name": "x-6", "name": "bulk", "active": "0", "queue": "0", "rejected": "0" } ]

Part the result to get your master one.

  1. Get master node thread_pool bulk size

GET _nodes/xxxxx/thread_pool where xxxxx is your master node

Parse result to get the bulk queue_size.

If pending tasks + batch indexer size (VueStorefrontIndexer indices setting) are lower than max bulk queue size master node, it means we are healthy and we can proceed.

As bulk operations are in a loop, we can implement an error log if load is too high.

@jonathanribas
Copy link
Author

jonathanribas commented Jul 7, 2020

I've also found this article which suggest before launching a bulk operation to update index.refresh_interval to -1. At the end of bulk operation, just enable it back. If an exception happens during bulk operation we also need to enable it back.

If you have replicas, they also suggest to disable replication when running bulk operations, set it to index.number_of_replicas: 0.

I've done several tests to test those settings. Best result so far is setting index.refresh_interval to -1 and index.number_of_replicas: 0.

Setup

  • 6 nodes (3 master dedicated nodes and 3 data nodes)
  • 11 websites
  • 36000 skus

20200707_cpu_usage
20200707_gc_count
20200707_gc_time
20200707_indexing_time
20200707_total_operations_rate
20200707_total_operations_time

camilloop pushed a commit to camilloop/magento2-vsbridge-indexer that referenced this issue Jul 10, 2020
camilloop pushed a commit to camilloop/magento2-vsbridge-indexer that referenced this issue Jul 10, 2020
camilloop pushed a commit to camilloop/magento2-vsbridge-indexer that referenced this issue Jul 10, 2020
camilloop pushed a commit to camilloop/magento2-vsbridge-indexer that referenced this issue Jul 13, 2020
camilloop pushed a commit to camilloop/magento2-vsbridge-indexer that referenced this issue Jul 17, 2020
camilloop pushed a commit to camilloop/magento2-vsbridge-indexer that referenced this issue Jul 19, 2020
camilloop pushed a commit to camilloop/magento2-vsbridge-indexer that referenced this issue Jul 24, 2020
afirlejczyk added a commit that referenced this issue Jul 24, 2020
@afirlejczyk
Copy link
Contributor

Changes are available in master branch.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants