Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reindex command and some es utils #35463

Merged
merged 7 commits into from
Dec 13, 2024

Conversation

AmitPhulera
Copy link
Contributor

@AmitPhulera AmitPhulera commented Dec 3, 2024

Product Description

This PR started out as to fix a small bug that that I saw in the elastic_sync_multiplexed command. But while reindexing India I came across couple other issues which were small enough to get it in the PR itself.
So this PR essentially does three things -

  • Fixes an issue where calculation of size required for reindex was failing on environments with no-sub indices.
  • batch_size param was mistakenly removed from elastic_sync_multiplexed during a refactor, it was added back.
  • Adds back option to purge_ids, the issue we thought would have been fixed with ES 2-5 upgrade, but it still persists.
  • 'doc_in_es' page was broken for users who are not using sub-indices

Review by commit 🐡

Feature Flag

N/A

Safety Assurance

The changes are already covered by the tests where required and I have tested them locally. The change affects a reindex command so it has a very low blast radius.

Automated test coverage

QA Plan

N/A

Rollback instructions

  • This PR can be reverted after deploy with no further considerations

Labels & Review

  • Risk label is set correctly
  • The set of people pinged as reviewers is appropriate for the level of risk of the change

we will not reindex the sub indices as we can create them on the fly after we have new indices ready
@AmitPhulera AmitPhulera requested a review from esoergel as a code owner December 3, 2024 09:03
@AmitPhulera AmitPhulera changed the title Fix size estimation for reindex Fix reindex command and some es utils Dec 12, 2024
@@ -31,6 +32,12 @@ def to_json(doc):
found_indices = {}
es_doc_type = None
for index in iter_index_cnames():
if not settings.IS_SAAS_ENVIRONMENT:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gherceg I have been running in circles in refactoring transient utils and was not able to make significant progress on it so I realised by the time I will do that it would be good to fix the page for users.

# Add doc_id field to the documents
script_parts.append("""
if (!ctx._source.containsKey('doc_id')) {
ctx._source['doc_id'] = ctx._id;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work if purge_ids is true?

It's referencing ctx._id, which may have been removed by ctx._source.remove('_id');? I'm not confident about what ctx._source.remove() does, so I could be wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It will work if purge_ids is true.

We are removing _id from ctx._source which is different from ctx._id. ctx._id is a read only property.

ctx._source.remove is basically a groovy style syntax to remove an object from a map. So it is just removing the attribute from the object.

@AmitPhulera AmitPhulera merged commit bcc5b6d into master Dec 13, 2024
13 checks passed
@AmitPhulera AmitPhulera deleted the ap/fix-size-estimation-for-reindex branch December 13, 2024 05:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants