Snapshot Deletion Could Run more Concurrently to Snapshot Creation #82853

original-brownbear · 2022-01-20T11:50:03Z

Currently, snapshot deletion completely locks the repository it operates on, forbidding both shard level snapshot creates and snapshot finalisations for the entire duration of the delete.
This is a problem when when executing very large deletes (as in large product of snapshot count * indices_per_snapshot) because blob stores like S3 are often somewhat rate limited in terms of how many blobs can be deleted per unit of time.
Without a change to the file structure in a repository I don't think we can speed up deletes much end-to-end beyond the current state. What we can however do is reduce the impact of a long running delete considerably by the following change.

Currently a snapshot delete runs as follows:

create delete entry in the cluster state
wait for all other repository operations to halt
update all metadata in the repository to reflect the removal of the deleted snapshots
remove all files now unreferenced (this is the slow part in most real-world cases)
remove delete from cluster state to allow other operations to continue

With the way deletes are currently implemented this level of locking out of other operations is unnecessary.

Switching the order of the last two steps to:

remove delete from cluster state to allow other operations to continue
remove all files now unreferenced (now that we don't block other operations any more slowness isn't so bad here)

Would drastically reduce the impact of a slow delete without introducing any safety issues.

The reason for this being a safe change is as follows:

When we delete a snapshot, we do the following steps:

Write new metadata to the the shard folders int he repository (this introduces no potential for corruption because shard-level metadata only matters when it's referenced from the repository root)
Collect all blobs that can be deleted after that metadata would be referenced from the root-level as well as all index-folders that can be deleted after its referenced on heap. (these paths will never be referenced again due to UUID-naming so we can safely clean them up without worrying about races once the next step completes)
Write new root-level metadata
Run deletes for everything tracked in step 2.

This also does not introduce an added risk of leaking blobs in all practically relevant cases because it fundamentally does not change the cleanup properties of deletes in step 2 above. We will always collect all unreferenced index folders and all unreferenced blobs in shard folders that get touched by the delete in step 2. Hence, if a delete fails to fully complete the next delete will simply pick up where it left off. The mechanics of what we try to delete and when are not touched by this change. All we are changing is allowing other operations to continue while we run the delete of known-to-not-be-referenced-ever blobs concurrently to other operations.

elasticmachine · 2022-01-20T11:50:06Z

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner · 2023-09-30T18:15:33Z

Didn't we do this in #86514? If not, what's left of this idea?

original-brownbear · 2023-10-02T01:18:39Z

Right we/I could've done better by avoiding duplicate blob deletes but that's a lot of work and the current implementation mostly performs just fine now => Closing :)

original-brownbear added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs labels Jan 20, 2022

original-brownbear self-assigned this Jan 20, 2022

elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jan 20, 2022

original-brownbear mentioned this issue Jan 20, 2022

Fix Large Shard Count Scalability Issues #77466

Open

97 tasks

original-brownbear closed this as completed Oct 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Snapshot Deletion Could Run more Concurrently to Snapshot Creation #82853

Snapshot Deletion Could Run more Concurrently to Snapshot Creation #82853

original-brownbear commented Jan 20, 2022 •

edited

Loading

elasticmachine commented Jan 20, 2022

DaveCTurner commented Sep 30, 2023

original-brownbear commented Oct 2, 2023

Snapshot Deletion Could Run more Concurrently to Snapshot Creation #82853

Snapshot Deletion Could Run more Concurrently to Snapshot Creation #82853

Comments

original-brownbear commented Jan 20, 2022 • edited Loading

elasticmachine commented Jan 20, 2022

DaveCTurner commented Sep 30, 2023

original-brownbear commented Oct 2, 2023

original-brownbear commented Jan 20, 2022 •

edited

Loading