Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce enthusiasm for calling flush APIs #38503

Closed

Conversation

DaveCTurner
Copy link
Contributor

Today we only really describe the benefits of the flush and synced-flush APIs:

... reduces recovery times ...

... NOTE: It is harmless to request a synced flush ...

This might lead users to believe that they should call these APIs frequently as
a matter of course, but this is not what we recommend.

This change rewords the docs a bit to talk about trade-offs and to mention more
prominently that Elasticsearch does these things automatically.

Today we only really describe the benefits of the flush and synced-flush APIs:

> _... reduces recovery times ..._

> _... NOTE: It is harmless to request a synced flush ..._

This might lead users to believe that they should call these APIs frequently as
a matter of course, but this is not what we recommend.

This change rewords the docs a bit to talk about trade-offs and to mention more
prominently that Elasticsearch does these things automatically.
@DaveCTurner DaveCTurner added >docs General docs changes :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. v7.0.0 v8.0.0 labels Feb 6, 2019
@DaveCTurner DaveCTurner requested a review from bleskes February 6, 2019 10:47
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments. I general I think it's no surprise that the old docs read better to me, because I wrote them :) That said - I think this PRs tries to do too much in one go. If you want to re-write the whole page, that's fine with me but it seems the goal was to only make it clearer that (normal) flushes should be very rarely called.

The flush process of an index makes sure that any data that is currently only
stored in the <<index-modules-translog,transaction log>> is also permanently
stored in Lucene. When restarting, {es} replays any unflushed operations from
the transaction log into Lucene to ensure that searches do not return stale
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked the previous explanation better 👼 The real reason is that we want to bring lucene back to where it was before the shard was shutdown and lucene resets itself to the last time it was flushed. Stale searches are only one aspect of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flush can be executed if another flush operation is already executing.
The default is `false` and will cause an exception to be thrown on
the shard level if another flush operation is already running.
flush can be executed if another flush operation is already executing. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main point is here that the command waits until a flush has happened and all operations indexed before the command was called are guaranteed to be flushed. I'm tempting to also say we need to change the default, but that's a different discussion :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. This wasn't spelled out before as far as I can see, but it's a good point to make. See https://github.com/elastic/elasticsearch/pull/46245/files#diff-31f3afe6b50f556a95a396324be535e5R22-R24.

The default is true, fixed by Nhat recently.


NOTE: It is harmless to request a synced flush while there is ongoing indexing. Shards that are idle will succeed and shards
that are not will fail. Any shards that succeeded will have faster recovery times.
The Synced Flush API allows an administrator to initiate a synced flush
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change the old explanation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes I made were to address a number of points that I originally found confusing:

  • It says it's useful specifically for rolling restarts, but it's actually useful for all restarts whether rolling or otherwise.
  • The note about future flushes removing the sync_id is out of place in this section on the API, since it applies even if you don't use this API.
  • It introduces the concept of a "low level Lucene commit point" without definition or explanation. This isn't really necessary, we can make that same point more clearly by talking about flushes directly.
  • The formatting (i.e. the numbered list and the NOTE: callout) were unnecessary and made it harder to read than just straight prose.

an opportunity for Elasticsearch to reduce shard resources and also perform
a special kind of flush, called `synced flush`. A synced flush performs a normal flush, then adds
a generated unique marker (sync_id) to all shards.
{es} keeps track of which shards have received indexing activity recently, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in the previous comment - I like the previous explanation better. It explains what the thing does. With the new version, it provides a higher level description that to me is harder to understand what really happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow you here. I've perhaps reordered the information to start with the higher-level view and then drop down to the details, since this is easier to follow if you aren't already intimately familiar with it. I don't think I've lost any detail here have I?

@javanna
Copy link
Member

javanna commented Aug 6, 2019

what should we do with this PR? It was opened months ago and there was some discussion, hence it was not merged. Should we close it or get it in?

@DaveCTurner
Copy link
Contributor Author

I am holding off on further changes here right now while the work on history retention (#41536) is still in flight, because that is intertwined with the flushing and recovery mechanisms. This PR is still on my list.

@DaveCTurner
Copy link
Contributor Author

Closing this in favour of #46245. @bleskes I will reply to your comments here, having addressed some of them in #46245.

@DaveCTurner DaveCTurner closed this Sep 2, 2019
@DaveCTurner DaveCTurner deleted the 2019-02-06-flush-docs branch September 2, 2019 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. >docs General docs changes v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants