Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable graceful shutdown of server processes #5193

Open
wants to merge 10 commits into
base: 2.1
Choose a base branch
from

Conversation

dlmarion
Copy link
Contributor

Added a ShutdownHook via Hadoop's ShutdownHookManager that interrupts the main server thread and sets the shutdownRequested variable to true. Removed variables from subclasses that were used to track shutdown requests. Modified the server threads run methods to attempt an orderly shutdown.

Added a ShutdownHook via Hadoop's ShutdownHookManager that
interrupts the main server thread and sets the shutdownRequested
variable to true. Removed variables from subclasses that were used
to track shutdown requests. Modified the server threads run methods
to attempt an orderly shutdown.
@dlmarion dlmarion added this to the 2.1.4 milestone Dec 17, 2024
@dlmarion dlmarion self-assigned this Dec 17, 2024
@dlmarion
Copy link
Contributor Author

Marked as draft because I want to do some more testing.

Copy link
Member

@ctubbsii ctubbsii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A smarter person than me (I don't remember their name, but they were introduced to me as an expert on robust systems) once suggested that it's often not worth the complexity to introduce graceful shutdowns if you already have to handle ungraceful ones. It's better to spend time making sure the ungraceful ones are handled robustly, and if they are, there's no harm in making ungraceful shutdowns (SIGKILL) the normal shutdown and not worth the effort to even bother to code graceful shutdowns... especially for a server process that you expect to stay running for long periods of time as part of your backend architecture where shutdowns are abnormal anyway. I wouldn't take that as an absolute rule (because nothing is), of course, but I'm reminded of that opinion every once in awhile, because I think there's some truth to it that's worth thinking about it whenever we try to add some complexity to handle some edge case for starting/stopping/upgrading/etc.

It's also not clear to me that shutdown hooks are always triggered by a SIGTERM or similar. It seems they also run when System.exit, and or maybe even when the main method is finished normally... and possibly if the main method throws an Exception or even an Error. We'll have to make sure that whatever we do to handle the shutdown request is suitable for all such cases, whatever they happen to be.

There are also other considerations to be made, like how uncaught exceptions are expected to be handled in the shutdown hook themselves, and we need to be very careful to shut down quickly. There can be no guarantees that these will even run, as the OS may determine a process is taking too long to stop and could just kill it at any time. So it's possible any complexity added to do this may not even be worth it.

Overall, I kind of like the idea, as long as it shuts down very quickly, doesn't add too much complexity, doesn't add risk by slowing down our emergency Halt measures, and is likely to be worth the effort by getting used. I would strongly prefer we avoid the Hadoop API, though. We've been slowly decoupling/disentangling ourselves from a strict dependency on Hadoop over the years, and we should avoid further mandatory entanglements.

@dlmarion
Copy link
Contributor Author

A smarter person than me (I don't remember their name, but they were introduced to me as an expert on robust systems) once suggested that it's often not worth the complexity to introduce graceful shutdowns if you already have to handle ungraceful ones. It's better to spend time making sure the ungraceful ones are handled robustly, and if they are, there's no harm in making ungraceful shutdowns (SIGKILL) the normal shutdown and not worth the effort to even bother to code graceful shutdowns... especially for a server process that you expect to stay running for long periods of time as part of your backend architecture where shutdowns are abnormal anyway. I wouldn't take that as an absolute rule (because nothing is), of course, but I'm reminded of that opinion every once in awhile, because I think there's some truth to it that's worth thinking about it whenever we try to add some complexity to handle some edge case for starting/stopping/upgrading/etc.

Accumulo is very good at handling ungraceful shutdowns of server processes. However, there exists a case where a user may want to inform the server process to finish what it's doing, then shut down. A good example of this is a long running compaction in progress, but the user wants to scale things differently. The user might want to let the Compactor process finish what it's doing, then shut down, with the alternative that if it takes too long they can always kill it.

It's also not clear to me that shutdown hooks are always triggered by a SIGTERM or similar. It seems they also run when System.exit, and or maybe even when the main method is finished normally... and possibly if the main method throws an Exception or even an Error. We'll have to make sure that whatever we do to handle the shutdown request is suitable for all such cases, whatever they happen to be.

  • Runtime.addShutdownHook says that "When the virtual machine begins its shutdown sequence it will start all registered shutdown hooks in some unspecified order and let them run concurrently" and defines the shutdown sequence being initiated when System.exit is called, the program exits normally, or in response to an interrupt.
  • This table says that SIGTERM, SIGINT, and SIGHUP will execute the shutdown hooks.
  • Runtime.halt does not start shutdown hooks
  • I didn't see anything regarding the shutdown hooks running on Exception or Error that terminates the VM, I'm assuming that they won't run.
  • Shutdown hooks won't be executed on SIGKILL.

There are also other considerations to be made, like how uncaught exceptions are expected to be handled in the shutdown hook themselves, and we need to be very careful to shut down quickly. There can be no guarantees that these will even run, as the OS may determine a process is taking too long to stop and could just kill it at any time. So it's possible any complexity added to do this may not even be worth it.

Uncaught exceptions in shutdown hooks is addressed in the javadoc for Runtime.addShutdownHook.

Overall, I kind of like the idea, as long as it shuts down very quickly, doesn't add too much complexity, doesn't add risk by slowing down our emergency Halt measures, and is likely to be worth the effort by getting used. I would strongly prefer we avoid the Hadoop API, though. We've been slowly decoupling/disentangling ourselves from a strict dependency on Hadoop over the years, and we should avoid further mandatory entanglements.

@dlmarion
Copy link
Contributor Author

@ctubbsii - Thanks for the comments. In answering them, I realized that we need a different way to signal normal shutdown and not use SIGTERM. The reason being that we would need to ensure that our shutdown hook thread does not return until the main thread has exited so that the FileSystem does not get closed out from underneath of us. This long running shutdown hook thread is likely something that we don't want to do.

I was trying not to create a Thrift shutdown endpoint for all of the server processes and use SIGTERM instead, but at this point I think that's what I'm going to have to do. The idea that was discussed already was to create a Thrift RPC endpoint that can be used to signal graceful shutdown and create a Java application (KeywordExecutable) that calls the RPC method given a host/port.

@dlmarion
Copy link
Contributor Author

I removed the shutdown hook in aec6d47 and replaced with a Thrift RPC mechanism

@dlmarion dlmarion changed the title Handle SIGTERM for graceful shutdown of server processes Enable graceful shutdown of server processes Dec 18, 2024
@dlmarion dlmarion marked this pull request as ready for review December 18, 2024 21:43
@dlmarion
Copy link
Contributor Author

This is ready for review. I have kicked off a full IT build.

Copy link
Contributor

@keith-turner keith-turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only partially looked through this, posted the comment I have so far.

LOG.error("Compactor lost lock (reason = {}), exiting.", reason);
gcLogger.logGCInfo(getConfiguration());
});
if (isShutdownRequested()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a compactor process looses it lock anything it was working on will may be cleaned up in the metadata table and after that cleanup any work the compactor completes completes would be discarded. So may want to always halt when the lock is lost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is guarding against this Watcher being fired when the Compactor process performs ServiceLock.unlock as part of it shutting down at the end of the run method.

Copy link
Contributor

@keith-turner keith-turner Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could try to determine if this happening in the correct time window like the tserver. However its not as important for the compactor as the tserver. If its still compacting, did not delete its own lock, and its lock is gone then there is a chance that further work done is pointless but its not harmful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 0d5f014

gcLogger.logGCInfo(getConfiguration());
});
if (isShutdownRequested()) {
LOG.warn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tablet server should always halt when its lock is lost as the manager may reassign its tablets after the lock is lost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is guarding against this Watcher being fired when the TabletServer process performs ServiceLock.unlock as part of it shutting down at the end of the run method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to do, however the check is too broad. There is a period of time where isShutdownRequested is true but the tablet server has not deleted its lock and is working on shutting down and could even be stuck. If its looses its lock during this time period it should still halt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 0d5f014

@dlmarion
Copy link
Contributor Author

Full IT build was successful

Copy link
Member

@ctubbsii ctubbsii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent more time reviewing the API and design of the sending and receiving of the shutdown signal than I did on the implementation handling the shutdown request in each server. I skimmed those, but will leave them to others to review more comprehensively.

For the design of the send/receive of the signal, I made a few comments, but it overall looks good.

My main concern would be that adding it to 2.1.4 changes the thrift RPC in a bugfix release, which can have substantial implications if somebody does a rolling restart or for some other reason has a heterogeneous cluster of different 2.1 patch versions. We strive to maintain both forward and backward compatibility within a bugfix release, so people can do things like rolling upgrades. So... if a 2.1.3 instance were to receive a signal like what would be sent by an admin command from 2.1.4, I'd want to know that it has been tested, and that 2.1.3 would ignore the signal, rather than experience a fault.

Additionally, I'm concerned about security. I made a comment about that below.

return ThriftUtil.getClientNoTimeout(this, serverProcess, context);
} catch (TTransportException tte) {
Throwable cause = tte.getCause();
if (cause != null && cause instanceof UnknownHostException) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null check is redundant here

Suggested change
if (cause != null && cause instanceof UnknownHostException) {
if (cause instanceof UnknownHostException) {

Throwable cause = tte.getCause();
if (cause != null && cause instanceof UnknownHostException) {
// do not expect to recover from this
throw new RuntimeException(tte);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnknownHostException is an IOException, you could use that:

Suggested change
throw new RuntimeException(tte);
throw new UncheckedIOException(tte.getCause());

or to also preserve the original:

Suggested change
throw new RuntimeException(tte);
var x = throw new UncheckedIOException(tte.getCause());
x.addSuppressed(tte);
throw x;

return coordinatorProcess == null ? Set.of() : Set.of(coordinatorProcess);
case COMPACTOR:
return compactorProcesses == null ? Set.of()
: Set.of(compactorProcesses.toArray(new Process[] {}));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this syntax might be slightly shorter.

Suggested change
: Set.of(compactorProcesses.toArray(new Process[] {}));
: Set.of(compactorProcesses.toArray(new Process[0]));

but, this is probably better, since you already have a collection, there's no need to make an intermediate array:

Suggested change
: Set.of(compactorProcesses.toArray(new Process[] {}));
: Set.copyOf(compactorProcesses);

Unless, you wanted to check for duplicates and throw an exception, in which case, the first option is better. But I don't really think duplicates are a risk here.

}

protected void requestShutdown() {
shutdownRequested.compareAndSet(false, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to do the compare in this base implementation, since you're ignoring the return value. You can just set it.

Suggested change
shutdownRequested.compareAndSet(false, true);
shutdownRequested.set(true);

requestShutdown();
}

protected void requestShutdown() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why the second method. It seems that subclasses can just extend gracefulShutdown instead.

Alternatively, instead of extending either of these, make the gracefulShutdown method call a registered gracefulShutdown hook that's passed in in the constructor. You can even run it as a separate thread, launched from the gracefulShutdown method, so that you can guarantee it isn't blocking the main thread.

Comment on lines +112 to +116
@Parameter(names = {"-h", "--host"}, description = "<host>")
String hostname = null;

@Parameter(names = {"-p", "--port"}, description = "<port>")
int port = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems unnecessary to use JCommander to split these into separate options. Why not just do:

bin/accumulo admin signalShutdown <host:port>

No need for a separate -h and -p option. They go together anyway. It'd be weird to split them up only to pass them as separate params, so that the code combines them again to act upon it.

@@ -170,7 +175,8 @@
import io.opentelemetry.api.trace.Span;
import io.opentelemetry.context.Scope;

public class TabletServer extends AbstractServer implements TabletHostingServer {
public class TabletServer extends AbstractServer
implements TabletHostingServer, ServerProcessService.Iface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these have redundant interface declarations since it's already declared on the AbstractServer super class. It'd be better to omit ServerProcessService.Iface from the child classes, so it's easier to see where things are coming from in the class hierarchy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't work with Thrift. You have to put the Thrift interface declarations on the class being used. I agree that they are redundant, but I have run into this many times, especially with overriding behavior of a Thrift server in a test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, interesting. I wasn't aware there was a problem with that. Then, perhaps it should be put on a separate "handler" class, like the other handlers, and be a member of the AbstractServer?

// Don't interrupt the server thread, that will cause
// IO operations to fail as the servers are finishing
// their work.
requestShutdown();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there needs to be some kind of security on this. Since it's coming via thrift, it should be possible to use the ServerContext with the SystemCredentials to verify that the request is coming from an admin utility with access to this cluster's instance.secret / config file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants