Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZOOKEEPER-4541 Ephemeral znode owned by closed session visible in 1 of 3 servers #1925

Closed
wants to merge 16 commits into from
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -883,7 +883,7 @@ public synchronized void shutdown(boolean fullyShutDown) {
// * If we fetch a new snapshot from leader, the zkDb will be
// cleared anyway before loading the snapshot
try {
//This will fast forward the database to the latest recorded transactions
// This will fast-forward the database to the latest recorded transactions
zkDb.fastForwardDataBase();
} catch (IOException e) {
LOG.error("Error updating DB", e);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -155,11 +155,11 @@ protected void unregisterMetrics() {
}

@Override
public synchronized void shutdown() {
public synchronized void shutdown(boolean fullyShutDown) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried about the modification here has an impact on the invoking chain.

Before modification: Leader.shutdown(String) -> LeaderZooKeeperServer.shutdown() -> ZooKeeperServer.shutdown()
After modification: Leader.shutdown() -> ZooKeeperServer.shutdown()

LeaderZooKeeperServer.shutdown is skipped and containerManager does not stop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ZooKeeperServer.shutdown() only calls shutdown(false), which is implemented in LeaderZooKeeperServer, and which stops the containerManager. shutdown() isn't overridden anywhere anymore.

if (containerManager != null) {
containerManager.stop();
}
super.shutdown();
super.shutdown(fullyShutDown);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -152,23 +152,31 @@ protected void unregisterJMX(Learner peer) {
}

@Override
public synchronized void shutdown() {
public synchronized void shutdown(boolean fullyShutDown) {
if (!canShutdown()) {
LOG.debug("ZooKeeper server is not running, so not proceeding to shutdown!");
return;
}
LOG.info("Shutting down");
try {
super.shutdown();
} catch (Exception e) {
LOG.warn("Ignoring unexpected exception during shutdown", e);
else {
LOG.info("Shutting down");
try {
if (syncProcessor != null) {
// Shutting down the syncProcessor here, first, ensures queued transactions here are written to
// permanent storage, which ensures that crash recovery data is consistent with what is used for a
// leader election immediately following shutdown, because of the old leader going down; and also
// that any state on its way to being written is also loaded in the potential call to
// fast-forward-from-edits, in super.shutdown(...), so we avoid getting a DIFF from the new leader
// that contains entries we have already written to our transaction log.
syncProcessor.shutdown();
}
}
catch (Exception e) {
LOG.warn("Ignoring unexpected exception in syncprocessor shutdown", e);
}
}
try {
if (syncProcessor != null) {
syncProcessor.shutdown();
}
super.shutdown(fullyShutDown);
} catch (Exception e) {
LOG.warn("Ignoring unexpected exception in syncprocessor shutdown", e);
LOG.warn("Ignoring unexpected exception during shutdown", e);
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ public Learner getLearner() {
* @param request
jonmv marked this conversation as resolved.
Show resolved Hide resolved
*/
public void commitRequest(Request request) {
if (syncRequestProcessorEnabled) {
if (syncProcessor != null) {
// Write to txnlog and take periodic snapshot
syncProcessor.processRequest(request);
}
Expand Down Expand Up @@ -107,6 +107,9 @@ protected void setupRequestProcessors() {
syncProcessor = new SyncRequestProcessor(this, null);
syncProcessor.start();
}
else {
syncProcessor = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syncProcessor as an ObserverZooKeeperServer field should have a default value of null.
Does setting null here makes a difference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'm just used to always assigning (to final fields). This can be removed again.

}
}

/*
Expand All @@ -127,18 +130,6 @@ public String getState() {
return "observer";
}

@Override
public synchronized void shutdown() {
if (!canShutdown()) {
LOG.debug("ZooKeeper server is not running, so not proceeding to shutdown!");
return;
}
super.shutdown();
if (syncRequestProcessorEnabled && syncProcessor != null) {
syncProcessor.shutdown();
}
}

@Override
public void dumpMonitorValues(BiConsumer<String, Object> response) {
super.dumpMonitorValues(response);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -190,23 +190,24 @@ public long getServerId() {
}

@Override
public synchronized void shutdown() {
public synchronized void shutdown(boolean fullyShutDown) {
if (!canShutdown()) {
super.shutdown(fullyShutDown);
jonmv marked this conversation as resolved.
Show resolved Hide resolved
LOG.debug("ZooKeeper server is not running, so not proceeding to shutdown!");
return;
}
shutdown = true;
unregisterJMX(this);

// set peer's server to null
self.setZooKeeperServer(null);
// clear all the connections
self.closeAllConnections();
else {
shutdown = true;
unregisterJMX(this);

self.adminServer.setZooKeeperServer(null);
// set peer's server to null
self.setZooKeeperServer(null);
// clear all the connections
self.closeAllConnections();

self.adminServer.setZooKeeperServer(null);
}
// shutdown the server itself
super.shutdown();
super.shutdown(fullyShutDown);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@

import java.io.Flushable;
import java.io.IOException;
import java.net.Socket;

import org.apache.zookeeper.ZooDefs.OpCode;
import org.apache.zookeeper.server.Request;
import org.apache.zookeeper.server.RequestProcessor;
Expand Down Expand Up @@ -64,7 +66,8 @@ public void flush() throws IOException {
} catch (IOException e) {
LOG.warn("Closing connection to leader, exception during packet send", e);
try {
if (!learner.sock.isClosed()) {
Socket socket = learner.sock;
if ( socket != null && ! learner.sock.isClosed()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably use socket in the second condition too? In case it changes after the first check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that was of course the intention :) Fixed!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi jonmv, i have read the jira-4541. I am confusing. ZK1 does not send ack to leader , ZK1 recieves commit from leader. It seems not conform ZAB protocol. Please help me figure out, thanks

learner.sock.close();
}
} catch (IOException e1) {
Expand Down
Loading