Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove zookeeper watches when entries are removed from ZooCache #5154

Open
keith-turner opened this issue Dec 9, 2024 · 2 comments · May be fixed by #5183
Open

Remove zookeeper watches when entries are removed from ZooCache #5154

keith-turner opened this issue Dec 9, 2024 · 2 comments · May be fixed by #5183
Milestone

Comments

@keith-turner
Copy link
Contributor

keith-turner commented Dec 9, 2024

When something is removed from zoocache also remove the watch. Currently when an entry is removed from zoocache explicitly its watch is just left there. This is follow on work from #5143.

Originally posted by @keith-turner in #5143 (comment)

@keith-turner keith-turner added this to the 4.0.0 milestone Dec 9, 2024
@dlmarion
Copy link
Contributor

@keith-turner - did you see that WatchTheWatchCountIT is also failing in main? I added the following to see what was being watched. It seems as if every ZooKeeper session (Accumulo process?) has a Watcher set on every tserver. Seemed odd to me.

diff --git a/minicluster/src/main/java/org/apache/accumulo/miniclusterImpl/MiniAccumuloClusterImpl.java b/minicluster/src/main/java/org/apache/accumulo/miniclusterImpl/MiniAccumuloClusterImpl.java
index e8a55081c3..2308e774ee 100644
--- a/minicluster/src/main/java/org/apache/accumulo/miniclusterImpl/MiniAccumuloClusterImpl.java
+++ b/minicluster/src/main/java/org/apache/accumulo/miniclusterImpl/MiniAccumuloClusterImpl.java
@@ -291,7 +291,7 @@ public class MiniAccumuloClusterImpl implements AccumuloCluster {
       zooCfg.setProperty("clientPort", config.getZooKeeperPort() + "");
       zooCfg.setProperty("maxClientCnxns", "1000");
       zooCfg.setProperty("dataDir", config.getZooKeeperDir().getAbsolutePath());
-      zooCfg.setProperty("4lw.commands.whitelist", "ruok,wchs");
+      zooCfg.setProperty("4lw.commands.whitelist", "ruok,wchs,wchp");
       zooCfg.setProperty("admin.enableServer", "false");
       zooCfg.store(fileWriter, null);
 
diff --git a/test/src/main/java/org/apache/accumulo/test/functional/WatchTheWatchCountIT.java b/test/src/main/java/org/apache/accumulo/test/functional/WatchTheWatchCountIT.java
index b90d8b7c6e..fd89e21612 100644
--- a/test/src/main/java/org/apache/accumulo/test/functional/WatchTheWatchCountIT.java
+++ b/test/src/main/java/org/apache/accumulo/test/functional/WatchTheWatchCountIT.java
@@ -88,6 +88,15 @@ public class WatchTheWatchCountIT extends ConfigurableMacBase {
       log.info("Number of watchers to be expected in range: ({}, {}), currently have {}", MIN, MAX,
           total);
 
+      if (total >= MAX) {
+        try (Socket socket = new Socket(hostAndPort.getHost(), hostAndPort.getPort())) {
+          socket.getOutputStream().write("wchp\n".getBytes(UTF_8), 0, 5);
+          byte[] buffer = new byte[10485760];
+          int n = socket.getInputStream().read(buffer);
+          String response = new String(buffer, 0, n, UTF_8);
+          System.out.println(response);
+        }        
+      }
       assertTrue(total > MIN && total < MAX, "Expected number of watchers to be contained in ("
           + MIN + ", " + MAX + "), but actually was " + total);
 

@keith-turner
Copy link
Contributor Author

@keith-turner - did you see that WatchTheWatchCountIT is also failing in main? I added the following to see what was being watched. It seems as if every ZooKeeper session (Accumulo process?) has a Watcher set on every tserver. Seemed odd to me.

No I had not seen that, I can look into it.

dlmarion added a commit to dlmarion/accumulo that referenced this issue Dec 13, 2024
This commit removes most of the places where ZooCache
instances were being created in favor of re-using the
ZooCache from the ClientContext. Additionally, this
commit does not place a Watcher on each node that is
cached and instead places a single persistent
recursive Watcher at the paths in which the caching
is taking place.

This change roughly reduces the Watchers reported in
WatchTheWatchCountIT by 50%. While reducing the number
of Watchers, this commit could reduce ZooKeeper server
performance in two ways:

  1. There is a note in the ZooKeeper javadoc for the
     AddWatchMode enum that states there is a small
     performance decrease when using recursive watchers
     as all of the segments of ZNode paths need to be
     checked for watch triggering.

  2. Because a Watcher is not set on each node this
     commit modified the ZooCache.ZCacheWatcher to
     remove the parent of the triggered node, the
     triggered node, and all of its siblings from the
     cache. This overmatching may mean increased
     lookups in ZooKeeper.

Related to apache#5134
Closes apache#5154, apache#5157
@dlmarion dlmarion linked a pull request Dec 13, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants