Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moved getLock() to AbstractServer, added ServiceLock verification thread #5145

Merged
merged 9 commits into from
Dec 7, 2024

Conversation

dlmarion
Copy link
Contributor

@dlmarion dlmarion commented Dec 5, 2024

Moved getLock() from TabletHostingServer to AbstractServer. Added method
to ServiceLock to verify the lock is being held in ZooKeeper versus
relying on the Watcher. Added method to start verification thread in
AbstractServer, which is called from the Manager and TabletServer.

Closes #5132

Moved getLock() from TabletHostingServer to AbstractServer. Added method
to ServiceLock to verfify the lock is being held in ZooKeeper versus
relying on the Watcher. Added method to start verification thread in
AbstractServer, which is called from the Manager and TabletServer.

Closes apache#5132
@dlmarion dlmarion added this to the 2.1.4 milestone Dec 5, 2024
@dlmarion dlmarion self-assigned this Dec 5, 2024
Copy link
Member

@ctubbsii ctubbsii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall seems like a good approach

Comment on lines +51 to +52
private volatile Thread serverThread;
private volatile Thread verificationThread;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these need to be volatile? I'm not sure what that's getting us here. Should these be final and the Thread created in the constructor, and only run in the runServer method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming so, but honestly not 100% sure. serverThread.isAlive is being called from the verificationThread and verificationThread is checked for a null reference in the main thread when stopping.

*
* @return lock ServiceLock or null
*/
public abstract ServiceLock getLock();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could avoid making this abstract and needing to implement it in the child classes if each child class called the super constructor and provided a Supplier<ServiceLock>, then this impl would just be:

Suggested change
public abstract ServiceLock getLock();
public ServiceLock getLock() {
serviceLockSupplier.get();
}

That might also avoid issues with handling null here, because the supplier would likely be blocking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But wouldn't that require an AtomicReference or something so that the Supplier can just call AtomicReference.get() and the server call AtomicReference.set(). I'm not sure that's any cleaner.

@dlmarion dlmarion merged commit be9fa22 into apache:2.1 Dec 7, 2024
8 checks passed
@dlmarion dlmarion deleted the 5132-service-lock-verification branch December 7, 2024 15:24
@dlmarion dlmarion linked an issue Dec 7, 2024 that may be closed by this pull request
@dlmarion
Copy link
Contributor Author

dlmarion commented Dec 7, 2024

Merged up to main. Changed property default value to 2m in 3.1 branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Watching the server lock differently
3 participants