Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing the CPU Intensive RemoveAll with Lists in Sticky & Load Based Partition Assignment Strategies #965

Merged

Conversation

shrinandthakkar
Copy link
Collaborator

Summary

  • The coordinator thread should be able to finish any event in less than the configured heartbeat period (default 1 minute). Lately it has been observed that all the partition assignment events are taking more than approximately 1.5 minutes to complete for every request for large clusters with around ~500K partitions per datastream.

  • The issue seems to be related to this code where the thread is stuck in the removeAll call, where one of the collections is a list. This may result in higher CPU usage.

  • This has been confirmed with thread dumps and logs from a partition heavy cluster's performance.


Important: DO NOT REPORT SECURITY ISSUES DIRECTLY ON GITHUB.
For reporting security issues and contributing security fixes,
please, email security@linkedin.com instead, as described in
the contribution guidelines.

Please, take a minute to review the contribution guidelines at:
https://github.com/linkedin/Brooklin/blob/master/CONTRIBUTING.md

@shrinandthakkar shrinandthakkar merged commit 0faec8e into linkedin:master Oct 27, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants