Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add circuit breaking #447

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

vench
Copy link

@vench vench commented Jul 19, 2024

Description

This pull request introduces circuit breaking functionality to Chproxy, enhancing its resilience and reliability when interacting with ClickHouse clusters. Circuit breaking is a critical feature for preventing cascading failures in distributed systems by halting operations when certain failure thresholds are reached

Please check the type of change your PR introduces:

  • Bugfix
  • Feature
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • Documentation content changes
  • Other (please describe):

Checklist

  • Linter passes correctly
  • Add tests which fail without the change (if possible)
  • All tests passing
  • Extended the README / documentation, if necessary

@sigua-cs
Copy link
Collaborator

sigua-cs commented Jul 21, 2024

Hello @vench
Thank you for your pull request introducing the circuit breaking functionality to Chproxy. To better understand the implementation and ensure its effectiveness, could you please provide additional details on the following points?

  1. Failure Scenarios:
    Could you provide examples of failure scenarios that the circuit breaker is designed to handle? For instance, what types of failures should trigger the circuit breaker?

  2. Expected Behavior:
    What specific behavior should we expect when the circuit breaker is triggered? How should it affect ongoing operations and future requests?

  3. Test Cases:
    In proxy_test.go, there is a test case for "error with breaker on." Could you explain the expected responses and status codes? Additionally, could you provide a few more test cases to demonstrate the circuit breaker’s behavior under different scenarios?

Your detailed insights will greatly help us understand and validate the introduced feature.
Thank you for your contribution!

@vench
Copy link
Author

vench commented Jul 23, 2024

Good day.

On our project, under heavy load on the ClickHouse database, we encountered a problem where it lacks resources to process requests. And to alleviate the intensity of the load, we would like to break the chain, i.e., if we already know about a large number of errors, not to go to the database, but to immediately return an error for some time. This way, give the database time to process existing requests.

Therefore, the errors we detect are lack of memory, exceeding limits, and insufficient threads (439).

Soon, I will conduct more tests on this functionality. Thank you for your response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants