-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pending requests: measure time spend waiting on a full queue #97
Changes from 29 commits
1695b23
792c61e
b2d3cbb
be4b41f
8d36fab
c21ac98
03816ad
0462810
4566492
2c08ff3
6b592b5
b58796e
eb47562
21ccb95
261cfd5
d98b80d
bbd1155
ad8b8e1
ca64cfe
e84cc0b
698ede0
7bfa2fc
a1cf135
94c5ffb
8cdbb73
7db2a8c
08ba684
3eff871
204f333
f4dc3de
f7e3cda
a25af07
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -29,6 +29,31 @@ def test_h1(self): | |
self.assertEqual(counters["upstream_rq_total"], 25) | ||
self.assertEqual(len(counters), 9) | ||
|
||
def mini_stress_test_h1(self, args): | ||
# run a test with more rps then we can handle, and a very small client-side queue. | ||
# we should observe both lots of successfull requests as well as time spend in blocking mode., | ||
parsed_json = self.runNighthawkClient(args) | ||
counters = self.getNighthawkCounterMapFromJson(parsed_json) | ||
self.assertGreater(counters["benchmark.http_2xx"], 1000) | ||
self.assertEqual(counters["upstream_cx_http1_total"], 1) | ||
global_histograms = self.getNighthawkGlobalHistogramsbyIdFromJson(parsed_json) | ||
self.assertGreater(int(global_histograms["sequencer.blocking"]["count"]), 1000) | ||
self.assertGreater( | ||
int(global_histograms["benchmark_http_client.request_to_response"]["count"]), 1000) | ||
return counters | ||
|
||
def test_h1_mini_stress_test_with_client_side_queueing(self): | ||
counters = self.mini_stress_test_h1([ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For all the actual There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
self.getTestServerRootUri(), "--rps", "999999", "--max-pending-requests", "10", | ||
"--duration 2" | ||
]) | ||
self.assertGreater(counters["upstream_rq_pending_total"], 100) | ||
|
||
def test_h1_mini_stress_test_without_client_side_queueing(self): | ||
counters = self.mini_stress_test_h1( | ||
[self.getTestServerRootUri(), "--rps", "999999", "--duration 2"]) | ||
self.assertEqual(counters["upstream_rq_pending_total"], 1) | ||
|
||
def test_h2(self): | ||
parsed_json = self.runNighthawkClient(["--h2", self.getTestServerRootUri()]) | ||
counters = self.getNighthawkCounterMapFromJson(parsed_json) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -189,16 +189,19 @@ void BenchmarkClientHttpImpl::setRequestHeader(absl::string_view key, absl::stri | |
} | ||
|
||
bool BenchmarkClientHttpImpl::tryStartOne(std::function<void()> caller_completion_callback) { | ||
// When no client side queueing is specified (via max_pending_requests_), we are in closed loop | ||
// mode. In closed loop mode we want to be able to control the pacing as exactly as possible. In | ||
// open-loop mode we probably want to skip this. NOTE(oschaaf): We can't consistently rely on | ||
// resourceManager()::requests() because that isn't used for h/1 (it is used in tcp and h2 | ||
// though). | ||
if (max_pending_requests_ == 1 && | ||
(!cluster_->resourceManager(Envoy::Upstream::ResourcePriority::Default) | ||
.pendingRequests() | ||
.canCreate() || | ||
((requests_initiated_ - requests_completed_) >= connection_limit_))) { | ||
// When we allow client-side queuing, we want to have a sense of time spend waiting on that queue. | ||
// So we return false here to indicate we couldn't initiate a new request. | ||
if (!cluster_->resourceManager(Envoy::Upstream::ResourcePriority::Default) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure I see how this diff relates to the PR description.. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Without this change, we would see the overflow stats rise excessively. With it we will end up reporting the time we spend not being able to queue u a new request as time spend blocking (we indicate we could not by returning false). Hope that clarifies this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think so, but can you add comments and a test? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, and done. The test is implemented with python, wdyt? |
||
.pendingRequests() | ||
.canCreate()) { | ||
return false; | ||
} | ||
// When no client side queueing is disabled (max_pending equals 1) we control the pacing as | ||
// exactly as possible here. | ||
// NOTE: We can't consistently rely on resourceManager()::requests() | ||
// because that isn't used for h/1 (it is used in tcp and h2 though). | ||
if ((max_pending_requests_ == 1 && | ||
(requests_initiated_ - requests_completed_) >= connection_limit_)) { | ||
return false; | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and below) are coupled with the call site, maybe try and have one source of truth for the number of reqs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done