Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pending requests: measure time spend waiting on a full queue #97

Merged
merged 32 commits into from
Jul 15, 2019
Merged
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
1695b23
Update Envoy + Python integration testing
oschaaf Jun 27, 2019
792c61e
CI image update, Py3, Integration test sync
oschaaf Jun 27, 2019
b2d3cbb
Update CI image version & formatting
oschaaf Jun 27, 2019
be4b41f
Bump bazel-compilation-database to 0.3.5
oschaaf Jun 27, 2019
8d36fab
Sanitize requirements.txt
oschaaf Jun 27, 2019
c21ac98
Merge remote-tracking branch 'upstream/master' into py3-and-sync-inte…
oschaaf Jun 27, 2019
03816ad
Add cluster, pool and tls control options
oschaaf Jun 27, 2019
0462810
Merge remote-tracking branch 'upstream/master' into cluster-configura…
oschaaf Jun 28, 2019
4566492
Back out Sequencer::cancel()
oschaaf Jun 28, 2019
2c08ff3
Clean up literak string in options_impl.cc
oschaaf Jun 28, 2019
6b592b5
Test update
oschaaf Jun 28, 2019
b58796e
Fix format
oschaaf Jun 28, 2019
eb47562
Back out accidentally committed .bazelrc change
oschaaf Jun 28, 2019
21ccb95
NOLINT clang-tidy warning originating from TCLAP
oschaaf Jun 28, 2019
261cfd5
do_ci.sh: clean dif, revert whitespace changes
oschaaf Jul 2, 2019
d98b80d
Update README.md
oschaaf Jul 2, 2019
bbd1155
benchmark_client_impl.cc: clean up diff
oschaaf Jul 2, 2019
ad8b8e1
Another cleanup round
oschaaf Jul 2, 2019
ca64cfe
Make sure we catch exception when parsing json
oschaaf Jul 2, 2019
e84cc0b
Fix uint64_t's that pight to be be uint32_t
oschaaf Jul 2, 2019
698ede0
Add & use TCLAP_SET_IF_SPECIFIED macro
oschaaf Jul 2, 2019
7bfa2fc
Process review feedback
oschaaf Jul 3, 2019
a1cf135
Tidy up macro definition
oschaaf Jul 3, 2019
94c5ffb
Add test for uint32_t TCLAP parsing range
oschaaf Jul 3, 2019
8cdbb73
Open- & closed-loop: measure blocking
oschaaf Jul 8, 2019
7db2a8c
Merge remote-tracking branch 'upstream/master' into open-loop-and-blo…
oschaaf Jul 9, 2019
08ba684
Update comments
oschaaf Jul 11, 2019
3eff871
Add integration test
oschaaf Jul 11, 2019
204f333
Python formatting
oschaaf Jul 11, 2019
f4dc3de
Add PyDoc comments, add TODO + more of expecations
oschaaf Jul 11, 2019
f7e3cda
Define a constant for reuse x-expecations
oschaaf Jul 11, 2019
a25af07
Amend formatting issues
oschaaf Jul 11, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions integration/integration_test_fixtures.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,12 @@ def getNighthawkCounterMapFromJson(self, parsed_json):
for counter in parsed_json["results"][0]["counters"]
}

def getNighthawkGlobalHistogramsbyIdFromJson(self, parsed_json):
"""
Utility method to get the global histograms from the json indexed by id.
"""
return {statistic["id"]: statistic for statistic in parsed_json["results"][0]["statistics"]}

def getTestServerRootUri(self, https=False):
"""
Utility for getting the http://host:port/ that can be used to query the server we started in setUp()
Expand Down
25 changes: 25 additions & 0 deletions integration/test_integration_basics.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,31 @@ def test_h1(self):
self.assertEqual(counters["upstream_rq_total"], 25)
self.assertEqual(len(counters), 9)

def mini_stress_test_h1(self, args):
# run a test with more rps then we can handle, and a very small client-side queue.
# we should observe both lots of successfull requests as well as time spend in blocking mode.,
parsed_json = self.runNighthawkClient(args)
counters = self.getNighthawkCounterMapFromJson(parsed_json)
self.assertGreater(counters["benchmark.http_2xx"], 1000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and below) are coupled with the call site, maybe try and have one source of truth for the number of reqs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

self.assertEqual(counters["upstream_cx_http1_total"], 1)
global_histograms = self.getNighthawkGlobalHistogramsbyIdFromJson(parsed_json)
self.assertGreater(int(global_histograms["sequencer.blocking"]["count"]), 1000)
self.assertGreater(
int(global_histograms["benchmark_http_client.request_to_response"]["count"]), 1000)
return counters

def test_h1_mini_stress_test_with_client_side_queueing(self):
counters = self.mini_stress_test_h1([
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all the actual test_*, can you add Pydoc style comments describing what each one does? This is pretty standard style in Google Python style.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

self.getTestServerRootUri(), "--rps", "999999", "--max-pending-requests", "10",
"--duration 2"
])
self.assertGreater(counters["upstream_rq_pending_total"], 100)

def test_h1_mini_stress_test_without_client_side_queueing(self):
counters = self.mini_stress_test_h1(
[self.getTestServerRootUri(), "--rps", "999999", "--duration 2"])
self.assertEqual(counters["upstream_rq_pending_total"], 1)

def test_h2(self):
parsed_json = self.runNighthawkClient(["--h2", self.getTestServerRootUri()])
counters = self.getNighthawkCounterMapFromJson(parsed_json)
Expand Down
23 changes: 13 additions & 10 deletions source/client/benchmark_client_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -189,16 +189,19 @@ void BenchmarkClientHttpImpl::setRequestHeader(absl::string_view key, absl::stri
}

bool BenchmarkClientHttpImpl::tryStartOne(std::function<void()> caller_completion_callback) {
// When no client side queueing is specified (via max_pending_requests_), we are in closed loop
// mode. In closed loop mode we want to be able to control the pacing as exactly as possible. In
// open-loop mode we probably want to skip this. NOTE(oschaaf): We can't consistently rely on
// resourceManager()::requests() because that isn't used for h/1 (it is used in tcp and h2
// though).
if (max_pending_requests_ == 1 &&
(!cluster_->resourceManager(Envoy::Upstream::ResourcePriority::Default)
.pendingRequests()
.canCreate() ||
((requests_initiated_ - requests_completed_) >= connection_limit_))) {
// When we allow client-side queuing, we want to have a sense of time spend waiting on that queue.
// So we return false here to indicate we couldn't initiate a new request.
if (!cluster_->resourceManager(Envoy::Upstream::ResourcePriority::Default)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I see how this diff relates to the PR description..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change, we would see the overflow stats rise excessively. With it we will end up reporting the time we spend not being able to queue u a new request as time spend blocking (we indicate we could not by returning false). Hope that clarifies this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, but can you add comments and a test?
/wait

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and done. The test is implemented with python, wdyt?

.pendingRequests()
.canCreate()) {
return false;
}
// When no client side queueing is disabled (max_pending equals 1) we control the pacing as
// exactly as possible here.
// NOTE: We can't consistently rely on resourceManager()::requests()
// because that isn't used for h/1 (it is used in tcp and h2 though).
if ((max_pending_requests_ == 1 &&
(requests_initiated_ - requests_completed_) >= connection_limit_)) {
return false;
}

Expand Down