Skip to content

Commit

Permalink
stats: Remember recent lookups and display them in an admin endpoint (e…
Browse files Browse the repository at this point in the history
…nvoyproxy#8116)

* stats: Remember recent lookups and display them in an admin endpoint 

Signed-off-by: Joshua Marantz <jmarantz@google.com>
  • Loading branch information
jmarantz authored and nandu-vinodan committed Oct 17, 2019
1 parent 0072503 commit 9566230
Show file tree
Hide file tree
Showing 19 changed files with 524 additions and 114 deletions.
2 changes: 2 additions & 0 deletions docs/root/intro/version_history.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ Version history
* access log: reintroduce :ref:`filesystem <filesystem_stats>` stats and added the `write_failed` counter to track failed log writes
* admin: added ability to configure listener :ref:`socket options <envoy_api_field_config.bootstrap.v2.Admin.socket_options>`.
* admin: added config dump support for Secret Discovery Service :ref:`SecretConfigDump <envoy_api_msg_admin.v2alpha.SecretsConfigDump>`.
* admin: added :http:get:`/stats/recentlookups`, :http:post:`/stats/recentlookups/clear`,
:http:post:`/stats/recentlookups/disable`, and :http:post:`/stats/recentlookups/enable` endpoints.
* api: added ::ref:`set_node_on_first_message_only <envoy_api_field_core.ApiConfigSource.set_node_on_first_message_only>` option to omit the node identifier from the subsequent discovery requests on the same stream.
* buffer filter: the buffer filter populates content-length header if not present, behavior can be disabled using the runtime feature `envoy.reloadable_features.buffer_filter_populate_content_length`.
* config: added support for :ref:`delta xDS <arch_overview_dynamic_config_delta>` (including ADS) delivery
Expand Down
51 changes: 50 additions & 1 deletion docs/root/operations/admin.rst
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@ modify different aspects of the server:
.. http:post:: /reset_counters
Reset all counters to zero. This is useful along with :http:get:`/stats` during debugging. Note
that this does not drop any data sent to statsd. It just effects local output of the
that this does not drop any data sent to statsd. It just affects local output of the
:http:get:`/stats` command.

.. http:get:: /server_info
Expand Down Expand Up @@ -354,6 +354,55 @@ modify different aspects of the server:
Envoy has updated (counters incremented at least once, gauges changed at least once,
and histograms added to at least once)

.. http:get:: /stats/recentlookups
This endpoint helps Envoy developers debug potential contention
issues in the stats system. Initially, only the count of StatName
lookups is acumulated, not the specific names that are being looked
up. In order to see specific recent requests, you must enable the
feature by POSTing to `/stats/recentlookups/enable`. There may be
approximately 40-100 nanoseconds of added overhead per lookup.

When enabled, this endpoint emits a table of stat names that were
recently accessed as strings by Envoy. Ideally, strings should be
converted into StatNames, counters, gauges, and histograms by Envoy
code only during startup or when receiving a new configuration via
xDS. This is because when stats are looked up as strings they must
take a global symbol table lock. During startup this is acceptable,
but in response to user requests on high core-count machines, this
can cause performance issues due to mutex contention.

This admin endpoint requires Envoy to be started with option
`--use-fake-symbol-table 0`.

See :repo:`source/docs/stats.md` for more details.

Note also that actual mutex contention can be tracked via :http:get:`/contention`.

.. http:post:: /stats/recentlookups/enable
Turns on collection of recent lookup of stat-names, thus enabling
`/stats/recentlookups`.

See :repo:`source/docs/stats.md` for more details.

.. http:post:: /stats/recentlookups/disable
Turns off collection of recent lookup of stat-names, thus disabling
`/stats/recentlookups`. It also clears the list of lookups. However,
the total count, visible as stat `server.stats_recent_lookups`, is
not cleared, and continues to accumulate.

See :repo:`source/docs/stats.md` for more details.

.. http:post:: /stats/recentlookups/clear
Clears all outstanding lookups and counts. This clears all recent
lookups data as well as the count, but collection continues if
it is enabled.

See :repo:`source/docs/stats.md` for more details.

.. _operations_admin_interface_runtime:

.. http:get:: /runtime
Expand Down
26 changes: 26 additions & 0 deletions include/envoy/stats/symbol_table.h
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,32 @@ class SymbolTable {
virtual void callWithStringView(StatName stat_name,
const std::function<void(absl::string_view)>& fn) const PURE;

using RecentLookupsFn = std::function<void(absl::string_view, uint64_t)>;

/**
* Calls the provided function with the name of the most recently looked-up
* symbols, including lookups on any StatNameSets, and with a count of
* the recent lookups on that symbol.
*
* @param iter the function to call for every recent item.
*/
virtual uint64_t getRecentLookups(const RecentLookupsFn& iter) const PURE;

/**
* Clears the recent-lookups structures.
*/
virtual void clearRecentLookups() PURE;

/**
* Sets the recent-lookup capacity.
*/
virtual void setRecentLookupCapacity(uint64_t capacity) PURE;

/**
* @return The configured recent-lookup tracking capacity.
*/
virtual uint64_t recentLookupCapacity() const PURE;

/**
* Creates a StatNameSet.
*
Expand Down
1 change: 1 addition & 0 deletions source/common/stats/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ envoy_cc_library(
hdrs = ["symbol_table_impl.h"],
external_deps = ["abseil_base"],
deps = [
":recent_lookups_lib",
"//include/envoy/stats:symbol_table_interface",
"//source/common/common:assert_lib",
"//source/common/common:logger_lib",
Expand Down
4 changes: 4 additions & 0 deletions source/common/stats/fake_symbol_table_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,10 @@ class FakeSymbolTableImpl : public SymbolTable {
return StatNameSetPtr(new StatNameSet(*this, name));
}
void forgetSet(StatNameSet&) override {}
uint64_t getRecentLookups(const RecentLookupsFn&) const override { return 0; }
void clearRecentLookups() override {}
void setRecentLookupCapacity(uint64_t) override {}
uint64_t recentLookupCapacity() const override { return 0; }

private:
absl::string_view toStringView(const StatName& stat_name) const {
Expand Down
104 changes: 100 additions & 4 deletions source/common/stats/symbol_table_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,7 @@ void SymbolTableImpl::addTokensToEncoding(const absl::string_view name, Encoding
// ref-counts in this.
{
Thread::LockGuard lock(lock_);
recent_lookups_.lookup(name);
for (auto& token : tokens) {
symbols.push_back(toSymbol(token));
}
Expand Down Expand Up @@ -207,16 +208,94 @@ void SymbolTableImpl::free(const StatName& stat_name) {
}
}

StatNameSetPtr SymbolTableImpl::makeSet(absl::string_view name) {
uint64_t SymbolTableImpl::getRecentLookups(const RecentLookupsFn& iter) const {
uint64_t total = 0;
absl::flat_hash_map<std::string, uint64_t> name_count_map;

// We don't want to hold stat_name_set_mutex while calling the iterator, so
// buffer lookup_data.
{
Thread::LockGuard lock(stat_name_set_mutex_);
for (StatNameSet* stat_name_set : stat_name_sets_) {
total +=
stat_name_set->getRecentLookups([&name_count_map](absl::string_view str, uint64_t count) {
name_count_map[std::string(str)] += count;
});
}
}

// We also don't want to hold lock_ while calling the iterator, but we need it
// to access recent_lookups_.
{
Thread::LockGuard lock(lock_);
recent_lookups_.forEach(
[&name_count_map](absl::string_view str, uint64_t count)
NO_THREAD_SAFETY_ANALYSIS { name_count_map[std::string(str)] += count; });
total += recent_lookups_.total();
}

// Now we have the collated name-count map data: we need to vectorize and
// sort. We define the pair with the count first as std::pair::operator<
// prioritizes its first element over its second.
using LookupCount = std::pair<uint64_t, absl::string_view>;
std::vector<LookupCount> lookup_data;
lookup_data.reserve(name_count_map.size());
for (const auto& iter : name_count_map) {
lookup_data.emplace_back(LookupCount(iter.second, iter.first));
}
std::sort(lookup_data.begin(), lookup_data.end());
for (const LookupCount& lookup_count : lookup_data) {
iter(lookup_count.second, lookup_count.first);
}
return total;
}

void SymbolTableImpl::setRecentLookupCapacity(uint64_t capacity) {
{
Thread::LockGuard lock(stat_name_set_mutex_);
for (StatNameSet* stat_name_set : stat_name_sets_) {
stat_name_set->setRecentLookupCapacity(capacity);
}
}

{
Thread::LockGuard lock(lock_);
recent_lookups_.setCapacity(capacity);
}
}

void SymbolTableImpl::clearRecentLookups() {
{
Thread::LockGuard lock(stat_name_set_mutex_);
for (StatNameSet* stat_name_set : stat_name_sets_) {
stat_name_set->clearRecentLookups();
}
}
{
Thread::LockGuard lock(lock_);
recent_lookups_.clear();
}
}

uint64_t SymbolTableImpl::recentLookupCapacity() const {
Thread::LockGuard lock(lock_);
// make_unique does not work with private ctor, even though FakeSymbolTableImpl is a friend.
return recent_lookups_.capacity();
}

StatNameSetPtr SymbolTableImpl::makeSet(absl::string_view name) {
const uint64_t capacity = recentLookupCapacity();
// make_unique does not work with private ctor, even though SymbolTableImpl is a friend.
StatNameSetPtr stat_name_set(new StatNameSet(*this, name));
stat_name_sets_.insert(stat_name_set.get());
stat_name_set->setRecentLookupCapacity(capacity);
{
Thread::LockGuard lock(stat_name_set_mutex_);
stat_name_sets_.insert(stat_name_set.get());
}
return stat_name_set;
}

void SymbolTableImpl::forgetSet(StatNameSet& stat_name_set) {
Thread::LockGuard lock(lock_);
Thread::LockGuard lock(stat_name_set_mutex_);
stat_name_sets_.erase(&stat_name_set);
}

Expand Down Expand Up @@ -484,10 +563,27 @@ StatName StatNameSet::getDynamic(absl::string_view token) {
Stats::StatName& stat_name_ref = dynamic_stat_names_[token];
if (stat_name_ref.empty()) { // Note that builtin_stat_names_ already has one for "".
stat_name_ref = pool_.add(token);
recent_lookups_.lookup(token);
}
return stat_name_ref;
}
}

uint64_t StatNameSet::getRecentLookups(const RecentLookups::IterFn& iter) const {
absl::MutexLock lock(&mutex_);
recent_lookups_.forEach(iter);
return recent_lookups_.total();
}

void StatNameSet::clearRecentLookups() {
absl::MutexLock lock(&mutex_);
recent_lookups_.clear();
}

void StatNameSet::setRecentLookupCapacity(uint64_t capacity) {
absl::MutexLock lock(&mutex_);
recent_lookups_.setCapacity(capacity);
}

} // namespace Stats
} // namespace Envoy
28 changes: 26 additions & 2 deletions source/common/stats/symbol_table_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#include "common/common/stack_array.h"
#include "common/common/thread.h"
#include "common/common/utility.h"
#include "common/stats/recent_lookups.h"

#include "absl/container/flat_hash_map.h"
#include "absl/strings/str_join.h"
Expand Down Expand Up @@ -161,6 +162,10 @@ class SymbolTableImpl : public SymbolTable {

StatNameSetPtr makeSet(absl::string_view name) override;
void forgetSet(StatNameSet& stat_name_set) override;
uint64_t getRecentLookups(const RecentLookupsFn&) const override;
void clearRecentLookups() override;
void setRecentLookupCapacity(uint64_t capacity) override;
uint64_t recentLookupCapacity() const override;

private:
friend class StatName;
Expand All @@ -176,6 +181,9 @@ class SymbolTableImpl : public SymbolTable {
// This must be held during both encode() and free().
mutable Thread::MutexBasicLockable lock_;

// This must be held while updating stat_name_sets_.
mutable Thread::MutexBasicLockable stat_name_set_mutex_;

/**
* Decodes a vector of symbols back into its period-delimited stat name. If
* decoding fails on any part of the symbol_vec, we release_assert and crash
Expand Down Expand Up @@ -241,7 +249,9 @@ class SymbolTableImpl : public SymbolTable {
// TODO(ambuc): There might be an optimization here relating to storing ranges of freed symbols
// using an Envoy::IntervalSet.
std::stack<Symbol> pool_ GUARDED_BY(lock_);
absl::flat_hash_set<StatNameSet*> stat_name_sets_ GUARDED_BY(lock_);
RecentLookups recent_lookups_ GUARDED_BY(lock_);

absl::flat_hash_set<StatNameSet*> stat_name_sets_ GUARDED_BY(stat_name_set_mutex_);
};

/**
Expand Down Expand Up @@ -709,19 +719,33 @@ class StatNameSet {
return pool_.add(str);
}

/**
* Clears recent lookups.
*/
void clearRecentLookups();

/**
* Sets the number of names recorded in the recent-lookups set.
*
* @param capacity the capacity to configure.
*/
void setRecentLookupCapacity(uint64_t capacity);

private:
friend class FakeSymbolTableImpl;
friend class SymbolTableImpl;

StatNameSet(SymbolTable& symbol_table, absl::string_view name);
uint64_t getRecentLookups(const RecentLookups::IterFn& iter) const;

const std::string name_;
Stats::SymbolTable& symbol_table_;
Stats::StatNamePool pool_ GUARDED_BY(mutex_);
absl::Mutex mutex_;
mutable absl::Mutex mutex_;
using StringStatNameMap = absl::flat_hash_map<std::string, Stats::StatName>;
StringStatNameMap builtin_stat_names_;
StringStatNameMap dynamic_stat_names_ GUARDED_BY(mutex_);
RecentLookups recent_lookups_ GUARDED_BY(mutex_);
};

} // namespace Stats
Expand Down
Loading

0 comments on commit 9566230

Please sign in to comment.