Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats: Add fake symbol table as an intermediate state to move to SymbolTable API without taking locks. #5414

Merged
merged 23 commits into from
Jan 30, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
b1fcd49
catch up with symtab-read-lock and to-string-on-symtab.
jmarantz Dec 18, 2018
121ec75
refactor toString
jmarantz Dec 22, 2018
a54a2bc
Merge branch 'master' into fake-symbol-table
jmarantz Dec 22, 2018
4b9570c
virtualize symbol-table.
jmarantz Dec 23, 2018
0e9303b
use virtual interface in tests.
jmarantz Dec 23, 2018
4fa1eb2
all tests working.
jmarantz Dec 24, 2018
8a6aec1
Fix asan failures, add comments, cleanup.
jmarantz Dec 25, 2018
2209115
clang-tidy fixes.
jmarantz Dec 25, 2018
adf956e
Merge branch 'master' into fake-symbol-table
jmarantz Jan 14, 2019
a92121d
Merge branch 'master' into fake-symbol-table
jmarantz Jan 17, 2019
c5e25e1
Merge branch 'master' into fake-symbol-table
jmarantz Jan 22, 2019
a5a112e
Sink Storage type nicknames into SymbolTable class.
jmarantz Jan 22, 2019
b37dc32
comment cleanup.
jmarantz Jan 23, 2019
9140665
Merge branch 'master' into fake-symbol-table
jmarantz Jan 26, 2019
11fd3c0
Privatize SymbolTable::free and incRefCount, friending helper classes…
jmarantz Jan 27, 2019
675b9d6
Improve comments, fix nits, typos, etc.
jmarantz Jan 28, 2019
35709b3
Remove 2-arg form of join().
jmarantz Jan 28, 2019
392a0be
Merge branch 'master' into fake-symbol-table
jmarantz Jan 29, 2019
e9f2b50
Review style nits and actually test for zero contentions in fake symb…
jmarantz Jan 29, 2019
71df963
Only start tracking the contentions right before doing the accesses.
jmarantz Jan 30, 2019
ecb1b88
Add missing include for vector.
jmarantz Jan 30, 2019
586103c
Merge branch 'master' into fake-symbol-table
jmarantz Jan 30, 2019
d804664
Merge branch 'master' into fake-symbol-table
jmarantz Jan 30, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 156 additions & 3 deletions include/envoy/stats/symbol_table.h
Original file line number Diff line number Diff line change
@@ -1,13 +1,166 @@
#pragma once

#include <memory>
#include <vector>

#include "envoy/common/pure.h"

#include "absl/strings/string_view.h"

namespace Envoy {
namespace Stats {

// Interface for referencing a stat name.
/**
* Runtime representation of an encoded stat name. This is predeclared only in
* the interface without abstract methods, because (a) the underlying class
* representation is common to both implementations of SymbolTable, and (b)
* we do not want or need the overhead of a vptr per StatName. The common
* declaration for StatName is in source/common/stats/symbol_table_impl.h
*/
class StatName;

// Interface for managing symbol tables.
class SymbolTable;
/**
* Intermediate representation for a stat-name. This helps store multiple names
* in a single packed allocation. First we encode each desired name, then sum
* their sizes for the single packed allocation. This is used to store
* MetricImpl's tags and tagExtractedName. Like StatName, we don't want to pay
* a vptr overhead per object, and the representation is shared between the
* SymbolTable implementations, so this is just a pre-declare.
*/
class SymbolEncoding;

/**
* SymbolTable manages a namespace optimized for stat names, exploiting their
* typical composition from "."-separated tokens, with a significant overlap
* between the tokens. The interface is designed to balance optimal storage
* at scale with hiding details from users. We seek to provide the most abstract
* interface possible that avoids adding per-stat overhead or taking locks in
* the hot path.
*/
class SymbolTable {
public:
/**
* Efficient byte-encoded storage of an array of tokens. The most common
* tokens are typically < 127, and are represented directly. tokens >= 128
* spill into the next byte, allowing for tokens of arbitrary numeric value to
* be stored. As long as the most common tokens are low-valued, the
* representation is space-efficient. This scheme is similar to UTF-8. The
* token ordering is dependent on the order in which stat-names are encoded
* into the SymbolTable, which will not be optimal, but in practice appears
* to be pretty good.
*
* This is exposed in the interface for the benefit of join(), which which is
* used in the hot-path to append two stat-names into a temp without taking
* locks. This is used then in thread-local cache lookup, so that once warm,
* no locks are taken when looking up stats.
*/
using Storage = uint8_t[];
using StoragePtr = std::unique_ptr<Storage>;

virtual ~SymbolTable() = default;

/**
* Encodes a stat name using the symbol table, returning a SymbolEncoding. The
* SymbolEncoding is not intended for long-term storage, but is used to help
* allocate a StatName with the correct amount of storage.
*
* When a name is encoded, it bumps reference counts held in the table for
* each symbol. The caller is responsible for creating a StatName using this
* SymbolEncoding and ultimately disposing of it by calling
* SymbolTable::free(). Users are protected from leaking symbols into the pool
* by ASSERTions in the SymbolTable destructor.
*
* @param name The name to encode.
* @return SymbolEncoding the encoded symbols.
*/
virtual SymbolEncoding encode(absl::string_view name) PURE;

/**
* @return uint64_t the number of symbols in the symbol table.
*/
virtual uint64_t numSymbols() const PURE;

/**
* Decodes a vector of symbols back into its period-delimited stat name. If
* decoding fails on any part of the symbol_vec, we release_assert and crash,
* since this should never happen, and we don't want to continue running
* with a corrupt stats set.
*
* @param stat_name the stat name.
* @return std::string stringifiied stat_name.
*/
virtual std::string toString(const StatName& stat_name) const PURE;

/**
* Deterines whether one StatName lexically precedes another. Note that
* the lexical order may not exactly match the lexical order of the
* elaborated strings. For example, stat-name of "-.-" would lexically
* sort after "---" but when encoded as a StatName would come lexically
* earlier. In practice this is unlikely to matter as those are not
jmarantz marked this conversation as resolved.
Show resolved Hide resolved
* reasonable names for Envoy stats.
*
* Note that this operation has to be performed with the context of the
* SymbolTable so that the individual Symbol objects can be converted
* into strings for lexical comparison.
*
* @param a the first stat name
* @param b the second stat name
* @return bool true if a lexically precedes b.
*/
virtual bool lessThan(const StatName& a, const StatName& b) const PURE;

/**
* Joins two or more StatNames. For example if we have StatNames for {"a.b",
* "c.d", "e.f"} then the joined stat-name matches "a.b.c.d.e.f". The
* advantage of using this representation is that it avoids having to
* decode/encode into the elaborated form, and does not require locking the
* SymbolTable.
*
* The caveat is that this representation does not bump reference counts on
* the referenced Symbols in the SymbolTable, so it's only valid as long for
* the lifetime of the joined StatNames.
*
* This is intended for use doing cached name lookups of scoped stats, where
* the scope prefix and the names to combine it with are already in StatName
* form. Using this class, they can be combined without acessingm the
* SymbolTable or, in particular, taking its lock.
jmarantz marked this conversation as resolved.
Show resolved Hide resolved
*
* @param stat_names the names to join.
* @return Storage allocated for the joined name.
*/
virtual StoragePtr join(const std::vector<StatName>& stat_names) const PURE;
htuch marked this conversation as resolved.
Show resolved Hide resolved

#ifndef ENVOY_CONFIG_COVERAGE
virtual void debugPrint() const PURE;
#endif

private:
friend class StatNameStorage;
friend class StatNameList;

/**
* Since SymbolTable does manual reference counting, a client of SymbolTable
* must manually call free(symbol_vec) when it is freeing the backing store
* for a StatName. This way, the symbol table will grow and shrink
* dynamically, instead of being write-only.
*
* @param stat_name the stat name.
*/
virtual void free(const StatName& stat_name) PURE;

/**
* StatName backing-store can be managed by callers in a variety of ways
* to minimize overhead. But any persistent reference to a StatName needs
* to hold onto its own reference-counts for all symbols. This method
* helps callers ensure the symbol-storage is maintained for the lifetime
* of a reference.
*
* @param stat_name the stat name.
*/
virtual void incRefCount(const StatName& stat_name) PURE;
};

using SharedSymbolTable = std::shared_ptr<SymbolTable>;

} // namespace Stats
} // namespace Envoy
6 changes: 6 additions & 0 deletions source/common/stats/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,12 @@ envoy_cc_library(
],
)

envoy_cc_library(
name = "fake_symbol_table_lib",
hdrs = ["fake_symbol_table_impl.h"],
deps = [":symbol_table_lib"],
)

envoy_cc_library(
name = "stats_options_lib",
hdrs = ["stats_options_impl.h"],
Expand Down
98 changes: 98 additions & 0 deletions source/common/stats/fake_symbol_table_impl.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
#pragma once

#include <algorithm>
#include <cstring>
#include <memory>
#include <stack>
#include <string>
#include <unordered_map>
#include <vector>

#include "envoy/common/exception.h"
#include "envoy/stats/symbol_table.h"

#include "common/common/assert.h"
#include "common/common/hash.h"
#include "common/common/lock_guard.h"
#include "common/common/non_copyable.h"
#include "common/common/thread.h"
#include "common/common/utility.h"
#include "common/stats/symbol_table_impl.h"

#include "absl/strings/str_join.h"
#include "absl/strings/str_split.h"

namespace Envoy {
namespace Stats {

/**
* Implements the SymbolTable interface without taking locks or saving memory.
* This implementation is intended as a transient state for the Envoy codebase
* to allow incremental conversion of Envoy stats call-sites to use the
* SymbolTable interface, pre-allocating symbols during construction time for
* all stats tokens.
*
* Once all stat tokens are symbolized at construction time, this
* FakeSymbolTable implementation can be deleted, and real-symbol tables can be
* used, thereby reducing memory and improving stat construction time.
*
* Note that it is not necessary to pre-allocate all elaborated stat names
* because multiple StatNames can be joined together without taking locks,
* even in SymbolTableImpl.
*
* This implementation simply stores the characters directly in the uint8_t[]
* that backs each StatName, so there is no sharing or memory savings, but also
* no state associated with the SymbolTable, and thus no locks needed.
*
* TODO(jmarantz): delete this class once SymbolTable is fully deployed in the
* Envoy codebase.
*/
class FakeSymbolTableImpl : public SymbolTable {
public:
SymbolEncoding encode(absl::string_view name) override { return encodeHelper(name); }

std::string toString(const StatName& stat_name) const override {
return std::string(toStringView(stat_name));
}
uint64_t numSymbols() const override { return 0; }
bool lessThan(const StatName& a, const StatName& b) const override {
return toStringView(a) < toStringView(b);
}
void free(const StatName&) override {}
void incRefCount(const StatName&) override {}
SymbolTable::StoragePtr join(const std::vector<StatName>& names) const override {
std::vector<absl::string_view> strings;
for (StatName name : names) {
absl::string_view str = toStringView(name);
if (!str.empty()) {
strings.push_back(str);
}
}
return stringToStorage(absl::StrJoin(strings, "."));
}

#ifndef ENVOY_CONFIG_COVERAGE
void debugPrint() const override {}
#endif

private:
SymbolEncoding encodeHelper(absl::string_view name) const {
SymbolEncoding encoding;
encoding.addStringForFakeSymbolTable(name);
return encoding;
}

absl::string_view toStringView(const StatName& stat_name) const {
return {reinterpret_cast<const char*>(stat_name.data()), stat_name.dataSize()};
}

SymbolTable::StoragePtr stringToStorage(absl::string_view name) const {
SymbolEncoding encoding = encodeHelper(name);
auto bytes = std::make_unique<uint8_t[]>(encoding.bytesRequired());
encoding.moveToStorage(bytes.get());
return bytes;
}
};

} // namespace Stats
} // namespace Envoy
Loading