Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats: Add fake symbol table as an intermediate state to move to SymbolTable API without taking locks. #5414

Merged
merged 23 commits into from
Jan 30, 2019

Conversation

jmarantz
Copy link
Contributor

@jmarantz jmarantz commented Dec 25, 2018

Description: Adds an abstract interface for SymbolTable and alternate implementation FakeSymbolTableImpl, which doesn't take locks. Once all stat tokens are symbolized at construction time, this FakeSymbolTable implementation can be deleted, and real-symbol tables can be used, thereby reducing memory and improving stat construction time per #3585 and #4980 . Note that it is not necessary to pre-allocate all elaborated stat names because multiple StatNames can be joined together without taking locks, even in SymbolTableImpl.

This implementation simply stores the characters directly in the uint8_t[] that backs each StatName, so there is no sharing or memory savings, but also no state associated with the SymbolTable, and thus no locks needed.

Risk Level: low
Testing: //test/common/stats/...
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz
Copy link
Contributor Author

@mattklein123 any comments on this idea in general? TL;DR by switching to the SymbollTable API without changing the underlying rep for now, we can incrementally convert subsystems in Envoy to symbolize their stat names on construction with small-scope PRs. Then make the switch in SymbolTable impls after there are no hot runtime string-based stat lookups.

@mattklein123
Copy link
Member

@jmarantz if I understand the proposal at a high level this makes sense to me as a stepping stone. However, I do wonder if we should focus on getting rid of shared memory stats first? I'm once again having a really hard time keeping track of all the permutations we need to support and I feel that we would a) get much better prod/test coverage if everyone was using the same stats and b) it would be vastly easier to understand. Thoughts on priority?

@jmarantz
Copy link
Contributor Author

Linking #4974 for context. I was reluctant to take that as a blocker for reducing stat memory as it's hard for me to anticipate the operational differences with the current hot-restart method, especially as we don't actually use hot-restart ourselves.

But I think it's a fair point that we need to be confident in every change that takes us toward a leaner stats architecture, and the three parallel implementations (hot-restart, non-hot-restart with fake symbol table, non-hot-restart with real symbol table) make that harder.

@stale
Copy link

stale bot commented Jan 4, 2019

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Jan 4, 2019
@jmarantz
Copy link
Contributor Author

jmarantz commented Jan 4, 2019

/wait

@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Jan 4, 2019
@stale
Copy link

stale bot commented Jan 11, 2019

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Jan 11, 2019
@jmarantz
Copy link
Contributor Author

/wait

@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Jan 11, 2019
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz
Copy link
Contributor Author

/wait

per DM with Matt we can go ahead with this in parallel with the hot-restart via RPC mechanism that @fredlas is working on.

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before diving into review, I'm curious if we have back-of-envelope SWAG for the potential savings here. I realize this PR isn't going to provide the measurable outcomes, but I'm curious what we're shooting for in eventual savings via symbol table. Do you have some thoughts @jmarantz?

@jmarantz
Copy link
Contributor Author

Yes, from where we are now ~20k/cluster savings. I have a completed integration in #4980 that has potential lock-contention risk, but shows the memory savings.

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this is moving some existing interfaces around, but my main initial feedback is to make the public header for symbol_table.h to be linearly consumable and explain the concepts to the reader as if they've never seen the implementation and have no history of all the stats work. This is for both selfish reasons (as a reviewer who has been out of loop on the stats work, I don't have this context on hand) and also for the benefit of future folks who work in this part of the code.

include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Contributor Author

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear you...I was hesitant about making SymbolTable into an abstract interface, as for performance/memory reasons it needs to expose some concepts to callers. The only reason I made it an interface here is to be able to have this intermediate state where we are using the SymbolTable API but with a fake implementation underneath that will never take locks.

Once we are fully transitioned to using the API, no locks will be taken in the hot-path. We could then remove the abstract interface.

Maybe a better approach is to skip the abstract API and just use more templating in the test. WDYT?

include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
ambuc
ambuc previously approved these changes Jan 22, 2019
Copy link
Contributor

@ambuc ambuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks like a good path forward. As always, the stats system is becoming more and more complex. Would it make sense to try and separate the inline docs into two portions: one inline function-level documentation about /how/ to use the stats system as it is, and one README-level design doc (maybe colocated in /stats) about /why/ the stats system is designed the way it is? That way a new developer can start running with the stats system at any time without needing to understand a larger roadmap for it.

include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
source/common/stats/symbol_table_impl.h Show resolved Hide resolved
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmarantz abstract API is fine, my ask around linear docs I think would be good either way, also +1 on what @ambuc suggests. I think this is a good move, just needs to be made clearer what is going on for folks who aren't living in the implementation day-to-day.

include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
Signed-off-by: Joshua Marantz <jmarantz@google.com>
… that need access.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz
Copy link
Contributor Author

@ambuc the README you are thinking of is in source/docs/stats.md . The stats system is still not using SymbolTable in any way, so I did not edit that doc in this PR.

include/envoy/stats/symbol_table.h Show resolved Hide resolved
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
source/common/stats/symbol_table_impl.cc Outdated Show resolved Hide resolved
source/common/stats/symbol_table_impl.cc Outdated Show resolved Hide resolved
source/common/stats/symbol_table_impl.h Outdated Show resolved Hide resolved
source/common/stats/symbol_table_impl.cc Show resolved Hide resolved
source/common/stats/symbol_table_impl.h Show resolved Hide resolved
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
include/envoy/stats/symbol_table.h Outdated Show resolved Hide resolved
include/envoy/stats/symbol_table.h Show resolved Hide resolved
source/common/stats/symbol_table_impl.cc Outdated Show resolved Hide resolved
source/common/stats/symbol_table_impl.cc Outdated Show resolved Hide resolved
//
// The maximum size of the list is 255 elements, so the length can fit in a
// byte. It would not be difficult to increase this, but there does not appear
// to be a current need.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside, I keep thinking that we shouldn't have to reinvent the byte packed design here, I'm sure this is done elsewhere for efficient string packing, but I don't see anything in STD or absl..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood; I'm sure this has been implemented somewhere, notably utf-8 encoders, but if it isn't in an already-used library it seems like it's easier to just have a robust tested impl here. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also be willing to factor this out and make it a separate tested packed-int abstraction in source/common/common, but I'd prefer to do that in a separate PR, as that code is already stable in the system and this PR is just abstrating the API to it.

test/common/stats/symbol_table_impl_test.cc Show resolved Hide resolved
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…ol table case.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, nice PR!

@htuch htuch merged commit 69964ba into envoyproxy:master Jan 30, 2019
@jmarantz jmarantz deleted the fake-symbol-table branch January 30, 2019 23:42
danzh2010 pushed a commit to danzh2010/envoy that referenced this pull request Jan 31, 2019
…olTable API without taking locks. (envoyproxy#5414)

Adds an abstract interface for SymbolTable and alternate implementation FakeSymbolTableImpl, which doesn't take locks. Once all stat tokens are symbolized at construction time, this FakeSymbolTable implementation can be deleted, and real-symbol tables can be used, thereby reducing memory and improving stat construction time per envoyproxy#3585 and envoyproxy#4980 . Note that it is not necessary to pre-allocate all elaborated stat names because multiple StatNames can be joined together without taking locks, even in SymbolTableImpl.

This implementation simply stores the characters directly in the uint8_t[] that backs each StatName, so there is no sharing or memory savings, but also no state associated with the SymbolTable, and thus no locks needed.

Risk Level: low
Testing: //test/common/stats/...

Signed-off-by: Joshua Marantz <jmarantz@google.com>
fredlas pushed a commit to fredlas/envoy that referenced this pull request Mar 5, 2019
…olTable API without taking locks. (envoyproxy#5414)

Adds an abstract interface for SymbolTable and alternate implementation FakeSymbolTableImpl, which doesn't take locks. Once all stat tokens are symbolized at construction time, this FakeSymbolTable implementation can be deleted, and real-symbol tables can be used, thereby reducing memory and improving stat construction time per envoyproxy#3585 and envoyproxy#4980 . Note that it is not necessary to pre-allocate all elaborated stat names because multiple StatNames can be joined together without taking locks, even in SymbolTableImpl.

This implementation simply stores the characters directly in the uint8_t[] that backs each StatName, so there is no sharing or memory savings, but also no state associated with the SymbolTable, and thus no locks needed.

Risk Level: low
Testing: //test/common/stats/...

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Fred Douglas <fredlas@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants