Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats: add new TextReadout stat type #10639

Merged
merged 21 commits into from
Apr 17, 2020
Merged

Conversation

efimki
Copy link
Contributor

@efimki efimki commented Apr 3, 2020

Description: Add the TextReadout stat type, for holding strings.
Currently ignored (not preserved) during hot restart.
If importing text readout is desired in the future that could be a mode-bit like gauge.

Risk Level: low
Testing: roughly duplicated every bit of test logic involving the word "gauge".

Fixes #5790 @fredlas @jmarantz

@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/.

🐱

Caused by: #10639 was synchronize by efimki.

see: more, trace.

@efimki efimki force-pushed the textstats branch 2 times, most recently from 6773a4e to 9ec2e88 Compare April 6, 2020 16:52
efimki added 3 commits April 6, 2020 14:30
Based on envoyproxy#5844 by @fredlas.

Signed-off-by: Misha Efimov <mef@google.com>
Signed-off-by: Misha Efimov <mef@google.com>
Signed-off-by: Misha Efimov <mef@google.com>
Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great; left a few minor comments.

One larger question: what do we want the hot-restart behavior to be? Do we transfer text-readout values from parent to child?

We might need to have an ImportMode like Gauge, because if parent and child have conflicting text-readout values it won't be obvious how to combine them.

include/envoy/stats/scope.h Outdated Show resolved Hide resolved
source/common/stats/allocator_impl.cc Outdated Show resolved Hide resolved
source/common/stats/allocator_impl.cc Show resolved Hide resolved
source/common/stats/isolated_store_impl.h Outdated Show resolved Hide resolved
@jmarantz
Copy link
Contributor

jmarantz commented Apr 7, 2020

More thoughts:

  • how should text-readout be displayed in the admin /stats endpoint? Please be sure to properly html-escape output from the text, to avoid xss (or just weird formatting on the output)
  • how should text-readout be send to prometheus?
  • it's probably worth auditing all calls to counters() and gauges() in source/... to see if we need to have similar handling for text-readouts.

@htuch
Copy link
Member

htuch commented Apr 7, 2020

Please merge master to pick up #10672. We no longer accept changes to v2 (without explicit exception), so any API modifications should happen in v3. If this PR is adding a new proto, please follow the updated instructions in https://github.com/envoyproxy/envoy/blob/master/api/STYLE.md#adding-an-extension-configuration-to-the-api.

Signed-off-by: Misha Efimov <mef@google.com>
Copy link
Contributor Author

@efimki efimki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmarantz thanks a lot for your comments!
I don't have a good answer for an excellent question about hot-restart behavior. I guess it would depend on the use case.
I don't think any value preservation is needed for my current use case, which IIUIC also avoids the question about conflicting values.
WDYT?

source/common/stats/isolated_store_impl.h Outdated Show resolved Hide resolved
Signed-off-by: Misha Efimov <mef@google.com>
@jmarantz
Copy link
Contributor

jmarantz commented Apr 7, 2020

Simply not importing them during hot restart is fine for now. You could leave a comment staying that if importing text-readout is desired in the future that could be a mode-bit like gauge, and that could be added later.

Signed-off-by: Misha Efimov <mef@google.com>
@efimki
Copy link
Contributor Author

efimki commented Apr 7, 2020

FYI, I think "efimki requested review from alyssawilk, htuch, lizan, PiotrSikora and snowp as code owners yesterday" was auto-requested by repokitteh bot that noticed some spurious commits pulled from upstream and unintentionally included into this PR due to my github workflow clumsiness.

@jmarantz
Copy link
Contributor

jmarantz commented Apr 7, 2020

Looks good to me generally, but:

  • you need to do a master merge to resolve stats_Integration_test. when you do that, you should manually re-test with "-c opt --test_env=ENVOY_MEMORY_TEST_EXACT=true" to iterate toward the correct regold.
  • we need to decide what to do about admin /stats and audit other uses of .counters() and .gauges()

efimki added 2 commits April 8, 2020 17:38
Signed-off-by: Misha Efimov <mef@google.com>
Signed-off-by: Misha Efimov <mef@google.com>
@efimki
Copy link
Contributor Author

efimki commented Apr 9, 2020

@jmarantz - thanks for your review and suggestions.
I've re-run the test with "-c opt --test_env=ENVOY_MEMORY_TEST_EXACT=true" and updated numbers.

With regards to escaping text readout values is there a common pattern that I could follow?

I'm wondering whether it makes sense to replace .value() getter with safeHtmlEscapedValue() and unsafeRawStringValue() getters to make it clear that stored value is not escaped.

I think admin /stats will NOT be affected by this PR as they have to explicitly iterate TextReadout stats. I'll be happy to modify admin/stats in this PR, but I'm not sure whether it is the right thing to do. WDYT?

@jmarantz
Copy link
Contributor

jmarantz commented Apr 9, 2020

I'm OK with a follow-up to add visibility to the stats. But would like to definitely have a follow-up, as generally developers in Envoy that see an option for a new stat would expect to be able to get at the value through the UI or their favorite stats monitoring system (e.g. prometheus). Not sure what support those systems would have for a text-readout type. But at the minimum it should be an admin /stats endpoint.

It's a fair argument that you don't need to become an expert in every stats sink to add a new stats type, but you should definitely sprinkle some TODOs around for others to note that there is data that is collected and not exposed.

I don't think it's necessary for you to handle the HTML escaping or rename the methods in the stats type. HTML is just one of many possible output formats that might need some kind of escaping, another being JSON. You don't need to comment about all possible outputs in the value() method here.

Signed-off-by: Misha Efimov <mef@google.com>
mattklein123
mattklein123 previously approved these changes Apr 15, 2020
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@efimki
Copy link
Contributor Author

efimki commented Apr 15, 2020

Hrm, the circleci failure seems unrelated and in-actionable.
Any ideas how to fix this?
Should I try to kick-ci again?

[ RUN      ] Protocols/BufferIntegrationTest.RouteOverride/IPv4_Http2Downstream_HttpUpstream
#0 Envoy::SignalAction::sigHandler() [0x10dd7cb4]
#1 __restore_rt [0x7fe30341d390]
#2 Envoy::BaseIntegrationTest::createEnvoy() [0xefa4ad7]
#3 Envoy::BaseIntegrationTest::initialize() [0xefa36ba]
#4 __llvm_coverage_mapping [0x9ad1263]
#5 testing::internal::HandleSehExceptionsInMethodIfSupported<>() [0x12215686]
#6 testing::internal::HandleExceptionsInMethodIfSupported<>() [0x12201c51]
#7 testing::Test::Run() [0x121ec522]
#8 testing::TestInfo::Run() [0x121ed028]
#9 testing::TestSuite::Run() [0x121ed799]
#10 testing::internal::UnitTestImpl::RunAllTests() [0x121fa7a6]
#11 testing::internal::HandleSehExceptionsInMethodIfSupported<>() [0x1221a526]
#12 testing::internal::HandleExceptionsInMethodIfSupported<>() [0x12204a71]
#13 testing::UnitTest::Run() [0x121fa1f0]
#14 RUN_ALL_TESTS() [0x1076b235]
#15 Envoy::TestRunner::RunTests() [0x1076a936]
#16 main [0x10768c67]
#17 __libc_start_main [0x7fe303062830]
external/bazel_tools/tools/test/collect_coverage.sh: line 131: 16404 Aborted                 (core dumped) "$@"

@mattklein123
Copy link
Member

The clang-tidy errors are legit. Can you fix them? Coverage is a flake and we can rerun if needed. There are known issues there that are being worked on.

/wait

Signed-off-by: Misha Efimov <mef@google.com>
Signed-off-by: Misha Efimov <mef@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@efimki
Copy link
Contributor Author

efimki commented Apr 16, 2020

Hrm, "macOS was canceled" after 6 hours.
What's a good next step? kick-ci?

@jmarantz
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s), but failed to run 2 pipeline(s).

@mattklein123
Copy link
Member

/retest

@repokitteh-read-only
Copy link

🐴 hold your horses - no failures detected, yet.

🐱

Caused by: a #10639 (comment) was created by @mattklein123.

see: more, trace.

@mattklein123
Copy link
Member

Please push and empty commit to kick Circle CI, I don't know what is broken there.

Signed-off-by: Misha Efimov <mef@google.com>
@mattklein123
Copy link
Member

Sigh CircleCI. I dont know what is broken. I will try closing and reopening. If that doesn't work you will probably need to make a fresh PR and we can get this merged. I just want to make sure API/docs/etc. are all passing.

@efimki
Copy link
Contributor Author

efimki commented Apr 17, 2020

Looks like MacOS pipeline has timed out after 6 hours, but AFAICT the executed tests have passed:

https://dev.azure.com/cncf/envoy/_build/results?buildId=37491&view=logs&j=a5e52b91-c83f-5429-4a68-c246fc63a4f7&t=5852bf5a-5a02-52f3-bee8-4fdc90cda9d0

/azp run

@efimki
Copy link
Contributor Author

efimki commented Apr 17, 2020

/azp run

@azure-pipelines
Copy link

Commenter does not have sufficient privileges for PR 10639 in repo envoyproxy/envoy

@jmarantz jmarantz merged commit a9c2333 into envoyproxy:master Apr 17, 2020
@efimki
Copy link
Contributor Author

efimki commented Apr 17, 2020

Woo-hoo, thanks Joshua!

@efimki efimki deleted the textstats branch April 17, 2020 16:32
penguingao pushed a commit to penguingao/envoy that referenced this pull request Apr 22, 2020
* stats: add new TextReadout stat type
Based on envoyproxy#5844 by @fredlas.

Signed-off-by: Misha Efimov <mef@google.com>
Signed-off-by: pengg <pengg@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stats: New type(s) of Metric
4 participants