Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export expvar metrics of badger to the metricsFactory #1704

Merged
merged 10 commits into from
Aug 7, 2019

Conversation

burmanm
Copy link
Contributor

@burmanm burmanm commented Aug 1, 2019

Which problem is this PR solving?

Badger's internal metrics are not available through jaeger's metrics interface. This adds them so they can be monitored.

Short description of the changes

Copy the metric definitions from expvar and then sync them manually in a background thread. I have a nagging feeling this can't be the best solution though, but I didn't find metricsFactory to have "external collectors" that would fetch the data from there.

Signed-off-by: Michael Burman <yak@iki.fi>
@@ -150,7 +161,8 @@ func (f *Factory) CreateDependencyReader() (dependencystore.Reader, error) {

// Close Implements io.Closer and closes the underlying storage
func (f *Factory) Close() error {
f.maintenanceDone <- true
f.maintenanceDone <- true // maintenance close
f.maintenanceDone <- true // metrics close
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be metricsTicker?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the ticker is closed when the select ends.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we writing twice to the same channel? If it's only meant to be read-once for exit signal, then I think the usual practice for that is to close the channel.

…ld be every 5m not 5s and default table loading mode should stick to memory mapping

Signed-off-by: Michael Burman <yak@iki.fi>
waiter := func(previousValue int64) int64 {
sleeps := 0
_, gs := mFactory.Snapshot()
for gs["adger_memtable_gets_total"] == previousValue && sleeps < 8 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why "adger"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo. It always sleeps the maximum with that typo, so the test functions correctly - but might take too long. I'll fix that.

@yurishkuro
Copy link
Member

so badger code does not use any abstraction for metrics, only writes directly to expvar?

@burmanm
Copy link
Contributor Author

burmanm commented Aug 3, 2019

Yes, the badger's metric definitions are here:

https://github.com/dgraph-io/badger/blob/master/y/metrics.go

And they're written like this:

https://github.com/dgraph-io/badger/blob/master/db.go#L1009

@@ -150,7 +161,8 @@ func (f *Factory) CreateDependencyReader() (dependencystore.Reader, error) {

// Close Implements io.Closer and closes the underlying storage
func (f *Factory) Close() error {
f.maintenanceDone <- true
f.maintenanceDone <- true // maintenance close
f.maintenanceDone <- true // metrics close
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we writing twice to the same channel? If it's only meant to be read-once for exit signal, then I think the usual practice for that is to close the channel.

mapVal.Do(func(innerKv expvar.KeyValue) {
// The metrics we're interested in have only a single inner key (dynamic name)
// and we're only interested in its value
if intVal, ok := innerKv.Value.(*expvar.Int); ok {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is catching the metrics declared like this?

NumLSMGets = expvar.NewMap("badger_lsm_level_gets_total")

Why does Badger use a map in this case? It looks like it could have more than one entry in the map, but we're collapsing all under the map's name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, if multiple values are possible, we could put that value into a tag on the metric.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it is storing in the map the directory of the LSM or Value. We have that information in the Factory properties also, but it might indeed make sense to export that information somewhere. Unless there's a place to view properties already in Jaeger?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the tags it would look like this:

# HELP jaeger_badger_lsm_size_bytes badger_lsm_size_bytes
# TYPE jaeger_badger_lsm_size_bytes gauge
jaeger_badger_lsm_size_bytes{directory="/tmp/badger043433444"} 0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as long as there's a guarantee that map has only one entry, it makes more sense to go without the dir name in the label (since it might be different after restart)

g := metricsFactory.Gauge(metrics.Options{Name: kv.Key})
f.metrics.badgerMetrics[kv.Key] = g
} else if mapVal, ok := kv.Value.(*expvar.Map); ok {
mapVal.Do(func(innerKv expvar.KeyValue) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this map always be empty at this point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, since badger.Open() is called before the metrics are fetched. And badger.Open() registers the expvars (it calculates the initial sizes during Open()).

mFactory := metricstest.NewFactory(0)
f.Initialize(mFactory, zap.NewNop())
assert.NotNil(t, f.metrics.badgerMetrics)
_, found := f.metrics.badgerMetrics["badger_memtable_gets_total"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to see a test for the expvar.Map metrics as well.

… map inner key for metric tag

Signed-off-by: Michael Burman <yak@iki.fi>
@codecov
Copy link

codecov bot commented Aug 5, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@4d698ef). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #1704   +/-   ##
=========================================
  Coverage          ?   98.51%           
=========================================
  Files             ?      193           
  Lines             ?     9314           
  Branches          ?        0           
=========================================
  Hits              ?     9176           
  Misses            ?      110           
  Partials          ?       28
Impacted Files Coverage Δ
plugin/storage/badger/factory.go 98.09% <100%> (ø)
plugin/storage/badger/options.go 100% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4d698ef...b23abb3. Read the comment docs.

@yurishkuro yurishkuro merged commit dcaea9d into jaegertracing:master Aug 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants