Skip to content

Commit

Permalink
[Heartbeat] Add service_name option for APM integration (elastic#19932)…
Browse files Browse the repository at this point in the history
… (elastic#20034)

Adds a new standard service_name option to the heartbeat config file. While possible with fields already, adding this as a first class option encourages use of this important field for integration.

First step toward elastic/uptime#220

This PR also refactors some internal bits where we were passing too many parameters already, and adding service_name would just be too much. We now pass a single larger struct for common monitor options which cleans up a lot of the code.

(cherry picked from commit 6197850)
  • Loading branch information
andrewvc authored Jul 29, 2020
1 parent c6ffc58 commit 573cff9
Show file tree
Hide file tree
Showing 21 changed files with 174 additions and 102 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,7 @@ field. You can revert this change by configuring tags for the module and omittin
- Fixed excessive memory usage introduced in 7.5 due to over-allocating memory for HTTP checks. {pull}15639[15639]
- Fixed scheduler shutdown issues which would in rare situations cause a panic due to semaphore misuse. {pull}16397[16397]
- Fixed TCP TLS checks to properly validate hostnames, this broke in 7.x and only worked for IP SANs. {pull}17549[17549]
- Add support for new `service_name` option to all monitors. {pull}19932[19932].

*Journalbeat*

Expand Down
3 changes: 3 additions & 0 deletions heartbeat/_meta/config/beat.reference.yml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ heartbeat.monitors:
# Human readable display name for this service in Uptime UI and elsewhere
name: my-icmp-monitor

# Name of corresponding APM service, if Elastic APM is in use for the monitored service.
# service_name: my-apm-service-name

# Enable/Disable monitor
#enabled: true

Expand Down
2 changes: 2 additions & 0 deletions heartbeat/_meta/config/beat.yml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ heartbeat.monitors:
schedule: '@every 10s'
# Total test connection and data exchange timeout
#timeout: 16s
# Name of corresponding APM service, if Elastic APM is in use for the monitored service.
#service_name: my-apm-service-name

{{header "Elasticsearch template setting"}}

Expand Down
17 changes: 13 additions & 4 deletions heartbeat/docs/getting-started.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ include::{libbeat-dir}/tab-widgets/install-widget.asciidoc[]
==== Other installation options

* <<setup-repositories,APT or YUM>>
* https://www.elastic.co/downloads/beats/{beatname_lc}[Download page]
* https://www.elastic.co/downloads/beats/{beatname_lc}[Download page]
* <<running-on-docker,Docker>>

[float]
Expand All @@ -58,7 +58,7 @@ include::{libbeat-dir}/shared/connecting-to-es.asciidoc[]

Heartbeat provides monitors to check the status of hosts at set intervals.
Heartbeat currently provides monitors for ICMP, TCP, and HTTP (see
<<heartbeat-overview>> for more about these monitors).
<<heartbeat-overview>> for more about these monitors).

You configure each monitor individually. In +{beatname_lc}.yml+, specify the
list of monitors that you want to enable. Each item in the list begins with a
Expand All @@ -71,10 +71,19 @@ heartbeat.monitors:
- type: icmp
schedule: '*/5 * * * * * *' <1>
hosts: ["myhost"]
id: my-icmp-service
name: My ICMP Service
- type: tcp
schedule: '@every 5s' <2>
hosts: ["myhost:12345"]
mode: any <3>
id: my-tcp-service
- type: http
schedule: '@every 5s'
urls: ["http://example.net"]
service_name: apm-service-name <4>
id: my-http-service
name: My HTTP Service
----------------------------------------------------------------------
<1> The `icmp` monitor is scheduled to run exactly every 5 seconds (10:00:00,
10:00:05, and so on). The `schedule` option uses a cron-like syntax based on
Expand All @@ -83,7 +92,7 @@ https://github.com/gorhill/cronexpr#implementation[this `cronexpr` implementatio
was started. Heartbeat adds the `@every` keyword to the syntax provided by the
`cronexpr` package.
<3> The `mode` specifies whether to ping one IP (`any`) or all resolvable IPs
(`all`).
<4> The `service_name` field can be used to integrate heartbeat with elastic APM via the Uptime UI.

include::{libbeat-dir}/shared/config-check.asciidoc[]

Expand All @@ -106,7 +115,7 @@ include::{libbeat-dir}/tab-widgets/setup-widget.asciidoc[]
`-e` is optional and sends output to standard error instead of the configured log output.

This step loads the recommended {ref}/indices-templates.html[index template] for writing to {es}.
It does not install {beatname_uc} dashboards. Heartbeat dashboards and
It does not install {beatname_uc} dashboards. Heartbeat dashboards and
installation steps are available in the
https://github.com/elastic/uptime-contrib[uptime-contrib] GitHub repository.

Expand Down
1 change: 1 addition & 0 deletions heartbeat/docs/heartbeat-options.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ heartbeat.monitors:
- type: http
id: service-status
name: Service Status
service_name: my-apm-service-name
hosts: ["http://localhost:80/service/status"]
check.response.status: [200]
schedule: '@every 5s'
Expand Down
8 changes: 8 additions & 0 deletions heartbeat/docs/monitors/monitor-common-options.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,14 @@ it is recommended that you set this manually.
Optional human readable name for this monitor. This value appears in the <<exported-fields,exported fields>>
as `monitor.name`.


[float]
[[service-name]]
==== `service_name`

Optional APM service name for this monitor. Corresponds to the `service.name` ECS field. Set this when monitoring an app
that is also using APM to enable integrations between Uptime and APM data in Kibana.

[float]
[[monitor-enabled]]
==== `enabled`
Expand Down
3 changes: 3 additions & 0 deletions heartbeat/heartbeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ heartbeat.monitors:
# Human readable display name for this service in Uptime UI and elsewhere
name: my-icmp-monitor

# Name of corresponding APM service, if Elastic APM is in use for the monitored service.
# service_name: my-apm-service-name

# Enable/Disable monitor
#enabled: true

Expand Down
2 changes: 2 additions & 0 deletions heartbeat/heartbeat.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ heartbeat.monitors:
schedule: '@every 10s'
# Total test connection and data exchange timeout
#timeout: 16s
# Name of corresponding APM service, if Elastic APM is in use for the monitored service.
#service_name: my-apm-service-name

# ======================= Elasticsearch template setting =======================

Expand Down
3 changes: 3 additions & 0 deletions heartbeat/monitors.d/sample.http.yml.disabled
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@
# Human readable display name for this service in Uptime UI and elsewhere
name: My HTTP Monitor

# Name of corresponding APM service, if Elastic APM is in use for the monitored service.
#service_name: my-apm-service-name

# Enable/Disable monitor
#enabled: true

Expand Down
3 changes: 3 additions & 0 deletions heartbeat/monitors.d/sample.icmp.yml.disabled
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@
# Human readable display name for this service in Uptime UI and elsewhere
name: My ICMP Monitor

# Name of corresponding APM service, if Elastic APM is in use for the monitored service.
#service_name: my-apm-service-name

# Enable/Disable monitor
#enabled: true

Expand Down
4 changes: 2 additions & 2 deletions heartbeat/monitors.d/sample.tcp.yml.disabled
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@
# Human readable display name for this service in Uptime UI and elsewhere
name: My TCP monitor

# Monitor name used for job name and document type
#name: tcp
# Name of corresponding APM service, if Elastic APM is in use for the monitored service.
#service_name: my-apm-service-name

# Enable/Disable monitor
#enabled: true
Expand Down
11 changes: 6 additions & 5 deletions heartbeat/monitors/active/http/http_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ import (
"github.com/stretchr/testify/require"

"github.com/elastic/beats/v7/heartbeat/hbtest"
"github.com/elastic/beats/v7/heartbeat/monitors/stdfields"
"github.com/elastic/beats/v7/heartbeat/monitors/wrappers"
"github.com/elastic/beats/v7/heartbeat/scheduler/schedule"
"github.com/elastic/beats/v7/libbeat/beat"
Expand Down Expand Up @@ -78,8 +79,8 @@ func sendTLSRequest(t *testing.T, testURL string, useUrls bool, extraConfig map[
jobs, endpoints, err := create("tls", config)
require.NoError(t, err)

sched, _ := schedule.Parse("@every 1s")
job := wrappers.WrapCommon(jobs, "tls", "", "http", sched, time.Duration(0))[0]
sched := schedule.MustParse("@every 1s")
job := wrappers.WrapCommon(jobs, stdfields.StdMonitorFields{ID: "tls", Type: "http", Schedule: sched, Timeout: 1})[0]

event := &beat.Event{}
_, err = job(event)
Expand Down Expand Up @@ -318,7 +319,7 @@ func TestLargeResponse(t *testing.T) {
require.NoError(t, err)

sched, _ := schedule.Parse("@every 1s")
job := wrappers.WrapCommon(jobs, "test", "", "http", sched, time.Duration(0))[0]
job := wrappers.WrapCommon(jobs, stdfields.StdMonitorFields{ID: "test", Type: "http", Schedule: sched, Timeout: 1})[0]

event := &beat.Event{}
_, err = job(event)
Expand Down Expand Up @@ -514,7 +515,7 @@ func TestRedirect(t *testing.T) {
require.NoError(t, err)

sched, _ := schedule.Parse("@every 1s")
job := wrappers.WrapCommon(jobs, "test", "", "http", sched, time.Duration(0))[0]
job := wrappers.WrapCommon(jobs, stdfields.StdMonitorFields{ID: "test", Type: "http", Schedule: sched, Timeout: 1})[0]

// Run this test multiple times since in the past we had an issue where the redirects
// list was added onto by each request. See https://github.com/elastic/beats/pull/15944
Expand Down Expand Up @@ -561,7 +562,7 @@ func TestNoHeaders(t *testing.T) {
require.NoError(t, err)

sched, _ := schedule.Parse("@every 1s")
job := wrappers.WrapCommon(jobs, "test", "", "http", sched, time.Duration(0))[0]
job := wrappers.WrapCommon(jobs, stdfields.StdMonitorFields{ID: "test", Type: "http", Schedule: sched, Timeout: 1})[0]

event := &beat.Event{}
_, err = job(event)
Expand Down
3 changes: 2 additions & 1 deletion heartbeat/monitors/active/icmp/icmp_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ import (
"github.com/elastic/beats/v7/heartbeat/hbtest"
"github.com/elastic/beats/v7/heartbeat/look"
"github.com/elastic/beats/v7/heartbeat/monitors"
"github.com/elastic/beats/v7/heartbeat/monitors/stdfields"
"github.com/elastic/beats/v7/heartbeat/monitors/wrappers"
"github.com/elastic/beats/v7/heartbeat/scheduler/schedule"
"github.com/elastic/beats/v7/libbeat/beat"
Expand Down Expand Up @@ -69,7 +70,7 @@ func execTestICMPCheck(t *testing.T, cfg Config) (mockLoop, *beat.Event) {
require.Equal(t, 1, endpoints)
e := &beat.Event{}
sched, _ := schedule.Parse("@every 1s")
wrapped := wrappers.WrapCommon(j, "test", "", "icmp", sched, time.Duration(0))
wrapped := wrappers.WrapCommon(j, stdfields.StdMonitorFields{ID: "test", Type: "icmp", Schedule: sched, Timeout: 1})
wrapped[0](e)
return tl, e
}
Expand Down
7 changes: 3 additions & 4 deletions heartbeat/monitors/active/tcp/helpers_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,12 @@ import (
"net/http"
"net/http/httptest"
"testing"
"time"

"github.com/pkg/errors"

"github.com/stretchr/testify/require"

"github.com/elastic/beats/v7/heartbeat/hbtest"
"github.com/elastic/beats/v7/heartbeat/monitors/stdfields"
"github.com/elastic/beats/v7/heartbeat/monitors/wrappers"
"github.com/elastic/beats/v7/heartbeat/scheduler/schedule"
"github.com/elastic/beats/v7/libbeat/beat"
Expand All @@ -42,8 +41,8 @@ func testTCPConfigCheck(t *testing.T, configMap common.MapStr, host string, port
jobs, endpoints, err := create("tcp", config)
require.NoError(t, err)

sched, _ := schedule.Parse("@every 1s")
job := wrappers.WrapCommon(jobs, "test", "", "tcp", sched, time.Duration(0))[0]
sched := schedule.MustParse("@every 1s")
job := wrappers.WrapCommon(jobs, stdfields.StdMonitorFields{ID: "test", Type: "tcp", Schedule: sched, Timeout: 1})[0]

event := &beat.Event{}
_, err = job(event)
Expand Down
6 changes: 3 additions & 3 deletions heartbeat/monitors/active/tcp/tls_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ import (
"os"
"strconv"
"testing"
"time"

"github.com/elastic/beats/v7/heartbeat/monitors/stdfields"
"github.com/elastic/beats/v7/heartbeat/monitors/wrappers"
"github.com/elastic/beats/v7/heartbeat/scheduler/schedule"
"github.com/elastic/beats/v7/libbeat/beat"
Expand Down Expand Up @@ -187,8 +187,8 @@ func testTLSTCPCheck(t *testing.T, host string, port uint16, certFileName string
jobs, endpoints, err := createWithResolver(config, resolver)
require.NoError(t, err)

sched, _ := schedule.Parse("@every 1s")
job := wrappers.WrapCommon(jobs, "test", "", "tcp", sched, time.Duration(0))[0]
sched := schedule.MustParse("@every 1s")
job := wrappers.WrapCommon(jobs, stdfields.StdMonitorFields{ID: "test", Type: "tcp", Schedule: sched, Timeout: 1})[0]

event := &beat.Event{}
_, err = job(event)
Expand Down
32 changes: 15 additions & 17 deletions heartbeat/monitors/monitor.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ import (
"fmt"
"sync"

"github.com/elastic/beats/v7/heartbeat/monitors/stdfields"

"github.com/mitchellh/hashstructure"
"github.com/pkg/errors"

Expand All @@ -38,9 +40,7 @@ import (
// Monitor represents a configured recurring monitoring configuredJob loaded from a config file. Starting it
// will cause it to run with the given scheduler until Stop() is called.
type Monitor struct {
id string
name string
typ string
stdFields stdfields.StdMonitorFields
pluginName string
config *common.Config
registrar *pluginsReg
Expand Down Expand Up @@ -68,7 +68,7 @@ type Monitor struct {
// String prints a description of the monitor in a threadsafe way. It is important that this use threadsafe
// values because it may be invoked from another thread in cfgfile/runner.
func (m *Monitor) String() string {
return fmt.Sprintf("Monitor<pluginName: %s, enabled: %t>", m.name, m.enabled)
return fmt.Sprintf("Monitor<pluginName: %s, enabled: %t>", m.stdFields.Name, m.enabled)
}

func checkMonitorConfig(config *common.Config, registrar *pluginsReg, allowWatches bool) error {
Expand Down Expand Up @@ -120,20 +120,18 @@ func newMonitorUnsafe(
// Extract just the Id, Type, and Enabled fields from the config
// We'll parse things more precisely later once we know what exact type of
// monitor we have
mpi, err := pluginInfo(config)
stdFields, err := stdfields.ConfigToStdMonitorFields(config)
if err != nil {
return nil, err
}

monitorPlugin, found := registrar.get(mpi.Type)
monitorPlugin, found := registrar.get(stdFields.Type)
if !found {
return nil, fmt.Errorf("monitor type %v does not exist, valid types are %v", mpi.Type, registrar.monitorNames())
return nil, fmt.Errorf("monitor type %v does not exist, valid types are %v", stdFields.Type, registrar.monitorNames())
}

m := &Monitor{
id: mpi.ID,
name: mpi.Name,
typ: mpi.Type,
stdFields: stdFields,
pluginName: monitorPlugin.name,
scheduler: scheduler,
configuredJobs: []*configuredJob{},
Expand All @@ -144,22 +142,22 @@ func newMonitorUnsafe(
stats: monitorPlugin.stats,
}

if m.id != "" {
if m.stdFields.ID != "" {
// Ensure we don't have duplicate IDs
if _, loaded := uniqueMonitorIDs.LoadOrStore(m.id, m); loaded {
return m, ErrDuplicateMonitorID{m.id}
if _, loaded := uniqueMonitorIDs.LoadOrStore(m.stdFields.ID, m); loaded {
return m, ErrDuplicateMonitorID{m.stdFields.ID}
}
} else {
// If there's no explicit ID generate one
hash, err := m.configHash()
if err != nil {
return m, err
}
m.id = fmt.Sprintf("auto-%s-%#X", m.typ, hash)
m.stdFields.ID = fmt.Sprintf("auto-%s-%#X", m.stdFields.Type, hash)
}

rawJobs, endpoints, err := monitorPlugin.create(config)
wrappedJobs := wrappers.WrapCommon(rawJobs, m.id, m.name, m.typ, mpi.Schedule, mpi.Timeout)
wrappedJobs := wrappers.WrapCommon(rawJobs, m.stdFields)
m.endpoints = endpoints

if err != nil {
Expand All @@ -181,7 +179,7 @@ func newMonitorUnsafe(
return m, ErrWatchesDisabled
}

logp.Info(`Obsolete option 'watch.poll_file' declared. This will be removed in a future release.
logp.Info(`Obsolete option 'watch.poll_file' declared. This will be removed in a future release.
See https://www.elastic.co/guide/en/beats/heartbeat/current/configuration-heartbeat-options.html for more info`)
}

Expand Down Expand Up @@ -330,5 +328,5 @@ func (m *Monitor) Stop() {

func (m *Monitor) freeID() {
// Free up the monitor ID for reuse
uniqueMonitorIDs.Delete(m.id)
uniqueMonitorIDs.Delete(m.stdFields.ID)
}
Loading

0 comments on commit 573cff9

Please sign in to comment.