Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heartbeat autodiscover #8415

Merged
merged 23 commits into from
Oct 5, 2018
Merged

Heartbeat autodiscover #8415

merged 23 commits into from
Oct 5, 2018

Conversation

andrewvc
Copy link
Contributor

@andrewvc andrewvc commented Sep 24, 2018

This is a complete PR for heartbeat automatic reload aside from tests. Once those are added it will be ready for a true review.

This shares the adapter used by metricbeat, and moves that to libbeat to keep things DRY.

@andrewvc andrewvc added enhancement in progress Pull request is currently in progress. Heartbeat labels Sep 24, 2018
@andrewvc andrewvc requested a review from exekias September 24, 2018 17:05
Copy link
Contributor

@exekias exekias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great!

# docker.container.image: redis
# config:
# - type: tcp
# hosts: ["localhost:${data.port}"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably "${data.host}:${data.port}" is better?

@@ -26,20 +26,20 @@ import (
"github.com/elastic/beats/libbeat/common/bus"
)

// AutodiscoverAdapter for Metricbeat modules
type AutodiscoverAdapter struct {
// FactoryAdapter for Metricbeat modules
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you will need to update the docs here (Metricbeat references), probably explain a little bit how this takes a factory and give you an autodiscover adapter

}

if err := bt.scheduler.Start(); err != nil {
return err
}
defer bt.scheduler.Stop()
defer func() {

}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spurious defer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, good catch.

@andrewvc
Copy link
Contributor Author

Thanks for the review @exekias ! I'll ping you once I've had the chance to add tests.

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really excited to see autodiscovery in Heartbeat.

@@ -21,6 +21,8 @@ import (
"fmt"
"time"

"github.com/elastic/beats/libbeat/autodiscover"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import mess :-)

@@ -26,20 +26,20 @@ import (
"github.com/elastic/beats/libbeat/common/bus"
)

// AutodiscoverAdapter for Metricbeat modules
type AutodiscoverAdapter struct {
// FactoryAdapter is an adapter that works with any cfgfile.RunnerFactory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to see we generalised it.

@andrewvc andrewvc force-pushed the autodiscover branch 2 times, most recently from 3e0724b to ac43c34 Compare September 27, 2018 21:31
@andrewvc
Copy link
Contributor Author

Tests added & rebased off master. Will move out of WIP once CI passes

@andrewvc
Copy link
Contributor Author

I believe the test failures are unrelated, but there's one for metricbeat autodiscovery, so I'll run the suite again.

@andrewvc
Copy link
Contributor Author

jenkins, retest this please

@ruflin
Copy link
Contributor

ruflin commented Sep 28, 2018

On travis the metricbeat autodiscovery fails which I assume is related.

@andrewvc
Copy link
Contributor Author

@ruflin Yep. I've been spending today tracking it down, the metricbeat changes should have been a noop, but clearly that is not the case.

Narrowing down the cause, at the moment it's looking like somehow the hint builder is not being registered somehow.

@andrewvc
Copy link
Contributor Author

OK, just pushed d74b291 which I believe should fix it.

@andrewvc andrewvc changed the title WIP Heartbeat autodiscover Heartbeat autodiscover Sep 28, 2018
@andrewvc andrewvc added review and removed in progress Pull request is currently in progress. labels Sep 28, 2018
@andrewvc
Copy link
Contributor Author

andrewvc commented Sep 28, 2018

@ruflin this is basically done.

The lack of a docker-compose + Dockerfile was breaking this (and #8475 ). I opened #8491 to fix this as a singular PR, so that should be a blocker for this.

I just rebased this onto that PR. Techincally, #8491 is a blocker however.

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments for the testing setup. Please ping me if you have further questions.

environment:
- REDIS_HOST=redis
- REDIS_PORT=6379
- KIBANA_HOST=kibana
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove kibana

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this are complicated now because they're split across this and #8491. This is already fixed there, let's close that one out and I'll rebase here.

)

proc = self.start_beat()
docker_client.images.pull('redis:3.2.4')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we set TEST_ENVIRONMENT to true, these containers should already be running.

There are currently two models of testing against containers we have:

  • Start everything at the beginning: Libbeat and others
  • Start on request and stop afterwards: Metricbeat modules

We seem to mix the two here. For simplicity I would start with setting TEST_ENVIRONMENT to true and not handle container start / stop in here.

kibana: { condition: service_healthy }
redis: { condition: service_healthy }

elasticsearch:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the output in the system tests below is not Elasticsearch, do we need elasticsearch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we can get rid of that as well. should happen in the #8491 merge.

@andrewvc
Copy link
Contributor Author

andrewvc commented Oct 1, 2018

@ruflin let's put this on pause till #8491 is merged, since there's overlap on the docker stuff.

@andrewvc
Copy link
Contributor Author

andrewvc commented Oct 2, 2018

Rebased off current master with #8491 merged in

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM. Do we plan to backport this?

Changelog is missing.

@exekias Could you have a look?
@andrewvc There will be docs needed and I think also the reference config file should potentially be updated. Will you cover this in a follow up PR?

In general I'm surprised how little change was needed to enable autodiscovery 🎉

import docker
docker_client = docker.from_env()

self.render_config_template(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For later: As we have ES running anyways, we could add some additional tests for http.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this on slack with @ruflin , agreed that adding a second test was not appropriate due to the fact that we're testing autodiscovery, not reloading.

@andrewvc
Copy link
Contributor Author

andrewvc commented Oct 3, 2018

Hmm, I borked the tests with some finagling of the AD tests. Not sure exactly how, but I think it's the looser condition I'm trying

@andrewvc
Copy link
Contributor Author

andrewvc commented Oct 3, 2018

jenkins, retest this please

@@ -1,3 +1,5 @@
logging.level: debug
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not have any impact as -e is used to capture the logs from stdout.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewvc I would suggest to remove the line but don't want to break your tests (it shouldn't ;-) ). I worry that this might cause confusion in the future.

self.wait_until(lambda: self.log_contains(re.compile('autodiscover.+Got a start event:', re.I)))

self.wait_until(lambda: self.output_count(lambda x: x >= 1))
container.stop()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems the line it fails. As redis is started outside, no need to stop the container here. I assume varialbe is not initialised?

@andrewvc
Copy link
Contributor Author

andrewvc commented Oct 4, 2018

Fixed some bugs, but this exposed a flimsy way of enumerating over docker hosts in the autodiscover tests in python in the tests. This needs to apply the same filter there.

@andrewvc
Copy link
Contributor Author

andrewvc commented Oct 5, 2018

hmmm, these pass locally but not remotely... will take another look tomorrow

@andrewvc andrewvc force-pushed the autodiscover branch 4 times, most recently from df9584d to 92407f0 Compare October 5, 2018 17:03
@andrewvc
Copy link
Contributor Author

andrewvc commented Oct 5, 2018

OK, phew tests finally passed, the issue was that locally the dict with docker network attrs had the IP duplicated to an extra field. network_settings['Networks'].values()[0]['IPAddress'] is the right magic to get it.

@andrewvc andrewvc merged commit cff3e40 into elastic:master Oct 5, 2018
@andrewvc andrewvc added v6.5.0 and removed needs_backport PR is waiting to be backported to other branches. labels Oct 5, 2018
andrewvc added a commit to andrewvc/beats that referenced this pull request Oct 16, 2018
Autodiscover support for Heartbeat + tests

(cherry picked from commit cff3e40)
andrewvc added a commit that referenced this pull request Oct 16, 2018
Autodiscover support for Heartbeat + tests

(cherry picked from commit cff3e40)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants