-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heartbeat autodiscover #8415
Heartbeat autodiscover #8415
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great!
heartbeat/heartbeat.reference.yml
Outdated
# docker.container.image: redis | ||
# config: | ||
# - type: tcp | ||
# hosts: ["localhost:${data.port}"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably "${data.host}:${data.port}"
is better?
@@ -26,20 +26,20 @@ import ( | |||
"github.com/elastic/beats/libbeat/common/bus" | |||
) | |||
|
|||
// AutodiscoverAdapter for Metricbeat modules | |||
type AutodiscoverAdapter struct { | |||
// FactoryAdapter for Metricbeat modules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you will need to update the docs here (Metricbeat references), probably explain a little bit how this takes a factory and give you an autodiscover adapter
heartbeat/beater/heartbeat.go
Outdated
} | ||
|
||
if err := bt.scheduler.Start(); err != nil { | ||
return err | ||
} | ||
defer bt.scheduler.Stop() | ||
defer func() { | ||
|
||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spurious defer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, good catch.
013def8
to
ede6ed5
Compare
Thanks for the review @exekias ! I'll ping you once I've had the chance to add tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really excited to see autodiscovery in Heartbeat.
heartbeat/beater/heartbeat.go
Outdated
@@ -21,6 +21,8 @@ import ( | |||
"fmt" | |||
"time" | |||
|
|||
"github.com/elastic/beats/libbeat/autodiscover" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import mess :-)
@@ -26,20 +26,20 @@ import ( | |||
"github.com/elastic/beats/libbeat/common/bus" | |||
) | |||
|
|||
// AutodiscoverAdapter for Metricbeat modules | |||
type AutodiscoverAdapter struct { | |||
// FactoryAdapter is an adapter that works with any cfgfile.RunnerFactory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to see we generalised it.
3e0724b
to
ac43c34
Compare
Tests added & rebased off master. Will move out of WIP once CI passes |
I believe the test failures are unrelated, but there's one for metricbeat autodiscovery, so I'll run the suite again. |
jenkins, retest this please |
On travis the metricbeat autodiscovery fails which I assume is related. |
@ruflin Yep. I've been spending today tracking it down, the metricbeat changes should have been a noop, but clearly that is not the case. Narrowing down the cause, at the moment it's looking like somehow the hint builder is not being registered somehow. |
OK, just pushed d74b291 which I believe should fix it. |
ae95acb
to
f0b6912
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments for the testing setup. Please ping me if you have further questions.
heartbeat/docker-compose.yml
Outdated
environment: | ||
- REDIS_HOST=redis | ||
- REDIS_PORT=6379 | ||
- KIBANA_HOST=kibana |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove kibana
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this are complicated now because they're split across this and #8491. This is already fixed there, let's close that one out and I'll rebase here.
) | ||
|
||
proc = self.start_beat() | ||
docker_client.images.pull('redis:3.2.4') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we set TEST_ENVIRONMENT to true, these containers should already be running.
There are currently two models of testing against containers we have:
- Start everything at the beginning: Libbeat and others
- Start on request and stop afterwards: Metricbeat modules
We seem to mix the two here. For simplicity I would start with setting TEST_ENVIRONMENT to true and not handle container start / stop in here.
heartbeat/docker-compose.yml
Outdated
kibana: { condition: service_healthy } | ||
redis: { condition: service_healthy } | ||
|
||
elasticsearch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the output in the system tests below is not Elasticsearch, do we need elasticsearch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, we can get rid of that as well. should happen in the #8491 merge.
Rebased off current master with #8491 merged in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM. Do we plan to backport this?
Changelog is missing.
@exekias Could you have a look?
@andrewvc There will be docs needed and I think also the reference config file should potentially be updated. Will you cover this in a follow up PR?
In general I'm surprised how little change was needed to enable autodiscovery 🎉
import docker | ||
docker_client = docker.from_env() | ||
|
||
self.render_config_template( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For later: As we have ES running anyways, we could add some additional tests for http.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed this on slack with @ruflin , agreed that adding a second test was not appropriate due to the fact that we're testing autodiscovery, not reloading.
Hmm, I borked the tests with some finagling of the AD tests. Not sure exactly how, but I think it's the looser condition I'm trying |
jenkins, retest this please |
@@ -1,3 +1,5 @@ | |||
logging.level: debug |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not have any impact as -e
is used to capture the logs from stdout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewvc I would suggest to remove the line but don't want to break your tests (it shouldn't ;-) ). I worry that this might cause confusion in the future.
self.wait_until(lambda: self.log_contains(re.compile('autodiscover.+Got a start event:', re.I))) | ||
|
||
self.wait_until(lambda: self.output_count(lambda x: x >= 1)) | ||
container.stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems the line it fails. As redis is started outside, no need to stop the container here. I assume varialbe is not initialised?
Fixed some bugs, but this exposed a flimsy way of enumerating over docker hosts in the autodiscover tests in python in the tests. This needs to apply the same filter there. |
hmmm, these pass locally but not remotely... will take another look tomorrow |
df9584d
to
92407f0
Compare
OK, phew tests finally passed, the issue was that locally the dict with docker network attrs had the IP duplicated to an extra field. |
Autodiscover support for Heartbeat + tests (cherry picked from commit cff3e40)
This is a complete PR for heartbeat automatic reload aside from tests. Once those are added it will be ready for a true review.
This shares the adapter used by metricbeat, and moves that to libbeat to keep things DRY.