-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add add_docker_metadata
processor
#4352
Conversation
6df9110
to
c9cdf74
Compare
I think we should consider using the top level vendor dir and removing the |
|
||
watcher, err := watcherConstructor(config.Host, config.TLS) | ||
if err != nil { | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not return an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, I can believe I left that there, sorry
} | ||
|
||
func (d *addDockerMetadata) String() string { | ||
return "add_docker_metadata" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is customary to include config info in the string representation (assuming no secrets are included). I don't believe there is a standard format.
jenkins, test it |
1 similar comment
jenkins, test it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is an addition we will really need in the future and become quite powerful. It especially is becoming interesting with docker logs and having the id in the pat (how are we going to extract it ;-) ) and then attach the info to it.
You used in this PR a new push API which I think we don't use yet. This makes a lot of sense in this context but would also be great to have in the docker module (or filebeat?)
I suggest two split this PR up in 3 parts:
- Move docker dependencies to the top vendor directory. Be aware that we use our own fork of the docker project (see vendor.json on docker module)
- Add processor
- Decide where the add event as metricset or prospector
filebeat/filebeat.full.yml
Outdated
#processors: | ||
#- add_docker_metadata: | ||
# match_fields: ["system.process.cgroup.id"] | ||
# #host: "unix:///var/run/docker.sock" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like indentation would be off if uncommented, as this is commented twice
meta.Put("container.labels", labels) | ||
} | ||
|
||
meta.Put("container.id", container.ID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this matches the "fields" with have in metricbeat for the docker containers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, they do
value, err := event.GetValue(field) | ||
if err == nil { | ||
if strValue, ok := value.(string); ok { | ||
cid = strValue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fields is a list but in the end you only have one cid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no way to match several fields into several containers, that should not happen. Fields is a list cause you may have different fields you want to match, not all of them present in all events
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of assume, the different events will come from different modules or prospectors. Means in this case each prospector should define it's own processor. So there would not be a need to have an array of fields I think.
var cid string | ||
for _, field := range d.fields { | ||
value, err := event.GetValue(field) | ||
if err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend using here the default pattern:
if err != nil {
continue
}
...
Makes it easier to read in my opinion.
for _, c := range containers { | ||
w.containers[c.ID] = &Container{ | ||
ID: c.ID, | ||
Name: c.Names[0][1:], // Strip '/' from container names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we doing that also in the container metricset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an spurious char given by this API call & version, doesn't seem to be the case for our other calls
WATCH: | ||
for { | ||
select { | ||
case event := <-events: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is actually a docker event listener instead of polling the data. Very similar to what we have now in kubernetes? Should we also have this in the docker module as a metricset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could make sense yes
# | ||
#processors: | ||
#- add_docker_metadata: | ||
# match_fields: ["system.process.cgroup.id"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was first confused by this and had to read the code to understand what it does. It takes this field and checks if the value of this field matches to any container id. If yes, it will add the container meta data to the event.
In the code, it can currently only match 1 field I think, but here it is plural and you use an array. How is this going to work exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW I'm not too sure about the example used here. In case someone uses cgroups, doesn't that kind of mean he doesn't want to use docker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It checks all the fields until one of them matches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the example, you may want to have docker info on top of the cgroup id, so you can filter by, for instance, docker image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the config option is kind of field_that_matches_container_id: ...
. What is the use case of having multiple fields defined?
For the cgroup docker part: I see that this is what it's doing, but why would someone then not just use the docker module instead as he will rely on the docker api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have several metrics matching cgroup id in system (maybe something we have to look into): system.process.cgroup.cpu.id, system.process.cgroup.memory.id, system.process.cgroup.blkio.id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I assume per config you only want to match to one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to match all of them, perhaps in the future it would be better to merge those fields into a global system.cgroup.id
, with different meanings depending on the metricset, but as for now I would need to match any possible field there.
what do you think @andrewkroh, does this make any sense?
Ok I'll play with vendors and open a PR, then rebase this once that is merged |
46aaef3
to
bfebcef
Compare
Updated vendor folders, all details in PR description |
bfebcef
to
f17843b
Compare
Merging as the tests passed before the rebase, which only fixed the changelog. |
This processor will enrich events matching the desired configuration with Docker info:
This PR adds
docker/client
dependency and cleans up common vendors with currentgo-dockerclient
frommetricbeat/module/docker
.I'm also using a patch for
docker/client
(exekias/moby@83d94aa) to ensure the client works in netbsd, will check with upstream if I can push that to master.Important files to check here:
vendor.json:
Everything under
add_docker_metadata
(78f9943)