-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add experimental Docker json-file prospector #5402
Conversation
57d4f33
to
c7379f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going forth and back on this one if we should have config options to support json in json or a prospector type. The reason I'm leaning in the direction of the prospector type as you did here is that it allows us the change the implementation in the future and add more magic without affecting the user config. +1 on this :-)
@exekias Could you add some more details to the PR description on what this prospector does. An example on what the input log data looks like and what it makes out of it would be great.
filebeat/harvester/util.go
Outdated
LogType: {}, | ||
RedisType: {}, | ||
UdpType: {}, | ||
StdinType: {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kvch Not related to this PR, but as we register the prospectors now I'm curious if we still need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ValidType
is still used in log
prospector for config validation and opening the right input.
But there is no other check/use of it. The check now happens when a factory is retrieved from registry using GetFactory
. So it is not required, but this part needs further refactoring to make ValidType
log
specific. For the sake of consistency and to make refactoring easier, it is acceptable to add docker to the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed #5537 to remove this part.
Lets move this forward to play around with it but mark it experimental for now. Can you add an experimental log message and add it to the CHANGELOG? |
I would like to have filebeat only harvest from currently active docker containers. Was wondering if this would be an appropriate thing to add to this docker prospector or maybe it should be added somewhere else? |
c7379f4
to
643cdd6
Compare
LogType = "log" | ||
StdinType = "stdin" | ||
RedisType = "redis" | ||
UdpType = "udp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const UdpType should be UDPType
filebeat/harvester/reader/message.go
Outdated
@@ -42,3 +42,17 @@ func (msg *Message) AddFields(fields common.MapStr) { | |||
} | |||
msg.Fields.Update(fields) | |||
} | |||
|
|||
func (msg *Message) RemoveFields(fields ...string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported method Message.RemoveFields should have comment or be unexported
receiver name msg should be consistent with previous receiver name m for Message
0d6a2ca
to
cdf12cb
Compare
filebeat/harvester/reader/message.go
Outdated
@@ -42,3 +43,18 @@ func (msg *Message) AddFields(fields common.MapStr) { | |||
} | |||
msg.Fields.Update(fields) | |||
} | |||
|
|||
// RemoveFields removes given fields from the message | |||
func (msg *Message) RemoveFields(fields ...string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
receiver name msg should be consistent with previous receiver name m for Message
This should be ready for a second look @ruflin, I implemented timestamp parsing, put message in the right place and added some tests |
cdf12cb
to
3a05f29
Compare
jenkins, retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left some minor comments.
} | ||
|
||
message.AddFields(common.MapStr{ | ||
"stream": line.Stream, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you give some details on what values stream
will have? Should we add this to a fields.yml
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see below this is stdin
for example. If we know this field will exist, probably worth adding as keyword to fields.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed it, thanks!
// Parse timestamp | ||
ts, err := time.Parse(time.RFC3339, line.Timestamp) | ||
if err != nil { | ||
return message, errors.Wrap(err, "parsing docker timestamp") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps worth adding the given timestamp to the error. Probably already part of err
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parse error looks pretty verbose: https://golang.org/src/time/format.go?s=23626:23672#L669
filebeat/harvester/util.go
Outdated
LogType: {}, | ||
RedisType: {}, | ||
UdpType: {}, | ||
StdinType: {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed #5537 to remove this part.
filebeat/prospector/docker/config.go
Outdated
} | ||
|
||
type config struct { | ||
ContainersIDs []string `config:"containers_ids"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make this container.ids
and container.path
or containers.ids
and containers.path
?
@@ -83,6 +83,9 @@ type config struct { | |||
MaxBytes int `config:"max_bytes" validate:"min=0,nonzero"` | |||
Multiline *reader.MultilineConfig `config:"multiline"` | |||
JSON *reader.JSONConfig `config:"json"` | |||
|
|||
// Hidden on purpose, used by the docker prospector: | |||
DockerJSON bool `config:"docker-json"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of hacky but don't have a better solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@urso recommended something like this to avoid repeating code or a major refactor, works for me 😇
@@ -515,6 +517,11 @@ func (h *Harvester) newLogFileReader() (reader.Reader, error) { | |||
return nil, err | |||
} | |||
|
|||
if h.config.DockerJSON { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's interesting that form an implementation point of view it just comes down to adding a reader in the harvester (simplified). Seems like the log harvester is pretty generic and perhaps should be extracted. Prospectors then could configure "readers" on the harvester. (just thinking out loud for later and @urso was also talking about similar stuff in the past).
3a05f29
to
13a4c02
Compare
13a4c02
to
98641f7
Compare
rebased after master changes |
@@ -23,6 +23,12 @@ | |||
description: > | |||
The content of the line read from the log file. | |||
|
|||
- name: stream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we call this log.stream
or something similar? log.source
? Source is already overloaded.
@weltenwort has some ideas for a better name?
Merged this one as the field naming discussion we can still do in a follow up PR. |
@exekias Please remember to add docs for this one before the 6.1 GA date. |
This adds docs for elastic#5402
This adds docs for elastic#5402 (cherry picked from commit c37cc31)
Removing needs_docs label because it looks like docs were added in #5752 |
This adds docs for elastic#5402 (cherry picked from commit b9ce598)
This pull request adds an experimental
docker
prospector. Docker supports different ways of outputting logs, this prospector implements json-file.By offering the prospector we abstract the format from the user, so there is no need to manually configure JSON decoding, also we can get timestamp from the JSON file, something that's not possible as of now from existing settings.
This would read all logs:
This also abstract specifics of the chosen provider, for this case this would create a file prospector for
/var/lib/docker/containers/c3ec7a0bd9640151a768663b7e78c115d5b1a7f87fba572666bacd8065893d41/*.log
.In the future, we can extend this to support other logging drivers/formats, like CRI format, or a plugin implementation. I would add a conf key like
mode: json-file
once that happens.TODO:
container_id
parameter.