-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add short ID matching to add_docker_metadata #6172
Conversation
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for contributing @ripcurld0! This is an elegant solution to #6092 🎉
Could you please add some tests with long IDs?
Also please add a CHANGELOG.asciidoc entry
@exekias sure! That's what I meant to do (hence "WIP" in the title). |
Awesome, I totally missed the WIP part 😇 |
libbeat/common/docker/watcher.go
Outdated
@@ -125,7 +128,7 @@ func NewWatcherWithClient(client Client, cleanupTimeout time.Duration) (*watcher | |||
// Container returns the running container with the given ID or nil if unknown | |||
func (w *watcher) Container(ID string) *Container { | |||
w.RLock() | |||
container := w.containers[ID] | |||
container := w.containers[ID[:truncateLen]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should make this a config option or check always for the truncated id. I assume also with 12 digits the likelyhood of overlap is very low.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an interesting point, in theory, chances of collision are around 0.000059%, which makes it very unlikely. That said, someone reported one here: moby/moby#28260 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really interesting. I didn't think about it, because I thought it was very-very-very-very uncommon to get a collision. I guess we should consider printing a warning message if we get more than one containers for a specific ID. Thanks for noticing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial thought was that by default we should use the full id
as there are probably reason the id is that long. The shorter id my assumption is more for user consumption but as it seems it's also around it some other places.
If a short id is found / provided it should be best effort to match it to a long one with something like contains
. The problem is that this could become pretty inefficient as it's not a direct match in the array which makes the implementation you did pretty nice.
It would be great if we could find an efficient method to internally store the full id but still match the short id when needed.
Thought: What if we store both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can have another map map[string]*Container
that contains short ID as a key.
When we detect that the given ID is of length 12 then we use that map otherwise use the original one.
Or actually, you can store both the short-ID and the long one at the same map and reduce the if len(id) > 12
check.
The disadvantage here is obviously a space-time trade-off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about keeping current code and make truncate length configurable? That would support the short id use case, but you have to enable it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fine by me. @ruflin WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the short IDs are consistently 12 characters then adding a secondary entry to the map SGTM. But if we need to do a prefix search where the prefixes are of arbitrary length then maybe consider a different data structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewkroh there are going to be only two kind of IDs:
- short ID (length between 1 and 64 theoretically)
- ID (length of 64)
In other words, the length of a key in the map is 64
or 12
. You won't find a key that its length is 23
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also just add a second entry to the map so on the lookup side it still works as before. I would not expect too much overhead as this map will probably not contain more then 1000 entries?
If it becomes an overhead, I would introduce a config option to enable short id's and have it not store in the map if it's off, but still use the same map if it's on. But we can do that if it's causes any issue.
libbeat/common/docker/watcher.go
Outdated
// For example, by default container's hostname is set to the truncated container ID. | ||
// To allow matching by truncated and non-truncated container IDs the map keys | ||
// are the truncated version. | ||
w.containers[event.Actor.ID[:truncateLen]] = container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we store both, I assume this line would be changed to:
w.containers[event.Actor.ID] = container
w.containers[event.Actor.ID[:truncateLen]] = container
@ripcurld0 Do you have any time on your end to push this forward? I think it would be a great feature to have in. |
@ruflin I am going to push changes today. |
libbeat/common/docker/watcher.go
Outdated
} | ||
|
||
func NewWatcherWithClient(client Client, cleanupTimeout time.Duration) (*watcher, error) { | ||
func NewWatcherWithClient(client Client, cleanupTimeout time.Duration, storeShortID bool) (*watcher, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported function NewWatcherWithClient should have comment or be unexported
exported func NewWatcherWithClient returns unexported type *docker.watcher, which can be annoying to use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually true, we might want to return "Watcher" (the interface) but I am afraid there are function that are not included in the interface which other packages would like to use.
libbeat/common/docker/watcher.go
Outdated
@@ -77,10 +81,10 @@ type Client interface { | |||
Events(ctx context.Context, options types.EventsOptions) (<-chan events.Message, <-chan error) | |||
} | |||
|
|||
type WatcherConstructor func(host string, tls *TLSConfig) (Watcher, error) | |||
type WatcherConstructor func(host string, tls *TLSConfig, storeShortID bool) (Watcher, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exported type WatcherConstructor should have comment or be unexported
libbeat/common/docker/watcher.go
Outdated
}, nil | ||
} | ||
|
||
// Container returns the running container with the given ID or nil if unknown | ||
func (w *watcher) Container(ID string) *Container { | ||
w.RLock() | ||
container := w.containers[ID] | ||
_, ok := w.deleted[ID] | ||
if container == nil { | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😱 thanks!
I ran some manual tests on my computer. preparations
In Lets start with the following
This should read all log files in
or
Because in the first one it will pipe it to my machine /tmp/logs/${HOSTNAME} and in the second one although it will put it in the container's /tmp/logs the hostname will be my machine hostname. Thus, I had to build an image.
testing
The results:
Notice I am not enabling the
But to be sure this is because the container ID
and I do get the following:
that means that everything works like expected in
As you can see here no error is shown on Notesif you have other ideas, feel free to let me know. I think this is now more than (If we can have more eyes on this, that will be super awesome) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🎉, could you update docs and add a changelog entry?
ok to test |
sure @exekias |
container short ID are a common way to represent containers. The short ID length is 12 characters while the long one is 64 which makes it more human readable. Previous to this patch, if the folder name was set to the containers' short IDs the match would have failed. For example, putting the container logs in ./container/${HOSTNAME} since ${HOSTNAME} inside the container is set to its short ID by default then the docker metadata won't be added to the log lines. The user has to explicetly specify short ID match by adding setting "match_short_id" to true in the configuration file. If "match_short_id" is not given in the configuration or is set to false, this feature is disabled. Signed-off-by: Boaz Shuster <ripcurld.github@gmail.com>
This is fantastic! Thanks @ripcurld0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you for taking the time to implement this!
Thanks a lot @exekias @ruflin @chrisregnier @andrewkroh for your review. |
closes #6092