Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add host metadata processor #5968

Merged
merged 2 commits into from
Mar 16, 2018
Merged

Conversation

ruflin
Copy link
Contributor

@ruflin ruflin commented Dec 29, 2017

This adds a processor to add some information about the host machine to each event very similar to what we have for the cloud metadata processor. The idea is to make it possible to allow better filtering based on additional information about the host if needed. This information partially overlaps with what we have in beat.hostname but adds additional fields. Other fields like the IP address(es) could be also added here.

@ruflin ruflin added :Processors discuss Issue needs further discussion. libbeat labels Dec 29, 2017
Copy link
Contributor

@exekias exekias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this, would it make sense to send this info always by default instead of adding a processor?


func (p addHostMetadata) Run(event *beat.Event) (*beat.Event, error) {

info, err := host.Info()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any benchmark on this? it would make sense to move it to the constructor and cache the result, we just need to document that a hostname change won't get reflected here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to be instantiated in the constructor. I'm pretty sure later a request for expiring the info is coming up :-)

@andrewkroh
Copy link
Member

I was kind of working on something related over the break. It can (or will be able to) fetch some of the information that you want to add with this processor. I am working on this because I wanted to log (#5946) some of this information and also make this data available for xpack monitoring.

https://github.com/andrewkroh/go-sysinfo

@tsg
Copy link
Contributor

tsg commented Jan 2, 2018

Big +1 to this. I think a processor makes sense to mirror add_cloud_metadata and co, but we can put it into the default configuration file.

@ruflin
Copy link
Contributor Author

ruflin commented Jan 3, 2018

@andrewkroh Very nice. Did not look into your code yet. Did you use a third party lib or implement it yourself?

@monicasarbu
Copy link
Contributor

+1 on adding add_host_metadata, that will also add cloud metadata in the future.

@ruflin
Copy link
Contributor Author

ruflin commented Feb 26, 2018

This will close #3480

"hostname": p.info.Hostname,
"id": p.info.UniqueID,
"os": common.MapStr{
"architecture": p.info.Architecture,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewkroh Seeing that you have Architecture outside of info I have second thoughts on puttig it under os as I agree it more belongs to the host then the os itself.

Also thinking about putting os info on the top level.

package add_host_metadata

import (
sysinfo "github.com/elastic/go-sysinfo"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewkroh Did you have to add a - to the lib name? :-D

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With Go you don't need to add a package alias, just import plain old "github.com/elastic/go-sysinfo" and use sysinfo.XYZ in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was mainly my IDE that complained :-)

@ruflin ruflin changed the title [Discuss] Add host metadata processor Add host metadata processor Feb 27, 2018
@@ -0,0 +1,26 @@
package add_host_metadata

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use an underscore in package name

@andrewkroh
Copy link
Member

I'm glad you are trying out go-sysinfo. I'll take a closer look at this next week. I've been sitting on a bunch of updates to go-sysinfo, that I haven't had time to cleanup. I plan on getting the lib into a better place so that this can make 6.3.

@ruflin
Copy link
Contributor Author

ruflin commented Mar 6, 2018

@andrewkroh Good timing. I plan to put some work into this next week again. But also plan to start with a very basic version first which we then can expand on. The current fields are probably already enough for a first version.

@elastic elastic deleted a comment from houndci-bot Mar 7, 2018
@elastic elastic deleted a comment from houndci-bot Mar 7, 2018
Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it's just about ready. 👍

}

func (p addHostMetadata) String() string {
return "add_host_metadata="
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about add_host_metadata=[] to indicate an empty config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return nil, err
}
return &addHostMetadata{
info: h.Info(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will want to refresh this data every couple of minutes (image that the user installs an OS update). And given that we are caching this data we might was well cache the common.MapStr containing it.

I'd probably add a time.Time value to track when the value was collected. Then refresh the value from Run after it becomes stale.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cached the mapstr and set the cache expiry to 5 minutes. We can still make it a config later if needed.

"os": common.MapStr{
"platform": p.info.OS.Platform,
"version": p.info.OS.Version,
"family": p.info.OS.Family,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you want codename and build?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it.

Timestamp: time.Now(),
}
p, err := newHostMetadataProcessor(nil)
assert.NoError(t, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should expect to receive sysinfo/types.ErrNotImplemented for any OS other than windows, linux, and darwin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

data := common.MapStr{
"host": common.MapStr{
"hostname": p.info.Hostname,
"id": p.info.UniqueID,
Copy link
Member

@andrewkroh andrewkroh Mar 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

info.UniqueID is optional (as signified by ,omitempty in the json tag). You should add handling for this. Older versions of Linux don't have /etc/machine-id.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@ruflin ruflin force-pushed the add_host_metadata branch from 9da8b6a to ec4e196 Compare March 13, 2018 11:59
@@ -0,0 +1,80 @@
package add_host_metadata

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use an underscore in package name

@andrewkroh
Copy link
Member

I think we no longer need to keep github.com/elastic/procfs in vendor/. I'm pretty sure any changes got pushed upstream. Can you try removing references to github.com/elastic/procfs and replace them with github.com/prometheus/procfs so we don't have a duplicate dep under two names.

@ruflin
Copy link
Contributor Author

ruflin commented Mar 14, 2018

github.com/elastic/procfs is still used in the system/socket and system/raid metricset. I suggest to make this change in a follow up PR and not mix it with this one.

@ruflin ruflin force-pushed the add_host_metadata branch from b3a7396 to 7fbe357 Compare March 14, 2018 10:05
@ruflin ruflin added review and removed discuss Issue needs further discussion. labels Mar 14, 2018
@ruflin
Copy link
Contributor Author

ruflin commented Mar 14, 2018

Here is a separate PR that changes the dependency. We can merge the other one first and rebase on top of it: #6548

@ruflin ruflin force-pushed the add_host_metadata branch from 7fbe357 to 08123ea Compare March 14, 2018 13:38
This adds a processor to add some information about the host machine to each event very similar to what we have for the cloud metadata processor. The idea is to make it possible to allow better filtering based on additional information about the host if needed. This information partially overlaps with what we have in `beat.hostname` but adds additional fields. Other fields like the IP address(es) could be also added here.
@ruflin
Copy link
Contributor Author

ruflin commented Mar 15, 2018

This PR currently fails because of a crosscompile issue with darwin 32 bit: elastic/go-sysinfo#3

@andrewkroh
Copy link
Member

I believe elastic/go-sysinfo#4 should fix the cross-compile issue seen here. The sysinfo code was lacking cgo tags to prevent the darwin provider from being included when cross-compiling on Linux (without a C x-compiler).

@andrewkroh andrewkroh merged commit 1fbeb65 into elastic:master Mar 16, 2018
@TimWardOrigami
Copy link

Is the host's FQDN included in the output (eg "hostname -f" output)? If not should/can it be?

@ruflin ruflin deleted the add_host_metadata branch April 3, 2018 11:59
@ruflin
Copy link
Contributor Author

ruflin commented Apr 3, 2018

So far the code use to create the hostname is os.Hostname() which is the same as under beat.hostname. Now that we have a processor for the data we should make it a potential config option for it to be the FQDN if it not already is. I assume the fetch the FQDN we will need to do some lookups. @TimWardOrigami Want to add a feature request for this?

@andrewkroh WDYT on where this belongs? Should we have this in go-sysinfo?

@TimWardOrigami
Copy link

@ruflin How do I "add a feature request" please?

@ruflin
Copy link
Contributor Author

ruflin commented Apr 3, 2018

@TimWardOrigami Just open a new Github issue in this repo, explain quickly on what it is a about and what it is needed for. I will label it then accordingly.

@TimWardOrigami
Copy link

TimWardOrigami commented Apr 3, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants