Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Launch between apps fails due to filebeat and metricbeat defaults #15521

Closed
DanRoscigno opened this issue Jan 13, 2020 · 14 comments
Closed

Launch between apps fails due to filebeat and metricbeat defaults #15521

DanRoscigno opened this issue Jan 13, 2020 · 14 comments
Labels
bug Team:Integrations Label for the Integrations team Team:obs-ds-hosted-services Label for the Observability Hosted Services team

Comments

@DanRoscigno
Copy link
Contributor

Launch between Uptime, Logs, and Metrics depends on the field host.ip being populated in metricbeat and filebeat records. By default this is disabled. I opened a PR with the changes to the configs: #15347

Default config:

processors:
  - add_host_metadata: ~

Necessary config:

processors:
  - add_host_metadata:
      netinfo.enabled: true
  • Version: 7.5.1
  • Steps to reproduce:
    • run with default configs
    • attempt to launch from uptime to Logs or Metrics
    • note the filter that is applied
    • check to see if the field in the filter is being populated
    • modify filebeat.yml or metricbeat.yml and enable netinfo
    • restart filebeat or metricbeat
    • notice that the filter succeeds
@exekias
Copy link
Contributor

exekias commented Jan 14, 2020

Thank you for opening @DanRoscigno! pinging @andrewvc && @jasonrhodes as I don't know much about how links from Uptime work. Is this expected?

@exekias exekias added Team:obs-ds-hosted-services Label for the Observability Hosted Services team bug Team:Integrations Label for the Integrations team labels Jan 14, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/uptime (:uptime)

@andrewvc
Copy link
Contributor

Yes, it is expected, IPs are really the only way we can correlate. Is there any opposition to changing this default?

@blakerouse
Copy link
Contributor

Wouldn't hostname be a better correlation as IP addresses can move around with VIP's and DHCP leases?

@DanRoscigno
Copy link
Contributor Author

Wouldn't hostname be a better correlation as IP addresses can move around with VIP's and DHCP leases?

yes, I think that in the future that would be a good idea. I would be glad to open an issue for that for a future change. I would still like this fixed soon as the integrations are broken.

@andrewvc
Copy link
Contributor

Maybe? I mean it depends on the setup. I think what we need here is to broaden the scope a bit and think about how we want this integration to work. Is automatic guessing of fields even the right approach?

@drewpost would appreciate your input here.

@DanRoscigno
Copy link
Contributor Author

@andrewvc, that is exactly the right question.

In my experience, ops wants to "see everything for my service". So, not the logs for a particular host, but all of the logs for a particular service. Going back to the reason i opened the issue: I was looking at uptime for a service. From there I launched into Logs. The way the integration is written today, the filter is for all of the logs from the host which heartbeat is testing. What I would want is to have all of the logs from:

  • all of the hosts related to that service being monitored with heartbeat
  • all of the related containers / pods
  • all of the serverless functions
  • a few things I have not thought of yet, but in summary: All of the logs for my service

One thing I don't want is to have this hard coded by a vendor.

Here is how I have done this in the past:

  • Add a field (in ECS I would use service.id) and enrich all of my data with a service.id.

The design is more complicated now than in the past when I did this. Back a few years ago the concatenation of IPADDR and port number was a pretty great unique identifier. I think we are now in a serverless world, and therefore a world without IPADDRs .

I would love to see us work on a design where we used the enrich processor

@exekias
Copy link
Contributor

exekias commented Jan 15, 2020

Is there any opposition to changing this default?

Not really, but if we want links to work using this info I think we should be more opinionated and always send it. The proposed change only enables it through config defaults, for new installations. Short term fix would be either enabling this in the code defaults or using the hostname in the Heartbeat side.

@DanRoscigno
Copy link
Contributor Author

I would do these three four things:

  • change the default in the Beat configs to fix for new users
  • update the docs to fix for existing users (probably in the Beats docs, and the Kibana docs)
  • discuss this in the monthly Best practices for ingest webinar (already added to slides) to fix for existing users
  • start the discussion on how best to correlate IT Ops data. I would include services, support, and the Observability folks, I don't think hostname or IPADDR are the right filter

@andrewvc
Copy link
Contributor

andrewvc commented Jan 15, 2020

I really am skeptical of using hostnames extensively. For that to work the hostname of the box would have to align with the URL being monitored. That's a big maybe.

I agree with adding the netinfo stuff by default, that's a clear win (unless there's a reason not to do it), but past that I really think we need to talk about correlation at more of a service level.

Pinging @roncohen @alvarolobato and @tbragin , would love feedback here.

@tbragin
Copy link
Contributor

tbragin commented Jan 15, 2020

@cyrille-leclerc FYI - a discussion on cross-linking.

@andrewvc I agree with the tactical answer, but the bigger-picture item may need broader discussion in which I'd like to involve @drewpost and @cyrille-leclerc. Happy to take it offline and update the issue later.

@andrewvc
Copy link
Contributor

@tbragin @DanRoscigno any updates here? If not I'd rather close this issue since there's not a specific action item we can take here.

@jsoriano did merge #16077 , which enables netinfo metadata by default which does help some here.

@DanRoscigno
Copy link
Contributor Author

I will grab a snapshot and test it.

Personally I would love to know that there is a discussion about correlation beyond IPADDR (I think we all agree that IPADDR was a decent MVP, but not the goal, right?). Should I open a new issue for that and close this one @tbragin ?

@andrewvc
Copy link
Contributor

andrewvc commented Apr 2, 2020

@DanRoscigno do you mind if we close this?

Feel free to open a new issue for this, but I do think we need a bigger rethink of how this feature works. I'd be curious to see how @cyrille-leclerc thinks we should approach this issue. I'm not sure that looking piecemeal at Uptime is the right approach here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Team:Integrations Label for the Integrations team Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

No branches or pull requests

6 participants