Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reusable/common snippets of filebeat input config #16084

Closed
Mekk opened this issue Feb 4, 2020 · 12 comments
Closed

Reusable/common snippets of filebeat input config #16084

Mekk opened this issue Feb 4, 2020 · 12 comments
Labels
Team:Elastic-Agent Label for the Agent team

Comments

@Mekk
Copy link

Mekk commented Feb 4, 2020

My filebeat input configs start being very copypastish.
They usually look so:

- type: log
  enabled: true
  paths:
    - /some/path/*.log
    - /or/maybe/two.*.log
   exclude_files:
    - _a_few.log$
    - excluded$
    - ^thins
  encoding: latin2
  tags: ["some", "tags"]
  fields:
      some:
          custom: fields
  fields_under_root: true

  ignore_older: 48h
  max_bytes: 10485760
  close_inactive: 24h
  scan_frequency: 5s
  harvester_limit: 1200

  multiline.pattern: …entry pattern …
  multiline.negate: true
  multiline.match: after
  multiline.flush_pattern: …finish pattern ·
  multiline.max_lines: 100000
  multiline.timeout: 5s

- another-section

- and another

- and another

- and another (in record case I have 27 of them and may have 81 soon)

where sections differ only by paths and (some) fields (less frequently encoding or harvester_limit), and remaning settings remain the same. Copy. Paste. Copy. Paste. Copy. Paste.

So it would be great if I could write common settings somewhere somehow.

@Mekk
Copy link
Author

Mekk commented Feb 4, 2020

In general, I'd probably like most If I could write sht like

- type: config-group
  name: common-bluhbluh-settings
  any: settings
  wha:
    - te
    -ver

- type: log
  import: common-bluhbluh-settings
  paths:
   - this/section/path/*.log
  fields:
     this:
        section: fields

- type:log
  …and so on, and so on…

with general rule that imported group is simply blended into appropriate section (whether conflict between imported setting and sth written in place generates an error, or is resolved in favor of specific setting, I have no strict opinion, in case of fields it would be nice if they were merged and conflicted only if value mismatch for the same field happens)

Such or similar approach would let me have 2-3 predefined sets I need (for 2-3 log kinds, differing in multiline pattern) and keep my sections compact

Or mayhaps it could be done per-file (wherever I use two/three log kinds, I usually end having inputs.d directory with separate .conf for each kind), sth like:

- type: common
  some: settings
     which: are applied to all logs from this .conf file

Or sth else. Some kind of define macro/use macro (with go templates inside) could also be nice.

I considered templating externally, but this is troublesome considering people happen to patch filebeat configs manually in their final resting place…

@botelastic
Copy link

botelastic bot commented Jan 4, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@botelastic botelastic bot added Stalled needs_team Indicates that the issue/PR needs a Team:* label labels Jan 4, 2021
@Mekk
Copy link
Author

Mekk commented Jan 6, 2021

Bah, but how to add such label?

@botelastic botelastic bot removed the Stalled label Jan 6, 2021
@jsoriano jsoriano added the Team:Elastic-Agent Label for the Agent team label Feb 16, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Feb 16, 2021
@jsoriano
Copy link
Member

Hey @Mekk,

I'd say that a feature like this belongs more to configuration management grounds. I don't think that beats should include a general purpose templating or something like this.

But there are already some features that use to help to manage complex configurations:

  • YAML anchors can be leveraged to reuse some parts of the configuration (see for example its use here in the "How to test" section: Add cloudfoundry module to metricbeat #16671)
  • Environment variables can be used in Beats configuration.
  • Variables can be retrieved from keystores.
  • Modules can be imported from different files.
  • Autodiscover can be used in cases where a provider exists (docker, kubernetes...).
  • Agent and Fleet can also help to manage complex configurations from Kibana.

I am going to close this issue as I don't think that anything more general is going to be implemented for Beats config.

I'd suggest you to try with the existing options, and/or open a topic in https://discuss.elastic.co/c/elastic-stack/beats/28 to look for suggestions from the community.

@Mekk
Copy link
Author

Mekk commented Feb 17, 2021

Most of the suggestions above doesn't seem at all related to the problem I describe. That's not about consistency of individual params (which could be improved by reusing env variable or referring keystore) and not about configuration management. That's simply about the fact, that I have long sections (like multiline pattern) which are always the same, but must be repeated. In the result, filebeat configs turn long, difficult to read, troublesome to edit in case sth here changes, etc. Of course it can be templated outside. But this makes using ansible, feet or sth necessity, and fixing configs manually impossible.

Mayhaps yaml anchors could do. I learned to deeply hate them some time ago (syntax is very confusing and errors, in case they happen, difficult to comprehend, and „accidental admin” is unlikely to understand this syntax). And they can't merge things (like merging fields).

I think that sth like „log kind / log class”, which would define common features of many logs would be much nicer.

Well…

@jsoriano
Copy link
Member

@Mekk yes, I only wanted to mention some of the features Beats already have to support some cases of complex configurations, in case some of them can help you in your case 🙂
Adding more features like general templating would increase the complexity of configs, and I think that most of the use cases for this are better covered by using config management tools.

I think that sth like „log kind / log class”, which would define common features of many logs would be much nicer.

This "log class" makes me think that perhaps you could create a custom Filebeat module for your case.

Filebeat modules are implemented with config-like text files, and allow a high level of reusability. They can include fully templatized configurations, for example the config in the Filebeat module for MySQL includes a loop, multiline and some processors:

type: log
paths:
{{ range $i, $path := .paths }}
- {{$path}}
{{ end }}
exclude_files: [".gz$"]
multiline:
# Consider lines without timestamp part of the previous message
pattern: '^([0-9]{4}-[0-9]{2}-[0-9]{2}|[0-9]{6})'
negate: true
match: after
processors:
- add_locale: ~
- add_fields:
target: ''
fields:
ecs.version: 1.8.0

Then it filebeat config you can "instantiate" this "log class" or module several times with something like:

filebeat.modules:
- module: mysql
  error:
    var.paths:
      - /var/log/mysql1/mysql.err*
- module: mysql
  error:
    var.paths:
      - /var/log/mysql2/mysql.err*

There are some docs about creating new modules: https://www.elastic.co/guide/en/beats/devguide/7.11/filebeat-modules-devguide.html
Or you can use some of the existing one as base examples: https://github.com/elastic/beats/tree/465750c91d4ee476a1b2954c5bb7b81bea90ed58/filebeat/module

@Mekk
Copy link
Author

Mekk commented Feb 17, 2021

That's closer, I will take a look, thank you for the hint.

Still, IMHO generic problem (I have plenty of logs which differ in some minor detail, like label, but have large common set of settings) isn't that rare and requiring custom templating here raises the bar (not to mention non-obviousness of that solution). Sth like I suggested above ( #16084 (comment) , maybe better named - say log class referenced by class: or sth like that) would simply be much easier to use … and find (and could even be naturally documented with „if you need more, migrate to module”, creating natural learning and progress path).

@jsoriano
Copy link
Member

jsoriano commented Feb 17, 2021

Yep, agree that perhaps we should add some documentation about recommendations on how to manage extensive configurations.

Out of curiosity, to better understand your use case, are these logs generated by multiple deployments of some custom application?

@Mekk
Copy link
Author

Mekk commented Feb 17, 2021

Regarding docs (I was just writing that): currently they leave strong impression, that modules constitute some advanced feature mostly intended to be used, not created. Small touches like adding sentence „If you reuse the same multiline pattern in many cases, you may consider -Creating module- to simplify your config” to the chapter about multiline inputs, and some similar remarks, plus creating such example chapter showing how to make simple module (say on this very example: reusing custom multiline rule) would help a lot.

Regarding my case: I parse numerous logs created by plenty of custom applications built around some common frameworks (so they emit few consistend log kinds). So I have not-that-trivial multiline patterns, I have fields which classify log formats/types before logstash processing, and such. And frequently I happen to have 15 directories with logs on some machine, which are more-or-less identical (the format is the same) except each needs custom field or label to classify log project or app (otherwise I'd have one entry with many paths). So I end up having many similar entries.

Think just that: complex multiline rule, 8 common fields or labels, common tech fields like scan frequency … and the need to add 1-2 specific fields/labels per directory (or sometimes per more narrow pattern). And 10 such directories

@Mekk
Copy link
Author

Mekk commented Feb 17, 2021

… and I suppose modules are indeed good answer to my needs and I am to try them.

But note that I haven't even considered using them in spite of using filebeat for 2 years and reading docs many times.

@Mekk
Copy link
Author

Mekk commented Feb 20, 2023

(note-after-some-time)

I finally wrote „my” modules and they work nicely, so this is indeed correct way to proceeed.

But there is some problem (apart from documentation remarks above): modules must be installed to /usr/share/filebeat/module (or whererver path.home is pointed to) and it is easy to loose those files during upgrades. And even if not, installing my code inside filebeat directory seems somewhat „dirty”.

Would be nice if filebeat handled some out-of-its-normal-directory module directory in addendum (maybe /usr/local/filebeat/module, maybe /etc/filebeat/custom-modules, maybe sth else, maybe --path.custom_modules=… option — simply some place outside normal filebeat directories). As additional source, not instead (so I can use standard modules and my custom modules simultaneously).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

No branches or pull requests

3 participants