Improve documentation of Logstash pipelines vs Filebeat ingest nodes #9095

NiceGuyIT · 2018-02-05T15:22:31Z

I asked about the difference between "filebeat setup" and "filebeat --setup" which lead to the realization that once you add logstash to the equation, the setup and maintenance of filebeat ingest nodes becomes significantly harder. I was asked to open an issue against both repos to improve the documentation.

Background

Filebeat ingest nodes in ES work great when filebeat sends data to an Elasticsearch node. Once logstash is introduced to the environment, the ingest nodes are no longer used. The documentation hints that ingest nodes or logstash can be used to process data but doesn't explain how to use both or the consequences of implicitly not using ingest nodes when logstash is used.

Logstash without filebeat ingest nodes

The usual setup is to not use the ingest nodes created by filebeat. Here are the steps needed to achieve the same processing.

Run filebeat setup to setup the environment in ES and Kibana. This loads the index templates, Kibana dashboards and ML jobs. This doesn't have to be done on the same server as logstash.
Use the ingest converter to convert the ingest node to a logstash pipeline. The ingest node can be found in several locations.
1. Elasticsearch - Enable the filebeat module, run filebeat --setup with filebeat connected to ES and not logstash, wait for filebeat to create the ingest node by sending events to ES, then "GET /_ingest/pipeline/" in Kibana Console to see and download the ingest node.
2. Filesystem - The injest JSON is with the filebeat modules, /usr/share/filebeat/module in openSUSE (RPM based). This option implies filebeat is installed on the same server as logstash.
3. Online - Download the latest JSON from the beats repo.
4. Examples - Download the configuration examples. Being examples, they might not work as expected.
Modify the logstash pipeline as necessary.

Logstash with filebeat ingest nodes

Using the filebeat ingest nodes with logstash is significantly harder because you need to match the incoming message with the correct ingest node. Multiple message types with multiple ingest nodes means exponential management.

Run filebeat --setup to setup the environment in ES and Kibana. This loads the index templates, Kibana dashboards and ML jobs. Wait for filebeat to create the ingest node by sending events to ES, then kill filebeat. This doesn't have to be done on the same server as logstash.
Use pipeline option in the elasticsearch output block to specify the ingest node to use.
1. pipeline is a string and can accept only 1 value.
2. The elasticsearch block does not accept conditionals.
3. Due to items 1 and 2, conditionals need to be used to match each message to exactly 1 elasticsearch block that specifies the pipeline to use. If you have 5 ingest nodes, you'll need 6 elasticsearch blocks, one for each ingest node and possibly a 6th one as a default.
4. By default, the filebeat ingest node name contains the beat name, beat version, module name and some other identifiers. For example, "filebeat-6.1.2-nginx-access-default". An envinronment with multiple filebeat versions means even more conditionals to match the message to the ingest node.
5. If the pipeline doesn't exist, an error is thrown and the message is not added to ES.
6. Variables can be used to ease the configuration but if something goes wrong, an error is logged and the message is thrown away.
7. The elasticsearch filter plugin can query ES for the pipeline and possibly make the above bearable. The downside is that you're querying ES for every message.
The filebeat version changes upon upgrade.
1. If you don't run filebeat --setup to update the ingest node, the configuration mentioned above cannot rely on the filebeat version.
2. If you run filebeat --setup, the configuration will need to be updated to match the version. This should be done for every major version to make sure the ingest node matches the filebeat version.

Conclusion

Using filebeat modules without logstash is a breeze. Using filebeat with logstash requires additional setup but the documentation is lacking what that setup is. The logstash documentation has a section on working with Filebeat Modules but doesn't elaborate how or why the examples are important. As a user, it was very frustrating trying to understand why the ingest nodes weren't working when I was using Logstash.

Versions

This issue is specific to filebeat since the other beats do not use ingest piplines.

openSUSE Leap 42.3
Filebeat 6.1.2
Logstash 6.1.2
Elasticsearch 6.1.2

The text was updated successfully, but these errors were encountered:

NiceGuyIT · 2018-02-05T15:25:20Z

Corresponding Filebeat issue: elastic/beats#6280

dedemorton · 2019-02-21T00:13:09Z

Closed by #10438 and elastic/beats#10859

NiceGuyIT mentioned this issue Feb 5, 2018

Improve documentation of Logstash pipelines vs Filebeat ingest nodes elastic/beats#6280

Closed

andrewvc added the docs label Apr 9, 2018

andrewvc assigned andrewvc and karenzone and unassigned andrewvc Apr 9, 2018

dedemorton closed this as completed Feb 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve documentation of Logstash pipelines vs Filebeat ingest nodes #9095

Improve documentation of Logstash pipelines vs Filebeat ingest nodes #9095

NiceGuyIT commented Feb 5, 2018

NiceGuyIT commented Feb 5, 2018

dedemorton commented Feb 21, 2019

Improve documentation of Logstash pipelines vs Filebeat ingest nodes #9095

Improve documentation of Logstash pipelines vs Filebeat ingest nodes #9095

Comments

NiceGuyIT commented Feb 5, 2018

Background

Logstash without filebeat ingest nodes

Logstash with filebeat ingest nodes

Conclusion

Versions

NiceGuyIT commented Feb 5, 2018

dedemorton commented Feb 21, 2019