-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document load balancing modes in filebeat #2780
Conversation
Let's better wait on @urso to review this one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more details how load balancing works internally: https://github.com/elastic/beats/blob/master/libbeat/outputs/mode/lb/lb.go#L13
[[load-balancing]] | ||
== Configuring Load Balancing | ||
|
||
//REVIEWERS: Which outputs support load balancing? I only see the loadbalance option described under Logstash and Redis. Are any other outputs supported? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
redis, logstash and elasticsearch.
Note: Kafka internally uses the same load-balancer module, but only for asynchronous error handling. Loadbalancing for kafka is internally solved (a little different) by client library used.
* **Send events to one host after another:** | ||
+ | ||
By default, when you configure Filebeat to send events to multiple hosts | ||
(`load_balance: true`), Filebeat will send the events to one host after |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/load_balance/loadbalance/
plus:
filebeat.publish_async: false
filebeat.spool_size = output.X.bulk_max_size
The host choosen for the next batch is kind of random.
This is not really load-balancing from client point of view, but can still balance out network usage somehow if multiple filebeat instances are connected to one cluster.
Better option might be to set loadbalance: false
, though. In this mode filebeat chooses on logstash only by random + there is a chance different filebeat instances choosing different logstash instances in a logstash cluster still somehow balancing some load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@urso I'm not sure what you want me to change or add here. I don't think I need to mention publish_async
until later because the default is false
. Also, I'm not sure if it makes sense to recommend changing spool_size
until later in this topic. I've changed the intro to say, "Filebeat can send events in a few different modes" since it sounds like the first approach is not truly load balancing (even though the loadbalance
option is enabled). I'll push my changes in a bit, and maybe you can tell me specifically what I need to add or change to make the section right.
You can configure Filebeat to send events to `N` hosts in lock-step by setting | ||
`spool_size = N * bulk_max_size`. This mode requires more memory and CPU usage | ||
than the previous mode. | ||
+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
third mode:
set filebeat.publish_async: true
and filebeat will try to push batches and wait for ACKs asynchronously. In this mode, filebeat will try to fully load-balance batches collected by spooler. LS instances being slower might handle less batches. This mode should get the highest throughput at the cost of highest memory usage: Batches are kept sorted in memory until ACKed by output. ACK might be out of order, but cleanup is done in order, potentially keeping ACKed batches in memory until older still active batches are ACKed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@urso I'm not sure what you want me to change (vs what you are offering here as explanation). My inclindation is to not talk about how we manage batches in memory because that's an internal implementation detail that users don't really want to know (and we might want to change someday). Best not to document unless users can configure it in someway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got a little confused with asciidoc a little. I did think the third mode to be a separate section and didn't see it initially. My bad.
hosts: ["localhost:5044", "localhost:5045"] | ||
loadbalance: true | ||
bulk_max_size: 2048 | ||
------------------------------------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the loadbalancer supports multiple workers per host. By default worker: 1
, but if number of workers is increased, more network connections will be in use.
e.g. one can configure on logstash host with worker: 2
:
output.logstash:
hosts: ["localhost:5044"]
loadbalance: true
worker: 2
bulk_max_size: 2048
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@urso I'm adding this to the section where you mention it, but I'm wondering if it should go in the first section about enabling loadbalance
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agree it should go in the first section about enabling loadbalance
. The total number of workers participating in load balancing is #hosts * worker
.
------------------------------------------------------------------------------- | ||
|
||
|
||
//REVIEWERS: Please confirm the accuracy of the config examples here. I'm guessing at the config and haven't tested these examples. It would probably be better to use a more realistic path for /var/log/*.log (please suggest one). I'd like to show the full config because it helps users understand where the settings fit into the overall yaml file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/var/log/*.log
is even our default and should be fine on linux.
|
||
//REVIEWERS: Please confirm the accuracy of the config examples here. I'm guessing at the config and haven't tested these examples. It would probably be better to use a more realistic path for /var/log/*.log (please suggest one). I'd like to show the full config because it helps users understand where the settings fit into the overall yaml file. | ||
|
||
//REVIEWERS: I'm a bit confused in the above example because bulk_max_size does not appear under output.logstash in filebeat.full.yml. Is this intentional? It does show up as an option in https://www.elastic.co/guide/en/beats/filebeat/5.0/logstash-output.html, so I assume it's valid, correct? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bulk_max_size
should be added to libbeat/_meta/config*yml . There is another ticket for checking for potentially missing output settings. Only docs have been synced most recently.
- input_type: log | ||
paths: | ||
- /var/log/*.log | ||
filebeat.publish_async: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/false/true
Oh, this is the third mode, hehe.
@urso I forgot to remove the in progress label for the PR. Is this one good to go? |
hosts: ["localhost:5044", "localhost:5045"] | ||
loadbalance: true | ||
bulk_max_size: 2048 | ||
------------------------------------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agree it should go in the first section about enabling loadbalance
. The total number of workers participating in load balancing is #hosts * worker
.
You can configure Filebeat to send events to `N` hosts in lock-step by setting | ||
`spool_size = N * bulk_max_size`. This mode requires more memory and CPU usage | ||
than the previous mode. | ||
+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got a little confused with asciidoc a little. I did think the third mode to be a separate section and didn't see it initially. My bad.
+ | ||
By default, when you configure Filebeat to send events to multiple hosts | ||
(`loadbalance: true`), Filebeat will send the events to one host after | ||
another. This mode requires the least memory and CPU usage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a note about this not really being load balancing?
2117928
to
fedf7b5
Compare
fedf7b5
to
cd9d3c8
Compare
Added to resolve #1852