-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Filebeat] Inefficient Grok expression in Postgresql module causes Cluster to Stop Responding #7201
Comments
ph
added a commit
to ph/beats
that referenced
this issue
Jun 5, 2018
The postgresql ingest pipeline was not performing so well. This PR use the following rules to improve the situation. - Anchor the Regular expression at the begining of the string. - Merge the multiple statements into a single RE - Do not use back reference for user/host delimiter. Fixes: elastic#7201
ruflin
pushed a commit
that referenced
this issue
Jun 12, 2018
The postgresql ingest pipeline was not performing so well. This PR use the following rules to improve the situation. - Anchor the Regular expression at the begining of the string. - Merge the multiple statements into a single RE - Do not use back reference for user/host delimiter. Fixes: #7201 ## Note to reviewers: Master branch to ingest 18 documents 26ms, this PR 11 ms. Not I did not use dissect, since there are variations in the tokens.
ph
added a commit
to ph/beats
that referenced
this issue
Jun 15, 2018
The postgresql ingest pipeline was not performing so well. This PR use the following rules to improve the situation. - Anchor the Regular expression at the begining of the string. - Merge the multiple statements into a single RE - Do not use back reference for user/host delimiter. Fixes: elastic#7201 Master branch to ingest 18 documents 26ms, this PR 11 ms. Not I did not use dissect, since there are variations in the tokens. (cherry picked from commit 4a41587)
ruflin
pushed a commit
that referenced
this issue
Jun 18, 2018
The postgresql ingest pipeline was not performing so well. This PR use the following rules to improve the situation. - Anchor the Regular expression at the begining of the string. - Merge the multiple statements into a single RE - Do not use back reference for user/host delimiter. Fixes: #7201 Master branch to ingest 18 documents 26ms, this PR 11 ms. Not I did not use dissect, since there are variations in the tokens. (cherry picked from commit 4a41587)
leweafan
pushed a commit
to leweafan/beats
that referenced
this issue
Apr 28, 2023
The postgresql ingest pipeline was not performing so well. This PR use the following rules to improve the situation. - Anchor the Regular expression at the begining of the string. - Merge the multiple statements into a single RE - Do not use back reference for user/host delimiter. Fixes: elastic#7201 Master branch to ingest 18 documents 26ms, this PR 11 ms. Not I did not use dissect, since there are variations in the tokens. (cherry picked from commit e801aaa)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The postgresql module for filebeat contains a grok expression for parsing the log lines https://github.com/elastic/beats/blob/master/filebeat/module/postgresql/log/ingest/pipeline.json#L8-L13
This expression works but is extremely inefficient. Even on moderate loads (100 docs/sec) the indexing latency increases and causes the ES queue to fill - thus preventing any indexing on the cluster. ES hot_threads highlight the hot spot:
Testing indicates the https://github.com/elastic/beats/blob/master/filebeat/module/postgresql/log/ingest/pipeline.json#L18
is responsible.
On Postgresql 9.6.2 this isn't actually necessary and the following is sufficient.
I believe the above can be further refactored - possibly to a single expression. @ph for further action
The text was updated successfully, but these errors were encountered: