[Filebeat] Inefficient Grok expression in Postgresql module causes Cluster to Stop Responding #7201

gingerwizard · 2018-05-29T14:47:03Z

Version: 6.2.4
Operating System: Linux k8 on GCP
postgresql - 9.6.2

The postgresql module for filebeat contains a grok expression for parsing the log lines https://github.com/elastic/beats/blob/master/filebeat/module/postgresql/log/ingest/pipeline.json#L8-L13

This expression works but is extremely inefficient. Even on moderate loads (100 docs/sec) the indexing latency increases and causes the ES queue to fill - thus preventing any indexing on the cluster. ES hot_threads highlight the hot spot:

97.4% (486.9ms out of 500ms) cpu usage by thread 'elasticsearch[instance-0000000005][bulk][T#1]'
     3/10 snapshots sharing following 16 elements
       org.joni.ByteCodeMachine.matchAt(ByteCodeMachine.java:203)
       org.joni.Matcher.matchCheck(Matcher.java:304)
       org.joni.Matcher.searchInterruptible(Matcher.java:457)
       org.joni.Matcher.search(Matcher.java:318)
       org.elasticsearch.ingest.common.Grok.captures(Grok.java:206)
       org.elasticsearch.ingest.common.GrokProcessor.execute(GrokProcessor.java:65)
       org.elasticsearch.ingest.CompoundProcessor.execute(CompoundProcessor.java:100)

Testing indicates the https://github.com/elastic/beats/blob/master/filebeat/module/postgresql/log/ingest/pipeline.json#L18
is responsible.

On Postgresql 9.6.2 this isn't actually necessary and the following is sufficient.

{
          "grok": {
            "field": "message",
            "ignore_missing": true,
            "patterns": [
              "^%{LOCALDATETIME:postgresql.log.timestamp} %{WORD:postgresql.log.timezone} \\[%{NUMBER:postgresql.log.thread_id}\\] %{USERNAME:postgresql.log.user}@%{HOSTNAME:postgresql.log.database} %{WORD:postgresql.log.level}:  duration: %{NUMBER:postgresql.log.duration} ms  statement: %{GREEDYDATA:postgresql.log.query}",
              "^%{LOCALDATETIME:postgresql.log.timestamp} %{WORD:postgresql.log.timezone} \\[%{NUMBER:postgresql.log.thread_id}\\] \\[%{USERNAME:postgresql.log.user}\\]@\\[%{HOSTNAME:postgresql.log.database}\\] %{WORD:postgresql.log.level}:  duration: %{NUMBER:postgresql.log.duration} ms  statement: %{GREEDYDATA:postgresql.log.query}",
              "^%{LOCALDATETIME:postgresql.log.timestamp} %{WORD:postgresql.log.timezone} \\[%{NUMBER:postgresql.log.thread_id}\\] %{USERNAME:postgresql.log.user}@%{HOSTNAME:postgresql.log.database} %{WORD:postgresql.log.level}:  ?%{GREEDYDATA:postgresql.log.message}",
              "^%{LOCALDATETIME:postgresql.log.timestamp} %{WORD:postgresql.log.timezone} \\[%{NUMBER:postgresql.log.thread_id}\\] \\[%{USERNAME:postgresql.log.user}\\]@\\[%{HOSTNAME:postgresql.log.database}\\] %{WORD:postgresql.log.level}:  ?%{GREEDYDATA:postgresql.log.message}",
              "^%{LOCALDATETIME:postgresql.log.timestamp} %{WORD:postgresql.log.timezone} \\[%{NUMBER:postgresql.log.thread_id}\\] %{WORD:postgresql.log.level}:  ?%{GREEDYDATA:postgresql.log.message}"
            ],
            "pattern_definitions": {
              "LOCALDATETIME": "[-0-9]+ %{TIME}"
            }
          }
        }

I believe the above can be further refactored - possibly to a single expression. @ph for further action

The text was updated successfully, but these errors were encountered:

The postgresql ingest pipeline was not performing so well. This PR use the following rules to improve the situation. - Anchor the Regular expression at the begining of the string. - Merge the multiple statements into a single RE - Do not use back reference for user/host delimiter. Fixes: elastic#7201

The postgresql ingest pipeline was not performing so well. This PR use the following rules to improve the situation. - Anchor the Regular expression at the begining of the string. - Merge the multiple statements into a single RE - Do not use back reference for user/host delimiter. Fixes: #7201 ## Note to reviewers: Master branch to ingest 18 documents 26ms, this PR 11 ms. Not I did not use dissect, since there are variations in the tokens.

The postgresql ingest pipeline was not performing so well. This PR use the following rules to improve the situation. - Anchor the Regular expression at the begining of the string. - Merge the multiple statements into a single RE - Do not use back reference for user/host delimiter. Fixes: elastic#7201 Master branch to ingest 18 documents 26ms, this PR 11 ms. Not I did not use dissect, since there are variations in the tokens. (cherry picked from commit 4a41587)

The postgresql ingest pipeline was not performing so well. This PR use the following rules to improve the situation. - Anchor the Regular expression at the begining of the string. - Merge the multiple statements into a single RE - Do not use back reference for user/host delimiter. Fixes: #7201 Master branch to ingest 18 documents 26ms, this PR 11 ms. Not I did not use dissect, since there are variations in the tokens. (cherry picked from commit 4a41587)

The postgresql ingest pipeline was not performing so well. This PR use the following rules to improve the situation. - Anchor the Regular expression at the begining of the string. - Merge the multiple statements into a single RE - Do not use back reference for user/host delimiter. Fixes: elastic#7201 Master branch to ingest 18 documents 26ms, this PR 11 ms. Not I did not use dissect, since there are variations in the tokens. (cherry picked from commit e801aaa)

ph added Filebeat Filebeat module labels May 29, 2018

ph self-assigned this May 29, 2018

ph added the bug label May 29, 2018

ph mentioned this issue Jun 5, 2018

Optimize postgresql ingest pipeline #7269

Merged

ruflin closed this as completed in #7269 Jun 12, 2018

ph mentioned this issue Jun 15, 2018

Cherry-pick #7269 to 6.3: Optimize postgresql ingest pipeline #7344

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Filebeat] Inefficient Grok expression in Postgresql module causes Cluster to Stop Responding #7201

[Filebeat] Inefficient Grok expression in Postgresql module causes Cluster to Stop Responding #7201

gingerwizard commented May 29, 2018

[Filebeat] Inefficient Grok expression in Postgresql module causes Cluster to Stop Responding #7201

[Filebeat] Inefficient Grok expression in Postgresql module causes Cluster to Stop Responding #7201

Comments

gingerwizard commented May 29, 2018