Skip to content
FURUHASHI Sadayuki edited this page Nov 26, 2013 · 32 revisions

Filter/Copy syntax and nested match

We have several ideas to go. We need feedback! See discussion.

## Ideas
  • idea 1: Add <filter> syntax. Add filter_ plugins.
  • idea 2: No filter_ plugins (keep using out_* plugins). Users use nested <match> to filter/convert records. <source> can't have nested <match> sections.
  • Idea 3: Similar to 2 but <source> outputs records ONLY to the nested <match> (<source> must have nested <match> sections).
  • Idea 4: Similar to 2 but <source> can't be in <label> (must be toplevel). <source> and <match> can have to_label attribute.
  • Idea 5: <source> is not allowed to have nested matches. Multi-<match> tags in same nested level always get copied events. All matches can emit events to nested child matches only. And configuration syntax is extended for labels.

Welcome to add more examples to this page!

## Example A

Scenario:

  • Read data from TCP
  • Streaming aggregation to MongoDB
  • Archive to S3

Idea 1

<source>
  type forward
</source>

<filter **>
  type copy

  # real-time aggregation
  <filter>
    type groupcounter
  </filter>
 
  <match>
    type mongodb
  </match>
</filter>

# archive
<match **>
  type s3
</match>

Idea 2

<source>
  type forward
</source>

# real-time aggregation
<copy **>
  type groupcounter
  <match>
    type mongodb
  </match>
</copy>

# archive
<match **>
  type s3
</match>

Idea 3

<source>
  type forward

  # real-time aggregation
  <copy **>
    type groupcounter
    <match>
      type mongodb
    </match>
  </copy>

  # archive
  <match **>
    type s3
  </match>
</source>

Idea 4

(Same with idea 2)

Idea 5

<source => @aggregate_and_archive>
  type forward
</source>

<label @aggregate_and_archive>
  # real-time aggregation
  <match **>
    type groupcounter
    <match>
      type mongodb
    </match>
  </match>

  # archive
  <match **>
    type s3
  </match>
</label>
## Example B

Scenario:

  • Read from TCP and do streaming aggregation
  • Read from HTTP
  • Archive all logs to S3

Idea 1

<source>
  type forward
  # only forward has copy to groupcounter + forward

  <filter>
    type copy

    <filter>
      type groupcounter
    </filter>
 
    <match>
      type forward
    </match>
  </filter>
</source>

<source>
  type http
  # http goes to s3 directly
</source>

# archive
<match>
  type s3
</match>

Idea 2

<label @aggregate_and_archive>
  # forwarded data go to aggregate, then to archive
  <source>
    type forward
    # source can't have nested match
  </source>

  <copy>
    type groupcounter
    <match>
      type forward
    </match>
  </copy>

  <match>
    type redirect
    to_label @archive
  </match>
</label>

<label @archive>
  # http go to archive directly
  <source>
    type http
  </source>

  <match>
    type s3
  </match>
</label>

Idea 3

<source>
  type forward
  # source can't have nested match

  <copy>
    type groupcounter
    <match>
      type forward
    </match>
  </copy>

  <match>
    type redirect
    label archive
  </match>
</source>

<source>
  type http

  <match>
    type redirect
    label archive
  </match>
</source>

<label archive>
  <match>
    type s3
  </match>
</label>

Idea 4

# forwarded data go to aggregate, then to archive
<source>
  type forward
  to_label @aggregate_and_archive
</source>

# http go to archive directly
<source>
  type http
  to_label @archive
</source>

<label @aggregate_and_archive>
  <copy>
    type groupcounter
    <match>
      type forward
    </match>
    # groupcounter needs to have nested <match> OR to_label
  </copy>

  <match>
    type redirect
    to_label @archive
  </match>
</label>

<label @archive>
  <match>
    type s3
  </match>
</label>

Idea 5

# forwarded data go to aggregate, then to archive
<source => @aggregate>
  type forward
</source>

# http go to archive directly
<source => @archive>
  type http
</source>

<label @aggregate>
  <match>
    type groupcounter
    <match>
      type forward
    </match>
    # groupcounter needs to have nested <match> OR " => @label"
  </match>

  # same level match means always copy
  <match>
    type redirect
    to_label @archive
  </match>
</label>

<label @archive>
  <match>
    type s3
  </match>
</label>
## Example C

Scenario:

  • Read from system, app, or other logs from TCP or UNIX socket
  • Do streaming aggregation only system and app logs
  • Archive all logs to S3

Idea 1

<source>
  type forward
</source>

<source>
  type unix
</source>

<filter {syslog,applog}.**>
  type copy

  <filter>
    type groupcounter
  <filter>
  <match>
    type forward
  </match>
</filter>

<match>
  type s3
</match>

Idea 2

<source>
  type forward
</source>

<source>
  type unix
</source>

<copy {syslog,applog}.**>
  <match>
    type groupcounter
    <match>
      type forward
    </match>
  </match>
</copy>

<match>
  type s3
</match>

Idea 3

<source>
  type forward
  
  <match>
    type redirect
    label @aggregate
  </match>
</source>

<source>
  type unix

  <match>
    type redirect
    label aggregate
  </match>
</source>

<label aggregate>
  <copy {syslog,applog}.**>
    <match>
      type groupcounter
      <match>
        type forward
      </match>
    </match>
  </filter>

  <match>
    type s3
  </match>
</label>

Idea 4

<source>
  type forward
</source>

<source>
  type unix
</source>

<copy {syslog,applog}.**>
  <match>
    type groupcounter
    <match>
      type forward
    </match>
  </match>
</copy>

<match>
  type s3
</match>

Idea 4 with "route" built-in plugin

<source>
  type forward
</source>

<source>
  type unix
</source>

<match>
  type route

  <copy {syslog,applog}.**>
    to_label @aggregate
  </copy>

  <match>
    to_label @archive
  </mathc>
</copy>

<label @aggregate>
  <match>
    type groupcounter
    <match>
      type forward
    </match>
  </match>
</label>

<label @archive>
  <match>
    type s3
  </match>
</label>

Idea 5

# forwarded data go to aggregate, then to archive
<source => @split>
  type forward
</source>

# passed data from unix socket are routed by its tags
<source => @split>
  type unix
</source>

<label @split>
  <match>
    type redirect
    <case {syslog,applog}.** => @aggregate />
    <else => @archive />
  </match>
</label>

<label @aggregate>
  <match>
    type groupcounter
    <match>
      type forward
    </match>
    # groupcounter needs to have nested <match> OR " => @label"
  </match>

  # same level match means always copy
  <match>
    type redirect
    to_label @archive
  </match>
</label>

<label @archive>
  <match>
    type s3
  </match>
</label>

Idea 5 (shorter configuration: same rule)

# forwarded data go to aggregate, then to archive
<source => @out>
  type forward
</source>

# passed data from unix socket are routed by its tags
<source => @out>
  type unix
</source>

<label @out>
  <match {syslog,applog}.**>
    type groupcounter
    <match>
      type forward
    </match>
    # groupcounter needs to have nested <match> OR " => @label"
  </match>
  # same level match means always copy
  <match **>
    type s3
  </match>
</label>