Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Init support for filebeat multiline #570

Merged
merged 1 commit into from
Dec 29, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ https://github.com/elastic/beats/compare/1.0.0...master[Check the HEAD diff]
- group all cpu usage per core statistics and export them optionally if cpu_per_core is configured {pull}496[496]

*Filebeat*
- Add multiline support for combining multiple related lines into one event. {issue}461[461]

*Winlogbeat*

Expand Down
17 changes: 13 additions & 4 deletions filebeat/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,19 @@ type HarvesterConfig struct {
BackoffFactor int `yaml:"backoff_factor"`
MaxBackoff string `yaml:"max_backoff"`
MaxBackoffDuration time.Duration
ForceCloseFiles bool `yaml:"force_close_files"`
ExcludeLines []string `yaml:"exclude_lines"`
IncludeLines []string `yaml:"include_lines"`
ForceCloseFiles bool `yaml:"force_close_files"`
ExcludeLines []string `yaml:"exclude_lines"`
IncludeLines []string `yaml:"include_lines"`
MaxBytes *int `yaml:"max_bytes"`
Multiline *MultilineConfig `yaml:"multiline"`
}

type MultilineConfig struct {
Pattern string `yaml:"pattern"`
Negate bool `yaml:"negate"`
Match string `yaml:"match"`
MaxLines *int `yaml:"max_lines"`
Timeout string `yaml:"timeout"`
}

const (
Expand Down Expand Up @@ -157,5 +167,4 @@ func (config *Config) FetchConfigs() {
if len(config.Filebeat.Prospectors) == 0 {
log.Fatalf("No paths given. What files do you want me to watch?")
}

}
4 changes: 3 additions & 1 deletion filebeat/crawler/prospector.go
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,8 @@ func (p *Prospector) logRun(spoolChan chan *input.FileEvent) {
// Seed last scan time
p.lastscan = time.Now()

logp.Debug("prospector", "exclude_files: %s", p.ProspectorConfig.ExcludeFiles)

// Now let's do one quick scan to pick up new files
for _, path := range p.ProspectorConfig.Paths {
p.scan(path, spoolChan)
Expand Down Expand Up @@ -242,7 +244,7 @@ func (p *Prospector) isFileExcluded(file string) bool {
func (p *Prospector) scan(path string, output chan *input.FileEvent) {

logp.Debug("prospector", "scan path %s", path)
logp.Debug("prospector", "exclude_files: %s", p.ProspectorConfig.ExcludeFiles)

// Evaluate the path as a wildcards/shell glob
matches, err := filepath.Glob(path)
if err != nil {
Expand Down
47 changes: 46 additions & 1 deletion filebeat/docs/configuration.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ If both `include_lines` and `exclude_lines` are defined, then include_lines is c

===== exclude_files

A list of regular expressions to match the files to be ignored. By default no file is excluded.
A list of regular expressions to match the files to be ignored. By default no file is excluded.

[source,yaml]
-------------------------------------------------------------------------------------
Expand Down Expand Up @@ -148,6 +148,51 @@ document. The default value is `log`.

The buffer size every harvester uses when fetching the file. The default is 16384.

===== max_bytes

Maximum number of bytes a single log event can have. All bytes after max_bytes are discarded and not sent.
This is especially useful for multiline log messages which can get large. The default is 10MB (10485760).

===== multiline

Mutiline can be used for log messages spanning multiple lines. This is common for Java Stack Traces
or C-Line Continuation. The following example combines all lines following a line start with a `[`.
This could for example be a Java Stack Trace.

[source,yaml]
-------------------------------------------------------------------------------------
multiline:
pattern: ^\[
match: after
negate: true
-------------------------------------------------------------------------------------

====== pattern

The regexp pattern that has to be matched. The example pattern matches all lines starting with [

====== negate

Defines if the pattern set under pattern should be negated or not. Default is false.

====== match
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

match is required, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, match and pattern is required.


Match must be set to "after" or "before". It is used to define if lines should be
appended to a pattern that was (not) matched before or after or as long as a
pattern is not matched based on negate.

NOTE: After is the equivalent to previous and before is the equivalent to to next in https://www.elastic.co/guide/en/logstash/current/plugins-codecs-multiline.html[Logstash].

====== max_lines

The maximum number of lines that are combined to one event. In case there are more the max_lines the additional
lines are discarded. Default is set to 500.

====== timeout

After the defined timeout, an multiline event is sent even if no new pattern was found to start a new event.
Default is set to 5s.


===== tail_files

Expand Down
33 changes: 31 additions & 2 deletions filebeat/etc/beat.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,11 @@ filebeat:

# Exclude lines. A list of regular expressions to match. It drops the lines that are
# matching any regular expression from the list. The include_lines is called before
# exclude_lines. By default, no lines are dropped.
# exclude_lines. By default, no lines are dropped.
# exclude_lines: ["^DBG"]

# Include lines. A list of regular expressions to match. It exports the lines that are
# matching any regular expression from the list. The include_lines is called before
# matching any regular expression from the list. The include_lines is called before
# exclude_lines. By default, all the lines are exported.
# include_lines: ["^ERR", "^WARN"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related, but we should remove here and for exclude_lines the space in front.


Expand Down Expand Up @@ -73,6 +73,35 @@ filebeat:
# Defines the buffer size every harvester uses when fetching the file
#harvester_buffer_size: 16384

# Maximum number of bytes a single log event can have
# All bytes after max_bytes are discarded and not sent. The default is 10MB.
# This is especially useful for multiline log messages which can get large.
#max_bytes: 10485760

# Mutiline can be used for log messages spanning multiple lines. This is common
# for Java Stack Traces or C-Line Continuation
#multiline:

# The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
#pattern: ^\[

# Defines if the pattern set under pattern should be negated or not. Default is false.
#negate: false

# Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
# that was (not) matched before or after or as long as a pattern is not matched based on negate.
# Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
#match: after

# The maximum number of lines that are combined to one event.
# In case there are more the max_lines the additional lines are discarded.
# Default is 500
#max_lines: 500

# After the defined timeout, an multiline event is sent even if no new pattern was found to start a new event
# Default is 5s.
#timeout: 5s

# Setting tail_files to true means filebeat starts readding new files at the end
# instead of the beginning. If this is used in combination with log rotation
# this can mean that the first entries of a new file are skipped.
Expand Down
33 changes: 31 additions & 2 deletions filebeat/etc/filebeat.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,11 @@ filebeat:

# Exclude lines. A list of regular expressions to match. It drops the lines that are
# matching any regular expression from the list. The include_lines is called before
# exclude_lines. By default, no lines are dropped.
# exclude_lines. By default, no lines are dropped.
# exclude_lines: ["^DBG"]

# Include lines. A list of regular expressions to match. It exports the lines that are
# matching any regular expression from the list. The include_lines is called before
# matching any regular expression from the list. The include_lines is called before
# exclude_lines. By default, all the lines are exported.
# include_lines: ["^ERR", "^WARN"]

Expand Down Expand Up @@ -73,6 +73,35 @@ filebeat:
# Defines the buffer size every harvester uses when fetching the file
#harvester_buffer_size: 16384

# Maximum number of bytes a single log event can have
# All bytes after max_bytes are discarded and not sent. The default is 10MB.
# This is especially useful for multiline log messages which can get large.
#max_bytes: 10485760

# Mutiline can be used for log messages spanning multiple lines. This is common
# for Java Stack Traces or C-Line Continuation
#multiline:

# The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
#pattern: ^\[

# Defines if the pattern set under pattern should be negated or not. Default is false.
#negate: false

# Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
# that was (not) matched before or after or as long as a pattern is not matched based on negate.
# Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
#match: after

# The maximum number of lines that are combined to one event.
# In case there are more the max_lines the additional lines are discarded.
# Default is 500
#max_lines: 500

# After the defined timeout, an multiline event is sent even if no new pattern was found to start a new event
# Default is 5s.
#timeout: 5s

# Setting tail_files to true means filebeat starts readding new files at the end
# instead of the beginning. If this is used in combination with log rotation
# this can mean that the first entries of a new file are skipped.
Expand Down
2 changes: 0 additions & 2 deletions filebeat/harvester/harvester.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ import (
"io"
"os"
"regexp"
"time"

"github.com/elastic/beats/filebeat/config"
"github.com/elastic/beats/filebeat/harvester/encoding"
Expand All @@ -33,7 +32,6 @@ type Harvester struct {
SpoolerChan chan *input.FileEvent
encoding encoding.EncodingFactory
file FileSource /* the file being watched */
backoff time.Duration
ExcludeLinesRegexp []*regexp.Regexp
IncludeLinesRegexp []*regexp.Regexp
}
Expand Down
Loading