Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove document_type from Filebeat #4204

Merged
merged 2 commits into from
May 9, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Remove document_type from Filebeat
The `_type` field was removed in elasticsearch 6.0. The initial intention of `document_type` was to define different `_type`. As this does not exist anymore the config option was removed. It is recommend to use `fields` instead to add specific fields to a prospector.

* Adjust tests accordingly
  • Loading branch information
ruflin committed May 5, 2017
commit 1f3659f0ed37b791418e3e774ed7751d534c4de8
1 change: 1 addition & 0 deletions CHANGELOG.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ https://github.com/elastic/beats/compare/v5.1.1...master[Check the HEAD diff]
- Remove deprecated config options force_close_files and close_older. {pull}3768[3768]
- Change clean_removed behaviour to also remove states for files which cannot be found anymore under the same name. {pull}3827[3827]
- Add Icinga module. {pull}3904[3904]
- Remove `document_type` config option. Use `fields` instead. {pull}4204[4204]

*Heartbeat*
- Event format and field naming changes in Heartbeat and sample Dashboard. {pull}4091[4091]
Expand Down
5 changes: 0 additions & 5 deletions filebeat/_meta/common.full.p2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,11 +67,6 @@ filebeat.prospectors:
# Time strings like 2h (2 hours), 5m (5 minutes) can be used.
#ignore_older: 0

# Type to be published in the 'type' field. For Elasticsearch output,
# the type defines the document type these entries should be stored
# in. Default: log
#document_type: log

# How often the prospector checks for new files in the paths that are specified
# for harvesting. Specify 1s to scan the directory as frequently as possible
# without causing Filebeat to scan too frequently. Default: 10s.
Expand Down
5 changes: 0 additions & 5 deletions filebeat/_meta/fields.common.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,6 @@
description: >
The content of the line read from the log file.

- name: type
required: true
description: >
The name of the log event. This field is set to the value specified for the `document_type` option in the prospector section of the Filebeat config file.

- name: input_type
required: true
description: >
Expand Down
8 changes: 0 additions & 8 deletions filebeat/docs/fields.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -767,14 +767,6 @@ required: True
The content of the line read from the log file.


[float]
=== type

required: True

The name of the log event. This field is set to the value specified for the `document_type` option in the prospector section of the Filebeat config file.


[float]
=== input_type

Expand Down
13 changes: 3 additions & 10 deletions filebeat/docs/migration.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
[partintro]
--
Filebeat is based on the Logstash Forwarder source code and replaces Logstash Forwarder as the method
to use for tailing log files and forwarding them to Logstash.
to use for tailing log files and forwarding them to Logstash.

Filebeat introduces the following major changes:

Expand Down Expand Up @@ -139,25 +139,20 @@ filebeat.prospectors:
paths:
- /var/log/messages
- /var/log/*.log
document_type: syslog <1>
fields:
service: apache
zone: us-east-1
fields_under_root: true
- input_type: stdin <2>
document_type: stdin
- input_type: log
paths:
- /var/log/apache2/httpd-*.log
document_type: apache
-------------------------------------------------------------------------------------

<1> The `document_type` option controls the output `type` field, which is used by the
Elasticsearch output to determine the document type.
<2> The explicit `input_type` option was introduced to differentiate between normal files and
<1> The explicit `input_type` option was introduced to differentiate between normal files and
stdin. In the future, additional types might be supported.

As you can see, apart from the new `document_type` and `input_type` options,
As you can see, apart from the new `input_type` options,
which were before implicitly defined via the `type` custom field, the remaining
options can be migrated mechanically.

Expand Down Expand Up @@ -287,7 +282,6 @@ filebeat.prospectors:
- input_type: log
paths:
- /var/log/*.log
document_type: syslog
fields:
service: test01
output.elasticsearch:
Expand Down Expand Up @@ -375,7 +369,6 @@ filebeat.prospectors:
- input_type: log
paths:
- /var/log/*.log
document_type: syslog
fields:
service: test01
fields_under_root: true
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ filebeat.prospectors:
- input_type: log
paths:
- /var/log/apache/httpd-*.log
document_type: apache

- input_type: log
paths:
Expand Down Expand Up @@ -303,12 +302,6 @@ If you require log lines to be sent in near real time do not use a very low `sca
The default setting is 10s.

[[filebeat-document-type]]
===== document_type

The event type to use for published lines read by harvesters. For Elasticsearch
output, the value that you specify here is used to set the `type` field in the output
document. The default value is `log`.

===== harvester_buffer_size

The size in bytes of the buffer that each harvester uses when fetching a file. The default is 16384.
Expand Down
5 changes: 0 additions & 5 deletions filebeat/filebeat.full.yml
Original file line number Diff line number Diff line change
Expand Up @@ -236,11 +236,6 @@ filebeat.prospectors:
# Time strings like 2h (2 hours), 5m (5 minutes) can be used.
#ignore_older: 0

# Type to be published in the 'type' field. For Elasticsearch output,
# the type defines the document type these entries should be stored
# in. Default: log
#document_type: log

# How often the prospector checks for new files in the paths that are specified
# for harvesting. Specify 1s to scan the directory as frequently as possible
# without causing Filebeat to scan too frequently. Default: 10s.
Expand Down
2 changes: 0 additions & 2 deletions filebeat/harvester/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ var (
CloseRenamed: false,
CloseEOF: false,
CloseTimeout: 0,
DocumentType: "log",
CleanInactive: 0,
}
)
Expand All @@ -49,7 +48,6 @@ type harvesterConfig struct {
MaxBytes int `config:"max_bytes" validate:"min=0,nonzero"`
Multiline *reader.MultilineConfig `config:"multiline"`
JSON *reader.JSONConfig `config:"json"`
DocumentType string `config:"document_type"`
CleanInactive time.Duration `config:"clean_inactive" validate:"min=0"`
Pipeline string `config:"pipeline"`
Module string `config:"_module_name"` // hidden option to set the module name
Expand Down
1 change: 0 additions & 1 deletion filebeat/harvester/log.go
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,6 @@ func (h *Harvester) Harvest(r reader.Reader) {
"@timestamp": common.Time(message.Ts),
"source": state.Source,
"offset": state.Offset, // Offset here is the offset before the starting char.
"type": h.config.DocumentType,
"input_type": h.config.InputType,
}
data.Event.DeepUpdate(message.Fields)
Expand Down
10 changes: 4 additions & 6 deletions filebeat/tests/system/test_json.py
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,6 @@ def test_timestamp_in_message(self):
output = self.read_output()
assert len(output) == 5
assert all(isinstance(o["@timestamp"], basestring) for o in output)
assert all(isinstance(o["type"], basestring) for o in output)
assert output[0]["@timestamp"] == "2016-04-05T18:47:18.444Z"

assert output[1]["@timestamp"] != "invalid"
Expand Down Expand Up @@ -239,14 +238,13 @@ def test_type_in_message(self):
output = self.read_output()
assert len(output) == 3
assert all(isinstance(o["@timestamp"], basestring) for o in output)
assert all(isinstance(o["type"], basestring) for o in output)
assert output[0]["type"] == "test"

assert output[1]["type"] == "log"
assert "type" not in output[1]
assert output[1]["json_error"] == \
"type not overwritten (not string)"

assert output[2]["type"] == "log"
assert "type" not in output[2]
assert output[2]["json_error"] == \
"type not overwritten (not string)"

Expand Down Expand Up @@ -283,7 +281,7 @@ def test_with_generic_filtering(self):
proc.check_kill_and_wait()

output = self.read_output(
required_fields=["@timestamp", "type"],
required_fields=["@timestamp"],
)
assert len(output) == 1
o = output[0]
Expand Down Expand Up @@ -327,7 +325,7 @@ def test_with_generic_filtering_remove_headers(self):
proc.check_kill_and_wait()

output = self.read_output(
required_fields=["@timestamp", "type"],
required_fields=["@timestamp"],
)
assert len(output) == 1
o = output[0]
Expand Down
8 changes: 4 additions & 4 deletions filebeat/tests/system/test_processors.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def test_dropfields(self):
filebeat.check_kill_and_wait()

output = self.read_output(
required_fields=["@timestamp", "type"],
required_fields=["@timestamp"],
)[0]
assert "beat.name" not in output
assert "message" in output
Expand All @@ -53,7 +53,7 @@ def test_include_fields(self):
filebeat.check_kill_and_wait()

output = self.read_output(
required_fields=["@timestamp", "type"],
required_fields=["@timestamp"],
)[0]
assert "beat.name" not in output
assert "message" in output
Expand Down Expand Up @@ -81,7 +81,7 @@ def test_drop_event(self):
filebeat.check_kill_and_wait()

output = self.read_output(
required_fields=["@timestamp", "type"],
required_fields=["@timestamp"],
)[0]
assert "beat.name" in output
assert "message" in output
Expand Down Expand Up @@ -110,7 +110,7 @@ def test_condition(self):
filebeat.check_kill_and_wait()

output = self.read_output(
required_fields=["@timestamp", "type"],
required_fields=["@timestamp"],
)[0]
assert "beat.name" in output
assert "message" in output
Expand Down
4 changes: 2 additions & 2 deletions filebeat/tests/system/test_prospector.py
Original file line number Diff line number Diff line change
Expand Up @@ -648,7 +648,7 @@ def test_prospector_filter_dropfields(self):
filebeat.check_kill_and_wait()

output = self.read_output(
required_fields=["@timestamp", "type"],
required_fields=["@timestamp"],
)[0]
assert "offset" not in output
assert "message" in output
Expand All @@ -673,7 +673,7 @@ def test_prospector_filter_includefields(self):
filebeat.check_kill_and_wait()

output = self.read_output(
required_fields=["@timestamp", "type"],
required_fields=["@timestamp"],
)[0]
assert "message" not in output
assert "offset" in output
2 changes: 1 addition & 1 deletion libbeat/tests/system/beat/beat.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
import yaml
from datetime import datetime, timedelta

BEAT_REQUIRED_FIELDS = ["@timestamp", "type",
BEAT_REQUIRED_FIELDS = ["@timestamp",
"beat.name", "beat.hostname", "beat.version"]

INTEGRATION_TESTS = os.environ.get('INTEGRATION_TESTS', False)
Expand Down