Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/master' into feature/beats-tes…
Browse files Browse the repository at this point in the history
…ter-commit

* upstream/master:
  Add new licence status: expired (elastic#22180)
  [filebeat][okta] Make cursor optional for okta and update docs (elastic#22091)
  Add documentation of filestream input (elastic#21615)
  [Ingest Manager] Skip flaky gateway tests elastic#22177
  • Loading branch information
v1v committed Oct 27, 2020
2 parents 2d8d17d + f0da681 commit fd7f48f
Show file tree
Hide file tree
Showing 12 changed files with 750 additions and 12 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d
- The `o365input` and `o365` module now recover from an authentication problem or other fatal errors, instead of terminating. {pull}21259[21258]
- Orderly close processors when processing pipelines are not needed anymore to release their resources. {pull}16349[16349]
- Fix memory leak and events duplication in docker autodiscover and add_docker_metadata. {pull}21851[21851]
- Fix parsing of expired licences. {issue}21112[21112] {pull}22180[22180]

*Auditbeat*

Expand Down Expand Up @@ -640,6 +641,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d
- Adding support for FIPS in s3 input {pull}21446[21446]
- Add SSL option to checkpoint module {pull}19560[19560]
- Add max_number_of_messages config into s3 input. {pull}21993[21993]
- Update Okta documentation for new stateful restarts. {pull}22091[22091]

*Heartbeat*

Expand Down
2 changes: 2 additions & 0 deletions filebeat/docs/filebeat-options.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,8 @@ include::inputs/input-container.asciidoc[]

include::inputs/input-docker.asciidoc[]

include::inputs/input-filestream.asciidoc[]

include::../../x-pack/filebeat/docs/inputs/input-google-pubsub.asciidoc[]

include::../../x-pack/filebeat/docs/inputs/input-http-endpoint.asciidoc[]
Expand Down
394 changes: 394 additions & 0 deletions filebeat/docs/inputs/input-filestream-file-options.asciidoc

Large diffs are not rendered by default.

143 changes: 143 additions & 0 deletions filebeat/docs/inputs/input-filestream-reader-options.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
//////////////////////////////////////////////////////////////////////////
//// This content is shared by Filebeat inputs that use the input
//// but do not process files (the options for managing files
//// on disk are not relevant)
//// If you add IDs to sections, make sure you use attributes to create
//// unique IDs for each input that includes this file. Use the format:
//// [id="{beatname_lc}-input-{type}-option-name"]
//////////////////////////////////////////////////////////////////////////

[float]
===== `encoding`

The file encoding to use for reading data that contains international
characters. See the encoding names http://www.w3.org/TR/encoding/[recommended by
the W3C for use in HTML5].

Valid encodings:

* `plain`: plain ASCII encoding
* `utf-8` or `utf8`: UTF-8 encoding
* `gbk`: simplified Chinese charaters
* `iso8859-6e`: ISO8859-6E, Latin/Arabic
* `iso8859-6i`: ISO8859-6I, Latin/Arabic
* `iso8859-8e`: ISO8859-8E, Latin/Hebrew
* `iso8859-8i`: ISO8859-8I, Latin/Hebrew
* `iso8859-1`: ISO8859-1, Latin-1
* `iso8859-2`: ISO8859-2, Latin-2
* `iso8859-3`: ISO8859-3, Latin-3
* `iso8859-4`: ISO8859-4, Latin-4
* `iso8859-5`: ISO8859-5, Latin/Cyrillic
* `iso8859-6`: ISO8859-6, Latin/Arabic
* `iso8859-7`: ISO8859-7, Latin/Greek
* `iso8859-8`: ISO8859-8, Latin/Hebrew
* `iso8859-9`: ISO8859-9, Latin-5
* `iso8859-10`: ISO8859-10, Latin-6
* `iso8859-13`: ISO8859-13, Latin-7
* `iso8859-14`: ISO8859-14, Latin-8
* `iso8859-15`: ISO8859-15, Latin-9
* `iso8859-16`: ISO8859-16, Latin-10
* `cp437`: IBM CodePage 437
* `cp850`: IBM CodePage 850
* `cp852`: IBM CodePage 852
* `cp855`: IBM CodePage 855
* `cp858`: IBM CodePage 858
* `cp860`: IBM CodePage 860
* `cp862`: IBM CodePage 862
* `cp863`: IBM CodePage 863
* `cp865`: IBM CodePage 865
* `cp866`: IBM CodePage 866
* `ebcdic-037`: IBM CodePage 037
* `ebcdic-1040`: IBM CodePage 1140
* `ebcdic-1047`: IBM CodePage 1047
* `koi8r`: KOI8-R, Russian (Cyrillic)
* `koi8u`: KOI8-U, Ukranian (Cyrillic)
* `macintosh`: Macintosh encoding
* `macintosh-cyrillic`: Macintosh Cyrillic encoding
* `windows1250`: Windows1250, Central and Eastern European
* `windows1251`: Windows1251, Russian, Serbian (Cyrillic)
* `windows1252`: Windows1252, Legacy
* `windows1253`: Windows1253, Modern Greek
* `windows1254`: Windows1254, Turkish
* `windows1255`: Windows1255, Hebrew
* `windows1256`: Windows1256, Arabic
* `windows1257`: Windows1257, Estonian, Latvian, Lithuanian
* `windows1258`: Windows1258, Vietnamese
* `windows874`: Windows874, ISO/IEC 8859-11, Latin/Thai
* `utf-16-bom`: UTF-16 with required BOM
* `utf-16be-bom`: big endian UTF-16 with required BOM
* `utf-16le-bom`: little endian UTF-16 with required BOM

The `plain` encoding is special, because it does not validate or transform any input.

[float]
[id="{beatname_lc}-input-{type}-exclude-lines"]
===== `exclude_lines`

A list of regular expressions to match the lines that you want {beatname_uc} to
exclude. {beatname_uc} drops any lines that match a regular expression in the
list. By default, no lines are dropped. Empty lines are ignored.

The following example configures {beatname_uc} to drop any lines that start with
`DBG`.

["source","yaml",subs="attributes"]
----
{beatname_lc}.inputs:
- type: {type}
...
exclude_lines: ['^DBG']
----

See <<regexp-support>> for a list of supported regexp patterns.

[float]
[id="{beatname_lc}-input-{type}-include-lines"]
===== `include_lines`

A list of regular expressions to match the lines that you want {beatname_uc} to
include. {beatname_uc} exports only the lines that match a regular expression in
the list. By default, all lines are exported. Empty lines are ignored.

The following example configures {beatname_uc} to export any lines that start
with `ERR` or `WARN`:

["source","yaml",subs="attributes"]
----
{beatname_lc}.inputs:
- type: {type}
...
include_lines: ['^ERR', '^WARN']
----

NOTE: If both `include_lines` and `exclude_lines` are defined, {beatname_uc}
executes `include_lines` first and then executes `exclude_lines`. The order in
which the two options are defined doesn't matter. The `include_lines` option
will always be executed before the `exclude_lines` option, even if
`exclude_lines` appears before `include_lines` in the config file.

The following example exports all log lines that contain `sometext`,
except for lines that begin with `DBG` (debug messages):

["source","yaml",subs="attributes"]
----
{beatname_lc}.inputs:
- type: {type}
...
include_lines: ['sometext']
exclude_lines: ['^DBG']
----

See <<regexp-support>> for a list of supported regexp patterns.

[float]
===== `buffer_size`

The size in bytes of the buffer that each harvester uses when fetching a file.
The default is 16384.

[float]
===== `message_max_bytes`

The maximum number of bytes that a single log message can have. All bytes after
`mesage_max_bytes` are discarded and not sent. The default is 10MB (10485760).
165 changes: 165 additions & 0 deletions filebeat/docs/inputs/input-filestream.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
:type: filestream

[id="{beatname_lc}-input-{type}"]
=== filestream input

experimental[]

++++
<titleabbrev>filestream</titleabbrev>
++++

Use the `filestream` input to read lines from active log files. It is the
new, improved alternative to the `log` input. However, a few feature are
missing from it, e.g. `multiline` or other special parsing capabilities.
These missing options are probably going to be added again. We strive to
achieve feature parity, if possible.

To configure this input, specify a list of glob-based <<filestream-input-paths,`paths`>>
that must be crawled to locate and fetch the log lines.

Example configuration:

["source","yaml",subs="attributes"]
----
{beatname_lc}.inputs:
- type: filestream
paths:
- /var/log/messages
- /var/log/*.log
----


You can apply additional
<<{beatname_lc}-input-{type}-options,configuration settings>> (such as `fields`,
`include_lines`, `exclude_lines` and so on) to the lines harvested
from these files. The options that you specify are applied to all the files
harvested by this input.

To apply different configuration settings to different files, you need to define
multiple input sections:

["source","yaml",subs="attributes"]
----
{beatname_lc}.inputs:
- type: filestream <1>
paths:
- /var/log/system.log
- /var/log/wifi.log
- type: filestream <2>
paths:
- "/var/log/apache2/*"
fields:
apache: true
----

<1> Harvests lines from two files: `system.log` and
`wifi.log`.
<2> Harvests lines from every file in the `apache2` directory, and uses the
`fields` configuration option to add a field called `apache` to the output.


[[filestream-file-identity]]
==== Reading files on network shares and cloud providers

:WARNING: Filebeat does not support reading from network shares and cloud providers.

However, one of the limitations of these data sources can be mitigated
if you configure Filebeat adequately.

By default, {beatname_uc} identifies files based on their inodes and
device IDs. However, on network shares and cloud providers these
values might change during the lifetime of the file. If this happens
{beatname_uc} thinks that file is new and resends the whole content
of the file. To solve this problem you can configure `file_identity` option. Possible
values besides the default `inode_deviceid` are `path` and `inode_marker`.

Selecting `path` instructs {beatname_uc} to identify files based on their
paths. This is a quick way to avoid rereading files if inode and device ids
might change. However, keep in mind if the files are rotated (renamed), they
will be reread and resubmitted.

The option `inode_marker` can be used if the inodes stay the same even if
the device id is changed. You should choose this method if your files are
rotated instead of `path` if possible. You have to configure a marker file
readable by {beatname_uc} and set the path in the option `path` of `inode_marker`.

The content of this file must be unique to the device. You can put the
UUID of the device or mountpoint where the input is stored. The following
example oneliner generates a hidden marker file for the selected mountpoint `/logs`:
Please note that you should not use this option on Windows as file identifiers might be
more volatile.

["source","sh",subs="attributes"]
----
$ lsblk -o MOUNTPOINT,UUID | grep /logs | awk '{print $2}' >> /logs/.filebeat-marker
----

To set the generated file as a marker for `file_identity` you should configure
the input the following way:

["source","yaml",subs="attributes"]
----
{beatname_lc}.inputs:
- type: filestream
paths:
- /logs/*.log
file_identity.inode_marker.path: /logs/.filebeat-marker
----


[[filestream-rotating-logs]]
==== Reading from rotating logs

When dealing with file rotation, avoid harvesting symlinks. Instead
use the <<filestream-input-paths>> setting to point to the original file, and specify
a pattern that matches the file you want to harvest and all of its rotated
files. Also make sure your log rotation strategy prevents lost or duplicate
messages. For more information, see <<file-log-rotation>>.

Furthermore, to avoid duplicate of rotated log messages, do not use the
`path` method for `file_identity`. Or exclude the rotated files with `exclude_files`
option.

[id="{beatname_lc}-input-{type}-options"]
==== Prospector options

The `filestream` input supports the following configuration options plus the
<<{beatname_lc}-input-{type}-common-options>> described later.

[float]
[[filestream-input-paths]]
===== `paths`

A list of glob-based paths that will be crawled and fetched. All patterns
supported by https://golang.org/pkg/path/filepath/#Glob[Go Glob] are also
supported here. For example, to fetch all files from a predefined level of
subdirectories, the following pattern can be used: `/var/log/*/*.log`. This
fetches all `.log` files from the subfolders of `/var/log`. It does not
fetch log files from the `/var/log` folder itself.
It is possible to recursively fetch all files in all subdirectories of a directory
using the optional <<filestream-recursive-glob,`recursive_glob`>> settings.

{beatname_uc} starts a harvester for each file that it finds under the specified
paths. You can specify one path per line. Each line begins with a dash (-).

[float]
[[filestream-recursive-glob]]
===== `prospector.scanner.recursive_glob`

Enable expanding `**` into recursive glob patterns. With this feature enabled,
the rightmost `**` in each path is expanded into a fixed number of glob
patterns. For example: `/foo/**` expands to `/foo`, `/foo/*`, `/foo/*/*`, and so
on. If enabled it expands a single `**` into a 8-level deep `*` pattern.

This feature is enabled by default. Set `prospector.scanner.recursive_glob` to false to
disable it.

include::../inputs/input-filestream-reader-options.asciidoc[]

include::../inputs/input-filestream-file-options.asciidoc[]

[id="{beatname_lc}-input-{type}-common-options"]
include::../inputs/input-common-options.asciidoc[]

:type!:
15 changes: 9 additions & 6 deletions filebeat/docs/modules/okta.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,6 @@ the logs while honoring any
https://developer.okta.com/docs/reference/rate-limits/[rate-limiting] headers
sent by Okta.

NOTE: This module does not persist the timestamp of the last read event in
order to facilitate resuming on restart. This feature will be coming in a future
version. When you restart the module will read events from the beginning of the
log. To minimize duplicates documents the module uses the event's Okta UUID
value as the Elasticsearch `_id`.

This is an example configuration for the module.

[source,yaml]
Expand Down Expand Up @@ -99,6 +93,15 @@ information.
supported_protocols: [TLSv1.2]
----

*`var.initial_interval`*::

An initial interval can be defined. The first time the module starts, will fetch events from the current moment minus the initial interval value. Following restarts will fetch events starting from the last event read. It defaults to `24h`.
+
[source,yaml]
----
var.initial_interval: 24h # will fetch events starting 24h ago.
----

[float]
=== Example dashboard

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,7 @@ func wrapStrToResp(code int, body string) *http.Response {
}

func TestFleetGateway(t *testing.T) {
t.Skip("Flaky when CI is slower")

agentInfo := &testAgentInfo{}
settings := &fleetGatewaySettings{
Expand Down
Loading

0 comments on commit fd7f48f

Please sign in to comment.