diff --git a/filebeat/docs/inputs/input-common-harvester-options.asciidoc b/filebeat/docs/inputs/input-common-harvester-options.asciidoc index b9fa634b4cb..8f161006416 100644 --- a/filebeat/docs/inputs/input-common-harvester-options.asciidoc +++ b/filebeat/docs/inputs/input-common-harvester-options.asciidoc @@ -14,11 +14,59 @@ The file encoding to use for reading data that contains international characters. See the encoding names http://www.w3.org/TR/encoding/[recommended by the W3C for use in HTML5]. -Here are some sample encodings from W3C recommendation: - - * plain, latin1, utf-8, utf-16be-bom, utf-16be, utf-16le, big5, gb18030, - gbk, hz-gb-2312, - * euc-kr, euc-jp, iso-2022-jp, shift-jis, and so on +Valid encodings: + + * `plain`: plain ASCII encoding + * `utf-8` or `utf8`: UTF-8 encoding + * `gbk`: simplified Chinese charaters + * `iso8859-6e`: ISO8859-6E, Latin/Arabic + * `iso8859-6i`: ISO8859-6I, Latin/Arabic + * `iso8859-8e`: ISO8859-8E, Latin/Hebrew + * `iso8859-8i`: ISO8859-8I, Latin/Hebrew + * `iso8859-1`: ISO8859-1, Latin-1 + * `iso8859-2`: ISO8859-2, Latin-2 + * `iso8859-3`: ISO8859-3, Latin-3 + * `iso8859-4`: ISO8859-4, Latin-4 + * `iso8859-5`: ISO8859-5, Latin/Cyrillic + * `iso8859-6`: ISO8859-6, Latin/Arabic + * `iso8859-7`: ISO8859-7, Latin/Greek + * `iso8859-8`: ISO8859-8, Latin/Hebrew + * `iso8859-9`: ISO8859-9, Latin-5 + * `iso8859-10`: ISO8859-10, Latin-6 + * `iso8859-13`: ISO8859-13, Latin-7 + * `iso8859-14`: ISO8859-14, Latin-8 + * `iso8859-15`: ISO8859-15, Latin-9 + * `iso8859-16`: ISO8859-16, Latin-10 + * `cp437`: IBM CodePage 437 + * `cp850`: IBM CodePage 850 + * `cp852`: IBM CodePage 852 + * `cp855`: IBM CodePage 855 + * `cp858`: IBM CodePage 858 + * `cp860`: IBM CodePage 860 + * `cp862`: IBM CodePage 862 + * `cp863`: IBM CodePage 863 + * `cp865`: IBM CodePage 865 + * `cp866`: IBM CodePage 866 + * `ebcdic-037`: IBM CodePage 037 + * `ebcdic-1040`: IBM CodePage 1140 + * `ebcdic-1047`: IBM CodePage 1047 + * `koi8r`: KOI8-R, Russian (Cyrillic) + * `koi8u`: KOI8-U, Ukranian (Cyrillic) + * `macintosh`: Macintosh encoding + * `macintosh-cyrillic`: Macintosh Cyrillic encoding + * `windows1250`: Windows1250, Central and Eastern European + * `windows1251`: Windows1251, Russian, Serbian (Cyrillic) + * `windows1252`: Windows1252, Legacy + * `windows1253`: Windows1253, Modern Greek + * `windows1254`: Windows1254, Turkish + * `windows1255`: Windows1255, Hebrew + * `windows1256`: Windows1256, Arabic + * `windows1257`: Windows1257, Estonian, Latvian, Lithuanian + * `windows1258`: Windows1258, Vietnamese + * `windows874`: Windows874, ISO/IEC 8859-11, Latin/Thai + * `utf-16-bom`: UTF-16 with required BOM + * `utf-16be-bom`: big endian UTF-16 with required BOM + * `utf-16le-bom`: little endian UTF-16 with required BOM The `plain` encoding is special, because it does not validate or transform any input.