-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs: Document how to rebuild analyzers #30498
Conversation
Adds documentation for how to rebuild all the built in analyzers and tests for that documentation using the mechanism added in elastic#29535. Closes elastic#29499
Pinging @elastic/es-search-aggs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except some small minor changes
recreate it as a `custom` analyzer and modify it, usually by adding | ||
token filters. Usually, you should prefer the | ||
<<keyword, Keyword type>> when you want strings that are not split | ||
into tokens, but just in case you need it, this his would recreate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"this his" -> "this" ?
"tokenizer": { | ||
"split_on_non_word": { | ||
"type": "pattern", | ||
"stopwords": "\\W+" <1> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be pattern
instead of stopwords
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Now I have to figure out how the tests passed when it was wrong....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it is because \\W+
is the default. And because we don't complain if you pass extra stuff here.
[float] | ||
=== Definition | ||
|
||
The `simple` anlzyer consists of: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"anlzyer" -> "analyzer"
|
||
If you need to customize the `pattern` analyzer beyond the configuration | ||
parameters then you need to recreate it as a `custom` analyzer and modify | ||
it, usually by adding token filters. This would recreate the built in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"built in" -> "built-in"
in this place and in all other places
Thanks @mayya-sharipova! I've fixed it up as you requested and merged and backported. |
* 6.x: Revert "Silence IndexUpgradeIT test failures. (#30430)" [DOCS] Remove references to changelog and to highlights Revert "Mute ML upgrade test (#30458)" [ML] Fix BWC version for backport of #30125 [Docs] Improve section detailing translog usage (#30573) [Tests] Relax allowed delta in extended_stats aggregation (#30569) Fail if reading from closed KeyStoreWrapper (#30394) [ML] Reverse engineer Grok patterns from categorization results (#30125) Derive max composite buffers from max content len Update build file due to doc file rename SQL: Extract SQL request and response classes (#30457) Remove the changelog (#30593) Revert "Add deprecation warning for default shards (#30587)" Silence IndexUpgradeIT test failures. (#30430) Add deprecation warning for default shards (#30587) [DOCS] Adds 6.4.0 release highlight pages [DOCS] Adds release highlight pages (#30590) Docs: Document how to rebuild analyzers (#30498) [DOCS] Fixes title capitalization in security content LLRest: Add equals and hashcode tests for Request (#30584) [DOCS] Fix realm setting names (#30499) [DOCS] Fix path info for various security files (#30502) Docs: document precision limitations of geo_bounding_box (#30540) Fix non existing javadocs link in RestClientTests Auto-expand replicas only after failing nodes (#30553)
Adds documentation for how to rebuild all the built in analyzers and
tests for that documentation using the mechanism added in #29535.
Closes #29499