-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable _all by default, disallow configuring _all on 6.0+ indices #22144
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just left some minor things that need clarifications.
Other than that this change looks good. I am surprised that it did not require more changes in the tests but I guess that the all_fields
mode covers most of the use case.
@@ -270,9 +272,6 @@ public void testRandom() throws Exception { | |||
booleanOptionList.add(new Tuple<>("store_term_vectors", tv_stored = randomBoolean())); | |||
} | |||
if (randomBoolean()) { | |||
booleanOptionList.add(new Tuple<>("enabled", enabled = randomBoolean())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it was random before why do you need to force enabled
to true ?
@@ -2435,6 +2435,7 @@ public void testPostingsHighlighterCommonTermsQuery() throws IOException { | |||
|
|||
private static XContentBuilder type1PostingsffsetsMapping() throws IOException { | |||
return XContentFactory.jsonBuilder().startObject().startObject("type1") | |||
.startObject("_all").field("enabled", true).endObject() | |||
.startObject("properties") | |||
.startObject("field1").field("type", "text").field("index_options", "offsets").endObject() | |||
.startObject("field2").field("type", "text").field("index_options", "offsets").endObject() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain very briefly why it fails the test with _all disabled ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testPostingsHighlighter
has a test that queries on the _all
field and highlights other fields. Without this, it would fail.
I can remove the particular clause if you think that would be a better solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks for the explanation. BTW are we planning to add support for all_fields
in the multi match query ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we planning to add support for
all_fields
in themulti_match
query ?
I don't know, do you think we should? Someone could just as easily use "fields": ["*"]
with lenient: true
since they already have to specify fields
in the request? I'm totally open to adding it though.
assertAcked(prepareCreate("test").addMapping("type1", "nested1", "type=nested").addMapping("type2", "nested1", "type=nested")); | ||
assertAcked(prepareCreate("test") | ||
.addMapping("type1", "nested1", "type=nested", "_all", "enabled=true") | ||
.addMapping("type2", "nested1", "type=nested", "_all", "enabled=true")); | ||
ensureGreen(); | ||
|
||
// check on no data, see it works |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an actual check, see a few lines down for:
// check that _all is working on nested docs
Which requires _all
for the test to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Thanks
@@ -252,8 +252,7 @@ public void testSimpleQueryStringFlags() throws ExecutionException, InterruptedE | |||
searchResponse = client() | |||
.prepareSearch() | |||
.setQuery( | |||
simpleQueryStringQuery("baz | egg*").defaultOperator(Operator.AND).flags(SimpleQueryStringFlag.WHITESPACE, | |||
SimpleQueryStringFlag.PREFIX)).get(); | |||
simpleQueryStringQuery("blah | egg*").flags(SimpleQueryStringFlag.WHITESPACE, SimpleQueryStringFlag.PREFIX)).get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to remove this .defaultOperator(Operator.AND)
. It looks like a test use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change this to be a bit more transparent of a test. It's not a very good check right now.
==== The `_all` meta field is now disabled by default | ||
|
||
On new mappings, the `_all` meta field that contains a copy of the text from | ||
each field is not disabled by default. The `query_string` and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is not enabled by default ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops, "not" should be "now". I'll fix.
check if `_all` is enabled/disabled and switch to executing the query across all | ||
fields if `_all` is disabled. `_all` may be re-enabled by setting `enabled`: | ||
`true` in the mapping configuration for `_all`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a sentence that says that a reindex is required when _all
is re-enabled ?
"query": { | ||
"query_string": { | ||
"query": "post_date:foo", | ||
"lenient": false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
false
is still the default value, so why is this needed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
query_string
defaults to all-fields mode if _all
is disabled, which turns on lenient
automatically, so this is needed in order to make the query non-valid.
"query_string": { | ||
"query": "post_date:foo", | ||
"lenient": false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above :)
I pushed some commits, thanks for looking @jimczi! |
I don't like having a feature that has such a large impact on the code base disabled by default, as it means the maintenance cost remains the same (=high), while the benefit to our users decreases significantly, as most of them will not opt in. Two possible paths that I can think of are either to remove it entirely (ie. forbid 6.x indices from enabling it and removing in 7), or keep |
The impact is pretty contained and should not change the OOB significantly. Though I agree that keeping the feature is costly so I think that forbids in 6 and removing in 7 is a good plan.
Not sure it would be useful if we don't copy to it by default and if we always use the |
Alright, sounds like the consensus is to disallow in 6.0+ and we can eventually remove it in 7.0, I'll update the PR and title. |
fcf70fd
to
28d8ab0
Compare
retest this please |
81330bc
to
79f4cd4
Compare
each field is now disabled by default. The `query_string` and | ||
`simple_query_string` queries that previously used `_all` to search will now | ||
check if `_all` is enabled/disabled and switch to executing the query across all | ||
fields if `_all` is disabled. `_all` may be re-enabled by setting `enabled`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switch to executing the query across all fields if
_all
is disabled
I hope not all fields, but only the fields that I specified the include_in_all: True option? Can you please clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't specify anything in the fields
parameter, then simple_query_string
and query_string
will automatically try to execute the search across all searchable fields (see: #20925). If you want to limit it, then you can specify the "fields"
parameter in the query.
Additionally, you could use copy_to
and create your own all-type field as a composite of whatever fields you want, and then set that as the default_field
for the index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your prompt response and the link to #20925. Currently in elasticsearch 5.0 I use include_in_all: False
on everything except on text fields, so what I'll have to do with this change is indeed to create my own field with copy_to
, or specify all the text fields I want in fields
when I do the query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dakrone
This looks good. I can imagine how painful it was to understand why all this tests were relying on _all and how to remove it without loosing the test case.
Though I think we should at least keep some test around regarding the AllFieldType and maybe the copyToAll functionality. It will surely be removed but in the mean time we may add things that breaks it.
IMO we should completely disallow enabling _all
even if the index was created < 6.x. We don't want to create more _all
users ;)
assertWarnings("Deprecated field [template] used, replaced by [index_patterns]"); | ||
} | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can maybe keep this test around and just remove _all from the templates ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, though then it won't be the real templates from Logstash and Beats, but I guess it is still testing the other features.
`true` in the mapping configuration for `_all`. If `_all` is enabled on an | ||
existing index data that was previously indexed will need to be reindexed in | ||
order to populate the `_all` field. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is complicated and maybe not needed. If we don't accept new mapping with _all enabled maybe we should also not allow to enable _all
on an index that was created in older versions ? So we would completely disallow enabling _all
but continue to handle indices created in older versions with _all
already indexed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do disallow this for 5.0 indices, see UpdateMappingOnClusterIT
which tests this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok good then you can remove this sentence ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely, good catch!
@jimczi I re-added some tests based on your feedback, thanks for taking a look!
Can you describe what you mean here? Do you mean just that fields are copied to the |
Yes and the AllFieldTypeTest. Why is it removed ? |
My accidental over-zealousness, I've re-added it :) As far as testing that fields are copied, the |
retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This change disables the _all meta field by default. Now that we have the "all-fields" method of query execution, we can save both indexing time and disk space by disabling it. _all can no longer be configured for indices created after 6.0. Relates to elastic#20925 and elastic#21341 Resolves elastic#19784
bf80834
to
7a18bb5
Compare
ES 6.0 deprecates `_all`, see elastic/elasticsearch#22144. `*.template-es6x.json` has been added in case Elasticsearch 6.X is detected
ES 6.0 deprecates `_all`, see elastic/elasticsearch#22144. `*.template-es6x.json` has been added in case Elasticsearch 6.X is detected
If I want to index documents that have In other words, will it be possible to index documents like the following one OOB? {
"foo": "bar",
"_all": "dummy"
} |
We strongly recommend not naming user-created fields started with an You could, however, name it |
In 6.0 `_all` is not configurable and disabled by default. Relates to elastic#22144
* Add deprecation logging when _all is enabled In 6.0 `_all` is not configurable and disabled by default. Relates to #22144 * Log deprecation regardless of whether _all is enabled or disabled * Add expected warnings in REST tests * Assert warnings in unit tests * Add warning check to various documentation tests * Add assertWarnings if randomBoolean() is true for enabled
* Add deprecation logging when _all is enabled In 6.0 `_all` is not configurable and disabled by default. Relates to #22144 * Log deprecation regardless of whether _all is enabled or disabled * Add expected warnings in REST tests * Assert warnings in unit tests * Add warning check to various documentation tests * Add assertWarnings if randomBoolean() is true for enabled
This change disables the
_all
meta field by default.Now that we have the "all-fields" method of query execution, we can save both
indexing time and disk space by disabling it.
_all
can no longer be configured for indices created after 6.0.Relates to #20925 and #21341
Resolves #19784