Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make typeless APIs usable with indices whose type name is different from _doc #35790

Merged

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Nov 21, 2018

This commit makes document, update, explain, termvectors and mapping
typeless APIs work on indices that have a type whose name is not _doc.
Unfortunately, this needs to be a bit of a hack since I didn't want calls with
random type names to see documents with the type name that the user had chosen
upon type creation.

The explain and termvectors do not support being called without a type for
now so the test is just using _doc as a type for now, we will need to fix
tests later but this shouldn't require further changes server-side since passing
_doc as a type name is what typeless APIs do internally anyway.

Relates #35190

…rom `_doc`.

This commit makes `document`, `update`, `explain`, `termvectors` and `mapping`
typeless APIs work on indices that have a type whose name is not `_doc`.
Unfortunately, this needs to be a bit of a hack since I didn't want calls with
random type names to see documents with the type name that the user had chosen
upon type creation.

The `explain` and `termvectors` do not support being called without a type for
now so the test is just using `_doc` as a type for now, we will need to fix
tests later but this shouldn't require further changes server-side since passing
`_doc` as a type name is what typeless APIs do internally anyway.

Relates elastic#35190
@jpountz jpountz added >feature :Search Foundations/Mapping Index mappings, including merging and defining field types v7.0.0 labels Nov 21, 2018
@jpountz jpountz requested a review from jtibshirani November 21, 2018 14:50
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jpountz for working on this! Some comments:

I also thought it would be good if someone from the distributed team could take a look at the PR, as these changes are a bit subtle, and there may be other special cases to consider.

@@ -3497,4 +3500,57 @@ public void testResetEngine() throws Exception {
public Settings threadPoolSettings() {
return Settings.builder().put(super.threadPoolSettings()).put("thread_pool.estimated_time_interval", "5ms").build();
}

public void testDeleteOnWrongType() throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comment: maybe this test and the next should have consistent names (like testTypelessDelete and testTypelessGet)?

@@ -302,6 +302,11 @@ public Query buildFilteredQuery(Query query) {

private Query createTypeFilter(String[] types) {
if (types != null && types.length >= 1) {
if (Arrays.asList(types).contains(MapperService.SINGLE_MAPPING_NAME)) {
Copy link
Contributor

@jtibshirani jtibshirani Nov 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was added to deal with 'explain' requests, but for me this makes the overall search behavior a bit unintuitive. If a user explicitly adds a (deprecated) type filter on _doc in a search, I think they would expect the filter to be applied instead of being ignored. From a user's perspective the search APIs differ from the document CRUD APIs, where the _doc is instead seen as part of the endpoint (and not an explicit filter).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

* has no mappings.
* @deprecated Use {@link #mapping()} instead now that indices have a single type
*/
@Deprecated
public ImmutableOpenMap<String, MappingMetaData> getMappings() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we should remove this one too in the future but decided to keep it for a follow-up PR as this PR would otherwise be much larger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jpountz
Copy link
Contributor Author

jpountz commented Nov 23, 2018

Thanks for the comments @jtibshirani, I pushed more changes to address the issues that you found.

Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, just a couple small comments. Do you think someone else should look it over as well, who's very familiar with some of these core classes? This PR also reminded me that we need to remove support for the _default_ mapping type.

* has no mappings.
* @deprecated Use {@link #mapping()} instead now that indices have a single type
*/
@Deprecated
public ImmutableOpenMap<String, MappingMetaData> getMappings() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -0,0 +1,33 @@
---
"mtermvectors without types on an index that has types":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comment: inconsistent spacing in this test (extra newline in mtermvectors.body.docs, spaces before :).

@@ -104,15 +107,19 @@ protected void asyncShardOperation(ExplainRequest request, ShardId shardId,

@Override
protected ExplainResponse shardOperation(ExplainRequest request, ShardId shardId) throws IOException {
String[] types;
if (MapperService.SINGLE_MAPPING_NAME.equals(request.type())) { // typeless explain call
types = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comment, but I think an empty array would be more consistent with the default value of types in ShardSearchRequest, QueryShardContext, etc.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

@jpountz
Copy link
Contributor Author

jpountz commented Nov 27, 2018

The _default_ mapping is already disallowed on 7.x indices:

if (indexSettings.getIndexVersionCreated().onOrAfter(Version.V_7_0_0_alpha1)) {
throw new IllegalArgumentException("The [default] mapping cannot be updated on index [" + index().getName() +
"]: defaults mappings are not useful anymore now that indices can have at most one type.");

@jpountz
Copy link
Contributor Author

jpountz commented Nov 27, 2018

@javanna Would you mind having a look at this change?

@jpountz jpountz requested a review from javanna November 27, 2018 09:33
Copy link
Contributor

@jtibshirani jtibshirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jpountz, I had missed the _default_ mapping removal.

@jtibshirani
Copy link
Contributor

@javanna just a friendly ping on this (I have a couple PRs that are dependent on this change).

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some questions, nothing major though I think.

body: { foo: bar }

- do:
catch: bad_request
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get where this comes from, yet the error is a bit weird, specifically because it mentions mapping update while we are deleting a document, which may confuse users:

Rejecting mapping update to [index] as the final mapping would have more than 1 type: [not_doc, some_random_type]

Not sure whether this was previously discussed and/or can be improved. I was even wondering if this should have been a 404 - doc not found or 400 - request is wrong as the type is not there...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit unrelated to this PR, the behavior was already like this before. I agree it's not great but at the same time it's a bit tricky given that DELETE calls are write calls and that there are expectations regarding versioning if you delete a non-existing document and create a new one soon after with the same id.

indices.refresh: {}

- do:
catch: missing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes more sense to me compared to the 400 above. Yet I think - besides which error code is better - the two scenarios (explain and delete) should return the same error code when referring to a doc belonging to a type that's not there?

delete:
index: index
type: some_random_type
id: 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add some assert for the error message that comes back too?

- match: { _type: "_doc" }
- match: { _id: "1"}
- match: { _version: 1}
- match: { _source: { foo: bar }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we also need to check whether the doc comes back when specifying not_doc as a type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add a test for it because this is already tested by get/11_basic_with_types.yml?

- match: { _index: "index" }
- match: { _type: "_doc" }
- match: { _id: "1"}
- match: { _version: 2}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we also need to check whether the doc gets deleted when specifying not_doc as a type? Or maybe that is already covered by existing tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed this is tested by delete/13_basic_with_types.yml

- match: { _type: "_doc" }
- match: { _id: "1"}
- is_true: matched
- match: { explanation.value: 1 }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we also need to check whether the doc is found when specifying not_doc as a type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have one already

type: some_random_type
id: 1

- is_false: found
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we also need to check what happens when providing not_doc as a type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have one already

if (uidTerm == null) {
return new ExplainResponse(shardId.getIndexName(), request.type(), request.id(), false);
}
// No need to check the type, IndexShard#get does it for is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for is ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for us :) thanks for catching I'll fix

@markharwood
Copy link
Contributor

If I have a mapping with custom type my_type and I index or update using index/_doc/id endpoint shouldn't the response show "_type":"my_type"? It currently returns "_type":"_doc" and I thought we were asking users to think of _doc in the url as a verb rather than a doc type. A search still returns the "_type":"my_type" which contradicts the index/update response.

@jpountz
Copy link
Contributor Author

jpountz commented Dec 4, 2018

@markharwood This is done consciously: by always returning _doc as a value for the _type field, we hope that users will stop reading its value, which is in-turn expected to decrease chances of breakage once we entirely remove it from responses in the future.

@jpountz
Copy link
Contributor Author

jpountz commented Dec 4, 2018

@elasticmachine test this please

@markharwood
Copy link
Contributor

by always returning _doc as a value for the _type field,

Are search responses planning to do this too then?

@jpountz
Copy link
Contributor Author

jpountz commented Dec 4, 2018

@markharwood I'm not 100% sure about this one, the _search API is indeed more subtle. The work that @nik9000 is doing to allow us to change the format based on the version that the client is expecting would hopefully help here.

@jpountz jpountz merged commit d24b40f into elastic:master Dec 4, 2018
@jpountz jpountz deleted the enhancement/typeless_apis_work_with_any_type branch December 4, 2018 18:22
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this pull request Dec 16, 2018
- deprecate type-based constructors of IndexRequest
- update tests to use typeless IndexRequest constructors
- no yaml tests as they have been already added in elastic#35790

Relates to elastic#35190
mayya-sharipova added a commit that referenced this pull request Dec 18, 2018
* Deprecate types in index API

- deprecate type-based constructors of IndexRequest
- update tests to use typeless IndexRequest constructors
- no yaml tests as they have been already added in #35790

Relates to #35190
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Dec 21, 2018
We introduce a typeless API in elastic#35790 where we translate the default
docType "_doc" to the user-defined docType. However, we still use
docType from the source rather the translated type (i.e, type of the
Mapper) when parsing a document. This leads to a situation where we have
two translog operations for the same document with different types:

- prvOp [Index{id='9LCpwGcBkJN7eZxaB54L', type='_doc', seqNo=1, primaryTerm=1, version=1, autoGeneratedIdTimestamp=1545125562123}]
- newOp [Index{id='9LCpwGcBkJN7eZxaB54L', type='not_doc', seqNo=1, primaryTerm=1, version=1, autoGeneratedIdTimestamp=-1}]

Closes elastic#36769
dnhatn added a commit that referenced this pull request Dec 23, 2018
We introduce a typeless API in #35790 where we translate the default
docType "_doc" to the user-defined docType. However, we do not rewrite
the SourceToParse with the resolved docType. This leads to a situation
where we have two translog operations for the same document with
different types:

- prvOp [Index{id='9LCpwGcBkJN7eZxaB54L', type='_doc', seqNo=1,
  primaryTerm=1, version=1, autoGeneratedIdTimestamp=1545125562123}]

- newOp [Index{id='9LCpwGcBkJN7eZxaB54L', type='not_doc', seqNo=1,
  primaryTerm=1, version=1, autoGeneratedIdTimestamp=-1}]

Closes #36769
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :Search Foundations/Mapping Index mappings, including merging and defining field types v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants