Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype an approach to field aliases that creates a new top-level mapper type. #1

Closed
wants to merge 6 commits into from

Conversation

jtibshirani
Copy link
Owner

@jtibshirani jtibshirani commented May 29, 2018

Note that this a just prototype to explore a possible approach -- it is incomplete and is not meant for careful review.

Overall approach:

  • Create a new top-level Mapper type called FieldAliasMapper (so in particular, alias mappers do not inherit from FieldMapper).
  • Make sure to keep track of aliases in MapperService so that when field aliases are supplied to MapperService#fullName, the correct concrete field type is returned. The bulk of this logic resides in FieldTypeLookup.

Nice properties of this approach:

  • Supports alias fields when querying, sorting, aggregating, requesting suggestions, requesting docvalue_fields, and referencing fields in scripts. Aliases can also be used when fetching stored_fields (with some caveats mentioned below). Additionally, aliases can be used in the field capabilities API.
  • A concrete field can be added in one mappings update, and an alias added in a separate update.
  • If a concrete field type is updated, any aliases it has will always refer to the latest version.
  • The complexity around resolving aliases is largely scoped to FieldTypeLookup.
  • Consumers don't need to remember to call something like MappedFieldType#fieldTypeForIndex to get the underlying field type. Unfortunately it's often necessary to work with the concrete field type, as we sometimes compare against the class, or MappedFieldType#typeName.

Disadvantages:

  • Supporting highlighting will require some work, as it's not immediately straightforward to add in this approach. The highlighter APIs expect a FieldMapper as opposed to a FieldType, so we can't just resolve the alias + pass in the concrete field type as usual. It looks like all known highlighter implementations only need the information on FieldType, but this is API is exposed to plugins and may be difficult to change. See Highlighter#canHighlight for an example of where the API expects a FieldMapper, when it really just needs a MappedFieldType.
  • If an error occurs after we've already grabbed aMappedFieldType, we don't display the original alias name in the error message. This actually seems okay to me so far -- I think users have a general mental model of a field alias being resolved to a concrete type at some point during the request. So it's not too surprising to see an error message that references the concrete type when, for example, field types are being validated.

Current limitations (that also apply to the original approach of creating AliasFieldType):

  • When executing a search, we must sometimes take care to refer to MappedFieldType#name as opposed to the field name provided originally. For an example, see the classes SuggestionBuilder and FetchPhase in either PR. This could be easy to miss when writing new code, as there's nothing in terms of type safety, etc., to stop you from using the original field name.
  • Some work needs to be done around fetching stored_fields. First, when a field name is specified as a wildcard, alias names are not matched. In addition, when requesting stored_fields for an alias, the correct data comes back, but the response refers to the concrete field name: "fields" : { "original_field" : [ "value" ]}.
  • When combining results from multiple indices, some data structures whose field names used to always line up are now no longer guaranteed to match. For example, we may now compare two SortField instances with different field names, since they can refer to different concrete fields depending on the index. It looks like this works out fine in the relevant cases like sorting + aggregations, but it makes me a bit nervous, as I think we're changing something that was previously an invariant.

@jtibshirani jtibshirani force-pushed the prototype/field-aliases branch from 9f6dc30 to b25f395 Compare May 30, 2018 00:45
@mP1
Copy link

mP1 commented May 30, 2018

@jtibshirani

Few comments...

Make sure to keep track of aliases in MapperService so that when field aliases are supplied to MapperService#fullName, the correct concrete field type is returned. The bulk of this logic resides in FieldTypeLookup.

It was mentioned that a weakness of my approach was that one needed to know whether to pick "nameForIndex" or "nameForMessage", this approach seems to be the same, in that programmers need to know whether they can use FieldType.name() or need to use MapperService.fullName.

A lot of call sites just use FieldType.name.

Im going to guess that to support all the use cases that my code does, will result in a lot more changes, and you may encounter cases where you dont have a MapperService.

I believe my approach is simpler because code only needs to pick two methods (after making #name pkg private), on the same class, but with this approach they need to know about an extra class and whether to just use FieldType#name or call MapperService.

JT: Supporting highlighting will require some work, as it's not immediately straightforward to add in this approach. The highlighter APIs expect a FieldMapper as opposed to a FieldType, so we can't just resolve the alias + pass in the concrete field type as usual. It looks like all known highlighter implementations only need the information on FieldType, but this is API is exposed to plugins and may be difficult to change. See Highlighter#canHighlight for an example of where the API expects a FieldMapper, when it really just needs a MappedFieldType.

MP: This example is precisely why I had to make AliasFieldType. I believe the list of in disadv is incomplete and there will be plenty more that my changes work and your approach will be significantly more complex and require more changes, simply because you dont have a MapperService around, and all you have is the FieldType.

In my way highlighting just a few one liner changes..

https://github.com/mP1/elasticsearch/commit/ebf1c8a1f545ce5f091d4b21f9b64816dd4bc871#diff-754d0b632fad605a00ede85ff888dde7

Current limitations (that also apply to the original approach of creating AliasFieldType):

JT: When executing a search, we must sometimes take care to refer to MappedFieldType#name as opposed to the field name provided originally. For an example, see the classes SuggestionBuilder and FetchPhase in either PR. This could be easy to miss when writing new code, as there's nothing in terms of type safety, etc., to stop you from using the original field name.

MP: Actually i think this limitation for my approach can be easily fixed, by simply making MappedFIeldType#name package private, which basically forces all callers to now be fixed and updated to call either MappedFieldType#nameForIndex or MappedFieldType#nameForMessages.

Im not sure if JT approach can be easily fixed in the same way, because MappedFieldType#name remains public and theres nothing stopping callers from calling.

Im not saying my solution is perfect but at least the scope is significantly smaller.

JT: Some work needs to be done around fetching stored_fields. First, when a field name is specified as a wildcard, alias names are not matched

MP: Im not sure if this is a ref to my stuff, it might just be something i have failed to complete or missed.

JT: When combining results from multiple indices, some data structures whose field names used to always line up are now no longer guaranteed to match. For example, we may now compare two SortField instances with different field names, since they can refer to different concrete fields depending on the index. It looks like this works out fine in the relevant cases like sorting + aggregations, but it makes me a bit nervous, as I think we're changing something that was previously an invariant.

MP: I think we need a real concrete example here, so both JT and MP approach can actually be tested, mind experiements for something complex like this arent safe to assume as correct.

@mP1
Copy link

mP1 commented May 30, 2018

JT: Disadvantages:

Supporting highlighting will require some work, as it's not immediately straightforward to add in this approach. The highlighter APIs expect a FieldMapper as opposed to a FieldType, so we can't just resolve the alias + pass in the concrete field type as usual. It looks like all known highlighter implementations only need the information on FieldType, but this is API is exposed to plugins and may be difficult to change. See Highlighter#canHighlight for an example of where the API expects a FieldMapper, when it really just needs a MappedFieldType.
If an error occurs after we've already grabbed aMappedFieldType, we don't display the original alias name in the error message. This actually seems okay to me so far -- I think users have a general mental model of a field alias being resolved to a concrete type at some point during the request. So it's not too surprising to see an error message that references the concrete type when, for example, field types are being validated.

MP: There is no mention in your text whether this use case (elastic#30230 (comment)) currently works in JT approach.

A similar problem exists in other/multiple use cases for the same reason, can we have clarification if JT's approach has solved this problem ?

JT: If an error occurs after we've already grabbed aMappedFieldType, we don't display the original alias name in the error message. This actually seems okay to me so far -- I think users have a general mental model of a field alias being resolved to a concrete type at some point during the request. So it's not too surprising to see an error message that references the concrete type when, for example, field types are being validated

The same is true of preparing the results for aggregations and other places, for the same reasons as preparing messages doesnt know whether to pick the alias or the target field name.

If it only has the FieldType, JT approach will be guessing when it tries to pick the alias or the target field name.

There are other similar examples when preparing other results, they need to be mentioned as solved, unsupported and so on, currently they are not mentioned at all.

@mP1
Copy link

mP1 commented May 30, 2018

The text needs to mention whether ANY call site that interacts with the "name" of the field actually knows the original field name (be it alias or target(true) name) is known so they can do the right thing.

Sorry if im making a big deal out of this but the comment prepared by JT fails to make a clear statement about this, except for a reference that "error" messages cant definitely pick the original field name, which is the crux of what my solution can do.

if (path == null) {
throw new IllegalArgumentException("The [path] property must be specified.");
}
return builder.path(path);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im going to guess this is incomplete, but im going to guess here, that it wont be able to provide all the proper checking im performing on the path, eg:

  • does the path point to a real field.
  • does the path actually exist

etc.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's where I was thinking we would check if the alias was valid: https://github.com/jtibshirani/elasticsearch/pull/1/files#diff-c35d5ab4eb379e531c33305a916faceaR403. Other validation is already performed in MapperService/ FieldTypeLookup, so there is some precedent for this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtibshirani

image

Seem strange your comment says check here, and a few lines below it seems the state is updated ( is it ??) again.

Copy link

@mP1 mP1 May 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually precent is really my concern, my concern is whether the entire state of mappings is complete by the stage you wish to perform your checks.

Is it possible for "updates" to happen "after" this spot, which will result in a mappings that have not all been verified ? Basically is it possible that the target of an alias, might point to one thing when 403 is executed but after that the target field is something else, and never gets checked.

@jtibshirani
Copy link
Owner Author

jtibshirani commented May 30, 2018

My apologies for the confusion over what is supported -- I will update the PR comment to clarify. I ran through the script you provided, and everything except highlighting works (although more solid integration tests would certainly be needed before we had full confidence in the approach).

@jtibshirani
Copy link
Owner Author

@mP1 thanks for going over this prototype, and for your responses.

I updated the PR description to specify what's supported:

'Supports alias fields when querying, sorting, aggregating, requesting suggestions, requesting docvalue_fields, and referencing fields in scripts. Aliases can also be used when fetching stored_fields (with some caveats mentioned below). Additionally, aliases can be used in the field capabilities API.'

First, I'm a bit unclear on the concern mentioned here:

It was mentioned that a weakness of my approach was that one needed to know whether to pick "nameForIndex" or "nameForMessage", this approach seems to be the same, in that programmers need to know whether they can use FieldType.name() or need to use MapperService.fullName.

This approach is a bit different in that there is no AliasFieldType. So you only ever have access to the field name mentioned in a request, or the concrete MappedFieldType that contains a concrete field name. The idea is that when executing a search request, we always eventually get ahold of the MappedFieldType. Once you have this field type, you can be certain it represents a concrete type in the index, and there are no further choices to make around which name to use, etc.

Next, in response to these comments...

Im going to guess that to support all the use cases that my code does, will result in a lot more changes, and you may encounter cases where you dont have a MapperService.

The same is true of preparing the results for aggregations and other places, for the same reasons as preparing messages doesnt know whether to pick the alias or the target field name.

Now that I've clarified what is supported, would you be able to give a few examples of requests where this approach doesn't work? I could certainly be overlooking something, as I haven't written thorough tests for this prototype.

Lastly, I agree that it's not as straightforward to support highlighting with this approach.

@mP1
Copy link

mP1 commented May 31, 2018

On my PR you will find a large comment filled with requests. Can you take each one and one, and create a new comment here, that shows request and response, with a one line about whether it does the right thing or has a weakness etc.

I think that will really help speed up this conversation, to see cold hard results ...

@mP1
Copy link

mP1 commented May 31, 2018

Sorry was using wrong commit will try again...

@mP1
Copy link

mP1 commented May 31, 2018

Found a few weird failings, cant really show you failings without a working...

PUT /index-xyz
{
	"mappings": {
		"document": {
			"properties": {
				"integer1": {
					"type": "integer"
				},
				"integer2": {
					"type": "integer"
				},
				"alias3": {
					"type": "alias",
					"path": "integer2"
				},
        "text4" : {
					"type": "text",
					"fielddata": true,
					"term_vector": "with_positions_offsets",
					"store": true
				},
				"alias5": {
				  "type": "alias",
				  "path": "text4"
				},
        "text6" : {
					"type": "text",
					"fielddata": true,
					"term_vector": "with_positions_offsets",
					"store": true
				}
			}
		}
	}
}

PUT /index-xyz/document/1
{
    "integer1": 11,
    "integer2": 101,
    "text4": "abc",
    "text6": "qqq"
}

PUT /index-xyz/document/2
{
    "integer1": 21,
    "integer2": 201,
    "text4": "def",
    "text6": "rrr"
}

PUT /index-xyz/document/3
{
    "integer1": 39,
    "integer2": 309,
    "text4": "abc",
    "text6": "sss"
}

PUT /index-xyz/document/4
{
    "text4": "abc",
    "text6": "sss"
}

image

@mP1
Copy link

mP1 commented May 31, 2018

Not sure why aggs for the same eventual field give different results...

(same mappings+docs as previous comment). Note the second agg, does have the "correct" aggs at the bottom but it shows all the hits unnecessarily(the first results dont include). My PR shows only the aggs no hits for both.

NB alias3 -> integer2

image

@mP1
Copy link

mP1 commented May 31, 2018

will leave it at that...for now.highlighting appears broken on mine atm(im sure it was working before), will fix and get back to you

@jpountz
Copy link

jpountz commented May 31, 2018

I like this approach in general. It looks like no matter which approach we decide on in the end, some cleanups are required, for instance so that highlighters take a MappedFieldType rather than a FieldMapper.

For example, we may now compare two SortField instances with different field names, since they can refer to different concrete fields depending on the index. It looks like this works out fine in the relevant cases like sorting + aggregations, but it makes me a bit nervous, as I think we're changing something that was previously an invariant.

I'm not worried about this. Let's add some tests for confidence, but it should work out of the box.

@jtibshirani
Copy link
Owner Author

@mP1 thanks for taking a look. In your first example, it looks like the search endpoint is misspelled (_searchx), so that's why it's failing. For the other example, the first aggregation request has size=0, whereas the second one doesn't, so we would expect there to only be search hits for the second. I ran through the script previously and everything noted as 'supported' above looked like it worked fine. Instead of manually comparing the output in comments, I think our time would be well-spent writing integration tests for the feature.

@jpountz okay, great. It seems like with a bunch of tests, this approach could be viable.

@mP1
Copy link

mP1 commented May 31, 2018

@jtibshirani

JT: I will take a crack at the highlighter fix in a separate PR.

I think you should try to get at least one of the several highlighting use cases to work correctly with either an alias or the target that has an alias but is referenced by the target.

This is the real weakness that i have been concentrating on, the forward processing is the easier part, the post processing of a response from "lucene" is the part that i believe highlights the weaknesses of your approach.

By doing this in two steps, what happens if you hit a brick wall getting highlighting to work and it cant work, and the "first" half is in master ?

At the very least i believe the best approach is do one highlight use case, as mental experiments are not safe, and the only proof is seeing a working example.

@jtibshirani
Copy link
Owner Author

By doing this in two steps, what happens if you hit a brick wall getting highlighting to work and it cant work, and the "first" half is in master ?

I tried to clarify my comment on the original PR -- moving highlighters to use MappedFieldType instead of FieldMapper will be a nice refactor in itself, so we aren't too worried about it going in separately.

At the very least i believe the best approach is do one highlight use case, as mental experiments are not safe, and the only proof is seeing a working example.

Thanks for the suggestion, I'll work on adding highlighting to this prototype in parallel.

@jtibshirani jtibshirani force-pushed the prototype/field-aliases branch from 9cdc591 to 24ff1d8 Compare June 1, 2018 22:52
@jtibshirani
Copy link
Owner Author

jtibshirani commented Jun 1, 2018

I've opened the following PR that moves highlighters to use field types: elastic#31039

I've added that refactor to this prototype, then added a small commit to get highlighting working. As I mentioned in the 'current limitations' section, I'm not super happy with lines like the following, but it works and is not too drastic a change: 24ff1d8#diff-a170a995d8fbdb81e218c83aad1641e3R116

@jpountz
Copy link

jpountz commented Jun 4, 2018

I'm not super happy with lines like the following, but it works and is not too drastic a change

It looks totally fine to me.

@jtibshirani
Copy link
Owner Author

Closing this prototype, and will open a new PR against elastic:master.

@jtibshirani jtibshirani closed this Jun 5, 2018
@jtibshirani jtibshirani deleted the prototype/field-aliases branch June 5, 2018 00:59
jtibshirani pushed a commit that referenced this pull request Jun 13, 2018
The `requires_replica` yaml test feature hasn't worked for years. This
is what happens if you try to use it:
```
   > Throwable #1: java.lang.NullPointerException
   >    at __randomizedtesting.SeedInfo.seed([E6602FB306244B12:6E341069A8D826EA]:0)
   >    at org.elasticsearch.test.rest.yaml.Features.areAllSupported(Features.java:58)
   >    at org.elasticsearch.test.rest.yaml.section.SkipSection.skip(SkipSection.java:144)
   >    at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.test(ESClientYamlSuiteTestCase.java:321)
```

None of our tests use it.
jtibshirani pushed a commit that referenced this pull request Jun 22, 2018
The `requires_replica` yaml test feature hasn't worked for years. This
is what happens if you try to use it:
```
   > Throwable #1: java.lang.NullPointerException
   >    at __randomizedtesting.SeedInfo.seed([E6602FB306244B12:6E341069A8D826EA]:0)
   >    at org.elasticsearch.test.rest.yaml.Features.areAllSupported(Features.java:58)
   >    at org.elasticsearch.test.rest.yaml.section.SkipSection.skip(SkipSection.java:144)
   >    at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.test(ESClientYamlSuiteTestCase.java:321)
```

None of our tests use it.
jtibshirani pushed a commit that referenced this pull request Nov 7, 2018
Error was thrown if leader index had no soft deletes enabled, but it then continued creating the follower index.

The test caught this bug, but very rarely due to timing issue.

Build failure instance:

```
1> [2018-11-05T20:29:38,597][INFO ][o.e.x.c.LocalIndexFollowingIT] [testDoNotCreateFollowerIfLeaderDoesNotHaveSoftDeletes] before test
  1> [2018-11-05T20:29:38,599][INFO ][o.e.c.s.ClusterSettings  ] [node_s_0] updating [cluster.remote.local.seeds] from [[]] to [["127.0.0.1:9300"]]
  1> [2018-11-05T20:29:38,599][INFO ][o.e.c.s.ClusterSettings  ] [node_s_0] updating [cluster.remote.local.seeds] from [[]] to [["127.0.0.1:9300"]]
  1> [2018-11-05T20:29:38,609][INFO ][o.e.c.m.MetaDataCreateIndexService] [node_s_0] [leader-index] creating index, cause [api], templates [random-soft-deletes-templat
e, one_shard_index_template], shards [2]/[0], mappings []
  1> [2018-11-05T20:29:38,628][INFO ][o.e.c.r.a.AllocationService] [node_s_0] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[leader-
index][0]] ...]).
  1> [2018-11-05T20:29:38,660][INFO ][o.e.x.c.a.TransportPutFollowAction] [node_s_0] [follower-index] creating index, cause [ccr_create_and_follow], shards [2]/[0]
  1> [2018-11-05T20:29:38,675][INFO ][o.e.c.s.ClusterSettings  ] [node_s_0] updating [cluster.remote.local.seeds] from [["127.0.0.1:9300"]] to [[]]
  1> [2018-11-05T20:29:38,676][INFO ][o.e.c.s.ClusterSettings  ] [node_s_0] updating [cluster.remote.local.seeds] from [["127.0.0.1:9300"]] to [[]]
  1> [2018-11-05T20:29:38,678][INFO ][o.e.x.c.LocalIndexFollowingIT] [testDoNotCreateFollowerIfLeaderDoesNotHaveSoftDeletes] after test
  1> [2018-11-05T20:29:38,678][INFO ][o.e.x.c.LocalIndexFollowingIT] [testDoNotCreateFollowerIfLeaderDoesNotHaveSoftDeletes] [LocalIndexFollowingIT#testDoNotCreateFoll
owerIfLeaderDoesNotHaveSoftDeletes]: cleaning up after test
  1> [2018-11-05T20:29:38,678][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s_0] [follower-index/TlWlXp0JSVasju2Kr_hksQ] deleting index
  1> [2018-11-05T20:29:38,678][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s_0] [leader-index/FQ6EwIWcRAKD8qvOg2eS8g] deleting index
FAILURE 0.23s J0 | LocalIndexFollowingIT.testDoNotCreateFollowerIfLeaderDoesNotHaveSoftDeletes <<< FAILURES!
   > Throwable #1: java.lang.AssertionError:
   > Expected: <false>
   >      but: was <true>
   >    at __randomizedtesting.SeedInfo.seed([7A3C89DA3BCA17DD:65C26CBF6FEF0B39]:0)
   >    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
   >    at org.elasticsearch.xpack.ccr.LocalIndexFollowingIT.testDoNotCreateFollowerIfLeaderDoesNotHaveSoftDeletes(LocalIndexFollowingIT.java:83)
   >    at java.lang.Thread.run(Thread.java:748)
```

Build failure: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.5+intake/46/console
jtibshirani pushed a commit that referenced this pull request Mar 5, 2019
Today this test catches an exception and asserts that its proximate cause has
message `Random IOException` but occasionally this exception is wrapped two
layers deep, causing the test to fail. This commit adjusts the test to look at
the root cause of the exception instead.

      1> [2019-02-25T12:31:50,837][INFO ][o.e.s.SharedClusterSnapshotRestoreIT] [testSnapshotFileFailureDuringSnapshot] --> caught a top level exception, asserting what's expected
      1> org.elasticsearch.snapshots.SnapshotException: [test-repo:test-snap/e-hn_pLGRmOo97ENEXdQMQ] Snapshot could not be read
      1> 	at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:212) ~[main/:?]
      1> 	at org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:135) ~[main/:?]
      1> 	at org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:54) ~[main/:?]
      1> 	at org.elasticsearch.action.support.master.TransportMasterNodeAction.masterOperation(TransportMasterNodeAction.java:127) ~[main/:?]
      1> 	at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.doRun(TransportMasterNodeAction.java:208) ~[main/:?]
      1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) ~[main/:?]
      1> 	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[main/:?]
      1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_202]
      1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_202]
      1> 	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
      1> Caused by: org.elasticsearch.snapshots.SnapshotException: [test-repo:test-snap/e-hn_pLGRmOo97ENEXdQMQ] failed to get snapshots
      1> 	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.getSnapshotInfo(BlobStoreRepository.java:564) ~[main/:?]
      1> 	at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:206) ~[main/:?]
      1> 	... 9 more
      1> Caused by: java.io.IOException: Random IOException
      1> 	at org.elasticsearch.snapshots.mockstore.MockRepository$MockBlobStore$MockBlobContainer.maybeIOExceptionOrBlock(MockRepository.java:275) ~[test/:?]
      1> 	at org.elasticsearch.snapshots.mockstore.MockRepository$MockBlobStore$MockBlobContainer.readBlob(MockRepository.java:317) ~[test/:?]
      1> 	at org.elasticsearch.repositories.blobstore.ChecksumBlobStoreFormat.readBlob(ChecksumBlobStoreFormat.java:101) ~[main/:?]
      1> 	at org.elasticsearch.repositories.blobstore.BlobStoreFormat.read(BlobStoreFormat.java:90) ~[main/:?]
      1> 	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.getSnapshotInfo(BlobStoreRepository.java:560) ~[main/:?]
      1> 	at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:206) ~[main/:?]
      1> 	... 9 more

    FAILURE 0.59s J0 | SharedClusterSnapshotRestoreIT.testSnapshotFileFailureDuringSnapshot <<< FAILURES!
       > Throwable #1: java.lang.AssertionError:
       > Expected: a string containing "Random IOException"
       >      but: was "[test-repo:test-snap/e-hn_pLGRmOo97ENEXdQMQ] failed to get snapshots"
       > 	at __randomizedtesting.SeedInfo.seed([B73CA847D4B4F52D:884E042D2D899330]:0)
       > 	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
       > 	at org.elasticsearch.snapshots.SharedClusterSnapshotRestoreIT.testSnapshotFileFailureDuringSnapshot(SharedClusterSnapshotRestoreIT.java:821)
       > 	at java.lang.Thread.run(Thread.java:748)
jtibshirani pushed a commit that referenced this pull request May 6, 2019
Today this test catches an exception and asserts that its proximate cause has
message `Random IOException` but occasionally this exception is wrapped two
layers deep, causing the test to fail. This commit adjusts the test to look at
the root cause of the exception instead.

      1> [2019-02-25T12:31:50,837][INFO ][o.e.s.SharedClusterSnapshotRestoreIT] [testSnapshotFileFailureDuringSnapshot] --> caught a top level exception, asserting what's expected
      1> org.elasticsearch.snapshots.SnapshotException: [test-repo:test-snap/e-hn_pLGRmOo97ENEXdQMQ] Snapshot could not be read
      1> 	at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:212) ~[main/:?]
      1> 	at org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:135) ~[main/:?]
      1> 	at org.elasticsearch.action.admin.cluster.snapshots.get.TransportGetSnapshotsAction.masterOperation(TransportGetSnapshotsAction.java:54) ~[main/:?]
      1> 	at org.elasticsearch.action.support.master.TransportMasterNodeAction.masterOperation(TransportMasterNodeAction.java:127) ~[main/:?]
      1> 	at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.doRun(TransportMasterNodeAction.java:208) ~[main/:?]
      1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) ~[main/:?]
      1> 	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[main/:?]
      1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_202]
      1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_202]
      1> 	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
      1> Caused by: org.elasticsearch.snapshots.SnapshotException: [test-repo:test-snap/e-hn_pLGRmOo97ENEXdQMQ] failed to get snapshots
      1> 	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.getSnapshotInfo(BlobStoreRepository.java:564) ~[main/:?]
      1> 	at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:206) ~[main/:?]
      1> 	... 9 more
      1> Caused by: java.io.IOException: Random IOException
      1> 	at org.elasticsearch.snapshots.mockstore.MockRepository$MockBlobStore$MockBlobContainer.maybeIOExceptionOrBlock(MockRepository.java:275) ~[test/:?]
      1> 	at org.elasticsearch.snapshots.mockstore.MockRepository$MockBlobStore$MockBlobContainer.readBlob(MockRepository.java:317) ~[test/:?]
      1> 	at org.elasticsearch.repositories.blobstore.ChecksumBlobStoreFormat.readBlob(ChecksumBlobStoreFormat.java:101) ~[main/:?]
      1> 	at org.elasticsearch.repositories.blobstore.BlobStoreFormat.read(BlobStoreFormat.java:90) ~[main/:?]
      1> 	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.getSnapshotInfo(BlobStoreRepository.java:560) ~[main/:?]
      1> 	at org.elasticsearch.snapshots.SnapshotsService.snapshots(SnapshotsService.java:206) ~[main/:?]
      1> 	... 9 more

    FAILURE 0.59s J0 | SharedClusterSnapshotRestoreIT.testSnapshotFileFailureDuringSnapshot <<< FAILURES!
       > Throwable #1: java.lang.AssertionError:
       > Expected: a string containing "Random IOException"
       >      but: was "[test-repo:test-snap/e-hn_pLGRmOo97ENEXdQMQ] failed to get snapshots"
       > 	at __randomizedtesting.SeedInfo.seed([B73CA847D4B4F52D:884E042D2D899330]:0)
       > 	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
       > 	at org.elasticsearch.snapshots.SharedClusterSnapshotRestoreIT.testSnapshotFileFailureDuringSnapshot(SharedClusterSnapshotRestoreIT.java:821)
       > 	at java.lang.Thread.run(Thread.java:748)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants