Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request index not exist handling #169

Merged
merged 2 commits into from
Nov 21, 2023

Conversation

penghuo
Copy link
Collaborator

@penghuo penghuo commented Nov 19, 2023

Description

  • In Flint.createIndex, create query_execution_request index if not exist
  • Check query_execution_request, before write to **query_execution_request **index, if index not exist, throw IllegalStateException, emit metrics QueryExecutionRequestIndexNotFound.
  • When write
    • In translog state update, **query_execution_request **index, service not available / write block, throw IllegalStateException. Emit metrics QueryExecutionRequestIndexWriteFail
    • heartbeat update, **query_execution_request **index, service not available / write block, catch exception, Emit metrics QueryExecutionRequestIndexWriteFail. After 5 times, Monitor system detect the job lastUpdateTime is stale. if index is green, Monitor system retry EMR-S job. if index is RED / write block. Monitor system should cut ticket.
  • When read
    • In translog state init***, query_execution_request*** index, read block, throw IllegalStateException, Emit metrics QueryExecutionRequestIndexReadFail.
    • heartbeat update, **query_execution_request **index, service not available / write block, catch exception, Emit metrics QueryExecutionRequestIndexReadFail. After 5 times, Monitor system detect the job lastUpdateTime is stale. if index is green, Monitor system retry EMR-S job. if index is read block, Monitor system should cut ticket.

Detail info in #166

Issues Resolved

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@@ -28,6 +33,7 @@ public void upsert(String id, String doc) {
// also, failure to close the client causes the job to be stuck in the running state as the client resource
// is not released.
try (RestHighLevelClient client = flintClient.createClient()) {
assertIndexExist(client, indexName);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaituo any other test cases we need to cover when write to query_execution_request index?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may also need to check request indexing mapping as we did in the result index mapping. This can be done once during REPL lifetime.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Track in #172

@penghuo penghuo marked this pull request as ready for review November 20, 2023 16:01
@penghuo penghuo self-assigned this Nov 20, 2023
@penghuo penghuo added 0.1.1 enhancement New feature or request labels Nov 20, 2023
@penghuo penghuo marked this pull request as draft November 20, 2023 18:00
Signed-off-by: Peng Huo <penghuo@gmail.com>
@penghuo penghuo force-pushed the handleNotIndexFound branch from 4d77ba4 to 6055e1a Compare November 21, 2023 00:32
@@ -46,7 +42,7 @@ object FlintREPL extends Logging with FlintJobExecutor {
private val HEARTBEAT_INTERVAL_MILLIS = 60000L
private val DEFAULT_INACTIVITY_LIMIT_MILLIS = 10 * 60 * 1000
private val MAPPING_CHECK_TIMEOUT = Duration(1, MINUTES)
private val DEFAULT_QUERY_EXECUTION_TIMEOUT = Duration(10, MINUTES)
private val DEFAULT_QUERY_EXECUTION_TIMEOUT = Duration(30, MINUTES)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increase query execution timeout to 30mins.

@penghuo penghuo marked this pull request as ready for review November 21, 2023 01:39
Comment on lines +141 to +144
| "lastUpdateTime": {
| "type": "date",
| "format": "strict_date_time||epoch_millis"
| },
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this was Long type? And should we add jobStartTime here too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -44,6 +48,7 @@ public void upsert(String id, String doc) {

public void update(String id, String doc) {
try (RestHighLevelClient client = flintClient.createClient()) {
assertIndexExist(client, indexName);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this check is not required for Flint metadata log entry update?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found I didn't set refresh policy in create/update doc in FlintOpenSearchMetadataLog. Could you help check if we should do .setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL) there as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

} else {
String errorMsg = "Metadata log index not found " + metaLogIndexName;
LOG.warning(errorMsg);
throw new IllegalStateException(errorMsg);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where do you emit metrics QueryExecutionRequestIndexNotFound ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will add after metrics sink ready.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

track in here. #117

@@ -28,6 +33,7 @@ public void upsert(String id, String doc) {
// also, failure to close the client causes the job to be stuck in the running state as the client resource
// is not released.
try (RestHighLevelClient client = flintClient.createClient()) {
assertIndexExist(client, indexName);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may also need to check request indexing mapping as we did in the result index mapping. This can be done once during REPL lifetime.

Signed-off-by: Peng Huo <penghuo@gmail.com>
@penghuo penghuo merged commit e201f09 into opensearch-project:main Nov 21, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.1.1 enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants