Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing: add Doc status counter #8716

Merged
merged 1 commit into from
Sep 28, 2023

Conversation

r1walz
Copy link
Contributor

@r1walz r1walz commented Jul 17, 2023

Description

Currently, Opensearch returns a 200 OK response code for a Bulk API call, even though there can be partial/complete failures within the request E2E. Tracking these failures requires client to parse the response on their side and make sense of them. But, a general idea around trend in growth of different rest status codes at item level can provide insights on how indexing engine is performing.

Related Issues

Resolves #4562

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
Benchmarks!

DocStatusCounter Benchmarking

Configuration:

  • 3 Data Nodes: r6g.4xlarge
  • 3 Manager Nodes: c6g.2xlarge
  • Heap: 64 GB
  • Indexed 160,000,000 (160M) docs using opensearch-benchmark
  • CPU Utilization > 90% each run
Metric Unit Task String Keys Baseline Short Keys ThreadLocalMap AtomicLong[]
Cumulative indexing time of primary shards min 185.26433 173.42567 176.12467 176.21833 169.801
Min cumulative indexing time across primary shards min 6.67E-05 55.43687 55.61673 56.04497 55.348
Median cumulative indexing time across primary shards min 59.04873 57.10593 58.50937 58.2993 55.5361
Max cumulative indexing time across primary shards min 67.16667 60.88293 61.9988 61.87413 58.917
Cumulative indexing throttle time of primary shards min 0 0 0 0 0
Min cumulative indexing throttle time across primary shards min 0 0 0 0 0
Median cumulative indexing throttle time across primary shards min 0 0 0 0 0
Max cumulative indexing throttle time across primary shards min 0 0 0 0 0
Cumulative merge time of primary shards min 53.6561 53.96897 51.88953 55.65487 47.0504
Cumulative merge count of primary shards 113.33333 107.33333 105.66667 107.66667 105
Min cumulative merge time across primary shards min 0 16.16687 16.6941 16.01773 12.6753
Median cumulative merge time across primary shards min 17.10093 18.38877 16.9021 18.06377 17.0686
Max cumulative merge time across primary shards min 19.45427 19.41333 18.2934 21.57333 17.3066
Cumulative merge throttle time of primary shards min 14.49533 16.94427 14.9402 16.39743 12.676
Min cumulative merge throttle time across primary shards min 0 4.72383 4.03366 4.01268 3.78305
Median cumulative merge throttle time across primary shards min 4.3005 5.36746 5.16321 5.4102 4.10685
Max cumulative merge throttle time across primary shards min 5.89435 6.85298 5.74337 6.97451 4.78612
Cumulative refresh time of primary shards min 4.53589 4.42749 4.32285 4.26004 4.27837
Cumulative refresh count of primary shards 143 126.66667 126.33333 128.66667 120
Min cumulative refresh time across primary shards min 0.00018 1.39173 1.35457 1.32311 1.41253
Median cumulative refresh time across primary shards min 1.48191 1.43236 1.46172 1.4063 1.41697
Max cumulative refresh time across primary shards min 1.57189 1.6034 1.50656 1.53063 1.44887
Cumulative flush time of primary shards min 2.12778 2.24693 2.20361 2.09975 2.38865
Cumulative flush count of primary shards 31.33333 30.66667 28.66667 29.33333 28
Min cumulative flush time across primary shards min 0.00015 0.65804 0.62011 0.62618 0.75745
Median cumulative flush time across primary shards min 0.68019 0.77672 0.75957 0.67874 0.7731
Max cumulative flush time across primary shards min 0.76725 0.81217 0.82393 0.79483 0.8581
Total Young Gen GC time s 38.75267 34.86833 36.069 33.21967 36.447
Total Young Gen GC count s 502 486.33333 495.66667 447 542
Total Old Gen GC time GB 0 0 0 0 0
Total Old Gen GC count GB 0 0 0 0 0
Store size MB 63.48167 52.1122 57.52483 52.37343 72.8811
Translog size MB 5.63E-07 4.61E-07 4.61E-07 4.61E-07 4.61E-07
Heap used for segments MB 0 0 0 0 0
Heap used for doc values MB 0 0 0 0 0
Heap used for terms MB 0 0 0 0 0
Heap used for norms MB 0 0 0 0 0
Heap used for points 0 0 0 0 0
Heap used for stored fields docs/s index 0 0 0 0 0
Segment count docs/s index 86 89.33333 88.33333 85.33333 73
Min Throughput docs/s index 188963 190996.66667 188393.66667 202766 168622
Mean Throughput ms index 202721.66667 215656.33333 212218 213985 219714
Median Throughput ms index 199066.33333 214909 210353 211633.66667 220747
Max Throughput ms index 243293.66667 241402.66667 240322 247813 229924
50th percentile latency ms index 4956.15667 4532.03 4584.96667 4653.31333 4191.51
90th percentile latency ms index 7005.88667 6453.49333 6649.78 6524.09667 6210.21
99th percentile latency ms index 9844.68333 9437.39333 9771.28 9136.64667 9669.73
99.9th percentile latency ms index 11722.83333 11428.73333 11919.23333 10853.76667 12189.8
100th percentile latency ms index 13378.36667 13292.16667 13863.46667 11733.63333 12971.5
50th percentile service time ms index 4956.15667 4532.03 4584.96667 4653.31333 4191.51
90th percentile service time ms index 7005.88667 6453.49333 6649.78 6524.09667 6210.21
99th percentile service time ms index 9844.68333 9437.39333 9771.28 9136.64667 9669.73
99.9th percentile service time ms 11722.83333 11428.73333 11919.23333 10853.76667 12189.8
100th percentile service time ms 13378.36667 13292.16667 13863.46667 11733.63333 12971.5
error rate % 0 0 0 0 0

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@r1walz r1walz force-pushed the ra/idx-axn-cntr branch from c70381f to 884d313 Compare July 17, 2023 08:29
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@r1walz r1walz force-pushed the ra/idx-axn-cntr branch from 884d313 to 877a5e2 Compare July 17, 2023 08:58
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@r1walz r1walz force-pushed the ra/idx-axn-cntr branch from 877a5e2 to bf4cdcb Compare July 17, 2023 09:22
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@r1walz r1walz force-pushed the ra/idx-axn-cntr branch from bf4cdcb to e658274 Compare July 17, 2023 19:01
Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. Tracking these failures requires client to parse the
response on their side and make sense of them. But, a general idea
around trend in growth of different rest status codes at item level
can provide insights on how indexing engine is performing.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
Copy link
Member

@shwetathareja shwetathareja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @r1walz . LGTM

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@r1walz r1walz requested a review from mgodwan September 28, 2023 12:26
@shwetathareja shwetathareja merged commit d656e3d into opensearch-project:main Sep 28, 2023
15 checks passed
@shwetathareja shwetathareja added the backport 2.x Backport to 2.x branch label Sep 28, 2023
@r1walz r1walz deleted the ra/idx-axn-cntr branch September 28, 2023 13:04
opensearch-trigger-bot bot pushed a commit that referenced this pull request Sep 28, 2023
Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. This provides doc level stats with respect to the rest status code as 2xx, 4xx, 5xx etc.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
(cherry picked from commit d656e3d)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
shwetathareja pushed a commit that referenced this pull request Sep 29, 2023
* Indexing: add Doc Status Counter (#8716)

Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. This provides doc level stats with respect to the rest status code as 2xx, 4xx, 5xx etc.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
(cherry picked from commit d656e3d)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
r1walz added a commit to r1walz/OpenSearch that referenced this pull request Sep 29, 2023
[Backport 2.x] Indexing: add Doc status counter (opensearch-project#10267)

* Indexing: add Doc Status Counter (opensearch-project#8716)

Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. This provides doc level stats with respect to the rest status code as 2xx, 4xx, 5xx etc.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
(cherry picked from commit d656e3d)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 94173e3)
Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
sgup432 added a commit to sgup432/OpenSearch that referenced this pull request Sep 29, 2023
[Backport 2.x] Indexing: add Doc status counter (opensearch-project#10267)

* Indexing: add Doc Status Counter (opensearch-project#8716)

Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. This provides doc level stats with respect to the rest status code as 2xx, 4xx, 5xx etc.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
(cherry picked from commit d656e3d)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 94173e3)
Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
sgup432 added a commit to sgup432/OpenSearch that referenced this pull request Sep 29, 2023
[Backport 2.x] Indexing: add Doc status counter (opensearch-project#10267)

* Indexing: add Doc Status Counter (opensearch-project#8716)

Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. This provides doc level stats with respect to the rest status code as 2xx, 4xx, 5xx etc.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
(cherry picked from commit d656e3d)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 94173e3)
Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>
msfroh pushed a commit that referenced this pull request Sep 30, 2023
* [Search latency - Coordinator] Changing version check to 2.11

Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>

* [Port main] update version check as per v2.11.0

[Backport 2.x] Indexing: add Doc status counter (#10267)

* Indexing: add Doc Status Counter (#8716)

Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. This provides doc level stats with respect to the rest status code as 2xx, 4xx, 5xx etc.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
(cherry picked from commit d656e3d)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 94173e3)
Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>

---------

Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>
Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Oct 3, 2023
…arch-project#10280)

* [Search latency - Coordinator] Changing version check to 2.11

Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>

* [Port main] update version check as per v2.11.0

[Backport 2.x] Indexing: add Doc status counter (opensearch-project#10267)

* Indexing: add Doc Status Counter (opensearch-project#8716)

Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. This provides doc level stats with respect to the rest status code as 2xx, 4xx, 5xx etc.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
(cherry picked from commit d656e3d)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 94173e3)
Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>

---------

Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>
Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
deshsidd pushed a commit to deshsidd/OpenSearch that referenced this pull request Oct 9, 2023
Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. This provides doc level stats with respect to the rest status code as 2xx, 4xx, 5xx etc.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
deshsidd pushed a commit to deshsidd/OpenSearch that referenced this pull request Oct 9, 2023
…arch-project#10280)

* [Search latency - Coordinator] Changing version check to 2.11

Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>

* [Port main] update version check as per v2.11.0

[Backport 2.x] Indexing: add Doc status counter (opensearch-project#10267)

* Indexing: add Doc Status Counter (opensearch-project#8716)

Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. This provides doc level stats with respect to the rest status code as 2xx, 4xx, 5xx etc.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
(cherry picked from commit d656e3d)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 94173e3)
Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>

---------

Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>
Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
vikasvb90 pushed a commit to vikasvb90/OpenSearch that referenced this pull request Oct 10, 2023
Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. This provides doc level stats with respect to the rest status code as 2xx, 4xx, 5xx etc.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
vikasvb90 pushed a commit to vikasvb90/OpenSearch that referenced this pull request Oct 10, 2023
…arch-project#10280)

* [Search latency - Coordinator] Changing version check to 2.11

Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>

* [Port main] update version check as per v2.11.0

[Backport 2.x] Indexing: add Doc status counter (opensearch-project#10267)

* Indexing: add Doc Status Counter (opensearch-project#8716)

Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. This provides doc level stats with respect to the rest status code as 2xx, 4xx, 5xx etc.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
(cherry picked from commit d656e3d)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 94173e3)
Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>

---------

Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>
Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
mgodwan pushed a commit to mgodwan/OpenSearch that referenced this pull request Oct 17, 2023
Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. This provides doc level stats with respect to the rest status code as 2xx, 4xx, 5xx etc.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. This provides doc level stats with respect to the rest status code as 2xx, 4xx, 5xx etc.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…arch-project#10280)

* [Search latency - Coordinator] Changing version check to 2.11

Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>

* [Port main] update version check as per v2.11.0

[Backport 2.x] Indexing: add Doc status counter (opensearch-project#10267)

* Indexing: add Doc Status Counter (opensearch-project#8716)

Currently, Opensearch returns a 200 OK response code for a Bulk API
call, even though there can be partial/complete failures within the
request E2E. This provides doc level stats with respect to the rest status code as 2xx, 4xx, 5xx etc.

Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
(cherry picked from commit d656e3d)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 94173e3)
Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>

---------

Signed-off-by: Sagar Upadhyaya <sagar.upadhyaya.121@gmail.com>
Signed-off-by: Rohit Ashiwal <rashiwal@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch Clients Clients within the Core repository such as High level Rest client and low level client distributed framework enhancement Enhancement or improvement to existing feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Bulk Item Failure Count
5 participants