Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query Insights API spec #625

Merged
merged 4 commits into from
Oct 23, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
- Added ability to pass `InlineScript` as a simple string ([#605](https://github.com/opensearch-project/opensearch-api-specification/pull/605))
- Added `config_id` and `config_id_list` to `/_plugins/_notifications/configs` query parameters ([#594](https://github.com/opensearch-project/opensearch-api-specification/pull/594))
- Added a release workflow triggered on a tag ([#635](https://github.com/opensearch-project/opensearch-api-specification/pull/635))
- Added API spec for query insights plugin ([#625](https://github.com/opensearch-project/opensearch-api-specification/pull/625))

### Changed

Expand Down
34 changes: 34 additions & 0 deletions spec/namespaces/insights.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
openapi: 3.1.0
info:
title: Query Insights API
description: API to retrieve top queries based on latency, CPU, or memory usage.
version: 1.0.0
paths:
/_insights/top_queries:
get:
operationId: insights.top_queries.0
x-operation-group: insights.top_queries
x-version-added: '1.0'
ansjcy marked this conversation as resolved.
Show resolved Hide resolved
description: Retrieves the top queries based on the given metric type (latency, CPU, or memory).
parameters:
- $ref: '#/components/parameters/insights.top_queries::query.type'
responses:
'200':
$ref: '#/components/responses/insights.top_queries@200'

components:
parameters:
insights.top_queries::query.type:
name: type
in: query
required: true
description: Get top n queries by a specific metric.
schema:
type: string
enum: [cpu, latency, memory]
responses:
insights.top_queries@200:
content:
application/json:
schema:
$ref: '../schemas/insights._common.yaml#/components/schemas/TopQueriesResponse'
224 changes: 224 additions & 0 deletions spec/schemas/insights._common.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,224 @@
openapi: 3.1.0
info:
title: Schemas of query insights
description: Schemas of query insights
version: 1.0.0
paths: {}
components:
schemas:
TopQueriesResponse:
type: object
properties:
top_queries:
type: array
items:
type: object
$ref: '#/components/schemas/TopQuery'
required:
- top_queries
TopQuery:
type: object
properties:
timestamp:
type: integer
description: The timestamp of the query execution.
total_shards:
type: integer
description: The total number of shards involved in the query.
task_resource_usages:
type: array
items:
type: object
$ref: '#/components/schemas/TaskResourceUsages'
labels:
type: object
description: Additional labels for the query.
search_type:
type: string
description: The search query type (e.g., query_then_fetch).
source:
type: object
$ref: '#/components/schemas/Source'
node_id:
type: string
description: The node ID associated with the query.
indices:
type: array
items:
type: string
description: The indices involved in the query.
phase_latency_map:
type: object
measurements:
type: object
$ref: '#/components/schemas/Measurements'
TaskResourceUsages:
type: object
properties:
action:
type: string
description: The action type of the task.
taskId:
type: integer
description: The task ID.
parentTaskId:
type: integer
description: The parent task ID.
nodeId:
type: string
description: The node ID where the task was executed.
taskResourceUsage:
type: object
$ref: '#/components/schemas/TaskResourceUsage'
TaskResourceUsage:
type: object
properties:
cpu_time_in_nanos:
type: integer
description: The CPU time used in nanoseconds.
memory_in_bytes:
type: integer
description: The memory usage in bytes.
Source:
type: object
properties:
aggregations:
description: Defines the aggregations that are run as part of the search request.
type: object
collapse:
$ref: '_core.search.yaml#/components/schemas/FieldCollapse'
explain:
description: If true, returns detailed information about score computation as part of a hit.
type: boolean
ext:
description: Configuration of search extensions defined by OpenSearch plugins.
type: object
additionalProperties:
type: object
from:
description: |-
Starting document offset.
Needs to be non-negative.
By default, you cannot page through more than 10,000 hits using the `from` and `size` parameters.
To page through more hits, use the `search_after` parameter.
type: number
highlight:
$ref: '_core.search.yaml#/components/schemas/Highlight'
track_total_hits:
$ref: '_core.search.yaml#/components/schemas/TrackHits'
indices_boost:
description: Boosts the _score of documents from specified indices.
type: array
items:
type: object
additionalProperties:
type: number
docvalue_fields:
description: |-
Array of wildcard (`*`) patterns.
The request returns doc values for field names matching these patterns in the `hits.fields` property of the response.
type: array
items:
$ref: '_common.query_dsl.yaml#/components/schemas/FieldAndFormat'
min_score:
description: |-
Minimum `_score` for matching documents.
Documents with a lower `_score` are not included in the search results.
type: number
post_filter:
$ref: '_common.query_dsl.yaml#/components/schemas/QueryContainer'
profile:
description: |-
Set to `true` to return detailed timing information about the execution of individual components in a search request.
NOTE: This is a debugging tool and adds significant overhead to search execution.
type: boolean
query:
$ref: '_common.query_dsl.yaml#/components/schemas/QueryContainer'
script_fields:
description: Retrieve a script evaluation (based on different fields) for each hit.
type: object
additionalProperties:
$ref: '_common.yaml#/components/schemas/ScriptField'
search_after:
$ref: '_common.yaml#/components/schemas/SortResults'
size:
description: |-
The number of hits to return.
By default, you cannot page through more than 10,000 hits using the `from` and `size` parameters.
To page through more hits, use the `search_after` parameter.
type: number
slice:
$ref: '_common.yaml#/components/schemas/SlicedScroll'
sort:
$ref: '_common.yaml#/components/schemas/Sort'
_source:
$ref: '_core.search.yaml#/components/schemas/SourceConfig'
fields:
description: |-
Array of wildcard (`*`) patterns.
The request returns values for field names matching these patterns in the `hits.fields` property of the response.
type: array
items:
$ref: '_common.query_dsl.yaml#/components/schemas/FieldAndFormat'
suggest:
$ref: '_core.search.yaml#/components/schemas/Suggester'
terminate_after:
description: |-
Maximum number of documents to collect for each shard.
If a query reaches this limit, OpenSearch terminates the query early.
OpenSearch collects documents before sorting.
Use with caution.
OpenSearch applies this parameter to each shard handling the request.
When possible, let OpenSearch perform early termination automatically.
Avoid specifying this parameter for requests that target data streams with backing indices across multiple data tiers.
If set to `0` (default), the query does not terminate early.
type: integer
format: int32
timeout:
description: |-
Specifies the period of time to wait for a response from each shard.
If no response is received before the timeout expires, the request fails and returns an error.
Defaults to no timeout.
type: string
track_scores:
description: If true, calculate and return document scores, even if the scores are not used for sorting.
type: boolean
version:
description: If true, returns document version as part of a hit.
type: boolean
seq_no_primary_term:
description: If `true`, returns sequence number and primary term of the last modification of each hit.
type: boolean
stored_fields:
$ref: '_common.yaml#/components/schemas/Fields'
pit:
$ref: '_core.search.yaml#/components/schemas/PointInTimeReference'
stats:
description: |-
Stats groups to associate with the search.
Each group maintains a statistics aggregation for its associated searches.
You can retrieve these stats using the indices stats API.
type: array
items:
type: string
Measurement:
type: object
properties:
number:
type: integer
count:
type: integer
aggregationType:
type: string
Measurements:
type: object
properties:
latency:
type: object
$ref: '#/components/schemas/Measurement'
cpu:
type: object
$ref: '#/components/schemas/Measurement'
memory:
type: object
$ref: '#/components/schemas/Measurement'
15 changes: 15 additions & 0 deletions tests/plugins/query_insights/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
version: '3'

services:
opensearch-cluster:
image: ${OPENSEARCH_DOCKER_HUB_PROJECT:-opensearchproject}/opensearch:${OPENSEARCH_VERSION:-latest}${OPENSEARCH_DOCKER_REF}
ports:
- 9200:9200
- 9600:9600
environment:
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_PASSWORD:-myStrongPassword123!}
- OPENSEARCH_JAVA_OPTS=${OPENSEARCH_JAVA_OPTS}
- discovery.type=single-node
- search.insights.top_queries.cpu.enabled=true
- search.insights.top_queries.latency.enabled=true
- search.insights.top_queries.memory.enabled=true
dblock marked this conversation as resolved.
Show resolved Hide resolved
99 changes: 99 additions & 0 deletions tests/plugins/query_insights/insights/top_queries.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
$schema: ../../../../json_schemas/test_story.schema.yaml

description: Test top n queries API in the Query Insights plugin.
version: '>= 2.15'
prologues:
- path: /movies
method: PUT
request:
payload:
mappings:
properties:
director:
type: text
fielddata: true
fields:
raw:
type: keyword
- path: /_bulk
method: POST
parameters:
refresh: true
request:
content_type: application/x-ndjson
payload:
- {create: {_index: movies}}
- {director: Bennett Miller, title: Moneyball}
- {create: {_index: movies}}
- {director: Bennett Miller, title: The Cruise}
- {create: {_index: movies}}
- {director: Nicolas Winding Refn, title: Drive}

- path: /{index}/_search
parameters:
index: movies
method: GET
request:
payload:
size: 0
aggregations:
directors:
terms:
field: director.raw

chapters:
- synopsis: Retrieve default top queries.
path: /_insights/top_queries
retry:
count: 2
wait: 5000
method: GET
response:
status: 200
content_type: application/json
payload:
top_queries: []
- synopsis: Retrieve top queries by latency.
path: /_insights/top_queries
retry:
count: 2
wait: 5000
parameters:
type: latency
method: GET
response:
status: 200
content_type: application/json
payload:
top_queries: []
- synopsis: Retrieve top queries by cpu usage.
path: /_insights/top_queries
retry:
count: 2
wait: 5000
parameters:
type: cpu
method: GET
response:
status: 200
content_type: application/json
payload:
top_queries: []
- synopsis: Retrieve top queries by memory usage.
path: /_insights/top_queries
retry:
count: 2
wait: 5000
parameters:
type: memory
method: GET
response:
status: 200
content_type: application/json
payload:
top_queries: []

epilogues:
- path: /movies
method: DELETE
status: [200, 404]
Loading