Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update API specifications for k-NN 2.17 changes #588

Merged
merged 7 commits into from
Sep 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .github/workflows/test-spec.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,12 @@ jobs:
- version: 2.16.0
tests: snapshot
- version: 2.17.0
- version: 2.18.0
hub: opensearchstaging
ref: '@sha256:1273489ebbedcb470ea13563dae4c6dc6b2ed431e87e686ed931ae0733034b25'
ryanbogan marked this conversation as resolved.
Show resolved Hide resolved
ref: '@sha256:4445e195c53992038891519dc3be0d273cdaad1b047943d68921168ed243e7e9'
- version: 3.0.0
hub: opensearchstaging
ref: '@sha256:06af2ba4037f8423dc1a4ed3cd29108a1912774e7c659e73f0fac09e1bb2b63d'
ref: '@sha256:cf07c0ffa7d9e8a3e7fdb58041caae3bb81f1123260431b99d0ebf4a52c3d9a3'

name: test-opensearch-spec (version=${{ matrix.entry.version }}, hub=${{ matrix.entry.hub || 'opensearchproject' }}, tests=${{ matrix.entry.tests || 'default' }})
runs-on: ubuntu-latest
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
- Fixed tasks namespace schemas ([#520](https://github.com/opensearch-project/opensearch-api-specification/pull/520))
- Fixed `/_plugins/_transform/_preview` ([#568](https://github.com/opensearch-project/opensearch-api-specification/pull/568))
- Fixed create/delete/index operation in `_bulk` ([#582](https://github.com/opensearch-project/opensearch-api-specification/pull/582))
- Add `mode` and `compression` to k-NN index creation and search, and add `rescore` and `oversample_factor` to k-NN search ([#588](https://github.com/opensearch-project/opensearch-api-specification/pull/588))

### Security

Expand Down
18 changes: 16 additions & 2 deletions spec/namespaces/knn.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -268,11 +268,16 @@ components:
format: int32
description:
type: string
mode:
type: string
compression_level:
type: string
method:
type: string
spaceType:
type: string
required:
- dimension
- method
- training_field
- training_index
required: true
Expand All @@ -281,7 +286,16 @@ components:
knn.get_model@200: {}
knn.search_models@200: {}
knn.stats@200: {}
knn.train_model@200: {}
knn.train_model@200:
content:
application/json:
schema:
type: object
properties:
model_id:
type: string
required:
- model_id
knn.warmup@200: {}
parameters:
knn.delete_model::path.model_id:
Expand Down
8 changes: 8 additions & 0 deletions spec/schemas/_common.mapping.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1141,6 +1141,14 @@ components:
properties:
dimension:
type: number
space_type:
type: string
data_type:
type: string
mode:
type: string
compression_level:
type: string
method:
$ref: '#/components/schemas/KnnVectorMethod'
required:
Expand Down
10 changes: 10 additions & 0 deletions spec/schemas/_common.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -656,6 +656,16 @@ components:
boost:
description: Boost value to apply to kNN scores
type: number
method_parameters:
type: object
x-version-added: '2.16'
additionalProperties:
type: number
rescore:
type: object
x-version-added: '2.17'
additionalProperties:
type: number
Comment on lines +659 to +668

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I understand this correctly, the additionalProperties are more of a map whose values will always be of type number? correct?

if the above understanding is correct, then my worry is going forward we can add more parameters and those parameters might not be of number type. Can use some type like object or anything that can allow us to specify number, strings as additional parameter values in future.

@dblock thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add them in the future, we can update this with:

anyOf:
  type:
    number
    string
    etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@navneet1v you're correct, and @ryanbogan is also correct, if the schema today is a number, leave it as such, and tomorrow it can be extended when it can actually be something else.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure make sense. In case we update in future, I am hoping the changes are BWC. can you please confirm that.
@dblock

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

required:
- vector
QueryVector:
Expand Down
71 changes: 71 additions & 0 deletions tests/default/_core/search/knn/on_disk.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
$schema: ../../../../../json_schemas/test_story.schema.yaml

description: Test search endpoint with knn query.
version: '>= 2.17'

prologues:
- method: PUT
path: /movies
request:
payload:
settings:
index:
knn: true
mappings:
properties:
recommendation_vector:
type: knn_vector
dimension: 8
space_type: l2
data_type: float
mode: on_disk
compression_level: 16x
status: [200]
- method: POST
path: /_bulk
request:
content_type: application/x-ndjson
payload:
- {index: {_index: movies, _id: '1'}}
- {recommendation_vector: [1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5], duration: 12.2}
- {index: {_index: movies, _id: '2'}}
- {recommendation_vector: [2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5], duration: 7.1}
- {index: {_index: movies, _id: '3'}}
- {recommendation_vector: [3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5], duration: 12.9}
- {index: {_index: movies, _id: '4'}}
- {recommendation_vector: [4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5], duration: 1.2}
- {index: {_index: movies, _id: '5'}}
- {recommendation_vector: [5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5], duration: 3.7}
- {index: {_index: movies, _id: '6'}}
- {recommendation_vector: [6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5], duration: 10.3}
- {index: {_index: movies, _id: '7'}}
- {recommendation_vector: [7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5], duration: 5.5}
- {index: {_index: movies, _id: '8'}}
- {recommendation_vector: [8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5], duration: 4.4}
- {index: {_index: movies, _id: '9'}}
- {recommendation_vector: [9.5, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5], duration: 8.9}
status: [200]
epilogues:
- path: /movies
method: DELETE
status: [200, 404]

chapters:
- synopsis: Test k-NN disk-based search.
method: POST
path: /{index}/_search
parameters:
index: movies
request:
payload:
query:
knn:
recommendation_vector:
vector: [1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5]
k: 5
method_parameters:
ef_search: 512
rescore:
oversample_factor: 10
response:
status: 200
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
$schema: ../../../../json_schemas/test_story.schema.yaml
$schema: ../../../../../json_schemas/test_story.schema.yaml

description: Test search endpoint with knn query.
version: '>= 1.2'
Expand Down
73 changes: 73 additions & 0 deletions tests/default/knn/train_model.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
$schema: ../../../json_schemas/test_story.schema.yaml

description: Test training k-NN model with disk-based parameters.
version: '>= 2.17'

prologues:
- method: PUT
path: /movies
request:
payload:
settings:
index:
knn: true
mappings:
properties:
recommendation_vector:
type: knn_vector
dimension: 8
status: [200]
- method: POST
path: /_bulk
request:
content_type: application/x-ndjson
payload:
- {index: {_index: movies, _id: '1'}}
- {recommendation_vector: [1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5], duration: 12.2}
- {index: {_index: movies, _id: '2'}}
- {recommendation_vector: [2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5], duration: 7.1}
- {index: {_index: movies, _id: '3'}}
- {recommendation_vector: [3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5], duration: 12.9}
- {index: {_index: movies, _id: '4'}}
- {recommendation_vector: [4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5], duration: 1.2}
- {index: {_index: movies, _id: '5'}}
- {recommendation_vector: [5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5], duration: 3.7}
- {index: {_index: movies, _id: '6'}}
- {recommendation_vector: [6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5], duration: 10.3}
- {index: {_index: movies, _id: '7'}}
- {recommendation_vector: [7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5], duration: 5.5}
- {index: {_index: movies, _id: '8'}}
- {recommendation_vector: [8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5], duration: 4.4}
- {index: {_index: movies, _id: '9'}}
- {recommendation_vector: [9.5, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5], duration: 8.9}
status: [200]
epilogues:
- path: /movies
method: DELETE
status: [200, 404]
- path: /_plugins/_knn/models/{model_id}
parameters:
model_id: ${train_model.test_model_id}
method: DELETE
status: [200, 404]

chapters:
- synopsis: Test training a model with disk-based parameters.
id: train_model
method: POST
path: /_plugins/_knn/models/_train
request:
payload:
training_index: movies
training_field: recommendation_vector
dimension: 8
max_training_vector_count: 1200
search_size: 100
description: Test model
mode: on_disk
compression_level: 32x
spaceType: l2
response:
status: 200
output:
test_model_id: payload.model_id
Loading