Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add where clause support for covering index #85

Merged

Conversation

dai-chen
Copy link
Collaborator

@dai-chen dai-chen commented Oct 19, 2023

Description

  1. Support WHERE clause CREATE INDEX statement for partial indexing. Only source data matched the given filtering condition will be processed.
  2. Add empty implementation for create skipping index statement. Because filtering condition support may impact skipping index query rewrite (covering index doesn't have this feature) and thus need more time. Added the grammar now so PPL plugin can includes it first.

Example

spark-sql> CREATE INDEX client_status_idx ON ds_tables.http_logs
         > (clientip, status)
         > WHERE status != 200
         > WITH (
         >   auto_refresh = true
         > );

GET flint_myglue_ds_tables_http_logs_client_status_idx_index/_mapping
{
  "flint_myglue_ds_tables_http_logs_client_status_idx_index": {
    "mappings": {
      "_meta": {
        "kind": "covering",
        "indexedColumns": [
          {
            "columnType": "string",
            "columnName": "clientip"
          },
          {
            "columnType": "int",
            "columnName": "status"
          }
        ],
        "name": "client_status_idx",
        "options": {
          "auto_refresh": "true"
        },
        "source": "myglue.ds_tables.http_logs",
        "version": "0.1.0",
        "properties": {
          "filterCondition": "status != 200"
        }
      },
      "properties": {
        "clientip": {
          "type": "keyword"
        },
        "status": {
          "type": "integer"
        }
      }
    }
  }
}

GET flint_myglue_ds_tables_http_logs_client_status_idx_index/_search
{
    ......
    "hits": [
      {
        "_index": "flint_myglue_ds_tables_http_logs_client_status_idx_index",
        "_id": "kyUjaIsBWdGHpYUCCTv-",
        "_score": 1,
        "_source": {
          "clientip": "107.8.3.0",
          "status": 304
        }
      },
      {
        "_index": "flint_myglue_ds_tables_http_logs_client_status_idx_index",
        "_id": "liUjaIsBWdGHpYUCCTv-",
        "_score": 1,
        "_source": {
          "clientip": "250.126.5.0",
          "status": 304
        }
      },

Issues Resolved

#23, #89

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
@dai-chen dai-chen added the enhancement New feature or request label Oct 19, 2023
@dai-chen dai-chen self-assigned this Oct 19, 2023
@dai-chen dai-chen changed the title Add where clause syntax for partial indexing support Add where clause support for covering index Oct 24, 2023
Signed-off-by: Chen Dai <daichen@amazon.com>
@dai-chen dai-chen marked this pull request as ready for review October 25, 2023 16:16
@dai-chen dai-chen merged commit 1f8dda9 into opensearch-project:main Oct 25, 2023
4 checks passed
@dai-chen dai-chen deleted the add-filtering-condition-empty-syntax branch October 25, 2023 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants