Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix catalog name missing in Flint index name #48

Conversation

dai-chen
Copy link
Collaborator

@dai-chen dai-chen commented Sep 27, 2023

Description

  1. Add indexName and tableName rule in Flint grammar for PPL plugin convenience.
  2. Use Spark util methods to parse and qualify table identifier.

TODO

  1. [BUG] Find correct catalog name in query rewriter #54
  2. Try to add IT with FlintDelegatingSessionCatalog in Spark application module

Examples:

Default Catalog Name (spark_catalog)

CREATE INDEX test ON stream.lineitem_tiny (l_shipdate);

GET flint_spark_catalog_stream_lineitem_tiny_test_index/_mapping
{
  "flint_spark_catalog_stream_lineitem_tiny_test_index": {
    "mappings": {
      "_meta": {
        "name": "test",
        "options": {},
        "source": "spark_catalog.stream.lineitem_tiny",
        "kind": "covering",
        "indexedColumns": [
          {
            "columnType": "date",
            "columnName": "l_shipdate"
          }
        ]
      },
      "properties": {
        "l_shipdate": {
          "type": "date",
          "format": "strict_date"
        }
      }
    }
  }
}

Custom Catalog Name (myglue)

spark-sql ... \
  --conf spark.sql.defaultCatalog=myglue \
  --conf spark.sql.catalog.myglue=org.opensearch.sql.FlintDelegatingSessionCatalog

spark-sql> CREATE INDEX test ON stream.lineitem_tiny (l_shipdate) WITH (auto_refresh=true);

GET flint_myglue_stream_lineitem_tiny_test_index/_search
{
    ...
    "hits": [
      {
        "_index": "flint_myglue_stream_lineitem_tiny_test_index",
        "_id": "Bx4R44oBWdGHpYUCRVK4",
        "_score": 1,
        "_source": {
          "l_shipdate": "1997-08-16"
        }
      },
      {
        "_index": "flint_myglue_stream_lineitem_tiny_test_index",
        "_id": "CR4R44oBWdGHpYUCRVK4",
        "_score": 1,
        "_source": {
          "l_shipdate": "1997-08-16"
        }
      },

spark-sql> DROP INDEX test ON myglue.stream.lineitem_tiny;

Issues Resolved

#43

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
@dai-chen dai-chen added the bug Something isn't working label Sep 27, 2023
@dai-chen dai-chen self-assigned this Sep 27, 2023
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
@dai-chen dai-chen merged commit e3210f0 into opensearch-project:main Oct 2, 2023
4 checks passed
@dai-chen dai-chen deleted the encode-catalog-name-in-flint-index-name branch October 2, 2023 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants