-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Add new data type for text #1038
Comments
OpenSearch storage engine support TEXT data type? Do you propose add TEXT data type in core engine? |
Yes and Yes. |
Aggregating on text is disabled by default because of performance implications. The recommended approach is to keep a copy of the raw string. JDBC has @kylepbit is another use case besides aggregation that did not work as expected? |
Another trouble I just met caused by the same issue: SELECT COUNT(*) FROM account WHERE address LIKE '% Street'; returns 0. sql/integ-test/src/test/resources/indexDefinitions/account_index_mapping.json Lines 8 to 11 in b56edc7
UPD SELECT address, address LIKE '% Street' FROM account; works and returns valid results. This confuses a lot. |
The crux of the problem is that we expose opensearch text fields as VARCHARs to BI tools, which is indistinguishable from keyword fields, because both are also exposed as VARCHARs. But the two field types behave differently, and produce different results. |
The It needs to be well documented which cases will be flexible (and automatically convert) and which cases expect the function to work with a specific field type ( see: #1032 |
Answering my own question posted in #1038 (comment): SELECT count(*) FROM account WHERE address LIKE '%Street'; Nothing there related to
Full explanation is available on hivemind: https://stackoverflow.com/a/72084568. |
RFC for
|
JDBC type | ExprCoreType |
OpenSearchDataType |
OpenSearch type |
---|---|---|---|
VARCHAR /CHAR |
STRING |
-- | keyword |
LONGVARCHAR /TEXT |
STRING |
OpenSearchTextType |
text |
Is your feature request related to a problem?
SQL plugin doesn't distinguish between
text
andkeyword
data types. OpenSearch supports aggregation onkeyword
s andtext
s withfielddata
and/orfields
.It is possible to aggregate on
keyword
ortext
(conditions apply)But impossible to aggregate on general
text
:Existing mapping
ExprCoreType
OpenSearchDataType
VARCHAR
STRING
OPENSEARCH_TEXT_KEYWORD
keyword
VARCHAR
STRING
OPENSEARCH_TEXT
text
See OpenSearch mapping samples available for aggregation:
sql/integ-test/src/test/resources/correctness/opensearch_dashboards_sample_data_flights.json
Lines 25 to 27 in b56edc7
sql/integ-test/src/test/resources/correctness/opensearch_dashboards_sample_data_flights.json
Lines 61 to 69 in b56edc7
sql/integ-test/src/test/resources/indexDefinitions/account_index_mapping.json
Lines 12 to 21 in b56edc7
Not available for aggregation:
sql/integ-test/src/test/resources/indexDefinitions/bank_with_null_values_index_mapping.json
Lines 16 to 18 in b56edc7
What solution would you like?
Have 2 different data types which are mapped to different JDBC/ODBC types.
ExprCoreType
OpenSearchDataType
VARCHAR
/CHAR
STRING
OPENSEARCH_KEYWORD
keyword
text
withfielddata
text
withfields
LONGVARCHAR
/TEXT
TEXT
OPENSEARCH_TEXT
text
withoutfielddata
andfields
What alternatives have you considered?
N/A
Do you have any additional context?
Opened on behalf of @kylepbit
Ref:
sql/sql-jdbc/src/main/java/org/opensearch/jdbc/types/OpenSearchType.java
Lines 59 to 61 in b56edc7
sql/opensearch/src/main/java/org/opensearch/sql/opensearch/data/type/OpenSearchDataType.java
Lines 25 to 46 in b56edc7
sql/core/src/main/java/org/opensearch/sql/data/type/ExprCoreType.java
Lines 44 to 47 in b56edc7
The text was updated successfully, but these errors were encountered: