Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Highlight In SQL #96

Merged
merged 27 commits into from
Aug 1, 2022
Merged

Conversation

forestmvey
Copy link

Description

Support highlighting function in SQL query engine.

Issues Resolved

Github Issue: 636

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

MaxKsyunz and others added 20 commits July 26, 2022 09:29
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
In particular, highlightField is an expression now.

Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
HighlightFunction is converted to LogicalHighlight logical plan.

Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Comment on lines 40 to 67
/**
* Return String or Map value matching highlight field.
* @param valueEnv : Dataset to parse value from.
* @return : String or Map value of highlight fields.
*/
@Override
public ExprValue valueOf(Environment<Expression, ExprValue> valueEnv) {
String refName = "_highlight";
if (!getHighlightField().toString().contains("*")) {
refName += "." + StringUtils.unquoteText(getHighlightField().toString());
}
ExprValue retVal = valueEnv.resolve(DSL.ref(refName, ExprCoreType.STRING));
ImmutableMap.Builder<String, ExprValue> builder = new ImmutableMap.Builder<>();

if (retVal.isMissing() || retVal.type() != ExprCoreType.STRUCT) {
return retVal;
}

var hlBuilder = ImmutableMap.<String, ExprValue>builder();
hlBuilder.putAll(retVal.tupleValue());

for (var entry : retVal.tupleValue().entrySet()) {
String entryKey = "highlight(" + getHighlightField() + ")." + entry.getKey();
builder.put(entryKey, ExprValueUtils.stringValue(entry.getValue().toString()));
}

return ExprTupleValue.fromExprValueMap(builder.build());
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth returning always a map?

Copy link
Author

@forestmvey forestmvey Jul 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 55 returns only one field in the instance there aren't multiple highlight fields returned.

ExprValue retVal = valueEnv.resolve(DSL.ref(refName, ExprCoreType.STRING));
ImmutableMap.Builder<String, ExprValue> builder = new ImmutableMap.Builder<>();

if (retVal.isMissing() || retVal.type() != ExprCoreType.STRUCT) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this condition is true?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there was no highlight value parsed from the return, or only a single highlight field is found.

Comment on lines 19 to 21
public class HighlightOperator extends PhysicalPlan {
@NonNull
private final PhysicalPlan input;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you store a parent class instance there?
I think you can call super.next() and super.hasNext() instead of input.next() and input.hasNext() and so on.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line with Peng's comment in the POC, this class ended up not doing anything and can be removed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be standard considering other operators also store the parent class instance like this.

@@ -274,7 +292,7 @@ public void project_values() {
AstDSL.project(
AstDSL.values(ImmutableList.of(AstDSL.intLiteral(123))),
AstDSL.alias("123", AstDSL.intLiteral(123)),
AstDSL.alias("hello", AstDSL.stringLiteral("hello")),
AstDSL.alias("hello", stringLiteral("hello")),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this was an accidental change done in a previous draft PR. Should be fixed under 57ec0691b36f1e0010c034df1a59ef6b364feaf4.

@@ -694,7 +712,7 @@ public void parse_relation() {
AstDSL.parse(
AstDSL.relation("schema"),
AstDSL.field("string_value"),
AstDSL.stringLiteral("(?<group>.*)")),
stringLiteral("(?<group>.*)")),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this was an accidental change done in a previous draft PR. Should be fixed under 57ec0691b36f1e0010c034df1a59ef6b364feaf4.

Signed-off-by: forestmvey <forestv@bitquilltech.com>
@forestmvey forestmvey force-pushed the dev-highlight-in-sql branch from 9c777fc to 6c17cd7 Compare July 27, 2022 14:02
@forestmvey forestmvey requested a review from a team July 27, 2022 16:43
@forestmvey forestmvey force-pushed the dev-highlight-in-sql branch from 57ec069 to 7e1eab3 Compare July 27, 2022 21:49
import org.opensearch.sql.planner.logical.LogicalPlan;

@RequiredArgsConstructor
public class HighlightAnalyzer extends AbstractNodeVisitor<LogicalPlan, AnalysisContext> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Class needs a javadoc

refName += "." + StringUtils.unquoteText(getHighlightField().toString());
}
ExprValue retVal = valueEnv.resolve(DSL.ref(refName, ExprCoreType.STRING));
ImmutableMap.Builder<String, ExprValue> builder = new ImmutableMap.Builder<>();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can declare this after the is statement.

return retVal;
}

var hlBuilder = ImmutableMap.<String, ExprValue>builder();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mapBuilder or highlightMapBuilder?

Signed-off-by: forestmvey <forestv@bitquilltech.com>
…ing coverage for OpenSearchIndexScan

Signed-off-by: forestmvey <forestv@bitquilltech.com>
public class HighlightOperatorTest extends PhysicalPlanTestBase {

@Test
public void highlight_all_test() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this testing? that highlight all doesn't break anything?

Copy link
Author

@forestmvey forestmvey Jul 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this should be renamed to valid_highlight_operator_test. It is required for HighlightOperator coverage. The * isn't necessary for the highlight field.

sql/src/main/antlr/OpenSearchSQLParser.g4 Show resolved Hide resolved
…light acceptance as a string literal as well as qualified name.

Signed-off-by: forestmvey <forestv@bitquilltech.com>
@forestmvey forestmvey force-pushed the dev-highlight-in-sql branch from c0869c4 to 3ee3491 Compare July 29, 2022 18:51
Comment on lines 19 to 21
public class HighlightOperator extends PhysicalPlan {
@NonNull
private final PhysicalPlan input;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line with Peng's comment in the POC, this class ended up not doing anything and can be removed.

@MaxKsyunz MaxKsyunz self-requested a review July 29, 2022 20:01
@MitchellGale
Copy link

@forestmvey looks like a test is failing now fyi.

    org.opentest4j.AssertionFailedError: expected: org.opensearch.sql.ast.expression.HighlightFunction@9443150<HighlightFunction(highlightField=fieldA)> but was: org.opensearch.sql.ast.expression.HighlightFunction@60bdba16<HighlightFunction(highlightField=fieldA)>
        at app//org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
        at app//org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62)
        at app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182)
        at app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177)
        at app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1124)
        at app//org.opensearch.sql.sql.parser.AstExpressionBuilderTest.canBuildHighlighFunction(AstExpressionBuilderTest.java:314)

Copy link

@MaxKsyunz MaxKsyunz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once HighlightOperator goes away (or we can explain what it's for), other comments are addressed, and the CI tasks pass, I think it's good to go.

… PR.

Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
@codecov
Copy link

codecov bot commented Jul 30, 2022

Codecov Report

Merging #96 (373353b) into integ_highlight-in-sql (5c6fd72) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@                     Coverage Diff                      @@
##             integ_highlight-in-sql      #96      +/-   ##
============================================================
+ Coverage                     97.74%   97.77%   +0.02%     
- Complexity                     2858     2883      +25     
============================================================
  Files                           273      276       +3     
  Lines                          7020     7088      +68     
  Branches                        442      450       +8     
============================================================
+ Hits                           6862     6930      +68     
  Misses                          157      157              
  Partials                          1        1              
Flag Coverage Δ
sql-engine 97.77% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ensearch/sql/planner/physical/PhysicalPlanDSL.java 100.00% <ø> (ø)
.../sql/planner/physical/PhysicalPlanNodeVisitor.java 100.00% <ø> (ø)
...ain/java/org/opensearch/sql/analysis/Analyzer.java 100.00% <100.00%> (ø)
...rg/opensearch/sql/analysis/ExpressionAnalyzer.java 100.00% <100.00%> (ø)
...org/opensearch/sql/analysis/HighlightAnalyzer.java 100.00% <100.00%> (ø)
...ensearch/sql/expression/ExpressionNodeVisitor.java 100.00% <100.00%> (ø)
...opensearch/sql/expression/HighlightExpression.java 100.00% <100.00%> (ø)
...h/sql/expression/function/BuiltinFunctionName.java 100.00% <100.00%> (ø)
...h/sql/expression/function/OpenSearchFunctions.java 100.00% <100.00%> (ø)
...ensearch/sql/planner/logical/LogicalHighlight.java 100.00% <100.00%> (ø)
... and 7 more

Help us with your feedback. Take ten seconds to tell us how you rate us.

Signed-off-by: forestmvey <forestv@bitquilltech.com>
@acarbonetto
Copy link

Once HighlightOperator goes away (or we can explain what it's for), other comments are addressed, and the CI tasks pass, I think it's good to go.

At least, the two failing actions would be fixed by merging from upstream:main (opensearch-project#704) - I believe

@forestmvey forestmvey merged commit 1101db4 into integ_highlight-in-sql Aug 1, 2022
@Yury-Fridlyand Yury-Fridlyand deleted the dev-highlight-in-sql branch January 12, 2023 19:46
andy-k-improving pushed a commit that referenced this pull request Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants