Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRAFT - Add support for highlight in PPL #107

Closed
wants to merge 29 commits into from

Conversation

forestmvey
Copy link

Description
Add PPL support for highlight in SQL plugin.

Issues Resolved
Issue: github-636

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@forestmvey forestmvey requested a review from a team August 17, 2022 23:04
@codecov
Copy link

codecov bot commented Aug 17, 2022

Codecov Report

Merging #107 (c656026) into draft-highlight-in-ppl (ce15448) will increase coverage by 0.02%.
The diff coverage is 100.00%.

❗ Current head c656026 differs from pull request most recent head 9d554d0. Consider uploading reports for the commit 9d554d0 to get more accurate results

@@                     Coverage Diff                      @@
##             draft-highlight-in-ppl     #107      +/-   ##
============================================================
+ Coverage                     94.85%   94.88%   +0.02%     
- Complexity                     2913     2926      +13     
============================================================
  Files                           288      289       +1     
  Lines                          7839     7879      +40     
  Branches                        571      573       +2     
============================================================
+ Hits                           7436     7476      +40     
  Misses                          349      349              
  Partials                         54       54              
Flag Coverage Δ
query-workbench 62.76% <ø> (ø)
sql-engine 97.81% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...rg/opensearch/sql/analysis/ExpressionAnalyzer.java 100.00% <ø> (ø)
...ensearch/storage/script/core/ExpressionScript.java 100.00% <ø> (ø)
...pensearch/sql/protocol/response/format/Format.java 100.00% <ø> (ø)
...ain/java/org/opensearch/sql/analysis/Analyzer.java 100.00% <100.00%> (ø)
...search/sql/planner/physical/HighlightOperator.java 100.00% <100.00%> (ø)
.../sql/planner/physical/PhysicalPlanNodeVisitor.java 100.00% <100.00%> (ø)
...ecutor/protector/OpenSearchExecutionProtector.java 100.00% <100.00%> (ø)
...search/sql/opensearch/storage/OpenSearchIndex.java 100.00% <100.00%> (ø)
...java/org/opensearch/sql/ppl/parser/AstBuilder.java 100.00% <100.00%> (ø)
...pensearch/sql/ppl/parser/AstExpressionBuilder.java 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

* Bump checkstyle version and fix violations

Signed-off-by: Chen Dai <daichen@amazon.com>

* Fix checkstyle violations in test code

Signed-off-by: Chen Dai <daichen@amazon.com>

Signed-off-by: Chen Dai <daichen@amazon.com>
*/
@Override
public LogicalPlan visitHighlight(Highlight node, AnalysisContext context) {
LogicalPlan child = node.getChild().get(0).accept(this, context);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm that .get(0) won't crash if there is no such element.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this case would be possible. As well the returning object at index 0 is standard for visit functions in the Analyzer.

Comment on lines +260 to 266
public T visitHighlightFunction(HighlightFunction node, C context) {
return visitChildren(node, context);
}

public T visitHighlight(Highlight node, C context) {
return visitChildren(node, context);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

visitHighlightFunction takes a HighlightFunction node and has an overriding function in ExpressionAnalyzer, while visitHighlight takes a highlight node and has an overriding function in Analyzer. This is because for PPL I was unable to resolve a highlight field as an expression like I did in SQL. Instead I use HighlightOperator and use the Analyzer to get around this issue. This issue arose because of the way the parser was implemented, the function highlight syntax of not having highlight in a fields command, and the parser implementation of AllFields without a fields command.

We may want to consider updating SQL to use the highlightOperator rather than HighlightExpression...

@MaxKsyunz
Copy link

@forestmvey I didn't realize you recreated the PR.

Please have a look at my comments to the previous PR here.

@@ -79,4 +79,8 @@ public R visitMLCommons(PhysicalPlan node, C context) {
public R visitAD(PhysicalPlan node, C context) {
return visitNode(node, context);
}

public R visitHighlight(PhysicalPlan node, C context) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between visitHighlight and visitAD?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AD and highlight operators need their own visiting functions to be called from their derived classes. You can find the overriding functions in OpenSearchExecutionProtector. The nodes passed into these functions will be for their respective HighlightOperator and ADOperator. This way the operator nodes are passed in, we can successfully visit the input for each node.

@@ -607,6 +609,26 @@ public void testParseCommand() {
));
}

@Test
public void testStringLiteralHighlightCommand() {
assertEqual("source=t | highlight('FieldA')",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can double quotes be used in the highlight parameter? If so, should it be tested?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes double quotes in highlight parameter are supported. Both single and double quoted parameters will be interpreted as string literals at this parsing level. I don't think it's overly necessary to add a test for it at this level.

@@ -25,7 +25,6 @@
import static org.opensearch.sql.ast.dsl.AstDSL.exprList;
import static org.opensearch.sql.ast.dsl.AstDSL.field;
import static org.opensearch.sql.ast.dsl.AstDSL.filter;
import static org.opensearch.sql.ast.dsl.AstDSL.floatLiteral;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this just an unused import?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this was just an un-used import.

@forestmvey forestmvey force-pushed the dev-highlight-in-ppl branch 2 times, most recently from 875bbb7 to 978eb62 Compare August 22, 2022 20:09
MaxKsyunz and others added 21 commits August 23, 2022 08:31
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
HighlightFunction is converted to LogicalHighlight logical plan.

Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
…Improved test coverage and fixing checkstyle errors.

Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
…ing coverage for OpenSearchIndexScan

Signed-off-by: forestmvey <forestv@bitquilltech.com>
…light acceptance as a string literal as well as qualified name.

Signed-off-by: forestmvey <forestv@bitquilltech.com>
…ht call in PhysicalPlanNodeVisitor..

Signed-off-by: forestmvey <forestv@bitquilltech.com>
…roject.

Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
…quotes.

Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
…rchExecutionrotector. Refine HighlightOperator logic.

Signed-off-by: forestmvey <forestv@bitquilltech.com>
…tFunction in Analyzer as this case should not be reachable.

Signed-off-by: forestmvey <forestv@bitquilltech.com>
…ations from HighlightOperator.

Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
Signed-off-by: forestmvey <forestv@bitquilltech.com>
@forestmvey forestmvey force-pushed the dev-highlight-in-ppl branch from e00dba1 to c656026 Compare August 23, 2022 15:35
Signed-off-by: forestmvey <forestv@bitquilltech.com>
@forestmvey forestmvey closed this Aug 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants