Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement EXCLUDE in EvaluatingCompiler #1249

Merged
merged 6 commits into from
Dec 6, 2023
Merged

Conversation

alancai98
Copy link
Member

@alancai98 alancai98 commented Oct 18, 2023

Relevant Issues

partiql/partiql-lang#27 (still experimental until the RFC is merged).

Description

Implementation of EXCLUDE operator (partiql/partiql-lang#27) in the EvaluatingCompiler. A future PR will implement EXCLUDE in the PhysicalPlanCompiler, targeting the partiql-eval branch.

Other Information

License Information

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@alancai98 alancai98 self-assigned this Oct 18, 2023
@github-actions
Copy link

github-actions bot commented Oct 18, 2023

Conformance comparison report

Base (9ad3599) e6ccf41 +/-
% Passing 92.33% 92.33% 0.00%
✅ Passing 5372 5372 0
❌ Failing 446 446 0
🔶 Ignored 0 0 0
Total Tests 5818 5818 0

Number passing in both: 5372

Number failing in both: 446

Number passing in Base (9ad3599) but now fail: 0

Number failing in Base (9ad3599) but now pass: 0

@codecov-commenter
Copy link

codecov-commenter commented Oct 18, 2023

Codecov Report

Attention: 16 lines in your changes are missing coverage. Please review.

Comparison is base (9ad3599) 72.03% compared to head (f27fe71) 72.20%.
Report is 3 commits behind head on main.

Files Patch % Lines
...kotlin/org/partiql/lang/eval/EvaluatingCompiler.kt 90.30% 6 Missing and 10 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1249      +/-   ##
============================================
+ Coverage     72.03%   72.20%   +0.16%     
- Complexity     2058     2100      +42     
============================================
  Files           221      222       +1     
  Lines         15869    16019     +150     
  Branches       2856     2891      +35     
============================================
+ Hits          11432    11567     +135     
- Misses         3630     3635       +5     
- Partials        807      817      +10     
Flag Coverage Δ
CLI 13.43% <ø> (ø)
EXAMPLES 80.28% <ø> (ø)
LANG 80.57% <90.75%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alancai98
Copy link
Member Author

Based on some upcoming plans to deprecate EvaluatingCompiler, I'll look to port this reference implementation over to PhysicalPlanCompiler.

@alancai98 alancai98 changed the title WIP Implement EXCLUDE in EvaluatingCompiler Implement EXCLUDE in EvaluatingCompiler Nov 28, 2023
/**
* Represents all the exclusions at the current level and other nested levels.
*/
internal data class RemoveAndOtherSteps(val remove: Set<PartiqlAst.ExcludeStep>, val steps: Map<PartiqlAst.ExcludeStep, RemoveAndOtherSteps>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please help me understand how the steps are modeled? Why is there a set of steps and then a map from steps to more? This seems like an odd way to model what looks to be a cluster of trees

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understanding this will help be better see what's going on in compileExcludeClause

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To give more context, the EXCLUDE paths are associative (order in which they're specified shouldn't matter) so all of the exclude paths can be evaluated in parallel. Associative exclude paths proved to be less confusing especially when dealing with collection index exclude steps. For example, if we have EXCLUDE t.a[1], t.a[2].field, t.a's 1st index and the field attribute of t.a's 2nd index will be removed in parallel rather than iteratively removing t.a[1] and then t.a[2].field after removing t.a[1].

I used RemoveAndOtherSteps to make it easier to process these exclude paths in parallel. At each level of the exclude paths (index of exclude path step), we remove some tuple attributes and collection indexes corresponding to the exclude paths that end at the current level. This is modeled by the remove set.

There may also be some exclude paths that have additional steps (so their final steps are at a deeper level). This is modeled using a map of exclude steps to the other RemoveAndOtherSteps to group all the exclude paths that have the same exclude step at the current level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, let's say we have exclude paths (ignoring the exclude path root) of

      a.b,     -- exclude path 1
      x.y.z1,  -- exclude path 2
      x.y.z2   -- exclude path 3
      ^ ^ ^
Level 1 2 3

These exclude paths would be converted to the following in RemoveAndOtherSteps.

// For demonstration purposes, the syntax '<string>' corresponds to the exclude tuple attribute step of <string>
RemoveAndOtherSteps(                   // Level 1 (no exclusions at level 1)
    remove = emptySet(),
    steps = mapOf(
        'a' to RemoveAndOtherSteps(    // Level 2 for paths that have `'a'` in Level 1
            remove = setOf('b'),       // path `a.b` has final step at level 2
            steps = emptyMap()
        ),
        'x' to RemoveAndOtherSteps(    // Level 2 for paths that have `'x'` in Level 1
            remove = emptySet(),
            steps = mapOf(
                'y' to RemoveAndOtherSteps(     // Level 3 for paths that have `'y'` in Level 2 and `'x'` in Level 1
                    remove = setOf('z1', 'z2'), // paths `x.y.z1` and `x.y.z2` have final step at level 3
                    steps = emptyMap()
                )
            )
        ),
    )
)

Some of the other EXCLUDE path merging/subsumption tests shows the relationship between EXCLUDE paths and this data structure: https://github.com/partiql/partiql-lang-kotlin/blob/2d05d8a13237ed0a0ed1e1d5cec53171ca8d8b28/partiql-lang/src/test/kotlin/org/partiql/lang/eval/EvaluatingCompilerExcludeTests.kt#L841-L1074C14.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is just trees where you've marked leaves. It's fine as is, just that on the surface the naming and factoring is more complicated than it need be so kind of confused me.

Say a "red" node means it gets excluded, this is all you need

graphviz

Source

digraph G {
  b[color=red]
  z1[color=red]
  z2[color=red]
  a -> b
  x -> y
  y -> z1
  y -> z2
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the naming/factoring of RemoveAndOtherSteps could be initially confusing. Since this is currently internal, I'll leave it as it is. I plan to take another look at refactoring this as part of porting EXCLUDE to the plan compiler in #1280.

/**
* Represents all the exclusions at the current level and other nested levels.
*/
internal data class RemoveAndOtherSteps(val remove: Set<PartiqlAst.ExcludeStep>, val steps: Map<PartiqlAst.ExcludeStep, RemoveAndOtherSteps>) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To give more context, the EXCLUDE paths are associative (order in which they're specified shouldn't matter) so all of the exclude paths can be evaluated in parallel. Associative exclude paths proved to be less confusing especially when dealing with collection index exclude steps. For example, if we have EXCLUDE t.a[1], t.a[2].field, t.a's 1st index and the field attribute of t.a's 2nd index will be removed in parallel rather than iteratively removing t.a[1] and then t.a[2].field after removing t.a[1].

I used RemoveAndOtherSteps to make it easier to process these exclude paths in parallel. At each level of the exclude paths (index of exclude path step), we remove some tuple attributes and collection indexes corresponding to the exclude paths that end at the current level. This is modeled by the remove set.

There may also be some exclude paths that have additional steps (so their final steps are at a deeper level). This is modeled using a map of exclude steps to the other RemoveAndOtherSteps to group all the exclude paths that have the same exclude step at the current level.

/**
* Represents all the exclusions at the current level and other nested levels.
*/
internal data class RemoveAndOtherSteps(val remove: Set<PartiqlAst.ExcludeStep>, val steps: Map<PartiqlAst.ExcludeStep, RemoveAndOtherSteps>) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, let's say we have exclude paths (ignoring the exclude path root) of

      a.b,     -- exclude path 1
      x.y.z1,  -- exclude path 2
      x.y.z2   -- exclude path 3
      ^ ^ ^
Level 1 2 3

These exclude paths would be converted to the following in RemoveAndOtherSteps.

// For demonstration purposes, the syntax '<string>' corresponds to the exclude tuple attribute step of <string>
RemoveAndOtherSteps(                   // Level 1 (no exclusions at level 1)
    remove = emptySet(),
    steps = mapOf(
        'a' to RemoveAndOtherSteps(    // Level 2 for paths that have `'a'` in Level 1
            remove = setOf('b'),       // path `a.b` has final step at level 2
            steps = emptyMap()
        ),
        'x' to RemoveAndOtherSteps(    // Level 2 for paths that have `'x'` in Level 1
            remove = emptySet(),
            steps = mapOf(
                'y' to RemoveAndOtherSteps(     // Level 3 for paths that have `'y'` in Level 2 and `'x'` in Level 1
                    remove = setOf('z1', 'z2'), // paths `x.y.z1` and `x.y.z2` have final step at level 3
                    steps = emptyMap()
                )
            )
        ),
    )
)

Some of the other EXCLUDE path merging/subsumption tests shows the relationship between EXCLUDE paths and this data structure: https://github.com/partiql/partiql-lang-kotlin/blob/2d05d8a13237ed0a0ed1e1d5cec53171ca8d8b28/partiql-lang/src/test/kotlin/org/partiql/lang/eval/EvaluatingCompilerExcludeTests.kt#L841-L1074C14.

* Represents all the compiled `EXCLUDE` paths that start with the same [CompiledExcludeExpr.root]. Notably,
* redundant paths (i.e. exclude paths that exclude values already excluded by other paths) will be removed.
*/
internal data class CompiledExcludeExpr(val root: PartiqlAst.Identifier, val exclusions: RemoveAndOtherSteps)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From offline discussion, some of this PR's code in EvaluatingCompiler can be reused when porting EXCLUDE to the physical plan compiler. Some logic here will be refactored in an upcoming PR that adds EXCLUDE to the other compiler.

/**
* Represents all the exclusions at the current level and other nested levels.
*/
internal data class RemoveAndOtherSteps(val remove: Set<PartiqlAst.ExcludeStep>, val steps: Map<PartiqlAst.ExcludeStep, RemoveAndOtherSteps>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is just trees where you've marked leaves. It's fine as is, just that on the surface the naming and factoring is more complicated than it need be so kind of confused me.

Say a "red" node means it gets excluded, this is all you need

graphviz

Source

digraph G {
  b[color=red]
  z1[color=red]
  z2[color=red]
  a -> b
  x -> y
  y -> z1
  y -> z2
}

@alancai98 alancai98 merged commit f8f6b2a into main Dec 6, 2023
10 checks passed
@alancai98 alancai98 deleted the exclude-in-evaluator-poc branch December 6, 2023 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants