Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further clarification GROUP BY grouping key for MISSING values exclusively #15

Open
alancai98 opened this issue Apr 6, 2023 · 1 comment

Comments

@alancai98
Copy link
Member

alancai98 commented Apr 6, 2023

PartiQL spec defines the behavior when a grouping key results in NULL and MISSING. See spec section 11.1.1:

Notice that PartiQL will group together the NULL and the MISSING grouping expressions, since any grouping
expression resulting to MISSING has been coerced into NULL before eqg does comparisons for grouping.

That would seem to imply that in the output binding tuple created by GROUP BY, NULL and MISSING would be grouped together as illustrated in example 41.

However in the case in which a group key is only MISSING, the spec could be more clear on this behavior.

E.g.

SELECT a FROM
[
    {'a': 1},
    {'a': 2},
    {'a': MISSING}
] AS tbl
GROUP BY tbl.a AS a

Should the output binding tuple from GROUP BY be

-- case 1
<<
    < a: 1 >,
    < a: 2 >,
    < a: NULL>
>>

or

-- case 2
<<
    < a: 1 >,
    < a: 2 >,
    < a: MISSING>
>>

Because of the coercion that's mentioned in spec section 11.1.1, it would make more sense for case 1 to be the defined behavior (i.e. group key in binding would be NULL). Also considering the order in which the tuples are processed affecting the group key value (e.g. MISSING then NULL), case 2 would add additional implementation complexities.

After discussing with @almann, we believe that case 1 should be the spec behavior. In any case, additional spec clarification would be helpful.

@alancai98
Copy link
Member Author

alancai98 commented Apr 6, 2023

For reference, the Kotlin implementation preserves the attribute as MISSING (case 2) rather than doing the coercion to NULL (case 1):

-- CLI run with v0.9.3 including the `GROUP AS` grouping variable
PartiQL> SELECT a, g FROM
   | [
   |    {'a': 1},
   |    {'a': 2},
   |    {'a': MISSING}
   | ] AS tbl
   | GROUP BY tbl.a AS a GROUP AS g;
==='
<<
  {
    'g': <<
      {
        'tbl': {}
      }
    >>
},
  {
    'a': 1,
    'g': <<
      {
        'tbl': {
          'a': 1
        }
      }
    >>
  },
  {
    'a': 2,
    'g': <<
      {
        'tbl': {
          'a': 2
        }
      }
    >>
  }
>>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant