Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

from should not have a singleton record type unless it ends with a singleton record yield #159

Closed
julianhyde opened this issue Jul 26, 2022 · 0 comments

Comments

@julianhyde
Copy link
Collaborator

If a from expression has one variable named i of type int, should the type of the returned elements be int or {i: int}? (We call int a scalar, {i: int} a singleton record type, yield {i = i} a singleton record yield, and yield {i = j, j = k} a renaming yield.)

After this change, from will have a singleton record type only it ends with a singleton record yield. If it does not end in yield, the type depends on N, the number of pipeline variables, and will be a scalar if N = 1 and a record with N fields if N != 1.

Before this change, that would depend on whether there was a singleton record yield (such as yield {i} or yield {i = i} or yield {i = j} or yield {i = j + 2}) somewhere in the pipeline, as follows:

- from i in [1,2];                        # 1
val it = [1,2] : int list
- from i in [1,2] yield {i};              # 2
val it = [{i=1},{i=2}] : {i:int} list
- from i in [1,2] yield i;                # 3
val it = [1,2] : int list
- from i in [1,2] yield i + 3;            # 4
val it = [4,5] : int list
- from i in [1,2] where i > 1;            # 5
val it = [2] : int list
from i in [1,2] yield {i} where i > 1;    # 6
val it = [{i=2}] : {i:int} list
- from i in [1,2] where i > 1 yield {i};  # 7
val it = [{i=2}] : {i:int} list
- from i in [1,2] yield {j = i} where j > 1; # 8
val it = [{i=2}] : {i:int} list
- from i in [1,2] order i desc;           # 9
val it = [2,1] : int list
- from i in [1,2] yield {j=i} order j desc; # 10
val it = [{j=2},{j=1}] : {j:int} list
- from i in [1,2] yield {j=i} order j desc yield {j}; # 11
val it = [{j=2},{j=1}] : {j:int} list
- from i in [1,2] yield {j=i} order j desc yield j; # 12
val it = [2,1] : int list

This behavior is unsatisfactory. It requires that a pipeline remember whether there is a singleton record yield somewhere in the pipeline.

After this change, the only thing that counts is whether the last step is a yield. To return singleton records, the last step must be a singleton yield, for example yield {i = i} (or the shorthand yield {i}), or a rename yield {j = i}, or an expression yield {k = i + j + 3}.

If the last step is not a yield, the result is scalar if there is one variable, a record otherwise. Here are the above expressions after the change:

- from i in [1,2];                        # 1
val it = [1,2] : int list
- from i in [1,2] yield {i};              # 2
val it = [{i=1},{i=2}] : {i:int} list
- from i in [1,2] yield i;                # 3
val it = [1,2] : int list
- from i in [1,2] yield i + 3;            # 4
val it = [4,5] : int list
- from i in [1,2] where i > 1;            # 5
val it = [2] : int list
- from i in [1,2] yield {i} where i > 1;  # 6 (changed)
val it = [2] : int list
- from i in [1,2] where i > 1 yield {i};  # 7
val it = [{i=2}] : {i:int} list
- from i in [1,2] yield {j = i} where j > 1; # 8 (changed)
val it = [2] : int list
- from i in [1,2] order i desc;           # 9
val it = [2,1] : int list
- from i in [1,2] yield {j=i} order j desc; # 10 (changed)
val it = [2,1] : int list
- from i in [1,2] yield {j=i} order j desc yield {j}; # 11
val it = [{j=2},{j=1}] : {j:int} list
- from i in [1,2] yield {j=i} order j desc yield j; # 12
val it = [2,1] : int list

The pipelines whose types have changed (6, 8, 10) are those that contain a yield but do not end in yield.

As example 8 shows, you can now use a singleton yield (yield {j = i}) to rename the variable without forcing the result to be a record type. You would not want to use a singleton yield as the last step, because there are no downstream steps to use the new variable name.

As part of this change, we introduce a new class FromBuilder to safely build Core.From pipelines. It performs micro-optimizations as it goes, such as removing where true steps. FromBuilder can also inline nested from expressions:

from i in (from j in [1, 2, 3]
    where j > 1)
where i < 3

becomes

from j in [1, 2, 3]
where j > 1
yield {i = j}
where i < 3

Note the use of yield {i = j} to handle the variable name change caused by inlining.

julianhyde added a commit to julianhyde/morel that referenced this issue Jul 28, 2022
…ds with a singleton record yield

If a `from` expression has one variable named `i` of type
`int`, should the type of the returned elements be `int` or
`{i: int}`? (We call `int` a scalar, `{i: int}` a singleton
record type, `yield {i = i}` a singleton record yield, and
`yield {i = j, j = k}` a renaming yield.)

After this change, `from` will have a singleton record type
only it ends with a singleton record yield. If it does not
end in yield, the type depends on N, the number of pipeline
variables, and will be a scalar if N = 1 and a record with N
fields if N != 1.

As part of this change, we introduce a new `class FromBuilder`
to safely build `Core.From` pipelines. It performs
micro-optimizations as it goes, such as removing `where true`,
empty `order` and trivial `yield` steps. `FromBuilder` can
also inline nested from expressions:

  from i in (from j in [1, 2, 3]
      where j > 1)
    where i < 3

becomes

  from j in [1, 2, 3]
    where j > 1
    yield {i = j}
    where i < 3

Note the use of `yield {i = j}` to handle the variable name
change caused by inlining.

Ensure that `from d in scott.depts` is not printed as
'<relation>' even if it is optimized to `scott.depts`. If we
write `scott.depts` in the shell, the shell prints
'<relation>' because `depts` may be a large table.
`FromBuilder` now simplies `from d in scott.depts` to
`scott.depts`, but we want the shell to print the rows, not
'<relation>'. Therefore the shell now calls `Code.wrap` to
tag whether an expression would be treated as a relation or
as a query.

Fixes hydromatic#159
julianhyde added a commit to julianhyde/morel that referenced this issue Sep 25, 2022
…ds with a singleton record yield

If a `from` expression has one variable named `i` of type
`int`, should the type of the returned elements be `int` or
`{i: int}`? (We call `int` a scalar, `{i: int}` a singleton
record type, `yield {i = i}` a singleton record yield, and
`yield {i = j, j = k}` a renaming yield.)

After this change, `from` will have a singleton record type
only it ends with a singleton record yield. If it does not
end in yield, the type depends on N, the number of pipeline
variables, and will be a scalar if N = 1 and a record with N
fields if N != 1.

As part of this change, we introduce a new `class FromBuilder`
to safely build `Core.From` pipelines. It performs
micro-optimizations as it goes, such as removing `where true`,
empty `order` and trivial `yield` steps. `FromBuilder` can
also inline nested from expressions:

  from i in (from j in [1, 2, 3]
      where j > 1)
    where i < 3

becomes

  from j in [1, 2, 3]
    where j > 1
    yield {i = j}
    where i < 3

Note the use of `yield {i = j}` to handle the variable name
change caused by inlining.

Ensure that `from d in scott.depts` is not printed as
'<relation>' even if it is optimized to `scott.depts`. If we
write `scott.depts` in the shell, the shell prints
'<relation>' because `depts` may be a large table.
`FromBuilder` now simplies `from d in scott.depts` to
`scott.depts`, but we want the shell to print the rows, not
'<relation>'. Therefore the shell now calls `Code.wrap` to
tag whether an expression would be treated as a relation or
as a query.

Fixes hydromatic#159
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant