Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple yield steps in from #52

Closed
julianhyde opened this issue Jun 24, 2021 · 1 comment
Closed

Allow multiple yield steps in from #52

julianhyde opened this issue Jun 24, 2021 · 1 comment

Comments

@julianhyde
Copy link
Collaborator

Allow multiple yield steps in from.

In #21 we allowed from to contain multiple group and where clauses, in any order, but yield had to come last. Since yield is analogous to the Project relational operator, it makes sense to allow it to be before and after the other operators, and repeated.

So, the grammar will change from

from fromSource1 , ... , fromSources
      (fromFilter | fromGroup | fromOrder )*
      [ yield exp ]

to

from fromSource1 , ... , fromSources
      (fromFilter | fromGroup | fromOrder | yield exp)*

But we have to impose a side condition: if the yield expression is not a record, it has to come last. This is due to language scoping rules, not relational algebra. The from clause, and each subsequent step, must bind some variable names that are visible to the next step. A yield with a record expression binds the fields of the record; a yield with a scalar expression binds nothing, so would be no way for the next step to reference the incoming element.

Example:

from e in scott.emp,
  d in scott.dept
 (* at this point, we have e and d *)
where e.deptno = d.deptno
 (* we still have e and d, because where propagates *)
group x = d.deptno compute c as count, s as sum of e.sal
 (* we now have x, c and s, because group returns its keys and aggregates *)
yield {p = x + c, q = s}
 (* we now have p and q *)
where p < 20
 (* we still have p and q, because where propagates *)
yield p / q
 (* this has to be the last step, because p / q is a scalar, and no variables are bound *)

In the Core language (classes From, FromStep and its sub-classes Where, Group, Order, Yield, all of which are inner classes of class Core) we allow similar patterns. A Yield must be the last step if its expression is not a record.

But suppose you have

from e in emp
yield e.empno + e.deptno

and you want to add another step, say yield $incomingValue$ * 2. You can do this by assigning the previous expression in to a record with a single field:

from e in emp
yield {v0 = e.empno + e.deptno}
yield v0 * 2

This technique works for other step types too (group, record-valued yield, order). We intend to use this technique in our fix for #45, to deal with the case where there are nested calls to the map function.

julianhyde added a commit that referenced this issue Aug 3, 2021
See #52

Previously, we allowed from to contain multiple group and
where clauses, in any order, but yield had to come last.
Since yield is analogous to the Project relational
operator, it makes sense to allow it to be before and after
the other operators, and repeated.

So, the grammar changes from

  from fromSource1 , ... , fromSources
        (fromFilter | fromGroup | fromOrder )*
        [ yield exp ]

to

  from fromSource1 , ... , fromSources
        (fromFilter | fromGroup | fromOrder | yield exp)*

But we have to impose a side condition: if the yield
expression is not a record, it has to come last. This is
due to language scoping rules, not relational algebra. The
from clause, and each subsequent step, must bind some
variable names that are visible to the next step. A yield
with a record expression binds the fields of the record; a
yield with a scalar expression binds nothing, so would be
no way for the next step to reference the incoming element.

Example:

  from e in scott.emp,
    d in scott.dept
     (* at this point, we have e and d *)
  where e.deptno = d.deptno
    (* we still have e and d, because where propagates *)
  group x = d.deptno compute c as count, s as sum of e.sal
    (* we now have x, c and s, because group returns its
       keys and aggregates *)
  yield {p = x + c, q = s}
    (* we now have p and q *)
  where p < 20
    (* we still have p and q, because where propagates *)
  yield p / q
     (* this has to be the last step, because p / q is a
        scalar, and no variables are bound *)

In the Core language (classes From, FromStep and its
sub-classes Where, Group, Order, Yield, all of which are
inner classes of class Core) we allow similar patterns. A
Yield must be the last step if its expression is not a
record.

But suppose you have

  from e in emp
    yield e.empno + e.deptno

and you want to add another step, say (in pseudo-code)

  yield $incomingValue$ * 2

You can do this by assigning the previous expression in to
a record with a single field:

  from e in emp
    yield {v0 = e.empno + e.deptno}
    yield v0 * 2

This technique works for other step types too (group,
record-valued yield, order). We use this technique to
improve in our fix for [MOREL-45], to deal with the case
where there are nested calls to the map function.

Flatten nested from:
  from x in (from y ...) steps
becomes
  from y ... yield {x = E} steps

Improve unparse:
 * Give 'in' the same precedence as '=', which forces
   parentheses in 'from e in (from ...)';
 * Treat structure '$' in BuiltIn as if it were null;
 * Unparse Core.Tuple as a record if its type is RecordType.
julianhyde added a commit that referenced this issue Aug 3, 2021
See #52

Previously, we allowed from to contain multiple group and
where clauses, in any order, but yield had to come last.
Since yield is analogous to the Project relational
operator, it makes sense to allow it to be before and after
the other operators, and repeated.

So, the grammar changes from

  from fromSource1 , ... , fromSources
        (fromFilter | fromGroup | fromOrder )*
        [ yield exp ]

to

  from fromSource1 , ... , fromSources
        (fromFilter | fromGroup | fromOrder | yield exp)*

But we have to impose a side condition: if the yield
expression is not a record, it has to come last. This is
due to language scoping rules, not relational algebra. The
from clause, and each subsequent step, must bind some
variable names that are visible to the next step. A yield
with a record expression binds the fields of the record; a
yield with a scalar expression binds nothing, so would be
no way for the next step to reference the incoming element.

Example:

  from e in scott.emp,
    d in scott.dept
     (* at this point, we have e and d *)
  where e.deptno = d.deptno
    (* we still have e and d, because where propagates *)
  group x = d.deptno compute c as count, s as sum of e.sal
    (* we now have x, c and s, because group returns its
       keys and aggregates *)
  yield {p = x + c, q = s}
    (* we now have p and q *)
  where p < 20
    (* we still have p and q, because where propagates *)
  yield p / q
     (* this has to be the last step, because p / q is a
        scalar, and no variables are bound *)

In the Core language (classes From, FromStep and its
sub-classes Where, Group, Order, Yield, all of which are
inner classes of class Core) we allow similar patterns. A
Yield must be the last step if its expression is not a
record.

But suppose you have

  from e in emp
    yield e.empno + e.deptno

and you want to add another step, say (in pseudo-code)

  yield $incomingValue$ * 2

You can do this by assigning the previous expression in to
a record with a single field:

  from e in emp
    yield {v0 = e.empno + e.deptno}
    yield v0 * 2

This technique works for other step types too (group,
record-valued yield, order). We use this technique to
improve in our fix for [MOREL-45], to deal with the case
where there are nested calls to the map function.

Flatten nested from:
  from x in (from y ...) steps
becomes
  from y ... yield {x = E} steps

Improve unparse:
 * Give 'in' the same precedence as '=', which forces
   parentheses in 'from e in (from ...)';
 * Treat structure '$' in BuiltIn as if it were null;
 * Unparse Core.Tuple as a record if its type is RecordType.
@julianhyde
Copy link
Collaborator Author

Fixed in 3413702.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant