Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: Require parentheses around pipelines #4775

Merged
merged 13 commits into from
Jul 24, 2024

Conversation

max-sixty
Copy link
Member

Whether pipelines required parentheses was ambiguous before — does select (invoice_date | date.to_text "%d/%m/%Y") require its parentheses? How about in a tuple like select {(invoice_date | date.to_text "%d/%m/%Y")}?

This changes the parser so parentheses are required. This creates much better errors in some cases — for example a missing comma from a tuple isn't parsed as a pipeline:

derive {
  x=y
  a=b
}

...is no longer parsed as derive {x=(y | a=b)}, which is really confusing. Or similarly derive {x=y a=b} also parses into that.

It also simplifies the parser code slightly; unifying pipeline functions.

(tests fail on parsing one thing from the std lib, I need to fix)

@snth
Copy link
Member

snth commented Jul 23, 2024

It does make a lot of sense for those parsing error examples that you show. It's a pity though for the common use cases like sum or text.lower, e.g. derive low = (title | text.lower), where it adds overhead.

Would it be possible to require it only in cases where there's a \n involved? In Python you have

>>> s = 'a' 'b'
>>> s
'ab'
>>> s = ('a'
... 'b')
>>> s
'ab'

@snth
Copy link
Member

snth commented Jul 23, 2024

Or if one conceptually separated the two uses of |, maybe one could disallow \n as composition operator for the expression language?

What I mean is, assume one used |> or >> for piping relations to transforms and | for piping arguments to functions in column expressions, e.g. from customers >> derive upper_title = title | text.upper. While it's sometimes useful to use >> as above, we usually write this as

from customers
derive upper_title = title | text.upper

While I usually want to use \n instead of >>, I struggle to imagine scenarios where I would want to use \n instead of |. Even in cases where the line becomes very long and I want to split it over two lines, I'd be ok with requiring a mandatory | as well as () to suppress the conversion of \n to >>. So

from customers
derive upper_title = (title | ... | ... |
    text.upper)

or

from customers
derive upper_title = (title 
    | text.upper)

but never

from customers
derive upper_title = (title 
    text.upper)

or

from customers
derive upper_title = title 
    text.upper

@max-sixty
Copy link
Member Author

I think any deviation away from "| is the same as a newline in a pipeline" is a huge change and would need a really compelling rationale. Similarly anything making a pipeline anything other than applying functions in turn...

max-sixty added a commit to max-sixty/prql that referenced this pull request Jul 24, 2024
@max-sixty
Copy link
Member Author

max-sixty commented Jul 24, 2024

Need to add some parentheses docs but otherwise this is ready to go!

Edit: actually the docs were already up-to-date... the code behavior was overly permissive and we didn't check for an error

@max-sixty max-sixty enabled auto-merge (squash) July 24, 2024 07:48
@max-sixty max-sixty merged commit f01789a into PRQL:main Jul 24, 2024
36 of 37 checks passed
@max-sixty max-sixty deleted the pipeline-parens branch July 24, 2024 07:50
max-sixty added a commit to max-sixty/prql that referenced this pull request Jul 24, 2024
max-sixty added a commit that referenced this pull request Jul 24, 2024
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants