Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: RFC modules #2129

Merged
merged 12 commits into from
Apr 20, 2023
Merged

docs: RFC modules #2129

merged 12 commits into from
Apr 20, 2023

Conversation

aljazerzen
Copy link
Member

@aljazerzen aljazerzen commented Mar 12, 2023

A crude draft on module and multiple queries per file. It's much more concrete than the type system one.

@aljazerzen aljazerzen added the language-design Changes to PRQL-the-language label Mar 12, 2023
@aljazerzen aljazerzen requested a review from snth March 12, 2023 19:29
```
mod my_module {
let bangers = (from tracks | take 10)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Syntax is similar to current discussion on tuples/arrays, but I don't think it will clash on the parser-level. That's because this is on the "level" of the statements and those two are both on the "level" of expressions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to ask about that. Are modules not also structs where the keys are names and the values are either (sub)modules, relations or functions?

Shouldn't we therefore be consistent in the use of symbols and use [] possibly rather than {}? Your current proposal reads a lot more naturally ofc but curious about the consistency angle.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, semantically modules are not equivalent to structs. They don't have a type, cannot be referred to as a value and have different parsing rules.

So I think that here it's ok to choose the syntax that looks best, instead of seeking consistency with structs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(also moot if we use just files / #2129 (comment))

}
```

## File loading
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design is heavily plagiarized from how Rust modules work. Thank god for opensource!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Rust is poor choice to draw inspiration from for a module system. In my opinion it is very confusing and one of the worst parts of Rust.
I think the module system should aim to be as simple and predictable as possible.
I think it would be better to look at PHP and it's include statement which is dead simple.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to redesign it to come closer to PHP's include() or Common.js's require() and I've realized that this is a very close already.

The main distinctions are:

  • we have module hello declaration instead of a function include("hello.prql"),
  • we have more restrictions on importing that make the module structure more "normalized" (see goal 4),
  • these restrictions also allow cyclic references between modules,
  • child modules don't have to re-import modules that are already imported into parent modules:
    # parent.prql
    module child1
    module child2
    
    # child1.prql
    
    # no need to re-import child2, as it is already in the module structure
    let _ = (from child2.some_decl | ...)
    

Comment on lines 86 to 88
mod default_db {
# all inferred tables and defined CTEs
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest I don't remember why this module is needed. Maybe we could get away by having the tables on the top-level.

It would simplify the compiler a bit.

This comment was marked as outdated.

.gitignore Show resolved Hide resolved
Copy link
Member

@max-sixty max-sixty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great!


I found this quite insightful: https://matklad.github.io/2021/11/27/notes-on-module-system.html. It recommends some simplifications to the module system:

  • No mod {} — is it worth the complication, as opposed to just files?
  • No mod.rs — instead foo/foo.rs (it actually has _foo.rs, but why the leading underscore?) — I like this, it reduces the number of mod.rs files

Are you thinking that imports would be implicit? Or would that be a separate discussion?

When compilation is invoked compiler will search for the root of the module
structure of the project:

1. attempt to load `./project.prql`. If successful use it as the root,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you see as the difference between a project and a module? If we don't need a distinction, we can avoid having one at the moment, and just use mod for both...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention was for project to represent the root of the module tree.

This may not be needed if we would also have super (a reference to parent module), which would be nice to have in any case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does there need to be a "roof of the module"?
Shouldn't one module always be just one file?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @max-sixty about not having mod {} and mod.rs, but I think just foo.rs might be enough and not have it's own directory such as foo/foo.rs.
One module would always be just one file.

@aljazerzen
Copy link
Member Author

No mod {} — is it worth the complication, as opposed to just files?

Good idea, it does not bring much to the table. We can also add it later without breaking changes.

No mod.rs — instead foo/foo.rs (it actually has _foo.rs, but why the leading underscore?) — I like this, it reduces the number of mod.rs files

I also like this. The pain with mod.rs is searching because you have to look at folder names instead of just file name.

The underscore is here so it's first alphabetically.

Are you thinking that imports would be implicit? Or would that be a separate discussion?

(We can discuss here)

I'm not exactly sure what do you mean:

Implicit use foo.*. That is problematic because it will cause broken code just by adding a module. We do it now it std module and it's already causing bugs.

Implicit mod foo. I'd like to avoid automatic file discovery, because would allow attacks that place malicious mod.prql in common locations like Downloads/ and hope you try to compile a .prql file in same directory. It's not a huge risk now, but it would be backward incompatible to remove later.

@aljazerzen
Copy link
Member Author

Thinking about it I realize that similar attacks could be possible by injecting project.prql with current proposal.

Huh. I'll do another pass on the proposal.

@max-sixty
Copy link
Member

Implicit use foo.*. That is problematic because it will cause broken code just by adding a module. We do it now it std module and it's already causing bugs.

So are you thinking that we'd have use foo as a way of importing? (Currently I don't see this in the RFC — ofc no prob if we do it later)

The underscore is here so it's first alphabetically.

Hmmm, I don't love the underscore aesthetically, given it's the "main" file, but only aesthetically, low weight.

@aljazerzen
Copy link
Member Author

Implicit use foo.*. That is problematic because it will cause broken code just by adding a module. We do it now it std module and it's already causing bugs.

So are you thinking that we'd have use foo as a way of importing? (Currently I don't see this in the RFC — ofc no prob if we do it later)

We don't have to add it now, because you can use longer idents: foo.bar.baz. But yes, ultimately, use will be needed.

The underscore is here so it's first alphabetically.

Hmmm, I don't love the underscore aesthetically, given it's the "main" file, but only aesthetically, low weight.

I agree, it doesn't look nice. We could do without it, but that:

  • makes it hard to find the "parent" file in a directory,
  • has a conflict when you want to have modules foo and foo.foo (because both want to have a file at foo/foo.prql). I'd be fine with forbidding that all together.

@max-sixty
Copy link
Member

has a conflict when you want to have modules foo and foo.foo (because both want to have a file at foo/foo.prql). I'd be fine with forbidding that all together.

lol, this is technically possible, it just requires doing foo/foo/foo.prql!


We don't have to add it now, because you can use longer idents: foo.bar.baz. But yes, ultimately, use will be needed.

OK, so we do have implicit imports — i.e. someone can call foo.bar without having a use foo statement?

Copy link
Member

@snth snth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I found this a bit less clear than the other RFC and feel that it could be fleshed out a bit more. I had to read a few sections two or three times to try and ensure that I understood it correctly.

web/book/src/internals/modules.md Outdated Show resolved Hide resolved
web/book/src/internals/modules.md Outdated Show resolved Hide resolved
```
mod my_module {
let bangers = (from tracks | take 10)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to ask about that. Are modules not also structs where the keys are names and the values are either (sub)modules, relations or functions?

Shouldn't we therefore be consistent in the use of symbols and use [] possibly rather than {}? Your current proposal reads a lot more naturally ofc but curious about the consistency angle.

@vanillajonathan
Copy link
Collaborator

The documention doesn't mention how to load a module. I expected to find a include "foo.prql", import Foo, using AcmeCorp.Foo, or use Foo statement.

@vanillajonathan
Copy link
Collaborator

I agree, it doesn't look nice. We could do without it, but that:

  • makes it hard to find the "parent" file in a directory,
  • has a conflict when you want to have modules foo and foo.foo (because both want to have a file at foo/foo.prql). I'd be fine with forbidding that all together.

Do we even really needed modules in paths like that?
Maybe only foo should be allowed, not foo.foo with dot.

@aljazerzen
Copy link
Member Author

The documention doesn't mention how to load a module. I expected to find a include "foo.prql", import Foo, using AcmeCorp.Foo, or use Foo statement.

It does, but perhaps not clearly enough. You just use module foo and that will import foo.prql.

Do we even really needed modules in paths like that?
Maybe only foo should be allowed, not foo.foo with dot.

I don't think we can afford that. It would be a ruff arbitrary prohibition that may lead to IDE's overwriting your source files when you rename some module. And we (almost) run into such case in prql-compiler: we have a module with path tests::test. https://github.com/PRQL/prql/tree/main/prql-compiler/src/tests

Comment on lines 71 to 89
If an identifier cannot be resolved relative to the current module, it tries to
resolve relative to the parent module. This is repeated, stepping up the module
hierarchy until a match is found or root of the module structure is reached.

```prql_no_test
module my_playlists {
let decl_1 = ...

module soundtracks {
let decl_2 = ...
}

module upbeat_rock {
let decl_3 = ...

from decl_1 | join soundtracks.decl2 | join decl_3
}
}
```
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this.

Upsides:

  • the std module can be defined just in the root module and be accessible
    everywhere under name std,

Downsides:

  • we need to find the root of the module structure,

Copy link
Member

@max-sixty max-sixty Apr 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(edited after reading more)

When does this know when to stop going up the file tree?

Copy link
Member

@max-sixty max-sixty Apr 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some possible modules-as-inheritance approaches from the implicit shadowing.

But also possible some quite confusing error messages when a parent changes or adds a definition!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does this know when to stop going up the file tree?

Well, at the root? So std can be defined in the root only.

But also possible some quite confusing error messages when a parent changes or adds a definition!

If we give declarations in child priority over declarations in parent, no change in parent will be able to overshadow the child declaration.

But it could happen that a module would be referring to a grand-parent module and a change in parent would break the child.

 grandparent -> decl
      parent -> we add decl here
      child  -> ref to decl now resolves to parent instead of grandparent

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does this know when to stop going up the file tree?

Well, at the root? So std can be defined in the root only.

This is moot if we have a file that can only be at the root, such as _project.prql or project.prql — we can stop there

But otherwise we need some way of knowing when to stop searching upwards, so we don't keep going to /. We could do that when we no longer find a match of $dir containing _$dir.prql...

I also think this "keep searching up" would be fine to push into a later version if it's not trivial to implement — very reasonable to adjust at implementation time. Though we'd need a way to reference a parent module, such as module ..foo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is a tad different thing I was talking about here.

Here I meant name resolution only, which happens after the module structure has been read from files, parsed and resolved.

When deciding what files to load, the compiler should get the path to the root module and then find references to other files which need to be loaded from there.

@aljazerzen
Copy link
Member Author

I've updated the RFC with all the suggestions. I hope I've also made it more readable @snth :D

Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>
}
```

## Example
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re all the confusion above — sorry — the example is actually the most anchored thing to discuss from; I should have started here.

My main question is:

Why place the module foo declarations in the directory modules, rather than have import statements near the code that uses them?

  • It doesn't supply that much new information — just that we should use the file that's already there. We're replicating the file structure in declarations that are arguably duplicative
  • Instead, the import statement can just look up the file — same as python.
  • How do we discriminate between from foo.bar where foo is a DB schema vs. a module (assuming I eventually solve Resolve opaqueness of idents #1535)? If we have use foo in the file itself, it's explicit.

So by moving from "declarations in directory modules" to "import statements at the call site", we'd have something like:

# _project.prql

# We can have code here, but there's no need for `module` declarations
# This is identical to below
# employees.prql
let employees = (...)
let salaries = (...)
let departments = (...)
# sales/orders.prql
use ../employees  # or some variant of `../` that shows it's a level above

let current_year = (...)
let archived = (...)
let by_employee = (from orders | join employees.employees ...)

WDYT? I hope I'm making more sense now...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with that it should just directly load a file with the same name instead of look inside a subdirectory.

I am a bit skeptical against having imports from parent directories though (as you mention with ../). I lean towards that files should only be able to use other files in the same directory and child directories but not in parent directories.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lean towards that files should only be able to use other files in the same directory and child directories but not in parent directories.

Surely this would be super restrictive — any sort of utils package wouldn't be possible, unless every file was in the same path? Are there any other languages that do this? (Or am I misunderstanding the point?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why place the module foo declarations in the directory modules, rather than have import statements near the code that uses them?

Because with this proposal:

  • you only "import"/"declare" a module once and then it's in the module structure where you can use it simply by referencing that name, no use or module statements required,
  • you don't need a import ../foo (i.e. import parent) statement, because modules are imported/declared from root down (but you can still reference parent modules after they are imported/declared),
  • refactoring is simpler, since there is a single import/declaration of a module,

How do we discriminate between from foo.bar where foo is a DB schema vs. a module (assuming I eventually solve
Resolve opaqueness of idents #1535)? If we have use foo in the file itself, it's explicit.

Let's discuss that after this RFC is done. I have an idea how to do it on top of modules and declarations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely this would be super restrictive — any sort of utils package wouldn't be possible, unless every file was in the same path? Are there any other languages that do this? (Or am I misunderstanding the point?)

C# and Java imports using namespaces so nothing can be from a parent directory. PHP lets you include from any file path including relative paths which may be parent directories. JavaScript (Node.js) lets you import from parent directories. I am not sure about Python, but I don't think it is common. I also worry about the security impact of importing from parent directories.

In the PHP world when using Composer it is common that imports from your own code is in your your_project/src/ directory, and third-party code from a your_project/vendor/ directory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because with this proposal:

* you only "import"/"declare" a module once and then it's in the module structure where you can use it simply by referencing that name, no `use` or `module` statements required,

* you don't need a `import ../foo` (i.e. import parent) statement, because modules are imported/declared from root down (but you can still reference parent modules after they are imported/declared),

The main issue I see is that the action is quite far from the consequence. When we're missing an item in python, it's fairly obvious where the item should be:

from ..utils import read_file

"Couldn't find read_file in utils" — OK great

But in this design, what do we say if something isn't found? There are so many places we might be missing a declaration. The user needs to understand the module system well enough to then go and check each parent in the hierarchy for the missing item.

* refactoring is simpler, since there is a single import/declaration of a module,

Indeed — I can imagine refactorings being quite complicated, because the compiler can no longer tell you "we can't find this item here". I'm fine pushing the conversation re "DB schema vs. a module", but this will further confuse things, since from foo.bar will also be legitimate SQL even when foo isn't available as a module.

Or are my fears overblown? I can definitely see that it's an elegant system when everything is ideal — but I can also imagine it being quite confusing with newer users or messier projects.


My secondary — much more minor — concern is that the module declarations aren't doing that much. IIUC they can only refer to a single file location. Is requiring them (relative to just finding them in the file tree) going to make things clearer?


To put this into the example, I would do something more like python, with a touch of rust, and do:

# _project.prql
# employees.prql
let employees = (...)
let salaries = (...)
let departments = (...)
# sales/_sales.prql
let revenue_by_source = (...)
# sales/orders.prql
use ..employees

let current_year = (...)
let archived = (...)
let by_employee = (from orders | join employees.employees ...)
# sales/projections.prql
use ..orders

let orders_2023 = (from orders.current_year | append orders.archived)
let orders_2024 = (...)
# util.prql
func pretty_print_num col -> (...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point, declarations are far from where they are then used. In essence, the problem with my proposal is that "the module hierarchy" is a whole concept you have to know about as opposed to just knowing how to pull a declaration from one file into another.

I've purposefully made the module system repr in filesystem normalized, so we can have very good hints about where to put your module declarations:

select [sales.utils.my_func my_column]
              -----
                 \___ name utils does not exist
Hint: add `module utils` to `sales/_sales.prql`
module orders
       ------
          \__ cannot load sales/orders.prql or sales/orders/_orders.prql

My point is that knowing about "the module hierarchy" is not a requirement for anyone, because we can provide hints to guide users until it makes sense to them.


My fear is that it will end up having way-too much uses with the extreme case being:

# _project.prql
# employees.prql
use ..sales
use ..util

let employees = (...)
let salaries = (...)
let departments = (...)
use projections
use orders
use ..employees
use ..utils

# sales/_sales.prql
let revenue_by_source = (...)
# sales/orders.prql
use ..
use ..projections
use ...employees
use ...utils


let current_year = (...)
let archived = (...)
let by_employee = (from orders | join employees.employees ...)
# sales/projections.prql
use ..
use ..orders
use ...employees
use ...utils

let orders_2023 = (from orders.current_year | append orders.archived)
let orders_2024 = (...)
# util.prql
func pretty_print_num col -> (...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've purposefully made the module system repr in filesystem normalized, so we can have very good hints about where to put your module declarations:

select [sales.utils.my_func my_column]
              -----
                 \___ name utils does not exist
Hint: add `module utils` to `sales/_sales.prql`

I agree this is a very good hint. But given how we would search upwards for a module, if sales isn't found — i.e. a "root" item, there could be lots of places that it could be.

One option would be to do completely automatic imports — no need for module, no need for use. I might even prefer that to only having module — which seems to ask for more than gives, at least until we had non-file modules.

My synthesis after thinking about this is:

  • Just using a file structure is simpler than offering non-file modules (we agree)
  • use statements are explicit, reduce confusion, not that costly
  • Given the language goals & users, we should favor simplicity. And imports are often pits of confusion.
  • I do think your example of the hint is quite good, but I'm still worried about the ambiguity because of the hierarchy search, and whether the hint is so good that the language should infer it itself

WDYT?

Any thoughts from others? @eitsupi & @snth ? (Note — please read the comparative examples carefully, don't make the same mistake I did...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nobody wants to weigh-in here :(

I do think your example of the hint is quite good, but I'm still worried about the ambiguity because of the hierarchy search, and whether the hint is so good that the language should infer it itself

Good point, it may be more confusing than helpful.

One option would be to do completely automatic imports — no need for module, no need for use. I might even prefer that to only having module — which seems to ask for more than gives, at least until we had non-file modules.

That's probably the safest way to go here. It's also the easiest to implement, so would't be wasting much work if we decide to change later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nobody wants to weigh-in here :(

Yeah, one issue with these big discussions is that it's more difficult for folks to weigh in on some smaller but important issues — there are 114 comments — and it's difficult to know how much context is required for weighing in. I don't know a great way around it...

That's probably the safest way to go here. It's also the easiest to implement, so would't be wasting much work if we decide to change later.

Great, very much agree.

I also think that approach is the best way of getting more feedback — merge this with a path forward, and once we start building, folks will raise anything sufficient important...

aljazerzen and others added 2 commits April 12, 2023 09:06
Co-authored-by: Maximilian Roos <5635139+max-sixty@users.noreply.github.com>
}
```

## Example
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vanillajonathan starting a new thread for the "import parent" discussion:

Surely this would be super restrictive — any sort of utils package wouldn't be possible, unless every file was in the same path? Are there any other languages that do this? (Or am I misunderstanding the point?)

C# and Java imports using namespaces so nothing can be from a parent directory. PHP lets you include from any file path including relative paths which may be parent directories. JavaScript (Node.js) lets you import from parent directories. I am not sure about Python, but I don't think it is common. I also worry about the security impact of importing from parent directories.

In the PHP world when using Composer it is common that imports from your own code is in your your_project/src/ directory, and third-party code from a your_project/vendor/ directory.

Ah, so I think that's where the misunderstanding is — I agree it might be dangerous to allow importing from a parent outside of the package.

Python allows for importing parents within the package — e.g. from ... import utils, because the path is relative to the current file. IIUC Java allows for import utils in a file src/foo/bar/baz.java from a file src/utils.java, because the path is relative to the package root.

So given this, I agree possibly we should downweight being able to import from a parent outside the package, which might require a definition of what a package is. (Though I'm not sure, let's see how the other discussions shake out; possibly it's a nice feature to be less tethered to the concept of a package)

```

## Example

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@max-sixty Another thread for "Opaqueness of idents".


(Bear with me, I need to explain all this to so I can explain how I'd change it)

Right now, when we infer that a table exists, we add its declaration to module default_db (this declaration is updated when we discover new columns).

Using a made-up syntax:

module default_db {
    table foo = ...
}

If a table name is segmented (i.e. my_db.my_schema.foo), we infer this:

module default_db {
    table `my_db.my_schema.foo` = ...
}

... essentially, the table name contains the dots, which is then problematic when actual dots are inserted into the name segments (i.e. my_db.`my.schema`.foo).


Instead, I propose we infer this:

module default_db {
    module my_db {
        module my_schema {
            table foo = ...
        }
    }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, very much agree. that's what I'd been trying to do in my branch for #1535 — when it finds a table type of my_schema.foo, it would create a module my_schema in the root (though should actually be in default_db, good point).

My branch is super-messy, but maybe I push it and you can criticize / guide it towards something helpful. I haven't worked on it in a couple of weeks now.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

Comment on lines 93 to 94
If the final declaration in a module is a variable declaration that is named
`main`, then the leading `let main = ` can be omitted and expressed only by the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the example — should this instead be "The final statement in a module is implicitly named main"? In the example there is no variable declaration named main...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may be written poorly, I wanted to say that:

  • if the last declaration in a module is a variable declaration,
  • and it is named main,
  • it may omit the leading let main = .

It can also be named something else, but it needs to have the let x = in this case.

Another way to say it, would be that:

Last unnamed pipeline in a module is implicitly named `main` 

... but this implies that we can have unnamed pipelines elsewhere in a module.

(ps: I've been using term "declaration" as a cover term for variable, function and type declarations. We can switch that to "statement" if you feel it is more appropriate or future-proof)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • if the last declaration in a module is a variable declaration,

    • and it is named main,

    • it may omit the leading let main = .

I think this is the sticking point... it seems like the conclusion is required for the premise, since it can't be named main if it omits the leading let main =...
(sorry if I'm misunderstanding — my guess is that this is actually a small point of writing rather than some fundamental disagreement)

Last unnamed pipeline in a module is implicitly named `main` 

... but this implies that we can have unnamed pipelines elsewhere in a module.

How about a "An unnamed pipeline as the module's final declaration is implicitly named main" ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see the problem! In my mind, the last pipeline remains named main even if you remove the leading let main =. That may be confusing, yes.

How about a "An unnamed pipeline as the module's final declaration is implicitly named main" ?

That is still implying there are unnamed pipelines.

Last variable declaration in a module can omit the leading let main = and acquire an implicit name main.

(yes this appears to be a minor point in writing)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last variable declaration in a module can omit the leading let main = and acquire an implicit name main.

Perfect

(nitty: I'd use "final" rather than "last" as it's less ambiguous, though here it's fairly clear given the "in a module" modifier)

@aljazerzen
Copy link
Member Author

I think that we have a consensus on the outline for modules. There are things left to discuss, but we can get to them in separate issues.

@aljazerzen aljazerzen enabled auto-merge (squash) April 20, 2023 13:53
@aljazerzen aljazerzen merged commit 31aa015 into main Apr 20, 2023
@aljazerzen aljazerzen deleted the rfc-modules branch April 20, 2023 13:58
@max-sixty
Copy link
Member

Great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language-design Changes to PRQL-the-language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants