Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language proposal 0.1 #33

Closed
LPeter1997 opened this issue Feb 23, 2022 · 12 comments
Closed

Language proposal 0.1 #33

LPeter1997 opened this issue Feb 23, 2022 · 12 comments
Labels
Design document This one came out from an idea but considers many cases and tries to prove the usabity
Milestone

Comments

@LPeter1997
Copy link
Member

LPeter1997 commented Feb 23, 2022

Edit: By the suggestion of @WhiteBlackGoose, we are reducing the scope of this proposal even further, notably we are taking out modules and visibility. For now we assume a single-file system.

This is a proposal, not a specification. Changes can still happen to almost any extent, constructive discussion is encouraged.

Goal of the document

The goal of this document is to establish features for the first prototype compiler. The reason to make such an early design document is to end the era of "design by random cloud of ideas popping up" and actually start going and expanding in a given direction. I believe this will induce more directed and more fruitful discussions down the line.

Scope of the features

The feature set does not have to be big, but it has to be stable enough not to change too much in the future. The features listed here will not be the full, planned feature set of the language, just serves as a bare-bones skeleton to fill in with further proposals.

The proposal implicitly defines the initial syntax.

A note on semicolons

I know that some are not a huge fan of semicolon terminators. For now, we use it in the syntax, because it makes many grammatical rules simpler. This doesn't mean that the final language will have to use semicolons, there are languages that got rid of them down the line without any breaking change or issue.

Primitive types

The following primitive types are supported:

  • int, which is equivalent to int32
  • uint, which is equivalent to uint32
  • int8
  • uint8
  • int16
  • uint16
  • int32
  • uint32
  • int64
  • uint64
  • bool
  • float32
  • float64
  • unit, which is roughly equivalent to the void type in C#, but is a true type in the type system, not a marker for no-return (meaning that you can for example use it as a generic parameter, or create a variable of type unit)

The naming of these types gets rid of the C heritage, which is very inconsistent among the C family. The explicit sizes make sure we don't look up docs to know integer sizes. The convenience aliases int and uint are in there for "casual" use, when the explicit size is mostly irrelevant for the developer.

Comments

Single line comments are supported with the usual starting sequence //, ending at the end of the line.

Note: Originally, comments right above a construct was considered documentation comment. For now we have removed that, as we started to feel that this could cause an "accidental" information/doc-leak. Documentation comments will be added later and will have a different syntax (probably something like ///).

Functions

The language supports free functions defined on top level. The general syntax is func <name>(<arg1>: <type1>, <arg2>: <type2>, ...): <return type> { ... }, for example:

func fib(n: int): int {
    // ...
}

For one-liner functions returning a single expression, the syntax can be shortened:

func times_two(n: int): int = n * 2;

Functions can return a value using the conventional return statement. Since blocks can return a value (see later), the following is valid and is very similar to Rust implicit returns:

func foo() = {
    bar();
    1 + 2
};

We have decided that for now this will suffice, we can make function blocks do implicit returns later, if we decide to.

Operators and precedence

The following is the precedence table for the supported operators:

Operator Description Associativity Notes
expr(args...)
expr[indices...]
Function call
Indexing
-
expr.member Member access Left-to-right
+expr
-expr
Positive
Negative
-
expr * expr
expr / expr
expr mod expr
expr rem expr
Multiplication
Division
Modulo
Remainder
Left-to-right Hopefully the keywords instead of the made up % helps disambiguate and avoid bikeshedding syntax arguments in the future.
expr + expr
expr - expr
Addition
Subtraction
Left-to-right
expr in expr
expr not in expr
expr < expr
expr > expr
expr <= expr
expr >= expr
expr == expr
expr != expr
Containment
Does not contain
Less-than
Greater-than
Less or equal
Greater or equal
Equals
Not equals
Left-to-right These operators can be chained arbitrarily, like in Python.
x < y >= z in foo is equivalent to x < y and y >= z and z in foo, all expressions evaluated at most once, short-circuiting on the first falsy value.
The elements in the chain can not be parenthesized, (x < y) == (y < x) is not equivalent to x < y == y < x!
not Logical not The placementhas changed from the usual C-way.
and Logical and Left-to-right
or Logical or Left-to-right
=
@=
Assignment
Any compound assignment
Right-to-left In @= the @ stands for the usual symbols allowed for compound assignment.

TODO: Missing all bitwise and shift operators.

A small addition to the relational operators is that (x < y) == (z < w) would evaluate as "is the result of x < y the same as z < w", while x < y == z < w is "is x < y and y == z and z < w".

in and not in would translate to a .Contains or .ContainsKey call, depending on the argument type.

The allowed list of compound operators: +, -, *, /.

The not operator precedence has changed, compared to what C does for example. Note, that the relation of the logical operator precedences has not changed, not still has the highest precedence among and, or and not. The rationale for this change is that since not isn't sticky like ! anymore, we might as well put it alongsode the rest of the logical operators, and simplify expressions like !(start <= point && point < end) - a bounds check - to be something like not start <= point < end.

Scoping rules

Lexical scoping is followed, meaning that variables defined in a scope will be visible only in that scope, or scopes nested inside that scope. By #13, arbitrary variable shadowing is allowed, meaning that function-local variables can overwrite each other, within the rules of lexical scoping.

Control flow structures

By default all control-flow structures are expressions, meaning they return a value. The most basic such structure is a nested block. They return the last non-semicolon terminated expression in the block. If there's no such expression, the return value is unit.

// Evaluates to 3
{
    foo();
    bar();
    1 + 2
}

// Evaluates to unit
{
    foo();
    bar();
}

The two branches of the if-else statement have to evaluate to the same type and the condition has to be a bool type. If the else branch is missing, an empty one is assumed, returning the unit type.

if (foo()) bar() else baz()

While loops always return unit type.

while (foo()) bar()

Variables

The variable declaration syntax is var|val <name>: <type> = <value>;. The keyword var defines a mutable, and val defines an immutable variable. For var both the type specification part : <type> and the value assignment = <value> are optional, for val the value assignment is required. This gives 4 possibilities:

  • Type specified, value specified: The specified values type has to be assignable to the specified type
  • Only type specified: No extra checks
  • Only value specified: The type is immediately inferred from the specified value
  • Nothing specified: The type will be inferred from the first use.
@LPeter1997 LPeter1997 added the Design document This one came out from an idea but considers many cases and tries to prove the usabity label Feb 23, 2022
@WhiteBlackGoose
Copy link
Member

Module/namespace being inherited from path is good idea, but I think there's no need to add upper-casing for first letter thing. So aaa/bbb/ccc.fr is aaa.bbb.ccc, and Aaa/Bbb/Ccc.fr is Aaa.Bbb.Ccc.fr. My point is that it's not needed to upper-case the first latter artifically, imo.

@LPeter1997
Copy link
Member Author

LPeter1997 commented Feb 23, 2022

Module/namespace being inherited from path is good idea, but I think there's no need to add upper-casing for first letter thing. So aaa/bbb/ccc.fr is aaa.bbb.ccc, and Aaa/Bbb/Ccc.fr is Aaa.Bbb.Ccc.fr. My point is that it's not needed to upper-case the first latter artifically, imo.

@WhiteBlackGoose Fair point, it feels less magic. Edited the example to emphasize this.

@Happypig375
Copy link

not belongs with unary + and -, missing decimal/intn/uintn primitives, # instead of // for comments, // for explicit integer division while keeping / as float division, ^ as the exponentiation operator (both int and float allowed), import a single function/type inside a module, =/= instead of != as ! should be used for something else than not

@WhiteBlackGoose
Copy link
Member

/ should be consistent with the rest of .net. Apparently integer division is used very often, no need to prioritize floating point division over it.

About decimal and native integers - it's a very initial essential set of features, it's far from a finished one.

About ^ - absolutely agree, @Happypig375. But please, open an issue for it, so that it could be tracked.

@LPeter1997
Copy link
Member Author

LPeter1997 commented Feb 23, 2022

About / I'll agree with @WhiteBlackGoose, we will not change that.

We could introduce exponentiation as a right associative ^ later, please open an issue about that.

I will extend import later today, it's a fair point that we want to be selective with that. Edit: For now we are not sure if we need partial imports.

I'm open to discuss the != operator syntax, but I'm not sure how we could make a good decision on that.

About the remaining primitive types pleas open an issue, they can be part of a later proposal.

@Happypig375
Copy link

Also

func foo() { // Consider me as documentation for foo() as well
    ...
}

@LPeter1997
Copy link
Member Author

@Happypig375 That begs the question if we ever want to suggest documentation comment to be written there. I see no reason for that, I usually see TODOs and FIXMEs written there.

@Happypig375
Copy link

But that's also consistent with comments in relation to statements e.g. i += 1 // Add 1 to I

@LPeter1997
Copy link
Member Author

Statements have no documentation. Only elements that are publicly exposed in an API (functions, types, modules, globals).

@LPeter1997 LPeter1997 pinned this issue Mar 4, 2022
@LPeter1997 LPeter1997 changed the title [WIP] Language proposal 0.1 Language proposal 0.1 Mar 9, 2022
@WhiteBlackGoose WhiteBlackGoose added this to the 0.1 milestone Jun 2, 2022
@hez2010
Copy link

hez2010 commented Jun 4, 2022

I would recommend to include the new introduced types float16(System.Half), int128(System.Int128) and uint128(System.UInt128).
As well as nint(System.IntPtr) and unint(System.UIntPtr) for native interop.

@WhiteBlackGoose
Copy link
Member

@hez2010 that's a good suggestion. But for 0.1 we want the very very basic features. It's not even a release, it's more of an internal milestone.

@LPeter1997
Copy link
Member Author

@hez2010 Great suggestion, I agree that we will need these types eventually. We likely won't include it in the 0.1 proposal, but will be part of the first language release. Feel free to open an issue so we can keep track of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Design document This one came out from an idea but considers many cases and tries to prove the usabity
Projects
None yet
Development

No branches or pull requests

4 participants