-
Notifications
You must be signed in to change notification settings - Fork 67
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
EEP 71: clarification of type specs & generics
- Loading branch information
1 parent
68a0a1b
commit b22417e
Showing
1 changed file
with
152 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,152 @@ | ||
Author: John Högberg <john(at)erlang(dot)org>, Ilya Klyuchnikov <ilya(dot)klyuchnikov(at)gmail(dot)com> | ||
Status: Draft | ||
Type: Standards Track | ||
Created: 7-Aug-2024 | ||
Erlang-Version: OTP-28 | ||
Post-History: | ||
**** | ||
EEP 71: Clarify the type documentation, especially regarding type variables | ||
---- | ||
|
||
Abstract | ||
======== | ||
|
||
Type variables are an often-used yet very misunderstood part of the type | ||
language. As `dialyzer` has been lax in enforcing their defined semantics owing | ||
to a few bugs, many users have been led to believe that they are simple generics | ||
as in most other languages. As a significant amount of code has been written | ||
with this in mind, OTP cannot fix the bugs that made `dialyzer` lax by accident | ||
without breaking the analysis of a lot of code. | ||
|
||
While that would be fine if `dialyzer` were the only tool out there, all the | ||
other code analyzers we know of interpret type variables as generics which would | ||
fragment the ecosystem should we elect to fix the bugs. One application might | ||
have been written with `eqWAlizer` in mind and another with `dialyzer`, | ||
preventing either tool from being applied to the whole release. | ||
|
||
As `dialyzer` is simply broken in this regard, this EEP aims to solve the | ||
problem by changing the definition of type variables from _equality constraints_ | ||
to _generics_ (parametric polymorphism). We also propose several other changes | ||
to make the type language more useful and less ambiguous. | ||
|
||
Rationale | ||
========= | ||
|
||
As currently documented, variables in `-spec`s and `-type`s act just like in | ||
ordinary code. That is, if the variable `V` appears several times it has the | ||
same _value_ in all places. In a language like C++ or Java where said value is a | ||
type, this works pretty well: | ||
|
||
template<typename V> | ||
V foo(V a, V b) { | ||
return a + b; | ||
} | ||
|
||
For example, `foo(12, 34)` works because both arguments have the same type `int` | ||
(yielding `int` for `T`), but `foo(12, 34.0)` doesn't because `int` is not | ||
`double`. | ||
|
||
However, as the documentation for the Erlang type language says that repeated | ||
variables refer to the same _value at runtime_, tools like `gradualizer` and | ||
`eqWAlizer` are not free to treat them as generics. For example: | ||
|
||
-spec my_add(T, T) -> T. | ||
my_add(A, B) -> A + B. | ||
|
||
Here, the only possible values of `T` that will work are the various | ||
representations of zero: `foo(12, 34)` will fail because `12` is not `34`, and | ||
`foo(1, 1)` will fail because the result `2` is not `1`. | ||
|
||
This makes it impossible to declare that a function takes a value of a certain | ||
type and returns something of the same type (but not necessarily _same value_). | ||
This makes sense in `dialyzer` as values and types are the same thing in its | ||
paradigm[^1], but restricts what other type checkers can do; while the | ||
documentation states that the "extra information" given by type variables may be | ||
ignored, it leaves no room for other interpretations when they are taken into | ||
account. | ||
|
||
[^1]: In an untyped language we cannot simply say that the type of `1` is | ||
`integer()` since `1 | foo` is just as much of a superset of `1` as | ||
`integer()` is. The type of `1` is simply `1`. | ||
|
||
Changes | ||
---------------- | ||
|
||
To solve this issue and make the type language less surprising, we propose the | ||
following changes: | ||
|
||
1. Redefine `::` as alias instead of "subtype of." | ||
|
||
All currently known tools implement this interpretation[^2]. Should bounded | ||
quantification be needed in the future, the "subtype of" operator can be | ||
introduced later under the more widely used `<:` or `=<` syntaxes. | ||
|
||
Multiple aliases under the same name are rejected, making the following illegal: | ||
|
||
-spec multiple(X :: integer(), X :: integer()) -> atom(). | ||
-spec when_multiple(X) -> atom() when X :: integer(), X :: number(). | ||
|
||
To prevent breaking existing code as a result of this change, the compiler | ||
will _not_ enforce the above, leaving that to the type checking tools (or | ||
optionally a compiler flag). | ||
|
||
Check failure on line 92 in eeps/eep-0071.md GitHub Actions / markdownlintTrailing spaces
Check failure on line 92 in eeps/eep-0071.md GitHub Actions / markdownlintTrailing spaces
|
||
2. Normalize types in the parser, substituting aliases and treating "annotated | ||
Check failure on line 93 in eeps/eep-0071.md GitHub Actions / markdownlintSpaces after list markers
Check failure on line 93 in eeps/eep-0071.md GitHub Actions / markdownlintSpaces after list markers
|
||
types" the same as `when`. This makes the following signatures equivalent: | ||
|
||
-spec xyzzy(A) -> term() when A :: number(). | ||
-spec waldo(B :: number()) -> term(). | ||
-spec fred(number()) -> term(). | ||
|
||
Where `B :: number()` is mere shorthand for `B ... when B :: number()`, and | ||
all variants declare that the first argument is a `number()`. | ||
|
||
3. Treat type variables as generics rather than as equality constraints. | ||
|
||
Example: | ||
|
||
The current `lists:append` type spec is: | ||
|
||
-spec append(List1, List2) -> List3 when List1 :: [T], List2 :: [T], List3 :: [T], T :: term(). | ||
|
||
With the defined semantics, `T :: term()` makes the `append` function to | ||
refer to a top type. Instead, the following is preferred and more in line | ||
with current type systems: | ||
|
||
-spec append(List1, List2) -> List3 when List1 :: [T], List2 :: [U], List3 :: [T | U]. | ||
|
||
4. Have function contracts (`-spec`) describe argument types as they go in. | ||
|
||
A very surprising aspect of function contracts is that they describe the | ||
types of the arguments _when the function returns successfully_, and do not | ||
affect argument types _within_ the function. For example: | ||
|
||
-spec foo(integer()) -> integer(). | ||
foo(X) -> | ||
bar(X), %% X is not necessarily an integer here! | ||
X + 1. | ||
|
||
Most people take the above to mean that the type passed on to `bar/1` is | ||
`integer()`, yet it can be whatever is passed to `foo/1` including values | ||
that cause `foo/1` to fail. This often hides the fact that there is | ||
unreachable code in callees such as `bar/1` here, and sometimes results in | ||
cryptic warnings describing something the user believes could never happen. | ||
|
||
This also makes it much easier for a user to see what the type of a variable | ||
is at any point: just start with what it says in the `-spec`. At present, | ||
the type of `X` depends entirely on how `X` got there which is not always | ||
trivial to figure out. | ||
|
||
[^2]: `dialyzer` also does this as a consequence of the bugs mentioned in the abstract. | ||
|
||
Reference Implementation | ||
------------------------ | ||
|
||
We will make a PR with changes in the documentation to reflect | ||
above once it is accepted | ||
|
||
Copyright | ||
========= | ||
|
||
This document is placed in the public domain or under the CC0-1.0-Universal | ||
license, whichever is more permissive. | ||
|