Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design proposal for initialization. #5142

Merged
merged 14 commits into from
Sep 27, 2024

Conversation

csyonghe
Copy link
Collaborator

@csyonghe csyonghe commented Sep 23, 2024

Closes #5149.

@kaizhangNV
Copy link
Contributor

kaizhangNV commented Sep 24, 2024

The proposal is not in *.md extension, github cannot render it into formatted doc.
Can you update that?

Background
----------

Slang has introduced several different syntax around initialization to provide syntactic compatibility with HLSL/C++. As the language evolve, there aree many corners where
Copy link
Contributor

@kaizhangNV kaizhangNV Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: aree -> are

Slang has introduced several different syntax around initialization to provide syntactic compatibility with HLSL/C++. As the language evolve, there aree many corners where
the semantics around initialization are not well-defined, and causing confusion or leading to surprising behaviors.

This proposal attempts to provide a design on where we want to language to be in turns of how initialization is handled in all different places.
Copy link
Contributor

@kaizhangNV kaizhangNV Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

want to language -> want the language

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"in turns of" ?

is "in terms of"?

int x = int();

struct S { int x; int y; }
S s; // s will be default initialized to {0, 0} because `S` is default-initializable.
Copy link
Contributor

@kaizhangNV kaizhangNV Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add few sentences to make it clear that

S s;
equal to
S s = S();

And I have a question:
Does S s = S() equal to S s = {} ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, I see you you covered question this later.

## Automatic Synthesis of Default-Initializer

If a `struct` type is determined to be default-initializable but a default constructor isn't explicitly provided by the user, the Slang compiler should
synthesize such a constructor for the type. The synthesis logic should be recursively invoke defualt initialization on all members.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: defualt -> default.

Suggestion:
"defualt initialization" => "default initializer"

```csharp
S obj = {};
// equivalent to:
S obj = S();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By knowing this, I think the PR makes a mistake.

Because it treats these two differently.

That's why there are both $init(This) and $defaultInit(This) functions synthesized.

- It is a sized-array type where the element type is default-initializable.
- It is a tuple type where all element types are default-initializable.
- It is an `Optional<T>` type for any `T`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the type defined in shader resource type, e.g.:

cbuffer Uniforms
{
default-initializable-type a;
}

is a still considered default-initializable? I think it is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is considered default-initializable, although it will have no meaning since the location is in a read-only position.

S obj = S();
```

If the above code passes type check, then it will be used as the way to initialize `obj`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be neater from a typing point of view to just synthesize a constructor which perform this, then we wouldn't need to special case any initializer list application and it would always resolve to a constructor call

Copy link
Collaborator

@expipiplus1 expipiplus1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't mention out parameters, which at the moment are at risk of being uninitialized at their declaration site. I would be all for requiring such parameters be default initializable (and make it easier to return multiple values from a function to facilitate situations where this isn't possible)

@kaizhangNV
Copy link
Contributor

This doesn't mention static variable in a class. Is that default initializable?

@csyonghe
Copy link
Collaborator Author

static variables are a tricky case because there is currently no efficient way to initialize them on d3d and vulkan. So the answer for now has to be that they won't be default initialized.

@csyonghe
Copy link
Collaborator Author

I think for out parameters, we also don't want to default initialize them, because an out parameter must be coming from a location that is either default initlaized because it is a var, or can't be efficiently default inited because it is a static global var.

@tangent-vector
Copy link
Contributor

To follow up on the discussion of out parameters: the existence of out in Slang should be another factor pushing us to have a model where the semantic rule is not that it is an error to fail to initialize a variable at its declaration point, and instead should have a rule that it is an error to use a variable at a point where it might not be (fully) initialized.

We should ideally treat use of a variable (or field of a struct etc.) as an argument for an out parameter as equivalent to an assignment for the purposes of deciding whether a variable has been initialized, and where.

@tangent-vector
Copy link
Contributor

If we do make the rules be that the initialization point of a variable need not be its declaration, that would also allow us to have variables declared with let be initialized via assignment or being used as an out argument:

let x : Int; // okay: just not initialized yet
if(something) x = 1; else x = 2;

This is only a slight extension of the rules needed for the var case:

  • For any read of a variable, the variable must be fully initialized along every control-flow path that reaches that read

  • For any write of a let variable, the variable must be fully uninitialized along every control-flow path that reaches that write

```csharp
S obj = {};
// equivalent to:
S obj = S();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to be very clear that

S obj = {};

won't zero-initialize any members of S. It's not a C-style zero initializer list anymore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it should not, if S defines a default ctor, S obj = {} should be calling that default ctor.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, if S is a C-style struct, then S s = {} will still fall back to c-style initailization logic according to this proposal because the ctor match will fail. In that case, we will still zero initialize.

```csharp
void foo()
{
MyType t; // t is considered uninitialized.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify that there is no __init() defined for MyType?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general principle is never to initialize t. The fact that we still need to initialize t if MyType has default __init is just for backward compatbility, and we may in the future no longer do so. If the user is defining this as modern syntax such as var t : MyType, we will never initialize whether or not MyType has __init.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So: if MyType has a default __init(), then:

  1. MyType s; will default initialize now, but we need to think about a migration path so we don't default initilaize it in the future.
  2. var s : MyType will not initialize.
  3. In a modern module, i.e. a module defined with module moduleName;, MyType s; will not default initialize.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand the rules here.

But what I'm trying to suggest is to change this example as

struct MyType1 {
  int x;
}
void foo() {
  MyType t1; // `t1` is initialized with a call to `__init`.
}

So we know why it's different from the example in line 59-65. Because there is no default __init() defined in MyType1.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? MyType1 shouldn't default initialize because it doesn't even have a user defined ctor.

{
CLike c0; // `c0` is uninitialized.
CLike c1 = {}; // initialized with legacy initializaer list logic, `c1` is now `{0,0}`.
CLike c2 = {1}; // initialized with legacy initializaer list logic, `c1` is now `{1,0}`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this example, so we still zero-initialize members?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we still want to support legacy initializaer list?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is a still fallback if ctor match failed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can verify that this is already supported by current slang.

// translates to:
// MyType t = MyType.__init(1);
// which is not
// MyType t = MyType(t)
Copy link
Contributor

@kaizhangNV kaizhangNV Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite understand this examples, why is special? How is different from multi-arguments initialize list?

Is MyType t = MyType.__init(1); equivalent to

MyType t = MyType(1);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MyType.__init(1) calls the ctor directly.

MyType(1) means (MyType)1 which will first try a set of builtin coercion rules to convert 1 to MyType. If the builtin rules doesn't apply, then fallback to calling MyType.__init(1).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see that we want to differentiate from
Single argument constructor call sub-section above?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

MyType s = {1}
Always calls the ctor.

MyType(1) goes through type cast.

- It does not contain any explicit constructors defined by the user.
- All its members have higher or equal visibility than the type.
- All its members are legacy C-Style structs or arrays of legacy C-style structs.
In such case, we perform a legacy "read data" style consumption of the initializer list, so that the following behavior is valid:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can "legacy C-style struct" have member init expression?
e.g.

struct S
{
      int a = 5;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it can.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the case, you may want to update this example

struct DefaultMember {
  int x = 0;
  int y = 1;
}
void test3()
{
  DefaultMember m; // `m` is uninitialized.
  DefaultMember m1 = {}; // calls `__init()`, initialized to `{0,1}`.
  DefaultMember m2 = {1}; // calls `__init(1)`, initialized to `{1,1}`.
  DefaultMember m3 = {1,2}; // calls `__init(1,2)`, initialized to `{1,2}`.
}

because it looks like it will go to the legacy initializer list logic

tangent-vector
tangent-vector previously approved these changes Sep 27, 2024

A type X is default initializable if:
- It explicitly declares that `X` implements `IDefaultInitializable`.
- It explicitly provides a default constructor `X::__init()` that takes no arguments, in which case we treat the type as implementing `IDefaultInitializable` even if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very wary of treating types that do not declare a conformance as if they have it. Are you saying that if I had a user-defined generic:

T myFunc< T : IDefaultInitializable >() { return T(); }

I would be able to call this for any user-defined type that has an explicit zero-parameter __init() even if it didn't declare conformance?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I am not against us having a built-in IDefaultable or similar interface that can be opted into by user-defined types. I would also support having built-in types like arrays and tuples have conditional conformances for IDefaultable when their elements are defaultable.

All I am objecting to is making user-defined types automatically conform.

Comment on lines 59 to 60
- It is a struct type where all its members are default-initializable. A member is considered default-initializable if the type of the member is default-initializable,
or if the member has an initialization expression that defines its default value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually highlights the kind of reason why implicit conformance would be subtly dangerous. Adding a single private field to a struct type can change whether or not it is default-initializable, even though the public API of the type doesn't appear to have changed. Thus a (public) conformance can be removed from a type by editing the (private) implementation details of that type, without any error or warning on the struct declaration itself.

If the user had to explicitly make their struct conform to IDefaultInitializable, then it would be easy to diagnose an error on the struct declaration itself when implementation details change to make the conformance no longer valid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be that we need two different kinds of rules for default-initializability: one for within a single module, and one for types exported from the module.

Comment on lines 71 to 79
If the type of a local variable is default-initializable, then its default initializer will be invoked at its declaration site implicitly to intialize its value:
```c++
int x; // x will be default initialized to 0 because `int` is default-initializable.
// The above is equivalent to:
int x = int();

struct S { int x; int y; }
S s; // s will be default initialized to {0, 0} because `S` is default-initializable.
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly dislike making this our default behavior for what will end up being nearly all the built-in and user-defined types. Is the motivation for this that we need to be compatible with the C++ semantics for default construction of local variables?

Comment on lines 81 to 86
If a type is not default-initializable, and the declaration site does not provide an intial value for the variable, the compiler should generate an error:
```csharp
struct V { int[] arr; }

V v; // error: `v` must be initialized at declaration.
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may need/want our rules for what counts as properly initialized to be more subtle than this. Basically, so long as every variable is not read at a point where it could potentially be only partially initialized, then code is safe.

Code like the following should be fine:

void okayFunc( V p )
{
    V x;
    doSomethingThatDoesntUseX();
    x = p;
    nowUseX(x);
}

The variable x is clearly fully initialized before the point where it is used, so there is no error, even if it was not initialized as part of its declaration.

Comment on lines 92 to 98
A generic type parameter is not considered default-initializable by-default. As a result, the following code should produce error:
```csharp
void foo<T>()
{
T t; // error, `t` is uninitialized.
}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strong agree on this.

Comment on lines 123 to 130
As a special case, an empty initializer list will translate into a default-initialization:
```csharp
S obj = {};
// equivalent to:
S obj = S();
```

If the above code passes type check, then it will be used as the way to initialize `obj`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to note that there ends up being a subtlety here because of how casts and constructor-call syntax are related, such that the intuitive description might not be one we want to have work in the one-argument case:

S obj = {  x };
// might not be equivalent to:
S obj = S( x );

The reason for this is that when we see S( x ) we must always treat this as an attempt to coerce/cast x to type S, so that it is semantically equivalent to (S) x. In practice, the semantics of a cast will often bottom out by performing overload resolution for a single-argument constructor call on S, but not always. One notable exceptional case is when x is already of type S.

The syntax S obj = { x }; should almost certainly not be treated as equivalent to S obj = (S) x;, so it should not be exactly equivalent to S obj = S(x);, and should instead be equivalent to directly invoking a single-argument constructor on S.

Comment on lines 133 to 139
A type is a "legacy C-Style struct" iff:
- It is a struct type.
- It is a basic scalar, vector or matrix type, e.g. `int`, `float4x4`.
- It does not define any explicit constructors
- It does not define any initialization expressions on its members.
- All its members are legacy C-Style structs or arrays of legacy C-style structs.
In such case, we perform a legacy "read data" style consumption of the initializer list, so that the following behavior is valid:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably also make these rules require that the type come from the same module as where the initializer-list expression appears.

And we should make it explicit that this should be a warning, with a "fixit" hint indicating where the user should add additional {} to make the intention explicit.

Comment on lines 157 to 160
The signature for the synthesized initializer for type `T` is:
```csharp
V T.__init(member0: typeof(member0) = default(member0), member1 : typeof(member1) = default(member1), ...)
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, it seems like you are already covering the more general case of constructor synthesis here. I'm not sure why the no-argument case is being described separately above.

is the value defined by the initialization expression in `member0` if it exist, or the default value of `member0`'s type.
If `member0`'s type is not default initializable and the the member doesn't provide an initial value, then the parameter will not have a default value.

The body of the constructor will initialize each member with the value comming from the corresponding constructor argument if such argument exists,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some subtlety here that I would personally want to give ourselves freedom to explore down the road, around allowing the initial-value expression for one field to depend on another:

struct BitAndMask
{
    int bitIndex;
    int mask = 1 << bitIndex;
}

The desired semantics of this type are quite clear, but it would violate the current rules for initialization, and would also make it impossible to syntesize a construct, because the default value for mask is this.bitIndex, which is not accessible in the context of a caller to BitAndMask.__init().

Comment on lines 175 to 177
One important decision point is whether or not Slang should allow variables to be left in uninitialized state after its declaration as it is allowed in C++.
Our opinion is that this is not what we want to have in the long term and Slang should take the opportunity as a new language to not inherit from this
undesired C++ legacy behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree on this point: allowing variables to be used while in a potentially-uninitialized state is a Bad Idea.

Languages like C# have historically tried to solve this kind of thing with the pervasive big hammer of making most types have default values and automatically initializing uninitialized variables to those defaults.

History has shown that the C#-style approach has drawbacks too. It is quite easy for a programmer to fail to notice that they didn't explicitly initialize a field or variable that they meant to initialize, and end up with a program that compiles (because the compiler was "helpful" and set things to zero/null), but is semantically incorrect. Programmers would often rather be told about the potential mistake, and being forced to explicitly ask for a default-initialized variable when you want them is not that great of a burden compared to the potential costs of failing to notice an uninitialized (or incorrectly default-initialized) variable.

@csyonghe csyonghe merged commit 2321638 into shader-slang:master Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Design how initialization works in slang.
4 participants