Skip to content

Commit

Permalink
Spec local initialization (WebAssembly#67)
Browse files Browse the repository at this point in the history
  • Loading branch information
rossberg authored Aug 2, 2022
1 parent f831cff commit 6194785
Show file tree
Hide file tree
Showing 14 changed files with 387 additions and 133 deletions.
106 changes: 83 additions & 23 deletions document/core/appendix/algorithm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,10 @@ The algorithm is expressed in typed pseudo code whose semantics is intended to b
Data Structures
~~~~~~~~~~~~~~~

Types are representable as a set of enumerations.
Types
.....

Value types are representable as a set of enumerations.

.. code-block:: pseudo
Expand Down Expand Up @@ -65,37 +68,64 @@ Equivalence and subtyping checks can be defined on these types.
return matches_heap(t1.heap, t2.heap) && matches_null(t1.null, t2.null)
func matches(t1 : val_type, t2 : val_type) : bool =
return
(is_num t1 && is_num t2 && t1 = t2) ||
(is_vec t1 && is_vec t2 && t1 = t2) ||
(is_ref t1 && is_ref t2 && matches_ref(t1, t2)) ||
t1 = Bot
switch (t1, t2)
case (Ref(_), Ref(_))
return matches_ref(t1, t2)
case (Bot, _)
return true
case (_, _)
return t1 = t2
Context
.......

Validation requires a :ref:`context <context>` for checking uses of :ref:`indices <syntax-index>`.
For the purpose of presenting the algorithm, it is maintained in a set of global variables:

.. code-block:: pseudo
var locals : array(val_type)
var locals_init : array(bool)
var globals : array(global_type)
var funcs : array(func_type)
var tables : array(table_type)
var mems : array(mem_type)
This assumes suitable representations for the various :ref:`types <syntax-type>` besides :code:`val_type`, which are omitted here.

For locals, there is an additional array recording the initialization status of each local.

Stacks
......

The algorithm uses two separate stacks: the *value stack* and the *control stack*.
The former tracks the :ref:`types <syntax-valtype>` of operand values on the :ref:`stack <stack>`,
the latter surrounding :ref:`structured control instructions <syntax-instr-control>` and their associated :ref:`blocks <syntax-instr-control>`.
The algorithm uses three separate stacks: the *value stack*, the *control stack*, and the *initialization stack*.
The value stack tracks the :ref:`types <syntax-valtype>` of operand values on the :ref:`stack <stack>`.
The control stack tracks surrounding :ref:`structured control instructions <syntax-instr-control>` and their associated :ref:`blocks <syntax-instr-control>`.
The initialization stack records all :ref:`locals <syntax-local>` that have been initialized since the beginning of the function.

.. code-block:: pseudo
type val_stack = stack(val_type)
type init_stack = stack(u32)
type ctrl_stack = stack(ctrl_frame)
type ctrl_frame = {
opcode : opcode
start_types : list(val_type)
end_types : list(val_type)
height : nat
val_height : nat
init_height : nat
unreachable : bool
}
For each value, the value stack records its :ref:`value type <syntax-valtype>`.
For each entered block, the control stack records a *control frame* with the originating opcode, the types on the top of the operand stack at the start and end of the block (used to check its result as well as branches), the height of the operand stack at the start of the block (used to check that operands do not underflow the current block), and a flag recording whether the remainder of the block is unreachable (used to handle :ref:`stack-polymorphic <polymorphism>` typing after branches).
For each entered block, the control stack records a *control frame* with the originating opcode, the types on the top of the operand stack at the start and end of the block (used to check its result as well as branches), the height of the operand stack at the start of the block (used to check that operands do not underflow the current block), the height of the initialization stack at the start of the block (used to reset initialization status at the end of the block), and a flag recording whether the remainder of the block is unreachable (used to handle :ref:`stack-polymorphic <polymorphism>` typing after branches).

For the purpose of presenting the algorithm, the operand and control stacks are simply maintained as global variables:
For the purpose of presenting the algorithm, these stacks are simply maintained as global variables:

.. code-block:: pseudo
var vals : val_stack
var inits : init_stack
var ctrls : ctrl_stack
However, these variables are not manipulated directly by the main checking function, but through a set of auxiliary functions:
Expand Down Expand Up @@ -123,7 +153,7 @@ However, these variables are not manipulated directly by the main checking funct
func pop_ref() : ref_type =
let actual = pop_val()
error_if(not is_ref(actual))
if actual = Bot then return Ref(Bot, false)
if (actual = Bot) return Ref(Bot, false)
return actual
func push_vals(types : list(val_type)) = foreach (t in types) push_val(t)
Expand All @@ -150,25 +180,49 @@ Finally, there are accumulative functions for pushing or popping multiple operan
so that, e.g., :code:`ctrls[0]` accesses the element pushed last.


The initialization stack and the initialization status of locals is manipulated through the following functions:

.. code-block:: pseudo
func get_local(idx : u32) =
error_if(not (locals_init[idx] || ctrls[0].unreachable))
func set_local(idx : u32) =
if (not locals_init[idx])
inits.push(idx)
locals_init[idx] := true
func reset_locals(height : nat) =
while (inits.size() > height)
locals_init[inits.pop()] := false
Getting a local verifies that it is either known to be initialized, or that that the operation is unreachable in the current stack frame.
When a local is set that was not set already,
then its initialization status is updated and the change is recorded in the initialization stack.
Thus, the initialization status of all locals can be reset to a previous state by denoting a specific height in the initialization stack.

The size of the initialization stack is bounded by the number of (non-defaultable) locals in a function, so can be preallocated by an algorithm.

The control stack is likewise manipulated through auxiliary functions:

.. code-block:: pseudo
func push_ctrl(opcode : opcode, in : list(val_type), out : list(val_type)) =
let frame = ctrl_frame(opcode, in, out, vals.size(), false)
let frame = ctrl_frame(opcode, in, out, vals.size(), inits.size(), false)
ctrls.push(frame)
push_vals(in)
func pop_ctrl() : ctrl_frame =
error_if(ctrls.is_empty())
let frame = ctrls[0]
pop_vals(frame.end_types)
error_if(vals.size() =/= frame.height)
error_if(vals.size() =/= frame.val_height)
reset_locals(frame.init_height)
ctrls.pop()
return frame
func label_types(frame : ctrl_frame) : list(val_types) =
return (if frame.opcode == loop then frame.start_types else frame.end_types)
return (if (frame.opcode = loop) frame.start_types else frame.end_types)
func unreachable() =
vals.resize(ctrls[0].height)
Expand All @@ -180,6 +234,7 @@ It allocates a new frame record recording them along with the current height of
Popping a frame first checks that the control stack is not empty.
It then verifies that the operand stack contains the right types of values expected at the end of the exited block and pops them off the operand stack.
Afterwards, it checks that the stack has shrunk back to its initial height.
Finally, it undoes all changes to the initialization status of locals that happend inside the block.

The type of the :ref:`label <syntax-label>` associated with a control frame is either that of the stack at the start or the end of the frame, determined by the opcode that it originates from.

Expand All @@ -192,6 +247,7 @@ In that case, all existing operand types are purged from the value stack, in ord
However, a polymorphic stack cannot underflow, but instead generates :code:`Bot` types as needed.



.. index:: opcode

Validation of Opcode Sequences
Expand All @@ -200,10 +256,6 @@ Validation of Opcode Sequences
The following function shows the validation of a number of representative instructions that manipulate the stack.
Other instructions are checked in a similar manner.

.. note::
Various instructions not shown here will additionally require the presence of a validation :ref:`context <context>` for checking uses of :ref:`indices <syntax-index>`.
That is an easy addition and therefore omitted from this presentation.

.. code-block:: pseudo
func validate(opcode) =
Expand All @@ -222,7 +274,7 @@ Other instructions are checked in a similar manner.
let t2 = pop_val()
error_if(not (is_num(t1) && is_num(t2) || is_vec(t1) && is_vec(t2)))
error_if(t1 =/= t2 && t1 =/= Bot && t2 =/= Bot)
push_val(if (t1 == Bot) t2 else t1)
push_val(if (t1 = Bot) t2 else t1)
case (select t)
pop_val(I32)
Expand All @@ -238,6 +290,14 @@ Other instructions are checked in a similar manner.
let rt = pop_ref()
push_val(Ref(rt.heap, false))
case (local.get x)
get_local(x)
push_val(locals[x])
case (local.set x)
pop_val(locals[x])
set_local(x)
case (unreachable)
unreachable()
Expand Down Expand Up @@ -296,7 +356,7 @@ Other instructions are checked in a similar manner.
let rt = pop_ref()
if (rt.heap =/= Bot)
error_if(not is_def(rt.heap))
let ft = lookup_func_type(rt.heap.idx) // TODO
let ft = funcs[rt.heap.idx]
pop_vals(ft.params)
push_vals(ft.results)
Expand Down
11 changes: 11 additions & 0 deletions document/core/appendix/changes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,17 @@ Added vector type and instructions that manipulate multiple numeric values in pa
* New injection/projection :ref:`vector instructions <syntax-instr-vec>`: :math:`\K{i}\!N\!\K{x}\!M\!\K{.splat}`, :math:`\K{f}\!N\!\K{x}\!M\!\K{.splat}`, :math:`\K{i}\!N\!\K{x}\!M\!\K{.bitmask}`


Release 2.1
~~~~~~~~~~~

.. index:: reference

Typeful References
..................

.. todo:: describe


.. [#proposal-signext]
https://github.com/WebAssembly/spec/tree/main/proposals/sign-extension-ops/
Expand Down
19 changes: 18 additions & 1 deletion document/core/appendix/index-rules.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,14 @@ Typing of Static Constructs
Construct Judgement
=============================================== ===============================================================================
:ref:`Limits <valid-limits>` :math:`C \vdashlimits \limits : k`
:ref:`Numeric type <valid-numtype>` :math:`C \vdashnumtype \numtype \ok`
:ref:`Vector type <valid-vectype>` :math:`C \vdashvectype \vectype \ok`
:ref:`Heap type <valid-heaptype>` :math:`C \vdashheaptype \heaptype \ok`
:ref:`Reference type <valid-reftype>` :math:`C \vdashreftype \reftype \ok`
:ref:`Value type <valid-valtype>` :math:`C \vdashvaltype \valtype \ok`
:ref:`Function type <valid-functype>` :math:`C \vdashfunctype \functype \ok`
:ref:`Block type <valid-blocktype>` :math:`C \vdashblocktype \blocktype \ok`
:ref:`Block type <valid-blocktype>` :math:`C \vdashblocktype \blocktype : \instrtype`
:ref:`Instruction type <valid-instrtype>` :math:`C \vdashinstrtype \instrtype \ok`
:ref:`Table type <valid-tabletype>` :math:`C \vdashtabletype \tabletype \ok`
:ref:`Memory type <valid-memtype>` :math:`C \vdashmemtype \memtype \ok`
:ref:`Global type <valid-globaltype>` :math:`C \vdashglobaltype \globaltype \ok`
Expand All @@ -24,6 +30,7 @@ Construct Judgement
:ref:`Instruction sequence <valid-instr-seq>` :math:`S;C \vdashinstrseq \instr^\ast : \functype`
:ref:`Expression <valid-expr>` :math:`C \vdashexpr \expr : \resulttype`
:ref:`Function <valid-func>` :math:`C \vdashfunc \func : \functype`
:ref:`Local <valid-local>` :math:`C \vdashlocal \local : \localtype`
:ref:`Table <valid-table>` :math:`C \vdashtable \table : \tabletype`
:ref:`Memory <valid-mem>` :math:`C \vdashmem \mem : \memtype`
:ref:`Global <valid-global>` :math:`C \vdashglobal \global : \globaltype`
Expand Down Expand Up @@ -66,6 +73,16 @@ Construct Judgement
=============================================== ===============================================================================


Defaultability
~~~~~~~~~~~~~~

================================================= ===============================================================================
Construct Judgement
================================================= ===============================================================================
:ref:`Defaultable value type <valid-defaultable>` :math:`C \vdashvaltypedefaultable \valtype \defaultable`
================================================= ===============================================================================


Constantness
~~~~~~~~~~~~

Expand Down
10 changes: 5 additions & 5 deletions document/core/binary/modules.rst
Original file line number Diff line number Diff line change
Expand Up @@ -405,15 +405,15 @@ denoting *count* locals of the same value type.
\X{size}{:}\Bu32~~\X{code}{:}\Bfunc
&\Rightarrow& \X{code} & (\iff \X{size} = ||\Bfunc||) \\
\production{function} & \Bfunc &::=&
(t^\ast)^\ast{:}\Bvec(\Blocals)~~e{:}\Bexpr
&\Rightarrow& \concat((t^\ast)^\ast), e
& (\iff |\concat((t^\ast)^\ast)| < 2^{32}) \\
(\local^\ast)^\ast{:}\Bvec(\Blocals)~~e{:}\Bexpr
&\Rightarrow& \concat((\local^\ast)^\ast), e
& (\iff |\concat((\local^\ast)^\ast)| < 2^{32}) \\
\production{locals} & \Blocals &::=&
n{:}\Bu32~~t{:}\Bvaltype &\Rightarrow& t^n \\
n{:}\Bu32~~t{:}\Bvaltype &\Rightarrow& \{ \LTYPE~t \}^n \\
\end{array}
Here, :math:`\X{code}` ranges over pairs :math:`(\valtype^\ast, \expr)`.
The meta function :math:`\concat((t^\ast)^\ast)` concatenates all sequences :math:`t_i^\ast` in :math:`(t^\ast)^\ast`.
The meta function :math:`\concat((\local^\ast)^\ast)` concatenates all sequences :math:`\local_i^\ast` in :math:`(\local^\ast)^\ast`.
Any code for which the length of the resulting sequence is out of bounds of the maximum size of a :ref:`vector <syntax-vec>` is malformed.

.. note::
Expand Down
19 changes: 10 additions & 9 deletions document/core/exec/instructions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1043,7 +1043,7 @@ Variable Instructions

1. Let :math:`F` be the :ref:`current <exec-notation-textual>` :ref:`frame <syntax-frame>`.

2. Assert: due to :ref:`validation <valid-local.get>`, :math:`F.\ALOCALS[x]` exists.
2. Assert: due to :ref:`validation <valid-local.get>`, :math:`F.\ALOCALS[x]` exists and is non-empty.

3. Let :math:`\val` be the value :math:`F.\ALOCALS[x]`.

Expand Down Expand Up @@ -3007,23 +3007,21 @@ Invocation of :ref:`function address <syntax-funcaddr>` :math:`a`

3. Let :math:`[t_1^n] \to [t_2^m]` be the :ref:`function type <syntax-functype>` :math:`f.\FITYPE`.

4. Let :math:`t^\ast` be the list of :ref:`value types <syntax-valtype>` :math:`f.\FICODE.\FLOCALS`.
4. Let :math:`\local^\ast` be the list of :ref:`locals <syntax-local>` :math:`f.\FICODE.\FLOCALS`.

5. Let :math:`\instr^\ast~\END` be the :ref:`expression <syntax-expr>` :math:`f.\FICODE.\FBODY`.

6. Assert: due to :ref:`validation <valid-call>`, :math:`n` values are on the top of the stack.

7. Pop the values :math:`\val^n` from the stack.

8. Let :math:`\val_0^\ast` be the list of zero values of types :math:`t^\ast`.
8. Let :math:`F` be the :ref:`frame <syntax-frame>` :math:`\{ \AMODULE~f.\FIMODULE, \ALOCALS~\val^n~(\default_t)^\ast \}`.

9. Let :math:`F` be the :ref:`frame <syntax-frame>` :math:`\{ \AMODULE~f.\FIMODULE, \ALOCALS~\val^n~(\default_t)^\ast \}`.
9. Push the activation of :math:`F` with arity :math:`m` to the stack.

10. Push the activation of :math:`F` with arity :math:`m` to the stack.
10. Let :math:`L` be the :ref:`label <syntax-label>` whose arity is :math:`m` and whose continuation is the end of the function.

11. Let :math:`L` be the :ref:`label <syntax-label>` whose arity is :math:`m` and whose continuation is the end of the function.

12. :ref:`Enter <exec-instr-seq-enter>` the instruction sequence :math:`\instr^\ast` with label :math:`L`.
11. :ref:`Enter <exec-instr-seq-enter>` the instruction sequence :math:`\instr^\ast` with label :math:`L`.

.. math::
~\\[-1ex]
Expand All @@ -3035,11 +3033,14 @@ Invocation of :ref:`function address <syntax-funcaddr>` :math:`a`
\begin{array}[t]{@{}r@{~}l@{}}
(\iff & S.\SFUNCS[a] = f \\
\wedge & f.\FITYPE = [t_1^n] \to [t_2^m] \\
\wedge & f.\FICODE = \{ \FTYPE~x, \FLOCALS~t^k, \FBODY~\instr^\ast~\END \} \\
\wedge & f.\FICODE = \{ \FTYPE~x, \FLOCALS~\{\LTYPE~t\}^k, \FBODY~\instr^\ast~\END \} \\
\wedge & F = \{ \AMODULE~f.\FIMODULE, ~\ALOCALS~\val^n~(\default_t)^k \})
\end{array} \\
\end{array}
.. note::
For non-defaultable types, the respective local is left uninitialized by these rules.


.. _exec-invoke-exit:

Expand Down
11 changes: 7 additions & 4 deletions document/core/exec/runtime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,14 +51,16 @@ or *external references* pointing to an uninterpreted form of :ref:`extern addre

.. _default-val:

Each :ref:`value type <syntax-valtype>` has an associated *default value*;
it is the respective value :math:`0` for :ref:`number types <syntax-numtype>`, :math:`0` for :ref:`vector types <syntax-vectype>`, and null for :ref:`reference types <syntax-reftype>`.
:ref:`Value types <syntax-valtype>` can have an associated *default value*;
it is the respective value :math:`0` for :ref:`number types <syntax-numtype>`, :math:`0` for :ref:`vector types <syntax-vectype>`, and null for nullable :ref:`reference types <syntax-reftype>`.
For other references, no default value is defined, :math:`\default_t` hence is an optional value :math:`\val^?`.

.. math::
\begin{array}{lcl@{\qquad}l}
\default_t &=& t{.}\CONST~0 & (\iff t = \numtype) \\
\default_t &=& t{.}\CONST~0 & (\iff t = \vectype) \\
\default_t &=& \REFNULL~t & (\iff t = (\REF~\NULL~\heaptype)) \\
\default_t &=& \epsilon & (\iff t = (\REF~\heaptype)) \\
\end{array}
Expand Down Expand Up @@ -493,10 +495,11 @@ and a reference to the function's own :ref:`module instance <syntax-moduleinst>`
\production{(activation)} & \X{activation} &::=&
\FRAME_n\{\frame\} \\
\production{(frame)} & \frame &::=&
\{ \ALOCALS~\val^\ast, \AMODULE~\moduleinst \} \\
\{ \ALOCALS~(\val^?)^\ast, \AMODULE~\moduleinst \} \\
\end{array}
The values of the locals are mutated by respective :ref:`variable instructions <syntax-instr-variable>`.
Locals may be uninitialized, in which case they are empty.
Locals are mutated by respective :ref:`variable instructions <syntax-instr-variable>`.


.. _exec-expand:
Expand Down
4 changes: 3 additions & 1 deletion document/core/syntax/modules.rst
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,9 @@ The |MFUNCS| component of a module defines a vector of *functions* with the foll
.. math::
\begin{array}{llll}
\production{function} & \func &::=&
\{ \FTYPE~\typeidx, \FLOCALS~\vec(\valtype), \FBODY~\expr \} \\
\{ \FTYPE~\typeidx, \FLOCALS~\vec(\local), \FBODY~\expr \} \\
\production{local} & \local &::=&
\{ \LTYPE~\valtype \} \\
\end{array}
The |FTYPE| of a function declares its signature by reference to a :ref:`type <syntax-type>` defined in the module.
Expand Down
Loading

0 comments on commit 6194785

Please sign in to comment.