From 0719c11341264205e88cddd113653e6eb5969085 Mon Sep 17 00:00:00 2001
From: Amanieu d'Antras <amanieu@gmail.com>
Date: Tue, 9 Nov 2021 01:00:38 +0000
Subject: [PATCH 01/13] Add inline assembly to the reference

---
 src/SUMMARY.md         |   2 +
 src/inline-assembly.md | 462 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 464 insertions(+)
 create mode 100644 src/inline-assembly.md
diff --git a/src/SUMMARY.md b/src/SUMMARY.md
index 783c647b4..fbbd43673 100644
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@@ -114,6 +114,8 @@
 
 - [Linkage](linkage.md)
 
+- [Inline assembly](inline-assembly.md)
+
 - [Unsafety](unsafety.md)
     - [Unsafe functions](unsafe-functions.md)
     - [Unsafe blocks](unsafe-blocks.md)
diff --git a/src/inline-assembly.md b/src/inline-assembly.md
new file mode 100644
index 000000000..b42af734b
--- /dev/null
+++ b/src/inline-assembly.md
@@ -0,0 +1,462 @@
+# Inline assembly
+
+Rust provides support for inline assembly via the `asm!` and `global_asm!` macros.
+It can be used to embed handwritten assembly in the assembly output generated by the compiler.
+
+The following ABNF specifies the general syntax:
+
+```text
+dir_spec := "in" / "out" / "lateout" / "inout" / "inlateout"
+reg_spec := <register class> / "<explicit register>"
+operand_expr := expr / "_" / expr "=>" expr / expr "=>" "_"
+reg_operand := dir_spec "(" reg_spec ")" operand_expr
+operand := reg_operand / "const" const_expr / "sym" path
+clobber_abi := "clobber_abi(" <abi> *["," <abi>] [","] ")"
+option := "pure" / "nomem" / "readonly" / "preserves_flags" / "noreturn" / "nostack" / "att_syntax" / "raw"
+options := "options(" option *["," option] [","] ")"
+asm := "asm!(" format_string *("," format_string) *("," [ident "="] operand) *("," clobber_abi) *("," options) [","] ")"
+global_asm := "global_asm!(" format_string *("," format_string) *("," [ident "="] operand) *("," options) [","] ")"
+```
+
+Inline assembly is currently supported on the following architectures:
+- x86 and x86-64
+- ARM
+- AArch64
+- RISC-V
+- NVPTX
+- PowerPC
+- Hexagon
+- MIPS
+- WebAssembly
+- BPF
+- SPIR-V
+
+Support for more targets may be added in the future. The compiler will emit an error if `asm!` is used on an unsupported target.
+
+## Scope
+
+Inline assembly can be used in one of two ways.
+
+With the `asm!` macro, the assembly code is emitted in a function scope and integrated into the compiler-generated assembly code of a function. This assembly code must obey [strict rules](#rules-for-inline-assembly) to avoid undefined behavior. Note that in some cases the compiler may choose to emit the assembly code as a separate function and generate a call to it.
+
+With the `global_asm!` macro, the assembly code is emitted in a global scope, outside a function. This can be used to hand-write entire functions using assembly code, and generally provides much more freedom to use arbitrary registers and assembler directives.
+
+## Template string arguments
+
+The assembler template uses the same syntax as [format strings][format-syntax] (i.e. placeholders are specified by curly braces). The corresponding arguments are accessed in order, by index, or by name. However, implicit named arguments (introduced by [RFC #2795][rfc-2795]) are not supported.
+
+An `asm!` invocation may have one or more template string arguments; an `asm!` with multiple template string arguments is treated as if all the strings were concatenated with a `\n` between them. The expected usage is for each template string argument to correspond to a line of assembly code. All template string arguments must appear before any other arguments.
+
+As with format strings, named arguments must appear after positional arguments. Explicit register operands must appear at the end of the operand list, after named arguments if any.
+
+Explicit register operands cannot be used by placeholders in the template string. All other named and positional operands must appear at least once in the template string, otherwise a compiler error is generated.
+
+The exact assembly code syntax is target-specific and opaque to the compiler except for the way operands are substituted into the template string to form the code passed to the assembler.
+
+Currently, all supported targets follow the assembly code syntax used by LLVM's internal assembler which usually corresponds to that of the GNU assembler (GAS). On x86, the `.intel_syntax noprefix` mode of GAS is used by default. On ARM, the `.syntax unified` mode is used. These targets impose an additional restriction on the assembly code: any assembler state (e.g. the current section which can be changed with `.section`) must be restored to its original value at the end of the asm string. Assembly code that does not conform to the GAS syntax will result in assembler-specific behavior.
+
+[format-syntax]: https://doc.rust-lang.org/std/fmt/#syntax
+[rfc-2795]: https://github.com/rust-lang/rfcs/pull/2795
+
+## Operand type
+
+Several types of operands are supported:
+
+* `in(<reg>) <expr>`
+  - `<reg>` can refer to a register class or an explicit register. The allocated register name is substituted into the asm template string.
+  - The allocated register will contain the value of `<expr>` at the start of the asm code.
+  - The allocated register must contain the same value at the end of the asm code (except if a `lateout` is allocated to the same register).
+* `out(<reg>) <expr>`
+  - `<reg>` can refer to a register class or an explicit register. The allocated register name is substituted into the asm template string.
+  - The allocated register will contain an undefined value at the start of the asm code.
+  - `<expr>` must be a (possibly uninitialized) place expression, to which the contents of the allocated register is written to at the end of the asm code.
+  - An underscore (`_`) may be specified instead of an expression, which will cause the contents of the register to be discarded at the end of the asm code (effectively acting as a clobber).
+* `lateout(<reg>) <expr>`
+  - Identical to `out` except that the register allocator can reuse a register allocated to an `in`.
+  - You should only write to the register after all inputs are read, otherwise you may clobber an input.
+* `inout(<reg>) <expr>`
+  - `<reg>` can refer to a register class or an explicit register. The allocated register name is substituted into the asm template string.
+  - The allocated register will contain the value of `<expr>` at the start of the asm code.
+  - `<expr>` must be a mutable initialized place expression, to which the contents of the allocated register is written to at the end of the asm code.
+* `inout(<reg>) <in expr> => <out expr>`
+  - Same as `inout` except that the initial value of the register is taken from the value of `<in expr>`.
+  - `<out expr>` must be a (possibly uninitialized) place expression, to which the contents of the allocated register is written to at the end of the asm code.
+  - An underscore (`_`) may be specified instead of an expression for `<out expr>`, which will cause the contents of the register to be discarded at the end of the asm code (effectively acting as a clobber).
+  - `<in expr>` and `<out expr>` may have different types.
+* `inlateout(<reg>) <expr>` / `inlateout(<reg>) <in expr> => <out expr>`
+  - Identical to `inout` except that the register allocator can reuse a register allocated to an `in` (this can happen if the compiler knows the `in` has the same initial value as the `inlateout`).
+  - You should only write to the register after all inputs are read, otherwise you may clobber an input.
+* `const <expr>`
+  - `<expr>` must be an integer constant expression.
+  - The value of the expression is formatted as a string and substituted directly into the asm template string.
+* `sym <path>`
+  - `<path>` must refer to a `fn` or `static`.
+  - A mangled symbol name referring to the item is substituted into the asm template string.
+  - The substituted string does not include any modifiers (e.g. GOT, PLT, relocations, etc).
+  - `<path>` is allowed to point to a `#[thread_local]` static, in which case the asm code can combine the symbol with relocations (e.g. `@plt`, `@TPOFF`) to read from thread-local data.
+
+Operand expressions are evaluated from left to right, just like function call arguments. After the `asm!` has executed, outputs are written to in left to right order. This is significant if two outputs point to the same place: that place will contain the value of the rightmost output.
+
+Since `global_asm!` exists outside a function, only `const` and `sym` operands can be used with it.
+
+## Register operands
+
+Input and output operands can be specified either as an explicit register or as a register class from which the register allocator can select a register. Explicit registers are specified as string literals (e.g. `"eax"`) while register classes are specified as identifiers (e.g. `reg`). Using string literals for register names enables support for architectures that use special characters in register names, such as MIPS (`$0`, `$1`, etc).
+
+Note that explicit registers treat register aliases (e.g. `r14` vs `lr` on ARM) and smaller views of a register (e.g. `eax` vs `rax`) as equivalent to the base register. It is a compile-time error to use the same explicit register for two input operands or two output operands. Additionally, it is also a compile-time error to use overlapping registers (e.g. ARM VFP) in input operands or in output operands.
+
+Only the following types are allowed as operands for inline assembly:
+- Integers (signed and unsigned)
+- Floating-point numbers
+- Pointers (thin only)
+- Function pointers
+- SIMD vectors (structs defined with `#[repr(simd)]` and which implement `Copy`). This includes architecture-specific vector types defined in `std::arch` such as `__m128` (x86) or `int8x16_t` (ARM).
+
+Here is the list of currently supported register classes:
+
+| Architecture | Register class | Registers | LLVM constraint code |
+| ------------ | -------------- | --------- | -------------------- |
+| x86 | `reg` | `ax`, `bx`, `cx`, `dx`, `si`, `di`, `bp`, `r[8-15]` (x86-64 only) | `r` |
+| x86 | `reg_abcd` | `ax`, `bx`, `cx`, `dx` | `Q` |
+| x86-32 | `reg_byte` | `al`, `bl`, `cl`, `dl`, `ah`, `bh`, `ch`, `dh` | `q` |
+| x86-64 | `reg_byte`\* | `al`, `bl`, `cl`, `dl`, `sil`, `dil`, `bpl`, `r[8-15]b` | `q` |
+| x86 | `xmm_reg` | `xmm[0-7]` (x86) `xmm[0-15]` (x86-64) | `x` |
+| x86 | `ymm_reg` | `ymm[0-7]` (x86) `ymm[0-15]` (x86-64) | `x` |
+| x86 | `zmm_reg` | `zmm[0-7]` (x86) `zmm[0-31]` (x86-64) | `v` |
+| x86 | `kreg` | `k[1-7]` | `Yk` |
+| x86 | `x87_reg` | `st([0-7])` | Only clobbers |
+| x86 | `mmx_reg` | `mm[0-7]` | Only clobbers |
+| AArch64 | `reg` | `x[0-30]` | `r` |
+| AArch64 | `vreg` | `v[0-31]` | `w` |
+| AArch64 | `vreg_low16` | `v[0-15]` | `x` |
+| AArch64 | `preg` | `p[0-15]`, `ffr` | Only clobbers |
+| ARM | `reg` | `r[0-12]`, `r14` | `r` |
+| ARM (Thumb) | `reg_thumb` | `r[0-r7]` | `l` |
+| ARM (ARM) | `reg_thumb` | `r[0-r12]`, `r14` | `l` |
+| ARM | `sreg` | `s[0-31]` | `t` |
+| ARM | `sreg_low16` | `s[0-15]` | `x` |
+| ARM | `dreg` | `d[0-31]` | `w` |
+| ARM | `dreg_low16` | `d[0-15]` | `t` |
+| ARM | `dreg_low8` | `d[0-8]` | `x` |
+| ARM | `qreg` | `q[0-15]` | `w` |
+| ARM | `qreg_low8` | `q[0-7]` | `t` |
+| ARM | `qreg_low4` | `q[0-3]` | `x` |
+| MIPS | `reg` | `$[2-25]` | `r` |
+| MIPS | `freg` | `$f[0-31]` | `f` |
+| NVPTX | `reg16` | None\* | `h` |
+| NVPTX | `reg32` | None\* | `r` |
+| NVPTX | `reg64` | None\* | `l` |
+| RISC-V | `reg` | `x1`, `x[5-7]`, `x[9-15]`, `x[16-31]` (non-RV32E) | `r` |
+| RISC-V | `freg` | `f[0-31]` | `f` |
+| RISC-V | `vreg` | `v[0-31]` | Only clobbers |
+| Hexagon | `reg` | `r[0-28]` | `r` |
+| PowerPC | `reg` | `r[0-31]` | `r` |
+| PowerPC | `reg_nonzero` | | `r[1-31]` | `b` |
+| PowerPC | `freg` | `f[0-31]` | `f` |
+| PowerPC | `cr` | `cr[0-7]`, `cr` | Only clobbers |
+| PowerPC | `xer` | `xer` | Only clobbers |
+| wasm32 | `local` | None\* | `r` |
+| BPF | `reg` | `r[0-10]` | `r` |
+| BPF | `wreg` | `w[0-10]` | `w` |
+
+> **Notes**:
+> - On x86 we treat `reg_byte` differently from `reg` because the compiler can allocate `al` and `ah` separately whereas `reg` reserves the whole register.
+>
+> - On x86-64 the high byte registers (e.g. `ah`) are not available in the `reg_byte` register class.
+>
+> - NVPTX doesn't have a fixed register set, so named registers are not supported.
+>
+> - WebAssembly doesn't have registers, so named registers are not supported.
+>
+> - Some register classes are marked as "Only clobbers" which means that they cannot be used for inputs or outputs, only clobbers of the form `out("reg") _` or `lateout("reg") _`.
+
+Additional register classes may be added in the future based on demand (e.g. MMX, x87, etc).
+
+Each register class has constraints on which value types they can be used with. This is necessary because the way a value is loaded into a register depends on its type. For example, on big-endian systems, loading a `i32x4` and a `i8x16` into a SIMD register may result in different register contents even if the byte-wise memory representation of both values is identical. The availability of supported types for a particular register class may depend on what target features are currently enabled.
+
+| Architecture | Register class | Target feature | Allowed types |
+| ------------ | -------------- | -------------- | ------------- |
+| x86-32 | `reg` | None | `i16`, `i32`, `f32` |
+| x86-64 | `reg` | None | `i16`, `i32`, `f32`, `i64`, `f64` |
+| x86 | `reg_byte` | None | `i8` |
+| x86 | `xmm_reg` | `sse` | `i32`, `f32`, `i64`, `f64`, <br> `i8x16`, `i16x8`, `i32x4`, `i64x2`, `f32x4`, `f64x2` |
+| x86 | `ymm_reg` | `avx` | `i32`, `f32`, `i64`, `f64`, <br> `i8x16`, `i16x8`, `i32x4`, `i64x2`, `f32x4`, `f64x2` <br> `i8x32`, `i16x16`, `i32x8`, `i64x4`, `f32x8`, `f64x4` |
+| x86 | `zmm_reg` | `avx512f` | `i32`, `f32`, `i64`, `f64`, <br> `i8x16`, `i16x8`, `i32x4`, `i64x2`, `f32x4`, `f64x2` <br> `i8x32`, `i16x16`, `i32x8`, `i64x4`, `f32x8`, `f64x4` <br> `i8x64`, `i16x32`, `i32x16`, `i64x8`, `f32x16`, `f64x8` |
+| x86 | `kreg` | `avx512f` | `i8`, `i16` |
+| x86 | `kreg` | `avx512bw` | `i32`, `i64` |
+| x86 | `mmx_reg` | N/A | Only clobbers |
+| x86 | `x87_reg` | N/A | Only clobbers |
+| AArch64 | `reg` | None | `i8`, `i16`, `i32`, `f32`, `i64`, `f64` |
+| AArch64 | `vreg` | `fp` | `i8`, `i16`, `i32`, `f32`, `i64`, `f64`, <br> `i8x8`, `i16x4`, `i32x2`, `i64x1`, `f32x2`, `f64x1`, <br> `i8x16`, `i16x8`, `i32x4`, `i64x2`, `f32x4`, `f64x2` |
+| AArch64 | `preg` | N/A | Only clobbers |
+| ARM | `reg` | None | `i8`, `i16`, `i32`, `f32` |
+| ARM | `sreg` | `vfp2` | `i32`, `f32` |
+| ARM | `dreg` | `vfp2` | `i64`, `f64`, `i8x8`, `i16x4`, `i32x2`, `i64x1`, `f32x2` |
+| ARM | `qreg` | `neon` | `i8x16`, `i16x8`, `i32x4`, `i64x2`, `f32x4` |
+| MIPS32 | `reg` | None | `i8`, `i16`, `i32`, `f32` |
+| MIPS32 | `freg` | None | `f32`, `f64` |
+| MIPS64 | `reg` | None | `i8`, `i16`, `i32`, `i64`, `f32`, `f64` |
+| MIPS64 | `freg` | None | `f32`, `f64` |
+| NVPTX | `reg16` | None | `i8`, `i16` |
+| NVPTX | `reg32` | None | `i8`, `i16`, `i32`, `f32` |
+| NVPTX | `reg64` | None | `i8`, `i16`, `i32`, `f32`, `i64`, `f64` |
+| RISC-V32 | `reg` | None | `i8`, `i16`, `i32`, `f32` |
+| RISC-V64 | `reg` | None | `i8`, `i16`, `i32`, `f32`, `i64`, `f64` |
+| RISC-V | `freg` | `f` | `f32` |
+| RISC-V | `freg` | `d` | `f64` |
+| RISC-V | `vreg` | N/A | Only clobbers |
+| Hexagon | `reg` | None | `i8`, `i16`, `i32`, `f32` |
+| PowerPC | `reg` | None | `i8`, `i16`, `i32` |
+| PowerPC | `reg_nonzero` | None | `i8`, `i16`, `i32` |
+| PowerPC | `freg` | None | `f32`, `f64` |
+| PowerPC | `cr` | N/A | Only clobbers |
+| PowerPC | `xer` | N/A | Only clobbers |
+| wasm32 | `local` | None | `i8` `i16` `i32` `i64` `f32` `f64` |
+| BPF | `reg` | None | `i8` `i16` `i32` `i64` |
+| BPF | `wreg` | `alu32` | `i8` `i16` `i32` |
+
+> **Note**: For the purposes of the above table pointers, function pointers and `isize`/`usize` are treated as the equivalent integer type (`i16`/`i32`/`i64` depending on the target).
+
+If a value is of a smaller size than the register it is allocated in then the upper bits of that register will have an undefined value for inputs and will be ignored for outputs. The only exception is the `freg` register class on RISC-V where `f32` values are NaN-boxed in a `f64` as required by the RISC-V architecture.
+
+When separate input and output expressions are specified for an `inout` operand, both expressions must have the same type. The only exception is if both operands are pointers or integers, in which case they are only required to have the same size. This restriction exists because the register allocators in LLVM and GCC sometimes cannot handle tied operands with different types.
+
+## Register names
+
+Some registers have multiple names. These are all treated by the compiler as identical to the base register name. Here is the list of all supported register aliases:
+
+| Architecture | Base register | Aliases |
+| ------------ | ------------- | ------- |
+| x86 | `ax` | `eax`, `rax` |
+| x86 | `bx` | `ebx`, `rbx` |
+| x86 | `cx` | `ecx`, `rcx` |
+| x86 | `dx` | `edx`, `rdx` |
+| x86 | `si` | `esi`, `rsi` |
+| x86 | `di` | `edi`, `rdi` |
+| x86 | `bp` | `bpl`, `ebp`, `rbp` |
+| x86 | `sp` | `spl`, `esp`, `rsp` |
+| x86 | `ip` | `eip`, `rip` |
+| x86 | `st(0)` | `st` |
+| x86 | `r[8-15]` | `r[8-15]b`, `r[8-15]w`, `r[8-15]d` |
+| x86 | `xmm[0-31]` | `ymm[0-31]`, `zmm[0-31]` |
+| AArch64 | `x[0-30]` | `w[0-30]` |
+| AArch64 | `x29` | `fp` |
+| AArch64 | `x30` | `lr` |
+| AArch64 | `sp` | `wsp` |
+| AArch64 | `xzr` | `wzr` |
+| AArch64 | `v[0-31]` | `b[0-31]`, `h[0-31]`, `s[0-31]`, `d[0-31]`, `q[0-31]` |
+| ARM | `r[0-3]` | `a[1-4]` |
+| ARM | `r[4-9]` | `v[1-6]` |
+| ARM | `r9` | `rfp` |
+| ARM | `r10` | `sl` |
+| ARM | `r11` | `fp` |
+| ARM | `r12` | `ip` |
+| ARM | `r13` | `sp` |
+| ARM | `r14` | `lr` |
+| ARM | `r15` | `pc` |
+| RISC-V | `x0` | `zero` |
+| RISC-V | `x1` | `ra` |
+| RISC-V | `x2` | `sp` |
+| RISC-V | `x3` | `gp` |
+| RISC-V | `x4` | `tp` |
+| RISC-V | `x[5-7]` | `t[0-2]` |
+| RISC-V | `x8` | `fp`, `s0` |
+| RISC-V | `x9` | `s1` |
+| RISC-V | `x[10-17]` | `a[0-7]` |
+| RISC-V | `x[18-27]` | `s[2-11]` |
+| RISC-V | `x[28-31]` | `t[3-6]` |
+| RISC-V | `f[0-7]` | `ft[0-7]` |
+| RISC-V | `f[8-9]` | `fs[0-1]` |
+| RISC-V | `f[10-17]` | `fa[0-7]` |
+| RISC-V | `f[18-27]` | `fs[2-11]` |
+| RISC-V | `f[28-31]` | `ft[8-11]` |
+| Hexagon | `r29` | `sp` |
+| Hexagon | `r30` | `fr` |
+| Hexagon | `r31` | `lr` |
+| BPF | `r[0-10]` | `w[0-10]` |
+
+Some registers cannot be used for input or output operands:
+
+| Architecture | Unsupported register | Reason |
+| ------------ | -------------------- | ------ |
+| All | `sp` | The stack pointer must be restored to its original value at the end of an asm code block. |
+| All | `bp` (x86), `x29` (AArch64), `x8` (RISC-V), `fr` (Hexagon), `$fp` (MIPS) | The frame pointer cannot be used as an input or output. |
+| ARM | `r7` or `r11` | On ARM the frame pointer can be either `r7` or `r11` depending on the target. The frame pointer cannot be used as an input or output. |
+| All | `si` (x86-32), `bx` (x86-64), `r6` (ARM), `x19` (AArch64), `r19` (Hexagon), `x9` (RISC-V) | This is used internally by LLVM as a "base pointer" for functions with complex stack frames. |
+| x86 | `k0` | This is a constant zero register which can't be modified. |
+| x86 | `ip` | This is the program counter, not a real register. |
+| x86 | `mm[0-7]` | MMX registers are not currently supported (but may be in the future). |
+| x86 | `st([0-7])` | x87 registers are not currently supported (but may be in the future). |
+| AArch64 | `xzr` | This is a constant zero register which can't be modified. |
+| ARM | `pc` | This is the program counter, not a real register. |
+| ARM | `r9` | This is a reserved register on some ARM targets. |
+| MIPS | `$0` or `$zero` | This is a constant zero register which can't be modified. |
+| MIPS | `$1` or `$at` | Reserved for assembler. |
+| MIPS | `$26`/`$k0`, `$27`/`$k1` | OS-reserved registers. |
+| MIPS | `$28`/`$gp` | Global pointer cannot be used as inputs or outputs. |
+| MIPS | `$ra` | Return address cannot be used as inputs or outputs. |
+| RISC-V | `x0` | This is a constant zero register which can't be modified. |
+| RISC-V | `gp`, `tp` | These registers are reserved and cannot be used as inputs or outputs. |
+| Hexagon | `lr` | This is the link register which cannot be used as an input or output. |
+
+In some cases LLVM will allocate a "reserved register" for `reg` operands even though this register cannot be explicitly specified. Assembly code making use of reserved registers should be careful since `reg` operands may alias with those registers. Reserved registers are the frame pointer and base pointer
+- The frame pointer and LLVM base pointer on all architectures.
+- `r9` on ARM.
+- `x18` on AArch64.
+
+## Template modifiers
+
+The placeholders can be augmented by modifiers which are specified after the `:` in the curly braces. These modifiers do not affect register allocation, but change the way operands are formatted when inserted into the template string. Only one modifier is allowed per template placeholder.
+
+The supported modifiers are a subset of LLVM's (and GCC's) [asm template argument modifiers][llvm-argmod], but do not use the same letter codes.
+
+| Architecture | Register class | Modifier | Example output | LLVM modifier |
+| ------------ | -------------- | -------- | -------------- | ------------- |
+| x86-32 | `reg` | None | `eax` | `k` |
+| x86-64 | `reg` | None | `rax` | `q` |
+| x86-32 | `reg_abcd` | `l` | `al` | `b` |
+| x86-64 | `reg` | `l` | `al` | `b` |
+| x86 | `reg_abcd` | `h` | `ah` | `h` |
+| x86 | `reg` | `x` | `ax` | `w` |
+| x86 | `reg` | `e` | `eax` | `k` |
+| x86-64 | `reg` | `r` | `rax` | `q` |
+| x86 | `reg_byte` | None | `al` / `ah` | None |
+| x86 | `xmm_reg` | None | `xmm0` | `x` |
+| x86 | `ymm_reg` | None | `ymm0` | `t` |
+| x86 | `zmm_reg` | None | `zmm0` | `g` |
+| x86 | `*mm_reg` | `x` | `xmm0` | `x` |
+| x86 | `*mm_reg` | `y` | `ymm0` | `t` |
+| x86 | `*mm_reg` | `z` | `zmm0` | `g` |
+| x86 | `kreg` | None | `k1` | None |
+| AArch64 | `reg` | None | `x0` | `x` |
+| AArch64 | `reg` | `w` | `w0` | `w` |
+| AArch64 | `reg` | `x` | `x0` | `x` |
+| AArch64 | `vreg` | None | `v0` | None |
+| AArch64 | `vreg` | `v` | `v0` | None |
+| AArch64 | `vreg` | `b` | `b0` | `b` |
+| AArch64 | `vreg` | `h` | `h0` | `h` |
+| AArch64 | `vreg` | `s` | `s0` | `s` |
+| AArch64 | `vreg` | `d` | `d0` | `d` |
+| AArch64 | `vreg` | `q` | `q0` | `q` |
+| ARM | `reg` | None | `r0` | None |
+| ARM | `sreg` | None | `s0` | None |
+| ARM | `dreg` | None | `d0` | `P` |
+| ARM | `qreg` | None | `q0` | `q` |
+| ARM | `qreg` | `e` / `f` | `d0` / `d1` | `e` / `f` |
+| MIPS | `reg` | None | `$2` | None |
+| MIPS | `freg` | None | `$f0` | None |
+| NVPTX | `reg16` | None | `rs0` | None |
+| NVPTX | `reg32` | None | `r0` | None |
+| NVPTX | `reg64` | None | `rd0` | None |
+| RISC-V | `reg` | None | `x1` | None |
+| RISC-V | `freg` | None | `f0` | None |
+| Hexagon | `reg` | None | `r0` | None |
+| PowerPC | `reg` | None | `0` | None |
+| PowerPC | `reg_nonzero` | None | `3` | `b` |
+| PowerPC | `freg` | None | `0` | None |
+
+> **Notes**:
+> - on ARM `e` / `f`: this prints the low or high doubleword register name of a NEON quad (128-bit) register.
+> - on x86: our behavior for `reg` with no modifiers differs from what GCC does. GCC will infer the modifier based on the operand value type, while we default to the full register size.
+> - on x86 `xmm_reg`: the `x`, `t` and `g` LLVM modifiers are not yet implemented in LLVM (they are supported by GCC only), but this should be a simple change.
+
+As stated in the previous section, passing an input value smaller than the register width will result in the upper bits of the register containing undefined values. This is not a problem if the inline asm only accesses the lower bits of the register, which can be done by using a template modifier to use a subregister name in the asm code (e.g. `ax` instead of `rax`). Since this an easy pitfall, the compiler will suggest a template modifier to use where appropriate given the input type. If all references to an operand already have modifiers then the warning is suppressed for that operand.
+
+[llvm-argmod]: http://llvm.org/docs/LangRef.html#asm-template-argument-modifiers
+
+## ABI clobbers
+
+The `clobber_abi` keyword can be used to apply a default set of clobbers to an `asm!` block. This will automatically insert the necessary clobber constraints as needed for calling a function with a particular calling convention: if the calling convention does not fully preserve the value of a register across a call then a `lateout("reg") _` is implicitly added to the operands list.
+
+Generic register class outputs are disallowed by the compiler when `clobber_abi` is used: all outputs must specify an explicit register. Explicit register outputs have precedence over the implicit clobbers inserted by `clobber_abi`: a clobber will only be inserted for a register if that register is not used as an output.
+The following ABIs can be used with `clobber_abi`:
+
+| Architecture | ABI name | Clobbered registers |
+| ------------ | -------- | ------------------- |
+| x86-32 | `"C"`, `"system"`, `"efiapi"`, `"cdecl"`, `"stdcall"`, `"fastcall"` | `ax`, `cx`, `dx`, `xmm[0-7]`, `mm[0-7]`, `k[1-7]`, `st([0-7])` |
+| x86-64 | `"C"`, `"system"` (on Windows), `"efiapi"`, `"win64"` | `ax`, `cx`, `dx`, `r[8-11]`, `xmm[0-31]`, `mm[0-7]`, `k[1-7]`, `st([0-7])` |
+| x86-64 | `"C"`, `"system"` (on non-Windows), `"sysv64"` | `ax`, `cx`, `dx`, `si`, `di`, `r[8-11]`, `xmm[0-31]`, `mm[0-7]`, `k[1-7]`, `st([0-7])` |
+| AArch64 | `"C"`, `"system"`, `"efiapi"` | `x[0-17]`, `x30`, `v[0-31]`, `p[0-15]`, `ffr` |
+| ARM | `"C"`, `"system"`, `"efiapi"`, `"aapcs"` | `r[0-3]`, `r12`, `r14`, `s[0-15]`, `d[0-7]`, `d[16-31]` |
+| RISC-V | `"C"`, `"system"`, `"efiapi"` | `x1`, `x[5-7]`, `x[10-17]`, `x[28-31]`, `f[0-7]`, `f[10-17]`, `f[28-31]`, `v[0-31]` |
+
+The list of clobbered registers for each ABI is updated in rustc as architectures gain new registers: this ensures that `asm!` clobbers will continue to be correct when LLVM starts using these new registers in its generated code.
+
+## Options
+
+Flags are used to further influence the behavior of the inline assembly block.
+Currently the following options are defined:
+- `pure`: The `asm!` block has no side effects, and its outputs depend only on its direct inputs (i.e. the values themselves, not what they point to) or values read from memory (unless the `nomem` options is also set). This allows the compiler to execute the `asm!` block fewer times than specified in the program (e.g. by hoisting it out of a loop) or even eliminate it entirely if the outputs are not used.
+- `nomem`: The `asm!` blocks does not read or write to any memory. This allows the compiler to cache the values of modified global variables in registers across the `asm!` block since it knows that they are not read or written to by the `asm!`.
+- `readonly`: The `asm!` block does not write to any memory. This allows the compiler to cache the values of unmodified global variables in registers across the `asm!` block since it knows that they are not written to by the `asm!`.
+- `preserves_flags`: The `asm!` block does not modify the flags register (defined in the rules below). This allows the compiler to avoid recomputing the condition flags after the `asm!` block.
+- `noreturn`: The `asm!` block never returns, and its return type is defined as `!` (never). Behavior is undefined if execution falls through past the end of the asm code. A `noreturn` asm block behaves just like a function which doesn't return; notably, local variables in scope are not dropped before it is invoked.
+- `nostack`: The `asm!` block does not push data to the stack, or write to the stack red-zone (if supported by the target). If this option is *not* used then the stack pointer is guaranteed to be suitably aligned (according to the target ABI) for a function call.
+- `att_syntax`: This option is only valid on x86, and causes the assembler to use the `.att_syntax prefix` mode of the GNU assembler. Register operands are substituted in with a leading `%`.
+- `raw`: This causes the template string to be parsed as a raw assembly string, with no special handling for `{` and `}`. This is primarily useful when including raw assembly code from an external file using `include_str!`.
+
+The compiler performs some additional checks on options:
+- The `nomem` and `readonly` options are mutually exclusive: it is a compile-time error to specify both.
+- The `pure` option must be combined with either the `nomem` or `readonly` options, otherwise a compile-time error is emitted.
+- It is a compile-time error to specify `pure` on an asm block with no outputs or only discarded outputs (`_`).
+- It is a compile-time error to specify `noreturn` on an asm block with outputs.
+
+`global_asm!` only supports the `att_syntax` and `raw` options. The remaining options are not meaningful for global-scope inline assembly
+
+## Rules for inline assembly
+
+To avoid undefined behavior, these rules must be followed when using function-scope inline assembly (`asm!`):
+
+- Any registers not specified as inputs will contain an undefined value on entry to the asm block.
+  - An "undefined value" in the context of inline assembly means that the register can (non-deterministically) have any one of the possible values allowed by the architecture. Notably it is not the same as an LLVM `undef` which can have a different value every time you read it (since such a concept does not exist in assembly code).
+- Any registers not specified as outputs must have the same value upon exiting the asm block as they had on entry, otherwise behavior is undefined.
+  - This only applies to registers which can be specified as an input or output. Other registers follow target-specific rules.
+  - Note that a `lateout` may be allocated to the same register as an `in`, in which case this rule does not apply. Code should not rely on this however since it depends on the results of register allocation.
+- Behavior is undefined if execution unwinds out of an asm block.
+  - This also applies if the assembly code calls a function which then unwinds.
+- The set of memory locations that assembly code is allowed to read and write are the same as those allowed for an FFI function.
+  - Refer to the unsafe code guidelines for the exact rules.
+  - If the `readonly` option is set, then only memory reads are allowed.
+  - If the `nomem` option is set then no reads or writes to memory are allowed.
+  - These rules do not apply to memory which is private to the asm code, such as stack space allocated within the asm block.
+- The compiler cannot assume that the instructions in the asm are the ones that will actually end up executed.
+  - This effectively means that the compiler must treat the `asm!` as a black box and only take the interface specification into account, not the instructions themselves.
+  - Runtime code patching is allowed, via target-specific mechanisms.
+- Unless the `nostack` option is set, asm code is allowed to use stack space below the stack pointer.
+  - On entry to the asm block the stack pointer is guaranteed to be suitably aligned (according to the target ABI) for a function call.
+  - You are responsible for making sure you don't overflow the stack (e.g. use stack probing to ensure you hit a guard page).
+  - You should adjust the stack pointer when allocating stack memory as required by the target ABI.
+  - The stack pointer must be restored to its original value before leaving the asm block.
+- If the `noreturn` option is set then behavior is undefined if execution falls through to the end of the asm block.
+- If the `pure` option is set then behavior is undefined if the `asm!` has side-effects other than its direct outputs. Behavior is also undefined if two executions of the `asm!` code with the same inputs result in different outputs.
+  - When used with the `nomem` option, "inputs" are just the direct inputs of the `asm!`.
+  - When used with the `readonly` option, "inputs" comprise the direct inputs of the `asm!` and any memory that the `asm!` block is allowed to read.
+- These flags registers must be restored upon exiting the asm block if the `preserves_flags` option is set:
+  - x86
+    - Status flags in `EFLAGS` (CF, PF, AF, ZF, SF, OF).
+    - Floating-point status word (all).
+    - Floating-point exception flags in `MXCSR` (PE, UE, OE, ZE, DE, IE).
+  - ARM
+    - Condition flags in `CPSR` (N, Z, C, V)
+    - Saturation flag in `CPSR` (Q)
+    - Greater than or equal flags in `CPSR` (GE).
+    - Condition flags in `FPSCR` (N, Z, C, V)
+    - Saturation flag in `FPSCR` (QC)
+    - Floating-point exception flags in `FPSCR` (IDC, IXC, UFC, OFC, DZC, IOC).
+  - AArch64
+    - Condition flags (`NZCV` register).
+    - Floating-point status (`FPSR` register).
+  - RISC-V
+    - Floating-point exception flags in `fcsr` (`fflags`).
+    - Vector extension state (`vtype`, `vl`, `vcsr`).
+- On x86, the direction flag (DF in `EFLAGS`) is clear on entry to an asm block and must be clear on exit.
+  - Behavior is undefined if the direction flag is set on exiting an asm block.
+- The requirement of restoring the stack pointer and non-output registers to their original value only applies when exiting an `asm!` block.
+  - This means that `asm!` blocks that never return (even if not marked `noreturn`) don't need to preserve these registers.
+  - When returning to a different `asm!` block than you entered (e.g. for context switching), these registers must contain the value they had upon entering the `asm!` block that you are *exiting*.
+    - You cannot exit an `asm!` block that has not been entered. Neither can you exit an `asm!` block that has already been exited.
+    - You are responsible for switching any target-specific state (e.g. thread-local storage, stack bounds).
+    - The set of memory locations that you may access is the intersection of those allowed by the `asm!` blocks you entered and exited.
+- You cannot assume that an `asm!` block will appear exactly once in the output binary. The compiler is allowed to instantiate multiple copies of the `asm!` block, for example when the function containing it is inlined in multiple places.
+
+> **Note**: As a general rule, the flags covered by `preserves_flags` are those which are *not* preserved when performing a function call.

From c41b359723f2459b387d690153050fbb1f3c9939 Mon Sep 17 00:00:00 2001
From: Amanieu d'Antras <amanieu@gmail.com>
Date: Mon, 29 Nov 2021 15:30:42 +0000
Subject: [PATCH 02/13] Remove references to unstable parts of asm!

---
 src/inline-assembly.md | 80 ++++--------------------------------------
 1 file changed, 6 insertions(+), 74 deletions(-)

diff --git a/src/inline-assembly.md b/src/inline-assembly.md
index b42af734b..2ec72d9d3 100644
--- a/src/inline-assembly.md
+++ b/src/inline-assembly.md
@@ -10,7 +10,7 @@ dir_spec := "in" / "out" / "lateout" / "inout" / "inlateout"
 reg_spec := <register class> / "<explicit register>"
 operand_expr := expr / "_" / expr "=>" expr / expr "=>" "_"
 reg_operand := dir_spec "(" reg_spec ")" operand_expr
-operand := reg_operand / "const" const_expr / "sym" path
+operand := reg_operand
 clobber_abi := "clobber_abi(" <abi> *["," <abi>] [","] ")"
 option := "pure" / "nomem" / "readonly" / "preserves_flags" / "noreturn" / "nostack" / "att_syntax" / "raw"
 options := "options(" option *["," option] [","] ")"
@@ -18,18 +18,11 @@ asm := "asm!(" format_string *("," format_string) *("," [ident "="] operand) *("
 global_asm := "global_asm!(" format_string *("," format_string) *("," [ident "="] operand) *("," options) [","] ")"
 ```
 
-Inline assembly is currently supported on the following architectures:
+Support for inline assembly is stable on the following architectures:
 - x86 and x86-64
 - ARM
 - AArch64
 - RISC-V
-- NVPTX
-- PowerPC
-- Hexagon
-- MIPS
-- WebAssembly
-- BPF
-- SPIR-V
 
 Support for more targets may be added in the future. The compiler will emit an error if `asm!` is used on an unsupported target.
 
@@ -86,22 +79,14 @@ Several types of operands are supported:
 * `inlateout(<reg>) <expr>` / `inlateout(<reg>) <in expr> => <out expr>`
   - Identical to `inout` except that the register allocator can reuse a register allocated to an `in` (this can happen if the compiler knows the `in` has the same initial value as the `inlateout`).
   - You should only write to the register after all inputs are read, otherwise you may clobber an input.
-* `const <expr>`
-  - `<expr>` must be an integer constant expression.
-  - The value of the expression is formatted as a string and substituted directly into the asm template string.
-* `sym <path>`
-  - `<path>` must refer to a `fn` or `static`.
-  - A mangled symbol name referring to the item is substituted into the asm template string.
-  - The substituted string does not include any modifiers (e.g. GOT, PLT, relocations, etc).
-  - `<path>` is allowed to point to a `#[thread_local]` static, in which case the asm code can combine the symbol with relocations (e.g. `@plt`, `@TPOFF`) to read from thread-local data.
 
 Operand expressions are evaluated from left to right, just like function call arguments. After the `asm!` has executed, outputs are written to in left to right order. This is significant if two outputs point to the same place: that place will contain the value of the rightmost output.
 
-Since `global_asm!` exists outside a function, only `const` and `sym` operands can be used with it.
+Since `global_asm!` exists outside a function, it cannot use input/output operands.
 
 ## Register operands
 
-Input and output operands can be specified either as an explicit register or as a register class from which the register allocator can select a register. Explicit registers are specified as string literals (e.g. `"eax"`) while register classes are specified as identifiers (e.g. `reg`). Using string literals for register names enables support for architectures that use special characters in register names, such as MIPS (`$0`, `$1`, etc).
+Input and output operands can be specified either as an explicit register or as a register class from which the register allocator can select a register. Explicit registers are specified as string literals (e.g. `"eax"`) while register classes are specified as identifiers (e.g. `reg`).
 
 Note that explicit registers treat register aliases (e.g. `r14` vs `lr` on ARM) and smaller views of a register (e.g. `eax` vs `rax`) as equivalent to the base register. It is a compile-time error to use the same explicit register for two input operands or two output operands. Additionally, it is also a compile-time error to use overlapping registers (e.g. ARM VFP) in input operands or in output operands.
 
@@ -141,33 +126,15 @@ Here is the list of currently supported register classes:
 | ARM | `qreg` | `q[0-15]` | `w` |
 | ARM | `qreg_low8` | `q[0-7]` | `t` |
 | ARM | `qreg_low4` | `q[0-3]` | `x` |
-| MIPS | `reg` | `$[2-25]` | `r` |
-| MIPS | `freg` | `$f[0-31]` | `f` |
-| NVPTX | `reg16` | None\* | `h` |
-| NVPTX | `reg32` | None\* | `r` |
-| NVPTX | `reg64` | None\* | `l` |
 | RISC-V | `reg` | `x1`, `x[5-7]`, `x[9-15]`, `x[16-31]` (non-RV32E) | `r` |
 | RISC-V | `freg` | `f[0-31]` | `f` |
 | RISC-V | `vreg` | `v[0-31]` | Only clobbers |
-| Hexagon | `reg` | `r[0-28]` | `r` |
-| PowerPC | `reg` | `r[0-31]` | `r` |
-| PowerPC | `reg_nonzero` | | `r[1-31]` | `b` |
-| PowerPC | `freg` | `f[0-31]` | `f` |
-| PowerPC | `cr` | `cr[0-7]`, `cr` | Only clobbers |
-| PowerPC | `xer` | `xer` | Only clobbers |
-| wasm32 | `local` | None\* | `r` |
-| BPF | `reg` | `r[0-10]` | `r` |
-| BPF | `wreg` | `w[0-10]` | `w` |
 
 > **Notes**:
 > - On x86 we treat `reg_byte` differently from `reg` because the compiler can allocate `al` and `ah` separately whereas `reg` reserves the whole register.
 >
 > - On x86-64 the high byte registers (e.g. `ah`) are not available in the `reg_byte` register class.
 >
-> - NVPTX doesn't have a fixed register set, so named registers are not supported.
->
-> - WebAssembly doesn't have registers, so named registers are not supported.
->
 > - Some register classes are marked as "Only clobbers" which means that they cannot be used for inputs or outputs, only clobbers of the form `out("reg") _` or `lateout("reg") _`.
 
 Additional register classes may be added in the future based on demand (e.g. MMX, x87, etc).
@@ -193,27 +160,11 @@ Each register class has constraints on which value types they can be used with.
 | ARM | `sreg` | `vfp2` | `i32`, `f32` |
 | ARM | `dreg` | `vfp2` | `i64`, `f64`, `i8x8`, `i16x4`, `i32x2`, `i64x1`, `f32x2` |
 | ARM | `qreg` | `neon` | `i8x16`, `i16x8`, `i32x4`, `i64x2`, `f32x4` |
-| MIPS32 | `reg` | None | `i8`, `i16`, `i32`, `f32` |
-| MIPS32 | `freg` | None | `f32`, `f64` |
-| MIPS64 | `reg` | None | `i8`, `i16`, `i32`, `i64`, `f32`, `f64` |
-| MIPS64 | `freg` | None | `f32`, `f64` |
-| NVPTX | `reg16` | None | `i8`, `i16` |
-| NVPTX | `reg32` | None | `i8`, `i16`, `i32`, `f32` |
-| NVPTX | `reg64` | None | `i8`, `i16`, `i32`, `f32`, `i64`, `f64` |
 | RISC-V32 | `reg` | None | `i8`, `i16`, `i32`, `f32` |
 | RISC-V64 | `reg` | None | `i8`, `i16`, `i32`, `f32`, `i64`, `f64` |
 | RISC-V | `freg` | `f` | `f32` |
 | RISC-V | `freg` | `d` | `f64` |
 | RISC-V | `vreg` | N/A | Only clobbers |
-| Hexagon | `reg` | None | `i8`, `i16`, `i32`, `f32` |
-| PowerPC | `reg` | None | `i8`, `i16`, `i32` |
-| PowerPC | `reg_nonzero` | None | `i8`, `i16`, `i32` |
-| PowerPC | `freg` | None | `f32`, `f64` |
-| PowerPC | `cr` | N/A | Only clobbers |
-| PowerPC | `xer` | N/A | Only clobbers |
-| wasm32 | `local` | None | `i8` `i16` `i32` `i64` `f32` `f64` |
-| BPF | `reg` | None | `i8` `i16` `i32` `i64` |
-| BPF | `wreg` | `alu32` | `i8` `i16` `i32` |
 
 > **Note**: For the purposes of the above table pointers, function pointers and `isize`/`usize` are treated as the equivalent integer type (`i16`/`i32`/`i64` depending on the target).
 
@@ -270,19 +221,15 @@ Some registers have multiple names. These are all treated by the compiler as ide
 | RISC-V | `f[10-17]` | `fa[0-7]` |
 | RISC-V | `f[18-27]` | `fs[2-11]` |
 | RISC-V | `f[28-31]` | `ft[8-11]` |
-| Hexagon | `r29` | `sp` |
-| Hexagon | `r30` | `fr` |
-| Hexagon | `r31` | `lr` |
-| BPF | `r[0-10]` | `w[0-10]` |
 
 Some registers cannot be used for input or output operands:
 
 | Architecture | Unsupported register | Reason |
 | ------------ | -------------------- | ------ |
 | All | `sp` | The stack pointer must be restored to its original value at the end of an asm code block. |
-| All | `bp` (x86), `x29` (AArch64), `x8` (RISC-V), `fr` (Hexagon), `$fp` (MIPS) | The frame pointer cannot be used as an input or output. |
+| All | `bp` (x86), `x29` (AArch64), `x8` (RISC-V) | The frame pointer cannot be used as an input or output. |
 | ARM | `r7` or `r11` | On ARM the frame pointer can be either `r7` or `r11` depending on the target. The frame pointer cannot be used as an input or output. |
-| All | `si` (x86-32), `bx` (x86-64), `r6` (ARM), `x19` (AArch64), `r19` (Hexagon), `x9` (RISC-V) | This is used internally by LLVM as a "base pointer" for functions with complex stack frames. |
+| All | `si` (x86-32), `bx` (x86-64), `r6` (ARM), `x19` (AArch64), `x9` (RISC-V) | This is used internally by LLVM as a "base pointer" for functions with complex stack frames. |
 | x86 | `k0` | This is a constant zero register which can't be modified. |
 | x86 | `ip` | This is the program counter, not a real register. |
 | x86 | `mm[0-7]` | MMX registers are not currently supported (but may be in the future). |
@@ -290,14 +237,8 @@ Some registers cannot be used for input or output operands:
 | AArch64 | `xzr` | This is a constant zero register which can't be modified. |
 | ARM | `pc` | This is the program counter, not a real register. |
 | ARM | `r9` | This is a reserved register on some ARM targets. |
-| MIPS | `$0` or `$zero` | This is a constant zero register which can't be modified. |
-| MIPS | `$1` or `$at` | Reserved for assembler. |
-| MIPS | `$26`/`$k0`, `$27`/`$k1` | OS-reserved registers. |
-| MIPS | `$28`/`$gp` | Global pointer cannot be used as inputs or outputs. |
-| MIPS | `$ra` | Return address cannot be used as inputs or outputs. |
 | RISC-V | `x0` | This is a constant zero register which can't be modified. |
 | RISC-V | `gp`, `tp` | These registers are reserved and cannot be used as inputs or outputs. |
-| Hexagon | `lr` | This is the link register which cannot be used as an input or output. |
 
 In some cases LLVM will allocate a "reserved register" for `reg` operands even though this register cannot be explicitly specified. Assembly code making use of reserved registers should be careful since `reg` operands may alias with those registers. Reserved registers are the frame pointer and base pointer
 - The frame pointer and LLVM base pointer on all architectures.
@@ -343,17 +284,8 @@ The supported modifiers are a subset of LLVM's (and GCC's) [asm template argumen
 | ARM | `dreg` | None | `d0` | `P` |
 | ARM | `qreg` | None | `q0` | `q` |
 | ARM | `qreg` | `e` / `f` | `d0` / `d1` | `e` / `f` |
-| MIPS | `reg` | None | `$2` | None |
-| MIPS | `freg` | None | `$f0` | None |
-| NVPTX | `reg16` | None | `rs0` | None |
-| NVPTX | `reg32` | None | `r0` | None |
-| NVPTX | `reg64` | None | `rd0` | None |
 | RISC-V | `reg` | None | `x1` | None |
 | RISC-V | `freg` | None | `f0` | None |
-| Hexagon | `reg` | None | `r0` | None |
-| PowerPC | `reg` | None | `0` | None |
-| PowerPC | `reg_nonzero` | None | `3` | `b` |
-| PowerPC | `freg` | None | `0` | None |
 
 > **Notes**:
 > - on ARM `e` / `f`: this prints the low or high doubleword register name of a NEON quad (128-bit) register.

From 3d7b4ff325a3d882a75332d543593f5aefb3db41 Mon Sep 17 00:00:00 2001
From: Amanieu d'Antras <amanieu@gmail.com>
Date: Mon, 29 Nov 2021 23:11:56 +0000
Subject: [PATCH 03/13] Apply review feedback

---
 src/inline-assembly.md | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/inline-assembly.md b/src/inline-assembly.md
index 2ec72d9d3..d95ed1925 100644
--- a/src/inline-assembly.md
+++ b/src/inline-assembly.md
@@ -1,8 +1,11 @@
 # Inline assembly
 
-Rust provides support for inline assembly via the `asm!` and `global_asm!` macros.
+Support for inline assembly is provided via the [`asm!`] and [`global_asm!`] macros.
 It can be used to embed handwritten assembly in the assembly output generated by the compiler.
 
+[`asm!`]: ../core/arch/macro.asm.html
+[`global_asm!`]: ../core/arch/macro.global_asm.html
+
 The following ABNF specifies the general syntax:
 
 ```text
@@ -24,7 +27,7 @@ Support for inline assembly is stable on the following architectures:
 - AArch64
 - RISC-V
 
-Support for more targets may be added in the future. The compiler will emit an error if `asm!` is used on an unsupported target.
+The compiler will emit an error if `asm!` is used on an unsupported target.
 
 ## Scope
 
@@ -40,7 +43,7 @@ The assembler template uses the same syntax as [format strings][format-syntax] (
 
 An `asm!` invocation may have one or more template string arguments; an `asm!` with multiple template string arguments is treated as if all the strings were concatenated with a `\n` between them. The expected usage is for each template string argument to correspond to a line of assembly code. All template string arguments must appear before any other arguments.
 
-As with format strings, named arguments must appear after positional arguments. Explicit register operands must appear at the end of the operand list, after named arguments if any.
+As with format strings, named arguments must appear after positional arguments. Explicit [register operands](#register-operands) must appear at the end of the operand list, after named arguments if any.
 
 Explicit register operands cannot be used by placeholders in the template string. All other named and positional operands must appear at least once in the template string, otherwise a compiler error is generated.
 
@@ -48,7 +51,7 @@ The exact assembly code syntax is target-specific and opaque to the compiler exc
 
 Currently, all supported targets follow the assembly code syntax used by LLVM's internal assembler which usually corresponds to that of the GNU assembler (GAS). On x86, the `.intel_syntax noprefix` mode of GAS is used by default. On ARM, the `.syntax unified` mode is used. These targets impose an additional restriction on the assembly code: any assembler state (e.g. the current section which can be changed with `.section`) must be restored to its original value at the end of the asm string. Assembly code that does not conform to the GAS syntax will result in assembler-specific behavior.
 
-[format-syntax]: https://doc.rust-lang.org/std/fmt/#syntax
+[format-syntax]: ../std/fmt/#syntax
 [rfc-2795]: https://github.com/rust-lang/rfcs/pull/2795
 
 ## Operand type
@@ -137,8 +140,6 @@ Here is the list of currently supported register classes:
 >
 > - Some register classes are marked as "Only clobbers" which means that they cannot be used for inputs or outputs, only clobbers of the form `out("reg") _` or `lateout("reg") _`.
 
-Additional register classes may be added in the future based on demand (e.g. MMX, x87, etc).
-
 Each register class has constraints on which value types they can be used with. This is necessary because the way a value is loaded into a register depends on its type. For example, on big-endian systems, loading a `i32x4` and a `i8x16` into a SIMD register may result in different register contents even if the byte-wise memory representation of both values is identical. The availability of supported types for a particular register class may depend on what target features are currently enabled.
 
 | Architecture | Register class | Target feature | Allowed types |

From 480f6d43bf38e5b98b373a1d03b9558c7ecb829b Mon Sep 17 00:00:00 2001
From: Amanieu d'Antras <amanieu@gmail.com>
Date: Wed, 1 Dec 2021 23:57:23 +0000
Subject: [PATCH 04/13] Put each sentence on a separate line.

---
 src/inline-assembly.md | 134 +++++++++++++++++++++++++++++------------
 1 file changed, 95 insertions(+), 39 deletions(-)

diff --git a/src/inline-assembly.md b/src/inline-assembly.md
index d95ed1925..e8c25eb15 100644
--- a/src/inline-assembly.md
+++ b/src/inline-assembly.md
@@ -33,23 +33,36 @@ The compiler will emit an error if `asm!` is used on an unsupported target.
 
 Inline assembly can be used in one of two ways.
 
-With the `asm!` macro, the assembly code is emitted in a function scope and integrated into the compiler-generated assembly code of a function. This assembly code must obey [strict rules](#rules-for-inline-assembly) to avoid undefined behavior. Note that in some cases the compiler may choose to emit the assembly code as a separate function and generate a call to it.
+With the `asm!` macro, the assembly code is emitted in a function scope and integrated into the compiler-generated assembly code of a function.
+This assembly code must obey [strict rules](#rules-for-inline-assembly) to avoid undefined behavior.
+Note that in some cases the compiler may choose to emit the assembly code as a separate function and generate a call to it.
 
-With the `global_asm!` macro, the assembly code is emitted in a global scope, outside a function. This can be used to hand-write entire functions using assembly code, and generally provides much more freedom to use arbitrary registers and assembler directives.
+With the `global_asm!` macro, the assembly code is emitted in a global scope, outside a function.
+This can be used to hand-write entire functions using assembly code, and generally provides much more freedom to use arbitrary registers and assembler directives.
 
 ## Template string arguments
 
-The assembler template uses the same syntax as [format strings][format-syntax] (i.e. placeholders are specified by curly braces). The corresponding arguments are accessed in order, by index, or by name. However, implicit named arguments (introduced by [RFC #2795][rfc-2795]) are not supported.
+The assembler template uses the same syntax as [format strings][format-syntax] (i.e. placeholders are specified by curly braces).
+The corresponding arguments are accessed in order, by index, or by name.
+However, implicit named arguments (introduced by [RFC #2795][rfc-2795]) are not supported.
 
-An `asm!` invocation may have one or more template string arguments; an `asm!` with multiple template string arguments is treated as if all the strings were concatenated with a `\n` between them. The expected usage is for each template string argument to correspond to a line of assembly code. All template string arguments must appear before any other arguments.
+An `asm!` invocation may have one or more template string arguments; an `asm!` with multiple template string arguments is treated as if all the strings were concatenated with a `\n` between them.
+The expected usage is for each template string argument to correspond to a line of assembly code.
+All template string arguments must appear before any other arguments.
 
-As with format strings, named arguments must appear after positional arguments. Explicit [register operands](#register-operands) must appear at the end of the operand list, after named arguments if any.
+As with format strings, named arguments must appear after positional arguments.
+Explicit [register operands](#register-operands) must appear at the end of the operand list, after named arguments if any.
 
-Explicit register operands cannot be used by placeholders in the template string. All other named and positional operands must appear at least once in the template string, otherwise a compiler error is generated.
+Explicit register operands cannot be used by placeholders in the template string.
+All other named and positional operands must appear at least once in the template string, otherwise a compiler error is generated.
 
 The exact assembly code syntax is target-specific and opaque to the compiler except for the way operands are substituted into the template string to form the code passed to the assembler.
 
-Currently, all supported targets follow the assembly code syntax used by LLVM's internal assembler which usually corresponds to that of the GNU assembler (GAS). On x86, the `.intel_syntax noprefix` mode of GAS is used by default. On ARM, the `.syntax unified` mode is used. These targets impose an additional restriction on the assembly code: any assembler state (e.g. the current section which can be changed with `.section`) must be restored to its original value at the end of the asm string. Assembly code that does not conform to the GAS syntax will result in assembler-specific behavior.
+Currently, all supported targets follow the assembly code syntax used by LLVM's internal assembler which usually corresponds to that of the GNU assembler (GAS).
+On x86, the `.intel_syntax noprefix` mode of GAS is used by default.
+On ARM, the `.syntax unified` mode is used.
+These targets impose an additional restriction on the assembly code: any assembler state (e.g. the current section which can be changed with `.section`) must be restored to its original value at the end of the asm string.
+Assembly code that does not conform to the GAS syntax will result in assembler-specific behavior.
 
 [format-syntax]: ../std/fmt/#syntax
 [rfc-2795]: https://github.com/rust-lang/rfcs/pull/2795
@@ -59,11 +72,13 @@ Currently, all supported targets follow the assembly code syntax used by LLVM's
 Several types of operands are supported:
 
 * `in(<reg>) <expr>`
-  - `<reg>` can refer to a register class or an explicit register. The allocated register name is substituted into the asm template string.
+  - `<reg>` can refer to a register class or an explicit register.
+    The allocated register name is substituted into the asm template string.
   - The allocated register will contain the value of `<expr>` at the start of the asm code.
   - The allocated register must contain the same value at the end of the asm code (except if a `lateout` is allocated to the same register).
 * `out(<reg>) <expr>`
-  - `<reg>` can refer to a register class or an explicit register. The allocated register name is substituted into the asm template string.
+  - `<reg>` can refer to a register class or an explicit register.
+    The allocated register name is substituted into the asm template string.
   - The allocated register will contain an undefined value at the start of the asm code.
   - `<expr>` must be a (possibly uninitialized) place expression, to which the contents of the allocated register is written to at the end of the asm code.
   - An underscore (`_`) may be specified instead of an expression, which will cause the contents of the register to be discarded at the end of the asm code (effectively acting as a clobber).
@@ -71,7 +86,8 @@ Several types of operands are supported:
   - Identical to `out` except that the register allocator can reuse a register allocated to an `in`.
   - You should only write to the register after all inputs are read, otherwise you may clobber an input.
 * `inout(<reg>) <expr>`
-  - `<reg>` can refer to a register class or an explicit register. The allocated register name is substituted into the asm template string.
+  - `<reg>` can refer to a register class or an explicit register.
+    The allocated register name is substituted into the asm template string.
   - The allocated register will contain the value of `<expr>` at the start of the asm code.
   - `<expr>` must be a mutable initialized place expression, to which the contents of the allocated register is written to at the end of the asm code.
 * `inout(<reg>) <in expr> => <out expr>`
@@ -83,22 +99,28 @@ Several types of operands are supported:
   - Identical to `inout` except that the register allocator can reuse a register allocated to an `in` (this can happen if the compiler knows the `in` has the same initial value as the `inlateout`).
   - You should only write to the register after all inputs are read, otherwise you may clobber an input.
 
-Operand expressions are evaluated from left to right, just like function call arguments. After the `asm!` has executed, outputs are written to in left to right order. This is significant if two outputs point to the same place: that place will contain the value of the rightmost output.
+Operand expressions are evaluated from left to right, just like function call arguments.
+After the `asm!` has executed, outputs are written to in left to right order.
+This is significant if two outputs point to the same place: that place will contain the value of the rightmost output.
 
 Since `global_asm!` exists outside a function, it cannot use input/output operands.
 
 ## Register operands
 
-Input and output operands can be specified either as an explicit register or as a register class from which the register allocator can select a register. Explicit registers are specified as string literals (e.g. `"eax"`) while register classes are specified as identifiers (e.g. `reg`).
+Input and output operands can be specified either as an explicit register or as a register class from which the register allocator can select a register.
+Explicit registers are specified as string literals (e.g. `"eax"`) while register classes are specified as identifiers (e.g. `reg`).
 
-Note that explicit registers treat register aliases (e.g. `r14` vs `lr` on ARM) and smaller views of a register (e.g. `eax` vs `rax`) as equivalent to the base register. It is a compile-time error to use the same explicit register for two input operands or two output operands. Additionally, it is also a compile-time error to use overlapping registers (e.g. ARM VFP) in input operands or in output operands.
+Note that explicit registers treat register aliases (e.g. `r14` vs `lr` on ARM) and smaller views of a register (e.g. `eax` vs `rax`) as equivalent to the base register.
+It is a compile-time error to use the same explicit register for two input operands or two output operands.
+Additionally, it is also a compile-time error to use overlapping registers (e.g. ARM VFP) in input operands or in output operands.
 
 Only the following types are allowed as operands for inline assembly:
 - Integers (signed and unsigned)
 - Floating-point numbers
 - Pointers (thin only)
 - Function pointers
-- SIMD vectors (structs defined with `#[repr(simd)]` and which implement `Copy`). This includes architecture-specific vector types defined in `std::arch` such as `__m128` (x86) or `int8x16_t` (ARM).
+- SIMD vectors (structs defined with `#[repr(simd)]` and which implement `Copy`).
+This includes architecture-specific vector types defined in `std::arch` such as `__m128` (x86) or `int8x16_t` (ARM).
 
 Here is the list of currently supported register classes:
 
@@ -140,7 +162,10 @@ Here is the list of currently supported register classes:
 >
 > - Some register classes are marked as "Only clobbers" which means that they cannot be used for inputs or outputs, only clobbers of the form `out("reg") _` or `lateout("reg") _`.
 
-Each register class has constraints on which value types they can be used with. This is necessary because the way a value is loaded into a register depends on its type. For example, on big-endian systems, loading a `i32x4` and a `i8x16` into a SIMD register may result in different register contents even if the byte-wise memory representation of both values is identical. The availability of supported types for a particular register class may depend on what target features are currently enabled.
+Each register class has constraints on which value types they can be used with.
+This is necessary because the way a value is loaded into a register depends on its type.
+For example, on big-endian systems, loading a `i32x4` and a `i8x16` into a SIMD register may result in different register contents even if the byte-wise memory representation of both values is identical.
+The availability of supported types for a particular register class may depend on what target features are currently enabled.
 
 | Architecture | Register class | Target feature | Allowed types |
 | ------------ | -------------- | -------------- | ------------- |
@@ -169,13 +194,18 @@ Each register class has constraints on which value types they can be used with.
 
 > **Note**: For the purposes of the above table pointers, function pointers and `isize`/`usize` are treated as the equivalent integer type (`i16`/`i32`/`i64` depending on the target).
 
-If a value is of a smaller size than the register it is allocated in then the upper bits of that register will have an undefined value for inputs and will be ignored for outputs. The only exception is the `freg` register class on RISC-V where `f32` values are NaN-boxed in a `f64` as required by the RISC-V architecture.
+If a value is of a smaller size than the register it is allocated in then the upper bits of that register will have an undefined value for inputs and will be ignored for outputs.
+The only exception is the `freg` register class on RISC-V where `f32` values are NaN-boxed in a `f64` as required by the RISC-V architecture.
 
-When separate input and output expressions are specified for an `inout` operand, both expressions must have the same type. The only exception is if both operands are pointers or integers, in which case they are only required to have the same size. This restriction exists because the register allocators in LLVM and GCC sometimes cannot handle tied operands with different types.
+When separate input and output expressions are specified for an `inout` operand, both expressions must have the same type.
+The only exception is if both operands are pointers or integers, in which case they are only required to have the same size.
+This restriction exists because the register allocators in LLVM and GCC sometimes cannot handle tied operands with different types.
 
 ## Register names
 
-Some registers have multiple names. These are all treated by the compiler as identical to the base register name. Here is the list of all supported register aliases:
+Some registers have multiple names.
+These are all treated by the compiler as identical to the base register name.
+Here is the list of all supported register aliases:
 
 | Architecture | Base register | Aliases |
 | ------------ | ------------- | ------- |
@@ -241,14 +271,18 @@ Some registers cannot be used for input or output operands:
 | RISC-V | `x0` | This is a constant zero register which can't be modified. |
 | RISC-V | `gp`, `tp` | These registers are reserved and cannot be used as inputs or outputs. |
 
-In some cases LLVM will allocate a "reserved register" for `reg` operands even though this register cannot be explicitly specified. Assembly code making use of reserved registers should be careful since `reg` operands may alias with those registers. Reserved registers are the frame pointer and base pointer
+In some cases LLVM will allocate a "reserved register" for `reg` operands even though this register cannot be explicitly specified.
+Assembly code making use of reserved registers should be careful since `reg` operands may alias with those registers.
+Reserved registers are the frame pointer and base pointer
 - The frame pointer and LLVM base pointer on all architectures.
 - `r9` on ARM.
 - `x18` on AArch64.
 
 ## Template modifiers
 
-The placeholders can be augmented by modifiers which are specified after the `:` in the curly braces. These modifiers do not affect register allocation, but change the way operands are formatted when inserted into the template string. Only one modifier is allowed per template placeholder.
+The placeholders can be augmented by modifiers which are specified after the `:` in the curly braces.
+These modifiers do not affect register allocation, but change the way operands are formatted when inserted into the template string.
+Only one modifier is allowed per template placeholder.
 
 The supported modifiers are a subset of LLVM's (and GCC's) [asm template argument modifiers][llvm-argmod], but do not use the same letter codes.
 
@@ -290,18 +324,24 @@ The supported modifiers are a subset of LLVM's (and GCC's) [asm template argumen
 
 > **Notes**:
 > - on ARM `e` / `f`: this prints the low or high doubleword register name of a NEON quad (128-bit) register.
-> - on x86: our behavior for `reg` with no modifiers differs from what GCC does. GCC will infer the modifier based on the operand value type, while we default to the full register size.
+> - on x86: our behavior for `reg` with no modifiers differs from what GCC does.
+>   GCC will infer the modifier based on the operand value type, while we default to the full register size.
 > - on x86 `xmm_reg`: the `x`, `t` and `g` LLVM modifiers are not yet implemented in LLVM (they are supported by GCC only), but this should be a simple change.
 
-As stated in the previous section, passing an input value smaller than the register width will result in the upper bits of the register containing undefined values. This is not a problem if the inline asm only accesses the lower bits of the register, which can be done by using a template modifier to use a subregister name in the asm code (e.g. `ax` instead of `rax`). Since this an easy pitfall, the compiler will suggest a template modifier to use where appropriate given the input type. If all references to an operand already have modifiers then the warning is suppressed for that operand.
+As stated in the previous section, passing an input value smaller than the register width will result in the upper bits of the register containing undefined values.
+This is not a problem if the inline asm only accesses the lower bits of the register, which can be done by using a template modifier to use a subregister name in the asm code (e.g. `ax` instead of `rax`).
+Since this an easy pitfall, the compiler will suggest a template modifier to use where appropriate given the input type.
+If all references to an operand already have modifiers then the warning is suppressed for that operand.
 
 [llvm-argmod]: http://llvm.org/docs/LangRef.html#asm-template-argument-modifiers
 
 ## ABI clobbers
 
-The `clobber_abi` keyword can be used to apply a default set of clobbers to an `asm!` block. This will automatically insert the necessary clobber constraints as needed for calling a function with a particular calling convention: if the calling convention does not fully preserve the value of a register across a call then a `lateout("reg") _` is implicitly added to the operands list.
+The `clobber_abi` keyword can be used to apply a default set of clobbers to an `asm!` block.
+This will automatically insert the necessary clobber constraints as needed for calling a function with a particular calling convention: if the calling convention does not fully preserve the value of a register across a call then a `lateout("reg") _` is implicitly added to the operands list.
 
-Generic register class outputs are disallowed by the compiler when `clobber_abi` is used: all outputs must specify an explicit register. Explicit register outputs have precedence over the implicit clobbers inserted by `clobber_abi`: a clobber will only be inserted for a register if that register is not used as an output.
+Generic register class outputs are disallowed by the compiler when `clobber_abi` is used: all outputs must specify an explicit register.
+Explicit register outputs have precedence over the implicit clobbers inserted by `clobber_abi`: a clobber will only be inserted for a register if that register is not used as an output.
 The following ABIs can be used with `clobber_abi`:
 
 | Architecture | ABI name | Clobbered registers |
@@ -319,14 +359,23 @@ The list of clobbered registers for each ABI is updated in rustc as architecture
 
 Flags are used to further influence the behavior of the inline assembly block.
 Currently the following options are defined:
-- `pure`: The `asm!` block has no side effects, and its outputs depend only on its direct inputs (i.e. the values themselves, not what they point to) or values read from memory (unless the `nomem` options is also set). This allows the compiler to execute the `asm!` block fewer times than specified in the program (e.g. by hoisting it out of a loop) or even eliminate it entirely if the outputs are not used.
-- `nomem`: The `asm!` blocks does not read or write to any memory. This allows the compiler to cache the values of modified global variables in registers across the `asm!` block since it knows that they are not read or written to by the `asm!`.
-- `readonly`: The `asm!` block does not write to any memory. This allows the compiler to cache the values of unmodified global variables in registers across the `asm!` block since it knows that they are not written to by the `asm!`.
-- `preserves_flags`: The `asm!` block does not modify the flags register (defined in the rules below). This allows the compiler to avoid recomputing the condition flags after the `asm!` block.
-- `noreturn`: The `asm!` block never returns, and its return type is defined as `!` (never). Behavior is undefined if execution falls through past the end of the asm code. A `noreturn` asm block behaves just like a function which doesn't return; notably, local variables in scope are not dropped before it is invoked.
-- `nostack`: The `asm!` block does not push data to the stack, or write to the stack red-zone (if supported by the target). If this option is *not* used then the stack pointer is guaranteed to be suitably aligned (according to the target ABI) for a function call.
-- `att_syntax`: This option is only valid on x86, and causes the assembler to use the `.att_syntax prefix` mode of the GNU assembler. Register operands are substituted in with a leading `%`.
-- `raw`: This causes the template string to be parsed as a raw assembly string, with no special handling for `{` and `}`. This is primarily useful when including raw assembly code from an external file using `include_str!`.
+- `pure`: The `asm!` block has no side effects, and its outputs depend only on its direct inputs (i.e. the values themselves, not what they point to) or values read from memory (unless the `nomem` options is also set).
+  This allows the compiler to execute the `asm!` block fewer times than specified in the program (e.g. by hoisting it out of a loop) or even eliminate it entirely if the outputs are not used.
+- `nomem`: The `asm!` blocks does not read or write to any memory.
+  This allows the compiler to cache the values of modified global variables in registers across the `asm!` block since it knows that they are not read or written to by the `asm!`.
+- `readonly`: The `asm!` block does not write to any memory.
+  This allows the compiler to cache the values of unmodified global variables in registers across the `asm!` block since it knows that they are not written to by the `asm!`.
+- `preserves_flags`: The `asm!` block does not modify the flags register (defined in the rules below).
+  This allows the compiler to avoid recomputing the condition flags after the `asm!` block.
+- `noreturn`: The `asm!` block never returns, and its return type is defined as `!` (never).
+  Behavior is undefined if execution falls through past the end of the asm code.
+  A `noreturn` asm block behaves just like a function which doesn't return; notably, local variables in scope are not dropped before it is invoked.
+- `nostack`: The `asm!` block does not push data to the stack, or write to the stack red-zone (if supported by the target).
+  If this option is *not* used then the stack pointer is guaranteed to be suitably aligned (according to the target ABI) for a function call.
+- `att_syntax`: This option is only valid on x86, and causes the assembler to use the `.att_syntax prefix` mode of the GNU assembler.
+  Register operands are substituted in with a leading `%`.
+- `raw`: This causes the template string to be parsed as a raw assembly string, with no special handling for `{` and `}`.
+  This is primarily useful when including raw assembly code from an external file using `include_str!`.
 
 The compiler performs some additional checks on options:
 - The `nomem` and `readonly` options are mutually exclusive: it is a compile-time error to specify both.
@@ -334,17 +383,21 @@ The compiler performs some additional checks on options:
 - It is a compile-time error to specify `pure` on an asm block with no outputs or only discarded outputs (`_`).
 - It is a compile-time error to specify `noreturn` on an asm block with outputs.
 
-`global_asm!` only supports the `att_syntax` and `raw` options. The remaining options are not meaningful for global-scope inline assembly
+`global_asm!` only supports the `att_syntax` and `raw` options.
+The remaining options are not meaningful for global-scope inline assembly
 
 ## Rules for inline assembly
 
 To avoid undefined behavior, these rules must be followed when using function-scope inline assembly (`asm!`):
 
 - Any registers not specified as inputs will contain an undefined value on entry to the asm block.
-  - An "undefined value" in the context of inline assembly means that the register can (non-deterministically) have any one of the possible values allowed by the architecture. Notably it is not the same as an LLVM `undef` which can have a different value every time you read it (since such a concept does not exist in assembly code).
+  - An "undefined value" in the context of inline assembly means that the register can (non-deterministically) have any one of the possible values allowed by the architecture.
+    Notably it is not the same as an LLVM `undef` which can have a different value every time you read it (since such a concept does not exist in assembly code).
 - Any registers not specified as outputs must have the same value upon exiting the asm block as they had on entry, otherwise behavior is undefined.
-  - This only applies to registers which can be specified as an input or output. Other registers follow target-specific rules.
-  - Note that a `lateout` may be allocated to the same register as an `in`, in which case this rule does not apply. Code should not rely on this however since it depends on the results of register allocation.
+  - This only applies to registers which can be specified as an input or output.
+    Other registers follow target-specific rules.
+  - Note that a `lateout` may be allocated to the same register as an `in`, in which case this rule does not apply.
+    Code should not rely on this however since it depends on the results of register allocation.
 - Behavior is undefined if execution unwinds out of an asm block.
   - This also applies if the assembly code calls a function which then unwinds.
 - The set of memory locations that assembly code is allowed to read and write are the same as those allowed for an FFI function.
@@ -361,7 +414,8 @@ To avoid undefined behavior, these rules must be followed when using function-sc
   - You should adjust the stack pointer when allocating stack memory as required by the target ABI.
   - The stack pointer must be restored to its original value before leaving the asm block.
 - If the `noreturn` option is set then behavior is undefined if execution falls through to the end of the asm block.
-- If the `pure` option is set then behavior is undefined if the `asm!` has side-effects other than its direct outputs. Behavior is also undefined if two executions of the `asm!` code with the same inputs result in different outputs.
+- If the `pure` option is set then behavior is undefined if the `asm!` has side-effects other than its direct outputs.
+  Behavior is also undefined if two executions of the `asm!` code with the same inputs result in different outputs.
   - When used with the `nomem` option, "inputs" are just the direct inputs of the `asm!`.
   - When used with the `readonly` option, "inputs" comprise the direct inputs of the `asm!` and any memory that the `asm!` block is allowed to read.
 - These flags registers must be restored upon exiting the asm block if the `preserves_flags` option is set:
@@ -387,9 +441,11 @@ To avoid undefined behavior, these rules must be followed when using function-sc
 - The requirement of restoring the stack pointer and non-output registers to their original value only applies when exiting an `asm!` block.
   - This means that `asm!` blocks that never return (even if not marked `noreturn`) don't need to preserve these registers.
   - When returning to a different `asm!` block than you entered (e.g. for context switching), these registers must contain the value they had upon entering the `asm!` block that you are *exiting*.
-    - You cannot exit an `asm!` block that has not been entered. Neither can you exit an `asm!` block that has already been exited.
+    - You cannot exit an `asm!` block that has not been entered.
+      Neither can you exit an `asm!` block that has already been exited.
     - You are responsible for switching any target-specific state (e.g. thread-local storage, stack bounds).
     - The set of memory locations that you may access is the intersection of those allowed by the `asm!` blocks you entered and exited.
-- You cannot assume that an `asm!` block will appear exactly once in the output binary. The compiler is allowed to instantiate multiple copies of the `asm!` block, for example when the function containing it is inlined in multiple places.
+- You cannot assume that an `asm!` block will appear exactly once in the output binary.
+  The compiler is allowed to instantiate multiple copies of the `asm!` block, for example when the function containing it is inlined in multiple places.
 
 > **Note**: As a general rule, the flags covered by `preserves_flags` are those which are *not* preserved when performing a function call.

From 2c969f4defba395fde94d4db5e0149817699e9df Mon Sep 17 00:00:00 2001
From: Amanieu d'Antras <amanieu@gmail.com>
Date: Thu, 2 Dec 2021 00:22:14 +0000
Subject: [PATCH 05/13] Add example of inline assembly

---
 src/inline-assembly.md | 35 ++++++++++++++++++++++++++++-------
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/src/inline-assembly.md b/src/inline-assembly.md
index e8c25eb15..4bca8eb70 100644
--- a/src/inline-assembly.md
+++ b/src/inline-assembly.md
@@ -6,6 +6,34 @@ It can be used to embed handwritten assembly in the assembly output generated by
 [`asm!`]: ../core/arch/macro.asm.html
 [`global_asm!`]: ../core/arch/macro.global_asm.html
 
+Support for inline assembly is stable on the following architectures:
+- x86 and x86-64
+- ARM
+- AArch64
+- RISC-V
+
+The compiler will emit an error if `asm!` is used on an unsupported target.
+
+## Example
+
+```rust
+// Multiply x by 6 using shifts and adds
+let mut x: u64 = 4;
+unsafe {
+    asm!(
+        "mov {tmp}, {x}",
+        "shl {tmp}, 1",
+        "shl {x}, 2",
+        "add {x}, {tmp}",
+        x = inout(reg) x,
+        tmp = out(reg) _,
+    );
+}
+assert_eq!(x, 4 * 6);
+```
+
+## Syntax
+
 The following ABNF specifies the general syntax:
 
 ```text
@@ -21,13 +49,6 @@ asm := "asm!(" format_string *("," format_string) *("," [ident "="] operand) *("
 global_asm := "global_asm!(" format_string *("," format_string) *("," [ident "="] operand) *("," options) [","] ")"
 ```
 
-Support for inline assembly is stable on the following architectures:
-- x86 and x86-64
-- ARM
-- AArch64
-- RISC-V
-
-The compiler will emit an error if `asm!` is used on an unsupported target.
 
 ## Scope
 

From cb057399cf9a92cd9939f850eae6ca61ef9c5a9d Mon Sep 17 00:00:00 2001
From: Amanieu d'Antras <amanieu@gmail.com>
Date: Tue, 7 Dec 2021 15:17:01 +0000
Subject: [PATCH 06/13] Fix minor typos

Taken from https://github.com/rust-lang/rust/pull/90086
---
 src/inline-assembly.md | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/inline-assembly.md b/src/inline-assembly.md
index 4bca8eb70..505741a55 100644
--- a/src/inline-assembly.md
+++ b/src/inline-assembly.md
@@ -101,7 +101,7 @@ Several types of operands are supported:
   - `<reg>` can refer to a register class or an explicit register.
     The allocated register name is substituted into the asm template string.
   - The allocated register will contain an undefined value at the start of the asm code.
-  - `<expr>` must be a (possibly uninitialized) place expression, to which the contents of the allocated register is written to at the end of the asm code.
+  - `<expr>` must be a (possibly uninitialized) place expression, to which the contents of the allocated register are written at the end of the asm code.
   - An underscore (`_`) may be specified instead of an expression, which will cause the contents of the register to be discarded at the end of the asm code (effectively acting as a clobber).
 * `lateout(<reg>) <expr>`
   - Identical to `out` except that the register allocator can reuse a register allocated to an `in`.
@@ -110,10 +110,10 @@ Several types of operands are supported:
   - `<reg>` can refer to a register class or an explicit register.
     The allocated register name is substituted into the asm template string.
   - The allocated register will contain the value of `<expr>` at the start of the asm code.
-  - `<expr>` must be a mutable initialized place expression, to which the contents of the allocated register is written to at the end of the asm code.
+  - `<expr>` must be a mutable initialized place expression, to which the contents of the allocated register are written at the end of the asm code.
 * `inout(<reg>) <in expr> => <out expr>`
   - Same as `inout` except that the initial value of the register is taken from the value of `<in expr>`.
-  - `<out expr>` must be a (possibly uninitialized) place expression, to which the contents of the allocated register is written to at the end of the asm code.
+  - `<out expr>` must be a (possibly uninitialized) place expression, to which the contents of the allocated register are written at the end of the asm code.
   - An underscore (`_`) may be specified instead of an expression for `<out expr>`, which will cause the contents of the register to be discarded at the end of the asm code (effectively acting as a clobber).
   - `<in expr>` and `<out expr>` may have different types.
 * `inlateout(<reg>) <expr>` / `inlateout(<reg>) <in expr> => <out expr>`
@@ -294,7 +294,8 @@ Some registers cannot be used for input or output operands:
 
 In some cases LLVM will allocate a "reserved register" for `reg` operands even though this register cannot be explicitly specified.
 Assembly code making use of reserved registers should be careful since `reg` operands may alias with those registers.
-Reserved registers are the frame pointer and base pointer
+
+These reserved registers are:
 - The frame pointer and LLVM base pointer on all architectures.
 - `r9` on ARM.
 - `x18` on AArch64.

From 1b09eb29d7863bfe04f685be202a54f85c622992 Mon Sep 17 00:00:00 2001
From: Amanieu d'Antras <amanieu@gmail.com>
Date: Thu, 9 Dec 2021 14:17:58 +0000
Subject: [PATCH 07/13] Sync with latest changes to the unstable book

---
 src/inline-assembly.md | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/src/inline-assembly.md b/src/inline-assembly.md
index 505741a55..0ca956313 100644
--- a/src/inline-assembly.md
+++ b/src/inline-assembly.md
@@ -42,9 +42,9 @@ reg_spec := <register class> / "<explicit register>"
 operand_expr := expr / "_" / expr "=>" expr / expr "=>" "_"
 reg_operand := dir_spec "(" reg_spec ")" operand_expr
 operand := reg_operand
-clobber_abi := "clobber_abi(" <abi> *["," <abi>] [","] ")"
+clobber_abi := "clobber_abi(" <abi> *("," <abi>) [","] ")"
 option := "pure" / "nomem" / "readonly" / "preserves_flags" / "noreturn" / "nostack" / "att_syntax" / "raw"
-options := "options(" option *["," option] [","] ")"
+options := "options(" option *("," option) [","] ")"
 asm := "asm!(" format_string *("," format_string) *("," [ident "="] operand) *("," clobber_abi) *("," options) [","] ")"
 global_asm := "global_asm!(" format_string *("," format_string) *("," [ident "="] operand) *("," options) [","] ")"
 ```
@@ -161,9 +161,8 @@ Here is the list of currently supported register classes:
 | AArch64 | `vreg` | `v[0-31]` | `w` |
 | AArch64 | `vreg_low16` | `v[0-15]` | `x` |
 | AArch64 | `preg` | `p[0-15]`, `ffr` | Only clobbers |
-| ARM | `reg` | `r[0-12]`, `r14` | `r` |
-| ARM (Thumb) | `reg_thumb` | `r[0-r7]` | `l` |
-| ARM (ARM) | `reg_thumb` | `r[0-r12]`, `r14` | `l` |
+| ARM (ARM/Thumb2) | `reg` | `r[0-12]`, `r14` | `r` |
+| ARM (Thumb1) | `reg` | `r[0-7]` | `r` |
 | ARM | `sreg` | `s[0-31]` | `t` |
 | ARM | `sreg_low16` | `s[0-15]` | `x` |
 | ARM | `dreg` | `d[0-31]` | `w` |
@@ -284,9 +283,8 @@ Some registers cannot be used for input or output operands:
 | All | `si` (x86-32), `bx` (x86-64), `r6` (ARM), `x19` (AArch64), `x9` (RISC-V) | This is used internally by LLVM as a "base pointer" for functions with complex stack frames. |
 | x86 | `k0` | This is a constant zero register which can't be modified. |
 | x86 | `ip` | This is the program counter, not a real register. |
-| x86 | `mm[0-7]` | MMX registers are not currently supported (but may be in the future). |
-| x86 | `st([0-7])` | x87 registers are not currently supported (but may be in the future). |
 | AArch64 | `xzr` | This is a constant zero register which can't be modified. |
+| AArch64 | `x18` | This is a reserved register on some AArch64 targets. |
 | ARM | `pc` | This is the program counter, not a real register. |
 | ARM | `r9` | This is a reserved register on some ARM targets. |
 | RISC-V | `x0` | This is a constant zero register which can't be modified. |
@@ -294,11 +292,7 @@ Some registers cannot be used for input or output operands:
 
 In some cases LLVM will allocate a "reserved register" for `reg` operands even though this register cannot be explicitly specified.
 Assembly code making use of reserved registers should be careful since `reg` operands may alias with those registers.
-
-These reserved registers are:
-- The frame pointer and LLVM base pointer on all architectures.
-- `r9` on ARM.
-- `x18` on AArch64.
+Reserved registers that can sometimes be allocated are the frame pointer and base pointer in the list above.
 
 ## Template modifiers
 
@@ -362,6 +356,8 @@ If all references to an operand already have modifiers then the warning is suppr
 The `clobber_abi` keyword can be used to apply a default set of clobbers to an `asm!` block.
 This will automatically insert the necessary clobber constraints as needed for calling a function with a particular calling convention: if the calling convention does not fully preserve the value of a register across a call then a `lateout("reg") _` is implicitly added to the operands list.
 
+`clobber_abi` may be specified any number of times. It will insert a clobber for all unique registers in the union of all specified calling conventions.
+
 Generic register class outputs are disallowed by the compiler when `clobber_abi` is used: all outputs must specify an explicit register.
 Explicit register outputs have precedence over the implicit clobbers inserted by `clobber_abi`: a clobber will only be inserted for a register if that register is not used as an output.
 The following ABIs can be used with `clobber_abi`:
@@ -371,10 +367,13 @@ The following ABIs can be used with `clobber_abi`:
 | x86-32 | `"C"`, `"system"`, `"efiapi"`, `"cdecl"`, `"stdcall"`, `"fastcall"` | `ax`, `cx`, `dx`, `xmm[0-7]`, `mm[0-7]`, `k[1-7]`, `st([0-7])` |
 | x86-64 | `"C"`, `"system"` (on Windows), `"efiapi"`, `"win64"` | `ax`, `cx`, `dx`, `r[8-11]`, `xmm[0-31]`, `mm[0-7]`, `k[1-7]`, `st([0-7])` |
 | x86-64 | `"C"`, `"system"` (on non-Windows), `"sysv64"` | `ax`, `cx`, `dx`, `si`, `di`, `r[8-11]`, `xmm[0-31]`, `mm[0-7]`, `k[1-7]`, `st([0-7])` |
-| AArch64 | `"C"`, `"system"`, `"efiapi"` | `x[0-17]`, `x30`, `v[0-31]`, `p[0-15]`, `ffr` |
+| AArch64 | `"C"`, `"system"`, `"efiapi"` | `x[0-17]`, `x18`\*, `x30`, `v[0-31]`, `p[0-15]`, `ffr` |
 | ARM | `"C"`, `"system"`, `"efiapi"`, `"aapcs"` | `r[0-3]`, `r12`, `r14`, `s[0-15]`, `d[0-7]`, `d[16-31]` |
 | RISC-V | `"C"`, `"system"`, `"efiapi"` | `x1`, `x[5-7]`, `x[10-17]`, `x[28-31]`, `f[0-7]`, `f[10-17]`, `f[28-31]`, `v[0-31]` |
 
+> Notes:
+> - On AArch64 `x18` only included in the clobber list if it is not considered as a reserved register on the target.
+
 The list of clobbered registers for each ABI is updated in rustc as architectures gain new registers: this ensures that `asm!` clobbers will continue to be correct when LLVM starts using these new registers in its generated code.
 
 ## Options
@@ -469,5 +468,7 @@ To avoid undefined behavior, these rules must be followed when using function-sc
     - The set of memory locations that you may access is the intersection of those allowed by the `asm!` blocks you entered and exited.
 - You cannot assume that an `asm!` block will appear exactly once in the output binary.
   The compiler is allowed to instantiate multiple copies of the `asm!` block, for example when the function containing it is inlined in multiple places.
+- On x86, inline assembly must not end with an instruction prefix (such as `LOCK`) that would apply to instructions generated by the compiler.
+  - The compiler is currently unable to detect this due to the way inline assembly is compiled, but may catch and reject this in the future.
 
 > **Note**: As a general rule, the flags covered by `preserves_flags` are those which are *not* preserved when performing a function call.

From 88f35dd05b84b5a97a13d1b129535b311c05610f Mon Sep 17 00:00:00 2001
From: Amanieu d'Antras <amanieu@gmail.com>
Date: Fri, 10 Dec 2021 01:00:09 +0000
Subject: [PATCH 08/13] Update wording for reserved registers

---
 src/inline-assembly.md | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/src/inline-assembly.md b/src/inline-assembly.md
index 0ca956313..6f607f43e 100644
--- a/src/inline-assembly.md
+++ b/src/inline-assembly.md
@@ -284,15 +284,13 @@ Some registers cannot be used for input or output operands:
 | x86 | `k0` | This is a constant zero register which can't be modified. |
 | x86 | `ip` | This is the program counter, not a real register. |
 | AArch64 | `xzr` | This is a constant zero register which can't be modified. |
-| AArch64 | `x18` | This is a reserved register on some AArch64 targets. |
+| AArch64 | `x18` | This is an OS-reserved register on some AArch64 targets. |
 | ARM | `pc` | This is the program counter, not a real register. |
-| ARM | `r9` | This is a reserved register on some ARM targets. |
+| ARM | `r9` | This is an OS-reserved register on some ARM targets. |
 | RISC-V | `x0` | This is a constant zero register which can't be modified. |
 | RISC-V | `gp`, `tp` | These registers are reserved and cannot be used as inputs or outputs. |
 
-In some cases LLVM will allocate a "reserved register" for `reg` operands even though this register cannot be explicitly specified.
-Assembly code making use of reserved registers should be careful since `reg` operands may alias with those registers.
-Reserved registers that can sometimes be allocated are the frame pointer and base pointer in the list above.
+The frame pointer and base pointer registers are reserved for internal use by LLVM. While `asm!` statements cannot explicitly specify the use of reserved registers, in some cases LLVM will allocate one of these reserved registers for `reg` operands. Assembly code making use of reserved registers should be careful since `reg` operands may use the same registers.
 
 ## Template modifiers
 

From 1ce1f0ec66522ccbd7af61c43f72e9fd390e58e9 Mon Sep 17 00:00:00 2001
From: Amanieu d'Antras <amanieu@gmail.com>
Date: Thu, 16 Dec 2021 05:11:45 +0000
Subject: [PATCH 09/13] Fix example

---
 src/inline-assembly.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/inline-assembly.md b/src/inline-assembly.md
index 6f607f43e..2de1b3b5e 100644
--- a/src/inline-assembly.md
+++ b/src/inline-assembly.md
@@ -17,6 +17,8 @@ The compiler will emit an error if `asm!` is used on an unsupported target.
 ## Example
 
 ```rust
+use std::arch::asm;
+
 // Multiply x by 6 using shifts and adds
 let mut x: u64 = 4;
 unsafe {

From cf3a28145e06a3294494b5ac2ac4beef9f2e52e0 Mon Sep 17 00:00:00 2001
From: Amanieu d'Antras <amanieu@gmail.com>
Date: Thu, 16 Dec 2021 05:15:26 +0000
Subject: [PATCH 10/13] Satisfy linkchecker

---
 src/inline-assembly.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/inline-assembly.md b/src/inline-assembly.md
index 2de1b3b5e..d71efd606 100644
--- a/src/inline-assembly.md
+++ b/src/inline-assembly.md
@@ -87,7 +87,7 @@ On ARM, the `.syntax unified` mode is used.
 These targets impose an additional restriction on the assembly code: any assembler state (e.g. the current section which can be changed with `.section`) must be restored to its original value at the end of the asm string.
 Assembly code that does not conform to the GAS syntax will result in assembler-specific behavior.
 
-[format-syntax]: ../std/fmt/#syntax
+[format-syntax]: ../std/fmt/index.html#syntax
 [rfc-2795]: https://github.com/rust-lang/rfcs/pull/2795
 
 ## Operand type

From b3d0eb23fb0538171f317eade10700c47ab390ab Mon Sep 17 00:00:00 2001
From: Amanieu d'Antras <amanieu@gmail.com>
Date: Wed, 29 Dec 2021 22:29:14 +0100
Subject: [PATCH 11/13] Apply suggestions from code review

Co-authored-by: Josh Triplett <josh@joshtriplett.org>
---
 src/inline-assembly.md | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/inline-assembly.md b/src/inline-assembly.md
index d71efd606..b4762297e 100644
--- a/src/inline-assembly.md
+++ b/src/inline-assembly.md
@@ -463,9 +463,11 @@ To avoid undefined behavior, these rules must be followed when using function-sc
   - This means that `asm!` blocks that never return (even if not marked `noreturn`) don't need to preserve these registers.
   - When returning to a different `asm!` block than you entered (e.g. for context switching), these registers must contain the value they had upon entering the `asm!` block that you are *exiting*.
     - You cannot exit an `asm!` block that has not been entered.
-      Neither can you exit an `asm!` block that has already been exited.
+      Neither can you exit an `asm!` block that has already been exited (without first entering it again).
     - You are responsible for switching any target-specific state (e.g. thread-local storage, stack bounds).
+    - You cannot jump from an address in one `asm!` block to an address in another, even within the same function or block, without treating their contexts as potentially different and requiring context switching. You cannot assume that any particular value in those contexts (e.g. current stack pointer or temporary values below the stack pointer) will remain unchanged between the two `asm!` blocks.
     - The set of memory locations that you may access is the intersection of those allowed by the `asm!` blocks you entered and exited.
+- You cannot assume that two `asm!` blocks adjacent in source code, even without any other code between them, will end up in successive addresses in the binary without any other instructions between them.
 - You cannot assume that an `asm!` block will appear exactly once in the output binary.
   The compiler is allowed to instantiate multiple copies of the `asm!` block, for example when the function containing it is inlined in multiple places.
 - On x86, inline assembly must not end with an instruction prefix (such as `LOCK`) that would apply to instructions generated by the compiler.

From ac5f793204b3c29d983a26e496beb2902983eb21 Mon Sep 17 00:00:00 2001
From: Amanieu d'Antras <amanieu@gmail.com>
Date: Wed, 29 Dec 2021 21:30:28 +0000
Subject: [PATCH 12/13] Add format_string to asm! grammar

---
 src/inline-assembly.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/inline-assembly.md b/src/inline-assembly.md
index b4762297e..b95f5e003 100644
--- a/src/inline-assembly.md
+++ b/src/inline-assembly.md
@@ -39,6 +39,7 @@ assert_eq!(x, 4 * 6);
 The following ABNF specifies the general syntax:
 
 ```text
+format_string := STRING_LITERAL / RAW_STRING_LITERAL
 dir_spec := "in" / "out" / "lateout" / "inout" / "inlateout"
 reg_spec := <register class> / "<explicit register>"
 operand_expr := expr / "_" / expr "=>" expr / expr "=>" "_"

From 9dc1c167edb3f4ea92415be8703044904e91392f Mon Sep 17 00:00:00 2001
From: Amanieu d'Antras <amanieu@gmail.com>
Date: Wed, 29 Dec 2021 21:33:10 +0000
Subject: [PATCH 13/13] Add inline assembly to the list of possible undefined
 behavior

---
 src/behavior-considered-undefined.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/behavior-considered-undefined.md b/src/behavior-considered-undefined.md
index 5af4a4bef..c2e6fc0cb 100644
--- a/src/behavior-considered-undefined.md
+++ b/src/behavior-considered-undefined.md
@@ -58,6 +58,8 @@ code.
 
     > **Note**: `rustc` achieves this with the unstable
     > `rustc_layout_scalar_valid_range_*` attributes.
+* Incorrect use of inline assembly. For more details, refer to the [rules] to
+  follow when writing code that uses inline assembly.
 
 **Note:** Uninitialized memory is also implicitly invalid for any type that has
 a restricted set of valid values. In other words, the only cases in which
@@ -94,3 +96,4 @@ cannot be bigger than `isize::MAX` bytes.
 [`NonZero*`]: ../core/num/index.html
 [dereference expression]: expressions/operator-expr.md#the-dereference-operator
 [place expression context]: expressions.md#place-expressions-and-value-expressions
+[rules]: inline-assembly.md#rules-for-inline-assembly