Don't pass the state and current process as arguments #617

yorickpeterse · 2023-10-16T22:03:56Z

For every compiled method, the first two arguments are the runtime state and the current process. This means that fn foo(a: Int) translates to essentially fn foo(state: Pointer[UInt8], process: Pointer[UInt8], a: Int).

This approach isn't great, as we're wasting up to two registers to pass this data around, and in many cases the data likely isn't used much.

To optimize this, I'm thinking of the following:

The state is the same for all methods and processes, so we can generate a global variable and store it in there. The runtime functions still take an explicit state argument, such that it doesn't need to depend on the global variable generated by the compiler.
The process could be stored as the last value in the stack (that we grow towards), and the stack range adjusted to not allocate into that data. This way we can obtain the process easily. I'm not sure though how feasible/cross-platform this is.

The text was updated successfully, but these errors were encountered:

yorickpeterse · 2023-10-16T22:13:26Z

Using external thread-local variables in Rust requires nightly, and probably will continue to require this for a long time: rust-lang/rust#29594

yorickpeterse · 2023-10-16T22:48:10Z

A tricky thing about using the stack is that LLVM doesn't seem to provide any intrinsics for obtaining any kind of stack information. This means we'd have to use raw assembly somehow to get the data from the stack.

yorickpeterse · 2024-02-17T02:53:01Z

For thread-local code, the following Rust code compiles to the same as regular/raw thread-locals:

thread_local! {
  static PTR1: Cell<*mut ()> = const { Cell::new(std::ptr::null_mut()) };
}

This can be seen at https://rust.godbolt.org/z/v16va86aq.

The problem is that I'm not sure if this is true for every platform. Some additional details are found at https://matklad.github.io/2020/10/03/fast-thread-locals-in-rust.html.

yorickpeterse · 2024-02-17T03:08:17Z

A quick dive through the current code reveals we don't use the current process value in all that many places, mostly to pass it as an implicit argument to methods. The few runtime routines that require it could instead just use a thread-local variable kept entirely on the runtime side of things.

The only instruction that really needs it is the Preempt instruction as it checks the process-local epoch against the global epoch. We could probably make that epoch counter a thread-local variable as well, as we only write to it when resuming the process. This would probably also reduce the process size a little bit.

yorickpeterse · 2024-02-18T04:14:30Z

It seems that when one uses #[no_mangle] in the thread_local! macro, mangling is still applied to the constant. This can be seen in https://rust.godbolt.org/z/qWzaq8qze where PTR1 is mangled as example::PTR1::__getit::VAL.0 but PTR2 is just PTR2.

yorickpeterse · 2024-02-18T04:15:35Z

Looking at the assembly, it also seems Rust uses LLVM's localdynamic for the thread_local! variable, while using generaldynamic for the #[thread_local] version.

The compiler generated code no longer passes the runtime state and the current process as hidden arguments. Instead, the state is stored in a global variable when the program starts up. For the current process we change the stack layout to the following: ╭───────────────────╮ │ Private page │ ├───────────────────┤ │ Guard page │ ├───────────────────┤ │ Stack data │ ↑ Stack grows towards the guard ╰───────────────────╯ The private page stores extra data, such as a pointer to the process that owns the stack and the epoch at which it started running. This entire chunk of data is then aligned to its size. This makes it possible to get a pointer to the private data page by applying a bitmask to the stack pointer. The bitmask depends on the stack size, which is runtime configurable and depends on the page size, and is loaded into a global variable at startup. This entire approach removes the need for more expensive thread-local operations, which we can't use anyway due to Rust's "thread_local" attribute not being stable (and likely not becoming stable for another few years). This fixes #617. Changelog: changed

yorickpeterse added performance Changes related to improving performance compiler Changes related to the compiler labels Oct 16, 2023

yorickpeterse mentioned this issue Feb 9, 2024

Consider backing Inko processes by OS threads #690

Closed

yorickpeterse mentioned this issue Feb 17, 2024

Allow passing pointers to Inko methods to C functions #693

Closed

yorickpeterse added this to the 0.15.0 milestone Feb 17, 2024

yorickpeterse self-assigned this Feb 18, 2024

yorickpeterse closed this as completed in c9e90f5 Feb 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't pass the state and current process as arguments #617

Don't pass the state and current process as arguments #617

yorickpeterse commented Oct 16, 2023

yorickpeterse commented Oct 16, 2023

yorickpeterse commented Oct 16, 2023

yorickpeterse commented Feb 17, 2024 •

edited

Loading

yorickpeterse commented Feb 17, 2024

yorickpeterse commented Feb 18, 2024

yorickpeterse commented Feb 18, 2024

Don't pass the state and current process as arguments #617

Don't pass the state and current process as arguments #617

Comments

yorickpeterse commented Oct 16, 2023

yorickpeterse commented Oct 16, 2023

yorickpeterse commented Oct 16, 2023

yorickpeterse commented Feb 17, 2024 • edited Loading

yorickpeterse commented Feb 17, 2024

yorickpeterse commented Feb 18, 2024

yorickpeterse commented Feb 18, 2024

yorickpeterse commented Feb 17, 2024 •

edited

Loading