Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread local storage #339

Merged
merged 26 commits into from
Aug 12, 2019
Merged

Conversation

Orycterope
Copy link
Member

@Orycterope Orycterope commented Jun 17, 2019

Adding support for Thread Local Storage (TLS) in userspace.

For every thread it creates, the kernel allocates a 0x200 bytes TLS region,
and makes fs:0x0 .. fs:0x200 point to it.

For every thread it creates, the libuser allocates a TLS context,
stores its thread handle in it, and saves its address in its TLS region.

This PR defines the libuser low-level ABI to create threads,
and makes test_threads use it from now on.

ELF Thread Local Storage will come as a separate PR, as this one
is required to implement mutexes, so we don't have to wait
to have full ELF TLS support to start development.

Actually ELF TLS comes in this PR too, as it is closely linked to TLS region:

Userspace TLS:

For every thread it creates, the libuser copies the TLS initialization image, and makes gs:0x0..gs:0xffffffff point to it. The kernel provides a syscall for the libuser to set where the per-thread gs segment points to.

Kernel cpu-locals:

The kernel uses ELF TLS to implement cpu-local statics. For every core it detects, the kernel copies the kernel's TLS initialization image, and makes gs:0x0..gs:0xffffffff point to it. A cpu-local is therefore declared with the #[thread_local] attribute.

gs switching

As the kernel and userspace both use gs to access thread_local/cpu_local statics, we switch where gs points to everytime we enter and leave the kernel. This is done via two segments in the gdt, one for userspace and one for kernelspace, and we switch which segment gs uses when we enter/leave the kernel.

Allocating a 0x200 Thread Local Storage region for every thread,
and storing a pointer to it in the ThreadStruct.
Update the `gs` segment register with the address of the
userspace TLS at every thread-switch, before returning to userspace.

In order to update the TLS segment, we need to know its offset
in the GDT at any moment.

Because the GDT was implemented with a vec, with dynamic size in mind,
and descriptors were pushed to it and their index never saved,
this was impossible to implement.

This commit drops the dynamic size property of the GDT and LDT,
for a simpler fixed-sized array of descriptors, where every
descriptor's index is fixed and public, and the full content of
the GDT is known at compile time.

There is no foreseeable situation where the GDT would need to
be dynamically sized, so a fixed-sized array seems more appropriate.

This led to a lot of simplifications in the code:

* Because we no longer need Vecs to hold the table,
the gdt module no longer relies on the Heap.
* No more pointer wizardry, the table exists in
one and only one place: the `GDT` structure,
which in .data, and not on the heap.
* The GdtManager is now finally as dumb as it was meant to be:
holds two tables, and swaps them on `commit()`. That's it.
* No more reconstructing mem::forgotten tables from `sgdt`.

Because we don't (yet) have const generics, the LDT gets
the same size as the GDT, but is filled with null descriptors.
This is only 10 u64 more in the .data, so no big deal.
Changing the way the kernel passes the argument to a thread entrypoint
to the fastcall ABI, so we no longer need the entrypoint to be a naked function.
Passing its thread handle to the first thread of a process,
and changing the ABI for passing arguments to a thread.

The argument passed to a thread used to be stored in its ThreadStruct at thread creation.

First there was no real reason for it to be kept in the ThreadStruct for all of the thread's lifetime,
And secondly it forced us to already know the argument at the time we create the ThreadStruct,
which is not the case for main threads, the argument being a handle to the ThreadStruct we haven't yet created.

This lead to a redesign of the way a thread gets its argument.

The argument is now passed on the kernel stack of a new thread, just like the entrypoint
and userspace stack address.

This is done by `prepare_for_first_schedule`, which is called much after the ThreadStruct creation,
so it gives us more time to figure out the argument.

When a thread is first scheduled, it starts in `first_schedule`, with the argument in one of its registers.
It backups it just like the entry_point, finishes the schedule-in, and jumps to userspace with it.
Create a special TLS context for the main thread, which lives
in the `.data`, put its thread handle there, and make its
TLS region point to it.
The `test_threads` command used to display a seemingly random list
of variable number of As and Bs.
This was due to the fact that the `write!` macro internally makes
IPC calls, which reschedules the thread, while it is still holding
the lock.

When the concurrent threads gets to run, it sees that the lock is
held, so it does nothing and skips its turn silently.

We now make sure that every thread prints exactly 10 times,
not counting the rounds that are skipped.

This closes sunriseos#224
@todo
Copy link

todo bot commented Jun 17, 2019

Properly free resource after thread detach

When detaching a thread, we should ensure that the associated resources (stack, handle, context, etc...) are properly freed before the Process exits. This can be done by adding the ThreadContext to a global Vec<> of ThreadContext that gets freed when the main thread (or the last thread alive?) exits.


// todo: Properly free resource after thread detach
// body: When detaching a thread, we should ensure that the associated resources (stack,
// body: handle, context, etc...) are properly freed before the Process exits. This can be
// body: done by adding the ThreadContext to a global Vec<> of ThreadContext that gets freed
// body: when the main thread (or the last thread alive?) exits.
}
}
/// Initialisation of the main thread's thread local structures:
///
/// When a main thread starts, the kernel puts the handle of its own thread in one of its registers.


This comment was generated by todo based on a todo comment in 497f027 in #339. cc @Orycterope.

kernel/src/i386/gdt.rs Outdated Show resolved Hide resolved
Changing the way we allocate and initialize the MAIN_TSS and DOUBLE_FAULT_TSS.

They used to be lazily allocated, and both the DOUBLE_FAULT_TSS stack and TSS were allocated by the IDT, and pushed to the GDT late.

Now the MAIN_TSS, DOUBLE_FAULT_TSS, and DOUBLE_FAULT_STACK all live in the .bss, and are fully initialized on GDT init.

The IDT only takes a segment selector pointing to the already initialized TSS for task gates.

The MAIN_TSS and its IOPB are now behind a lock. This fixes #98.

Following 7664199, the GDT is now an open fixed-sized array, addressable from other modules. We use this property to modify some segments base addresses in the process-switch.
Found out the hard way that we cannot use the TLS region as an elf tls TCB by tricking the compiler and putting a pointer to our actual TCB in the first word of the TLS region, because for the GNU model llvm will generate code such as

```asm
	mov eax, gs:0xfffffffc
```

which gets the thread local variable directly from an offset from `gs`, instead of the usual

```asm
	mov eax, gs:0
	mov eax, [eax - 0x4]
```

which would have worked.

This would have meant we should have had our elf thread local variables directly prior to the TLS region, which is allocated by the kernel, so this is impossible.

Instead we choose to use two thread-local segments:

* `fs` points to the TLS region. It is allocated by the kernel, and cannot be changed.
* `gs` points to the TLS elf. The address it points to is controlled by the userspace, through the set_thread_area syscall.

When a user thread starts, it is expected to allocate a memory region, copy its `.tdata` and `.tbss` to it, and call set_thread_area to make `gs` point to it.

The kernel is still in charge of swapping the `gs` and `fs` segments on every thread switch.
Simplifying the panic function by taking a clear cause for the panic.

This way we can easily filter what's interesting to display, and what's not.
The caller of kernel panic shouldn't have to compute what to display himself.

Also adding Blue Screen Of Death so the user knows we have panicked.
* Context *

Exception handlers used to have no asm wrapper that saves the registers (userspace hardware context) and instead use the x86-interrupt calling convention to do the dirty work for us.
This had limitations, a x86-interrupt function does not expose the registers it has backed up. This is needed to implement a debugger so we can inspect the state of a thread's registers when it hits a breakpoint, and possibly modify them.

This meant that x86-interrupt exception had to eventually be replaced by a true asm wrapper.

* TLS switching *

For TLS + cpu-locals to work, because they both use `gs` to point to their thread_local area we must switch where `gs` points to every time we enter and leave the kernel. This happens at:

* Syscalls
* Exceptions
* Interrupts

This means we now need to add a small stub that switches `gs` at the start and end of every exception handler. Instead of doing that, we choose this time to finally implement an asm wrapper for exceptions.

This commit introduces the `generate_trap_gate_handler!` macro, which will generate an asm wrapper for exceptions, and handle all the boilerplate code and condition checking that exceptions have to perform, so that user-defined exception handlers cannot forget to perform them.

The `generate_trap_gate_handler!` macro is general enough for syscalls to use it. I'm planning to do exactly that, but in a separate commit so we can revert easily if things go wrong.

Interrupts handlers still use the x86-interrupt calling convention, as interrupt handling is supposed to be dead-simple, not make use of any cpu-locals, and return as fast as possible, we don't even want the overhead of saving the userspace hardware context.
Following previous commit, we now make syscalls and exceptions share the same asm wrapper.
Irqs used to follow the x86-interrupt calling convention.

Turns out I was wrong, irq handlers are not as simple as I thought, and we **do** access some cpu-local statics in irqs, so we need to switch `gs` everytime we enter the kernel.

We now treat them the same way as exceptions, since we need to switch `gs` everytime we handle one, and check if we're not killed when returning to userspace.

This adds the overhead of backuping registers to the ThreadStruct, however it might become useful when we will have a debugger, so we consider it tolerable.
Using #[thread_local] to implement cpu-local statics in the kernel.
Now that we support cpu-locals, make CURRENT_THREAD, the currently executed thread by a core, a cpu-local.

Since CURRENT_THREAD is used pretty much everywhere in the kernel, this is a good stress test for our cpu-local implementation.

Fixes #10
Introducing thread local statics in userspace.
@Orycterope Orycterope requested review from roblabla and marysaka July 20, 2019 23:12
@Orycterope
Copy link
Member Author

What's up with xmas_elf ? I'm panicking in the loading of vi since I merged master in this branch. Looks like xmas_elf is doing out of bounds accesses

@marysaka
Copy link
Member

marysaka commented Jul 21, 2019

It's #374

Copy link
Member

@roblabla roblabla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple nits

kernel/src/i386/mod.rs Show resolved Hide resolved
kernel/src/i386/process_switch.rs Show resolved Hide resolved
kernel/src/i386/process_switch.rs Show resolved Hide resolved
kernel/src/paging/arch/i386/table.rs Outdated Show resolved Hide resolved
kernel/src/panic.rs Show resolved Hide resolved
Copy link
Member

@marysaka marysaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems okay to me just noticed this

kernel/src/paging/arch/mod.rs Outdated Show resolved Hide resolved
@Orycterope Orycterope requested a review from roblabla July 31, 2019 12:20
@Orycterope
Copy link
Member Author

Hhhm why didn't todo-bot pick up my todos ?

@Orycterope Orycterope force-pushed the thread_local_storage branch from e131d84 to af56289 Compare August 1, 2019 12:41
fn tcb(&self) -> *mut u8 {
unsafe {
// safe: guaranteed to be aligned, and still in the allocation
(self.ptr as *mut u8).add(self.tcb_offset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not (self.ptr + self.tcb_offset) as *mut u8 ? This would remove the unsafe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It felt more intuitive, and the safety notice adds important information.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I should return a ref here.

kernel/src/cpu_locals.rs Show resolved Hide resolved
//! | Index | Found in | Maps to | Purpose |
//! |--------------------------|----------------------------------------|--------------------------------|-------------------------------------------------------------------|
//! | [`GdtIndex::Null`] | nowhere (hopefully) | _ | _ |
//! | [`GdtIndex::KCode`] | `cs`, while in kernel code | flat: `0x00000000..0xffffffff` | kernel's code segment | |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//! | [`GdtIndex::KCode`] | `cs`, while in kernel code | flat: `0x00000000..0xffffffff` | kernel's code segment | |
//! | [`GdtIndex::KCode`] | `cs`, while in kernel code | flat: `0x00000000..0xffffffff` | kernel's code segment |

//!
//! ### LDT segments:
//!
//! None :)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we have to switch up two segments, I wonder if it'd be worth it to have an LDT in the ThreadStruct and just switch up the LDT instead.

Not a blocking concern, just a though that popped up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought about it too, but the main idea of the LDT is to be per "task" (thread). Repurposing it for kernel vs. userspace isn't aligned with this.

Moreover, I don't see any real improvement for doing this, and doing a ldtr would add the overhead of reloading every ldt-releated segment selectors (I think).

Add the fact that we have almost no support for the LDT, and the choice for GDT seems justified.

kernel/src/i386/gdt.rs Show resolved Hide resolved
impl Drop for ThreadLocalStaticRegion {
/// Dropping a ThreadLocalStaticRegion deallocates it.
fn drop(&mut self) {
unsafe { dealloc(self.ptr as *mut u8, self.layout) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safety comment

extern "fastcall" fn thread_trampoline(thread_context_addr: usize) -> ! {
debug!("starting from new thread, context at address {:#010x}", thread_context_addr);
// first save the address of our context in our TLS region
unsafe { (*get_my_tls_region()).ptr_thread_context = thread_context_addr };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safety comment.

Maybe get_my_tls_region() should return some kind of reference instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because the IPC buffer needs to be borrowed mutably, and this can't be done trivially, we need to implement some kind of lock for it. So the only way to be safe for now is to return a *mut TLS.

However I'm making get_my_thread_context return a const reference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, "we would need some kind of lock for it" holds true for a raw pointer as well - the compiler just doesn't enforce it. And we don't really need a lock, just some kind of Cell that ensures it's not borrowed multiple times.

let thread_context_addr = thread_context_addr as *mut ThreadContext;

// make gs point to our tls
let tls_elf = unsafe { &(*thread_context_addr).tls_elf };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safety comment.


// call the routine saved in the context, passing it the arg saved in the context
unsafe {
((*thread_context_addr).entry_point)((*thread_context_addr).arg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safety comment.

Actually, when you turn thread_context_addr into a raw pointer earlier in the function, maybe you should just turn it into a reference directly at that point. That'd contain the unsafety to that one point where it turns into a reference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. Even better: getting a const reference to the thread_context should always be safe, so I'm making get_my_thread_context do that.

// save the handle in our context
MAIN_THREAD_CONTEXT.thread_handle.call_once(|| handle);
// save the address of our context in our TLS region
unsafe { (*get_my_tls_region()).ptr_thread_context = &MAIN_THREAD_CONTEXT as *const ThreadContext as usize };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

safety comment.

For some reason the "i" constrained failed on i686-unknown-linux-gnu (unit tests) for hardcoding a function pointer. Weird since there should be no real difference between i386 and i686. Looking at llvm constraints we can choose between:

* i: An integer constant (of target-specific width). Allows either a simple immediate, or a relocatable value.
* n: An integer constant – not including relocatable values.
* s: An integer constant, but allowing only relocatable values.
* X: Allows an operand of any kind, no constraint whatsoever. Typically useful to pass a label for an asm branch or call.

Apparently the compiler is ok with "s", even though "i" is supposed to be a general case of "s". Not gonna try to understand, just use "s" instead.

Also clobber memory.
@Orycterope Orycterope force-pushed the thread_local_storage branch from af56289 to aa1b5eb Compare August 5, 2019 17:04
@Orycterope Orycterope requested a review from roblabla August 6, 2019 11:17
@Orycterope Orycterope added type-feature New feature or request project-kernel Related to the kernel project-libuser Related to the libuser labels Aug 6, 2019
// memset the TLS, to clear previous owner's data.
// we do it here so don't have to CrossProcessMap it earlier.
unsafe {
// safe: we manage this memory, ptr is aligned, and 0 is valid for every field of the TLS.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait I'm stupid, ignore previous comment, TLS has no padding bytes. Ideally, we would use typelayout here (see #403), but absent this:

Suggested change
// safe: we manage this memory, ptr is aligned, and 0 is valid for every field of the TLS.
// safe: we manage this memory, ptr is aligned, 0 is valid for every field of the TLS, and there is no padding.

let mapped_kernel_elf = multiboot::try_get_boot_information()
.and_then(|info| info.module_tags().nth(0))
.and_then(|module| map_grub_module(module).ok())
.expect("cpu_locals: cannot get kernel elf");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the same system as userspace: put the TLS reference data in memory? (I suppose it's to safe some memory since the kernel isn't expected to need new "threads", but it'd be nice to mention that somewhere).

Copy link
Member Author

@Orycterope Orycterope Aug 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it doesn't save any memory, as the initialisation image is in .tdata/.tbss whatever you do.

This is the actual way you're supposed to do TLS: You parse your own ELF to find out where your .tdata+.tbss is. We use linker tricks in userspace because we don't want/can't access our own ELF, but in the kernel we can, so let's do it the normal way here.

CURRENT_THREAD.clone()
// if cpu_locals haven't been initialized, accessing gs:0 will triple fault,
// so don't even remotely try to access it.
if !ARE_CPU_LOCALS_INITIALIZED_YET.load(Ordering::Relaxed) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't do this check in any other of the places that attempt accessing CURRENT_THREAD though, like set_current_thread, is_in_schedule_queue, etc... Should we?

I just had a silly thought: What if bootstrap was responsible for setting up the main_cpu's TLS? If that's doable, it'd entirely remove the unsafety around cpu-locals, which would be nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other places would never be called in an early context.

I agree this is not ideal, I'm considering initializing TLS in bootstrap. This is not trivial however, and if I decide to it, i'll do it as a separate PR.

@@ -52,6 +52,10 @@ impl ModuleHeader {
pub const MAGIC: u32 = 0x30444F4D;
}

extern "C" {
pub static module_header: ModuleHeader;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document this.

pub unsafe extern fn relocate_self(aslr_base: *mut u8, module_header: *const ModuleHeader) -> u32 {
let module_header_address = module_header as *const u8;
let module_header = &(*module_header);
pub unsafe extern fn relocate_self(aslr_base: *mut u8, module_headr: *const ModuleHeader) -> u32 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh god what? What's the difference between the global mdoule_header and this module_header? Also, the function doesn't use the static module_header, so why rename the variable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they're supposed to be the same !

Renaming just so it doesn't conflict, and for clarity.

// safe: the context will never be accessed mutably after its allocation,
// it is guaranteed to live for the whole lifetime of the current thread,
// it is guaranteed to be well-formed since we allocated it ourselves,
// => creating a ref is safe.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's another invariant that you don't talk about here: the context needs to live forever (since it's a &'static). And by forever I don't mean "until the thread dies", but until the whole program dies. This is because a reference can be passed to other threads.

This is currently OK because we don't properly Drop Thread. Maybe mention that?

/// Start this thread.
pub fn start(&self) -> Result<(), Error> {
unsafe {
syscalls::start_thread(&(*self.0).thread_handle.r#try().unwrap())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you guess what's missing here? Yup, it's the all-famous safety comment! :D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't even unsafe 🤦

Copy link
Member

@roblabla roblabla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I clicked the wrong button

@Orycterope Orycterope merged commit 3ec9df1 into sunriseos:master Aug 12, 2019
Orycterope added a commit to Orycterope/SunriseOS that referenced this pull request Aug 12, 2019
Because the TLS PR is so big, todobot bailed on it.

Make a PR that only touches every `// todo:` we opened in sunriseos#339, so todobot picks them again.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
project-kernel Related to the kernel project-libuser Related to the libuser type-feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants