Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spin-based implementation for no_std #61

Closed
matklad opened this issue Oct 8, 2019 · 25 comments
Closed

spin-based implementation for no_std #61

matklad opened this issue Oct 8, 2019 · 25 comments

Comments

@matklad
Copy link
Owner

matklad commented Oct 8, 2019

We currently don't provide sync module in no_std, as it requires OS-level support for blocking. What lazy_static does in this case is that it has a spin feature, which replaces locking with spinning.

I think just silently replacing OS mutexes with busy waiting is the wrong approach, this is an important distinction which should be reflected in the type system.

So, we should instead add an opt-in spin module, which has the same API as sync, but is based on the spin crate. That is, with both std and spin features enabled, the user should be able to use both once_cell::sync and once_cell::spin.

@richardanaya
Copy link

I just ran into needing this tonight and felt bad having to turn to lazy static :( thanks for opening this! once_cell FTW!

@josephlr
Copy link

So notes on why we might want this (both in once_cell and eventually merged into libstd/libcore):

@matklad
Copy link
Owner Author

matklad commented Jan 1, 2020

So using spinlocks is the best bet.

I think I disagree with this. Using spin-locks if there's a real operating system around seems bad. If two threads race to run initialization, one thread enters a critical section and is scheduled out of the CPU, the other will be busy waiting for a long time. Moreover, if the first thread is a low-priority one, and the second one has a high priority, we get priority inversion!

Moreover, I don't think std uses blocking at all at the moment when getting random data? I think std only needs randomness for hash maps (is this true?) and there, it uses tls for caching:

https://github.com/rust-lang/rust/blob/e380efa5ecdef714dad72c473fc0933ff4d59283/src/libstd/collections/hash/map.rs#L2459-L2461

Could the getrandom be designed in such a way that it's the client who manages the state?

Something like this

pub struct SysRandom { ... }

impl SysRandom {
    pub fn init() -> SysRandom { ... }
    pub getrandom(&self, dest: &mut [u8]) { ... }
}

The std would then stuff it into a tls, and rand could use a global synchronized OnceCell<SysRandom>.

@matklad
Copy link
Owner Author

matklad commented Jan 1, 2020

@josephlr here's a demonstration that, in extremely unfortunate cases, the current spin-lock based implementation in getrandom leads to extremely horrible results: https://github.com/matklad/spin-of-death

@richardanaya
Copy link

richardanaya commented Jan 1, 2020 via email

@matklad
Copy link
Owner Author

matklad commented Jan 1, 2020

@richardanaya I'd love to hear more details about your use-case! I have a theory that one actually never wants a spin lock :)

Am I correct that your use-case is basically "I statically know that there always is exactly one thread, so no synchronization is necessary, and I want to use spin just to work around the annoying compiler errors, although I statically know that we'll never actually spin"?

@matklad
Copy link
Owner Author

matklad commented Jan 1, 2020

@richardanaya if my assumption sounds right, could you check if #82 fulfills your use-case?

@mark-i-m
Copy link

mark-i-m commented Jan 4, 2020

I just read the blog post. I disagree with the argument about interrupt handlers. Doing something potentially blocking in an interrupt handler is just wrong; interrupt handlers are supposed to be short and do minimal work because they are stealing time from the scheduled task. Personally, I think adding a spin-based no_std feature would be fine, and the implementation can be improved later if needed...

@RKennedy9064
Copy link

@matklad I'd also be interested in a #![no_std] implementation using spin-locks. I saw this crate recently and wanted to try it out, but without a #![no_std] feature I can't. My main use case would be to see if I could replace lazy_static! in the Rust OS tutorials here https://os.phil-opp.com/vga-text-mode/#lazy-statics.

Since it's building an OS, it's built with #![no_std]. I'm assuming my use case would be extremely small, but figured it couldn't hurt to bring it up. Let me know what you think.

@matklad
Copy link
Owner Author

matklad commented Feb 22, 2020

Interesting example! I think this is also a case of miss-use of spinlocks. We only need runtime initialization there because unsafe { &mut *(0xb8000 as *mut Buffer) } is not const evaluatable yet.

Rather than using lazy_static, I would suggest doing the following:

pub struct Writer {
    column_position: usize,
    color_code: ColorCode,
    buffer_addr: usize,
}

impl Writer {
    unsafe fn buffer(&self) -> &'static mut Buffer {
        unsafe { &mut (self.buffer_addr as *mut Buffer) }
    }
}

This is slightly less ergonomic (as you need a function call, and not a field access) but has massively simpler runtime behavior.

@RKennedy9064
Copy link

@matklad Does this still hold true if there's only ever one instance of Writer, that needs be be accessed by multiple sources and always writes to 0xb8000? The end result is eventually used to create printing macros that are used to display things from the kernel, as well as in hardware interrupts like so.

#[macro_export]
macro_rules! print {
    ($($arg:tt)*) => ($crate::vga_buffer::_print(format_args!($($arg)*)));
}

#[macro_export]
macro_rules! println {
    () => ($crate::print!("\n"));
    ($($arg:tt)*) => ($crate::print!("{}\n", format_args!($($arg)*)));
}

#[doc(hudden)]
pub fn _print(args: fmt::Arguments<'_>) {
    use core::fmt::Write;
    use x86_64::instructions::interrupts;

    interrupts::without_interrupts(|| {
        WRITER.lock().write_fmt(args).unwrap();
    });
}

#[panic_handler]
fn panic(info: &PanicInfo<'_>) -> ! {
    println!("{}", info);
    loop {}
}

extern "x86-interrupt" fn double_fault_handler(
    stack_frame: &mut InterruptStackFrame,
    _error_code: u64,
) -> ! {
    panic!("EXCEPTION: DOUBLE FAULT\n{:#?}", stack_frame);
}

Is something like this still possible without having a public static reference like so?

use spin::Mutex;

lazy_static! {
    pub static ref WRITER: Mutex<Writer> = Mutex::new(Writer {
        column_position: 0,
        color_code: ColorCode::new(Color::Yellow, Color::Black),
        buffer: unsafe { &mut *(0xb8000 as *mut Buffer) },
    });
}

Using your implementation, wouldn't you still need to create a pub static ref to the writer so that i can be accessed everywhere, including interrupts? Then wouldn't I still need lazy_static to make this possible since there's #![no_std] support?

I'm definitely open to different approaches for something like this, just didn't want to throw to many details in my first post. Let me know your thoughts/suggestions.

@matklad
Copy link
Owner Author

matklad commented Feb 24, 2020

Yeah, I still think that lazy_static ideally should not be here. Specifically,

    interrupts::without_interrupts(|| {
        WRITER.lock().write_fmt(args).unwrap();
    });

does synchronization three times:

  • first, we synchronize on lazy-static initialization
  • then, we disable interrupts (which is also synchronizatio)
  • finally, we lock a lock

I'd say that two of three synchronizations are unnecessary. They don't necessary make your program worse, but they are not the minimal solution.

Here's how I'd do this API:

/// Kernel Spin Lock.
////
/// This locks disables local interrupts and, on multiprocessor systems, 
/// additionally locks a global atomic flag.
/// This lock is safe to use in any context
/// 
/// See also https://www.kernel.org/doc/Documentation/locking/spinlocks.txt
pub struct KSpinLock<T> {
    #[cfg(has_two_cores)]
    locked: AtomicBool,
    value: T,
}

impl  KSpinLock<T> {
    pub const fn new(value: T) -> KSpinLock<T> { ... }

    pub fn lock(&self) -> KSpinLockGuard<'_, T> {
        disable_interrupts();
        #[cfg(has_two_cores)] while !self.locked.cas(false, true, Acquire) {}
        KSpinLockGuard { ... }
    }
}

pub struct KSpinLockGuard<T> {
    // this probably should store prev value of interrupt flags?
}

impl Drop for KSpinLockGuard<T> {
    fn drop(&mut self) {
        restore_interrupts();
        #[cfg(has_two_cores)] self.locked.store(false, Release)
    }
}

/// ...

pub struct Writer<'a> {
  column_position: usize,
  buffer: &'a mut Buffer,
}

// This struct we need to work-around lack of const-fn
struct WriterState {
  column_position: usize,
  buffer_addr: usize
}

pub fn with_writer(f: impl FnOnce(Writer<'_>)) {
    static WRITER_STATE: KSpinLock<WriterState> = 
        KSpinLock::new(WriterState { column_position: 0, buffer_addr: 0xb8000 }); 
    let mut state = WRITER_STATE.lock();
    let buffer: &mut Buffer = unsafe {&mut *(state.buffer_addr as *mut Buffer) };
    let writer = Writer { column_position: state.column_position, buffer };
    f(writer);
}

Specifically:

  • the global data doesn't really need any initaization, in theory it can just be there in our kernel binary
  • so we use available const-fn powers to get the binary we want, even if the Rust API is pretty (callback instead of &'static mut)
  • we also abstract synchronization into a dedicated kernel mutex, which manages both interrupts (which is also required) and global "locked" flags (which is only required if there are several CPUs).

@matklad
Copy link
Owner Author

matklad commented Feb 24, 2020

I do feel like this is a wall of code, in comparison with just using off the shelf lazy_static,mutex,etc. But I believe this wall of code has an important advantage in that it achieves the runtime behavior you really want. In particular, it doesn't do unnecessary late initialization, and it also does the exactly appropriate amount of synchronization.

@matklad
Copy link
Owner Author

matklad commented Feb 24, 2020

Hm, I guess a more elegant hack would be to do this:

#[repr(C)]
pub struct Writer<'a> {
  ...
  buffer: &'a mut Buffer,
}

[repr(C)]
struct WriterState {
  ...
  buffer_addr: usize
}

and then mem::transmute usize-state to an &'a mut state in the with_writer function.

@RKennedy9064
Copy link

@matklad Wow thanks for taking the time to come up with such a detailed example. I should hopefully have time in the next few days to test this out and see how it goes. I did have a few clarifying questions if you had the time.

If I understand correctly, is with_writer basically creating the lock, along with the writer and calling it once so that it just exists in the kernel binary? Then I could assign the writer using with_writer in order to call my write functions in various places and lock as needed?

Also, I noticed that one of the comments mentions needing the struct because of a lack of const-fn in Rust. Since I'm using nightly, would I be able to leverage #![feature(const_fn)] instead using WriterState, and if so, would it be advisable?

Again thanks for taking the time to look into this and provide detailed examples. This approach definitely looks promising and I like it more then having to rely on an untyped macro.

@tdonovan4
Copy link

Any update on this?

@matklad
Copy link
Owner Author

matklad commented Apr 6, 2020

@tdonovan4 not really, I still haven't seen a case which wouldn't become better by replacing spinlock with a more appropriate synchronization.

@tdonovan4
Copy link

@matklad thinking about it, you might be right. I believe I could adapt my case to use #82 instead once it's merged. Thanks.

@zesterer
Copy link

As a point of interest, spin-rs is now maintained again.

@richardanaya
Copy link

richardanaya commented Nov 14, 2020

Still curious about this issue :) I use spin-rs all the time in WebAssembly land. Mostly trying to get binary sizes down as much as possible.

@richardanaya
Copy link

FYI, to describe my use case, there's a lot of asynchronous code in web lang. WebAssembly is multi-entrant so I often times end up in scenarios where one function needs to have a common global mutexed-object. (e.g. like a common game state that render function and click handler can modify).

@zesterer
Copy link

spin-rs is now trying to mirror many of the APIs in std::sync so we'd definitely be happy to accept a spinning implementation.

@rockboynton
Copy link

Jump bumping this for interest

@matklad
Copy link
Owner Author

matklad commented Oct 22, 2022

Closed by #195

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants