Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement new global context #605

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tcharding
Copy link
Member

Draft for feedback please!

This is an attempt at implementing the epic multiple global contexts suggested by @Kixunil: #539 (comment)

Any deviation from the "spec" is unintentional except that I set the swap bit and active context bit in a different place.

// Stores the seed for RNG. Notably it doesn't matter that a thread may read "inconsistent"
// content because it's all random data. If the array is being overwritten while being read it
// cannot worsen entropy and the exact data doesn't matter.
static GLOBAL_SEED: [AtomicU8; 32] = init_seed_buffer();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this use atomics, if the seed is just randomness then we can read and write to it willy-nilly without a problem can't we?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, no. This is a common misconception, we're not programming a real hardware, we're programming a virtual machine that defines it as illegal and failing to do it correctly will result in arbitrary wrong stuff ("nasal daemons") because we would be lying about what the code is doing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, if you get time can you explain further or link me to something to read please?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is very good: https://predr.ag/blog/falsehoods-programmers-believe-about-undefined-behavior/ It had errors when I first saw it which, judging from a quick look at the end, the author fixed (the rest was fine from the beginning).

In our case: Rust defines concurrently existing &mut references (or writes) UB, so we have to obey it. Note however that Relaxed operations should be equally fast as what you wish for on x86_64 and maybe other archs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust defines concurrently existing &mut references (or writes) UB

Ah cool, this is the bit I was missing. Thanks for the link too, that was interesting.

@tcharding
Copy link
Member Author

Went over it again, did little cleanup, renamed to use the term "semaphore".

Add new sign/re-randomization API using atomics.

Includes a few examples of how we could move forward with a separate API
for "std" builds to non-std builds.
@tcharding
Copy link
Member Author

tcharding commented May 20, 2023

I had an idea on how we move forward with this PR while having not yet resolved what to do about ONCE and no-std builds. We could have two separate APIs for std/no-std builds (no-std still passes secp to all functions while std does not). I added a few examples of how that would look to the PR.

@sanket1729
Copy link
Member

@tcharding @Kixunil (whenever you are back :) ) I am not sure if this complexity is worth the re-randomization support. Upstream secp clearly states

Randomizing a copy of secp256k1_context_static did not have any effect and did not provide defense-in-depth protection against side-channel attacks. Create a new context if you want to benefit from randomization.

We can avoid a lot of complexity of seeds etc, if we just skip the re-randomization.

use crate::{ffi, Secp256k1};

struct GlobalVerifyContext {
__private: (),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this have a field in it? What value does the __private provide?

@sanket1729
Copy link
Member

sanket1729 commented Sep 22, 2023

I am focussing a lot of my review into this PR as a part of getting static contexts as a part of secp 1.0.

@tcharding
Copy link
Member Author

tcharding commented Sep 22, 2023

We can avoid a lot of complexity of seeds etc, if we just skip the re-randomization.

Oh if "We can avoid a lot of complexity of seeds etc, if we just skip the re-randomization." holds then you don't need to put too much time into this PR, we can try some other approach? If now (like over this coming month) you are good to focus on this work then I can also? @apoelstra do you feel like devoting clock cycles to this work at the moment?

@apoelstra
Copy link
Member

No, I don't have time to focus heavily on this, sorry.

I note that in #388 we decided that we wanted to punt some of the decisions around "randomize after signing" to upstream since they have the ability to do partial randomizations in a much faster way that did not involve sourcing random data from somewhere.

As for "do we need to rerandomize", IIRC we decided somewhere to do a "spinlock" that responded to contention by simply not randomizing. Then they was a long discussion about how under certain usecases this would result in us never rerandomizing and I don't think we ever came to an agreement on it. And then we got sidetracked trying to come up with a perfect solution.

My vote, so we can move forward, is:

  • When we have std just use a thread-local variable and rerandomize after every signature. Have a sign_no_rerandomize or whatever that would override this behavior if you are performance sensitive.
  • When we don't have nostd, rerandomization should be a no-op. (Maybe in the future we can make it do something smarter.)
  • When we have rand we should rerandomize when creating a context. Otherwise we shouldn't.

@sanket1729
Copy link
Member

I agree with @apoelstra's suggestion. In light of ease of review and maintenance, I feel this solution works better for me. If there are better re-randomization strategies, they can always be updated without breaking the API.

Randomizing a copy of secp256k1_context_static did not have any effect and did not provide defense-in-depth protection against side-channel attacks. Create a new context if you want to benefit from randomization.

@tcharding
Copy link
Member Author

Ok, that is a lot more simple and shouldn't need nearly as much review/iterating. I rekon we can do this with only half your brain on occasions @apoelstra :)

@apoelstra
Copy link
Member

The other thing blocking this process is that it's tied to the process of overhauling the API to get rid of all the context objects.

@tcharding since we are trying to get a new release out right now, I think we should hold off on this until after that.

@tcharding
Copy link
Member Author

The other thing blocking this process is that it's tied to the process of overhauling the API to get rid of all the context objects.

At the risk of sounding like a goose that is what I though this work was about.

@tcharding since we are trying to get a new release out right now, I think we should hold off on this until after that.

Sure thing.

#[rustfmt::skip]
const fn init_seed_buffer() -> [AtomicU8; 32] {
let buf: [AtomicU8; 32] = [
AtomicU8::new(0), AtomicU8::new(0), AtomicU8::new(0), AtomicU8::new(0), AtomicU8::new(0), AtomicU8::new(0), AtomicU8::new(0), AtomicU8::new(0),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplify this by first assigning to a const:

const ZERO: AtomicU8 = AtomicU8::new(0);
let arr = [ZERO; 32];

(in latest rust you can: let arr = [const {AtomicU8::new(0)}; 32];)

@elichai
Copy link
Member

elichai commented Jul 6, 2024

About the OnceCell,
You can easily replace that with a RacyCell(like https://docs.rs/once_cell/latest/once_cell/race/struct.OnceBox.html),
That way if there's no context and 2 threads are trying to use it they race to create one instead of one of them parking

@Kixunil
Copy link
Collaborator

Kixunil commented Jul 6, 2024

We can avoid a lot of complexity of seeds etc, if we just skip the re-randomization.

That's somewhat nice to read, TBH. :D

Honestly, I hate all this context stuff. It makes the code and the API very complicated, it's hard to sensibly manage them, hard to constify... I'd love to see all contexts just go away. And it feels like to a large degree it's trying to fix a hardware problem with software. If it's proven to be unhelpful we can just pas the responsibility to someone else.

  • When we have std just use a thread-local variable and rerandomize after every signature.

From what I understand this is not better than just creating a new context on stack. We can do that efficiently in vendored version because we can statically know its size and use alloca executed on the C side in dynamically-linked system version. Then we don't need to deal with this conditional BS and poor no_std support.

@apoelstra
Copy link
Member

And it feels like to a large degree it's trying to fix a hardware problem with software.

Well, it's defense in depth.

If it's proven to be unhelpful we can just pas the responsibility to someone else.

In this case the benefit is so small that I don't think it's worth the pain to achieve (especially if we've more-or-less agreed on the API and are letting technical problems around concurrency block us from moving forward ... and I think we can fix the technical problems later, perhaps cfg-gated on a higher MSRV or something). But I don't think it's reasonable for a crypto library to "pass the responsibility" for sidechannel-freeness to somebody else. The somebody elses -- compiler writers and CPU implementors -- are actively working against us!

From what I understand this is not better than just creating a new context on stack. We can do that efficiently in vendored version because we can statically know its size and use alloca executed on the C side in dynamically-linked system version. Then we don't need to deal with this conditional BS and poor no_std support.

It's better because it doesn't show up in the API anywhere. Thread-local storage is pretty much "just creating a new context on the stack" but it lets us program as though the data were a global since it's on a thread-global stack.

@apoelstra
Copy link
Member

I propose:

  • We implement the API we want with no rerandomization support (including sign_without_rerandomization methods which we promise not to slow down in the future but which currently are the same as the regular signing methods).
  • We then figure out how to do rerandomization, maybe in two parts (std then nostd).

@Kixunil
Copy link
Collaborator

Kixunil commented Jul 7, 2024

The somebody elses -- compiler writers and CPU implementors -- are actively working against us!

True. :(

It's better because it doesn't show up in the API anywhere.

I mean ad-hoc allocate on stack. Something like this:

pub fn sign_ecdsa(self, mesage: Message) -> ecdsa::Signature {
    with_context(|ctx| {
        ctx.sign_ecdsa(self, message)
    })
}

pub(crate) fn with_context<R, F: FnOnce(&Secp256k1<AllPreallocated>) -> R>(f: F) ->R {
    #[cfg(feature = "std")]
    {
        f(&GLOBAL_CONTEXT);
        GLOBAL_CONTEXT.rerandomize();
    }
    #[cfg(all(feature = "sys", not(feature = "std")))]
    {
        let mut f = ManuallyDrop::new(f);
        let mut cb = |ctx| {
            unsafe { ManuallyDrop::take(&mut f)(ctx) }
        }
        // the function is written in C and just calls alloca and passes the pointer to the callback - see bellow
        unsafe { secp_256k1_version_with_alloca(&mut cb as *mut dyn FnMut(&mut Secp256k1<AllPreallocated>)) }
    }
    #[cfg(all(not(feature = "sys"), not(feature = "std")))]
    {
        We statically know the size here, no need for alloca.
        let mut storage = Secp256k1ContextStorage::uninit();
        let mut ctx = Secp256k1::new_preallocated(&mut storage);
        f(&mut ctx)
    }
}

#[no_mangle]
extern "C" fn secp256k1_version_with_alloca_callback(data: *mut ffi::Context, closure: &mut dyn FnMut(&mut Secp256k1<AllPreallocated>)) {
    closure(Secp256k1::from_ffi(data))
}

Or perhaps upstream might be willing to provide a function that creates a context on stack and calls a callback?

I propose:

Good idea. Do we feature gate rerandomized ones? Although we can probably at least support std - that one is easy.

@apoelstra
Copy link
Member

On my system a new unrandomized context creation costs about 2.2us, or about 10% the time of a signing operation. (Signing is way faster than in the past thanks to the recent "signed-digit multi-comb" precomputation table technique upstream.)

So I'm not thrilled with this, especially since the expensive code would be run on those embedded systems which are least able to bear the extra cost (and which may have limited stack space as well).

Given that we aren't rerandomizing in this case, why not just use the global static context? So with std on, GLOBAL_CONTEXT is a thread-local and can be rerandomized but with std off, it's an actual global (upstream exposes one we can use) and cannot be rerandomized.

@Kixunil
Copy link
Collaborator

Kixunil commented Jul 7, 2024

@apoelstra I forgot to supply an RNG. I think we can seed the RNG with global atomics (as I suggested elsewhere). Occasional reuse of entropy (if two threads happen to read it at the same tim) shouldn't be that big of a problem. Those no_std systems unable to bear the cost should cal the _no_rerand versions or something.

@apoelstra
Copy link
Member

Ok, yeah, I can get behind this. Let me ask upstream if they can help us.

@apoelstra
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants