added secret types rfc #2859

avadacatavra · 2020-01-23T17:42:33Z

Rendered

The goal is to provide primitive data types that can be used to traffic in transient secrets in code that are important to not accidentally leak. These new primitives would be used in conjunction with an in-progress LLVM RFC to ensure important security invariants like secret-independent runtime are maintained throughout the compilation process. Note that we explicitly do not want secret_isize and secret_usize, because we do not want to index based on secrets.

Lokathor · 2020-01-23T17:49:34Z

If the LLVM RFC were accepted we'd still need to move Rust to some future version of LLVM that implements that RFC before we'd even have a hope of implementing this RFC. Is that right?

avadacatavra · 2020-01-23T17:50:27Z

We could still implement this RFC, we just wouldn't have all of the guarantees yet. It's still a huge step in a good direction though.

colmmacc · 2020-01-23T18:14:46Z

text/0000-secret-types.md

+ - Confidentiality of compromise is desirable
+Therefore, it’s important to use data-invariant programming for secrets. For this reason, secret types would only allow data invariant operations at all levels of compilation.
+
+Additionally, secret types serve as an indicator to programmers that this information should be treated with care, and not mixed with non-secret data, an invariant that would be enforced by the type system.


Just some more detail on this: it's important not to mix secrets and untrusted input in a context such as compressed content. Otherwise CRIME-like (https://en.wikipedia.org/wiki/CRIME) attacks can be used to retrieve the secret information.

burdges · 2020-01-23T18:20:08Z

I'd suggest adding impl<'a, const N: usize> From<&'a [X; N]> for &'a [secret_X; N], and its mut and slice versions, that do the unsafe pointer conversion. I suppose the opposite direction could be From too, but maybe folks would prefer some explicitly named method.

Centril · 2020-01-23T19:07:13Z

text/0000-secret-types.md

+# Rationale and alternatives
+[rationale-and-alternatives]: #rationale-and-alternatives
+
+The main alternative to this design is to handle this via crates such as secret_integers [3]. However, without compiler integration, the compiler could optimize out the guarantees that have been claimed at the source level. For example, the compiler could use non constant time instructions such as divide. Using these primitives correctly from a high-level language will typically require careful management of memory allocation, flagging relevant memory as containing secret for the operating system, hardware, or other system level components.


I could not find what guarantees were claimed at the source level in this RFC.

Centril · 2020-01-23T19:11:24Z

text/0000-secret-types.md

+# Guide-level explanation
+[guide-level-explanation]: #guide-level-explanation
+
+Secret integer types are a type of restricted integer primitive. In particular, non-constant time operations like division are prohibited. Printing of secret integers directly is also prohibited--they must first be declassified into non-secret integers.


Why are these primitive types as opposed to e.g. exposed intrinsics on normal integer types for which wrappers can be built around?

Using wrapper type means you can't write let x: Secret<u8> = 123; (unless Rust has user-defined literal). Though let x = Secret::new(123_u8); doesn't look bad.

Yes, SecretU8::new(123); is much more in line with what I would expect. A relevant precedent is the NonZero* types: rather than having nonzero_i8 and friends as primitives, we have NonZeroI8 with new constructors.

If the RFC desires to insist on primitives rather than library types, I would expect a very extensive section documenting why that would be necessary; as it stands, adding a new primitive has a much higher bar than adding a new library type.

Using wrapper type means you can't write let x: Secret<u8> = 123; (unless Rust has user-defined literal). Though let x = Secret::new(123_u8); doesn't look bad.

I think that would be completely acceptable, documenting the secret nature of all such data. On the issue of primitive vs library types, LLVM will need to treat Secret types distinctively, without conditional branches and with clearing of state post-use, at least to the extent that proves to be practical. I doubt that library types can get passed to LLVM with a distinguisher that triggers those code-generation aspects.

How does this interact with NonZero* types? Can a secret type be non-zero?

How does this work with existing NonZero* types for Option and Result space optimization? Does size_of::<Option<secret_u8>>() == size_of::<Option<NonZeroU8>>()?

Does size_of::<Option<secret_u8>>() == size_of::<Option<NonZeroU8>>()?

No, secret_u8 can be 0.

I would assume secret types behave more like MaybeUninit in that any forbidden values can't be exploited and they have to be treated as a set of opaque bytes for type layout.

Ah, that is understandable, so it means it is explicitly leaving optimization on the table for security purposes. Maybe we can add this to the drawbacks such that we mention that we explicitly prevent some optimization for security purposes?

I wouldn't call that a "drawback" so much as the actual goal (a.k.a. "that's not a bug, that's a feature")

When integrated with hypothetical secret-type-aware LLVM or Cranelift (etc) backend, the purpose of using secret types is to ensure they are, at all layers of processing, optimization, and codegen, explicitly opted-out of anything which would introduce any form of data-dependent timing variability.

Anything that special cased and branched on a 0 value would be an example of that (in fact CPUs short circuiting the ALU on multiply-by-zero is a classical example of introducing cryptographically broken timing variability).

When writing constant time code, if you really wanted the equivalent of an Option<secret_u8>, you'd probably instead want to use something like subtle::CtOption<secret_u8> instead. Furthermore, it probably wouldn't make sense to even have a NonZeroSecretU8 unless you were able to instantiate it with something that returned a CtOption (as it were, I just did something like that).

Centril · 2020-01-23T19:13:42Z

text/0000-secret-types.md

+- Default
+- Not
+
+We will also need to define a trait Classify<T> for T and method declassify. Classify will take a non secret integer or boolean and return a secret integer or boolean. `declassify` will consume a secret integer returning a non-secret integer.


I think it would make for a more readable proposal if you specified the trait definitions, and the other operations in ```rust blocks.

Centril · 2020-01-23T19:14:05Z

text/0000-secret-types.md

+
+
+# Reference-level explanation
+[reference-level-explanation]: #reference-level-explanation


Please explain the rationale for including these specific operations.

It was mentioned in the guide-level explanation:

In particular, non-constant time operations like division are prohibited.

It might still be useful to re-iterate here that these are the specific operations that we know are possible to implement in constant-time.

Centril · 2020-01-23T19:15:04Z

text/0000-secret-types.md

+# Drawbacks
+[drawbacks]: #drawbacks
+
+Because secret integers prohibit the use of certain operations and compiler optimizations, there will be a performance impact on code that uses them. However, this is intentional.


This is vague. Please be as specific as possible regarding what optimizations or categories of optimizations would be prohibited, at what granularity they would be prohibited, and what the impacts of those prohibitions would be.

Please also, without referring to LLVM, specify how these prohibitions are justified in terms of a specification. If guarantees (with stability promises and all that, as opposed to best effort constructs) are sought after, such a specification should be operational (in the sense of operational semantics and abstract machines). At the moment, I do not see how this can be specified in terms of an abstract machine / interpreter like Miri.

I imagine one could re-use some ideas from this paper and more generally from the IFC (information flow control) literature. The guarantee should probably be some form of non-interference -- program executions that only differ in secrets must not be observably different, or so. One hard problem is defining what exactly the possible observations are; that can and should (IMO) be done in an operational way (a "trace of observations" that e.g. Miri could print, or so).

A type-based approach certainly sounds like a good start for me; this feels much better than ad-hoc attributes.

Thanks for the paper reference; that's a good start. Rather than just enumerate possible observations, we can also enumerate observation modalities that are out of scope. My list of the latter includes

instantaneous power fluctuations due to Hamming weight of secret data, whether on-chip or at interconnects between chips (e.g., CPU and RAM); and

secret-influenced variation in RF radiation as secret data propagates through the circuitry.

I think you made a very good point for why we need to precisely list what we do consider an observation -- everything not on that list (and that will be all sorts of weird stuff, like your two items) is not protected.

I think you can generally put all microarchitectural covert channels out-of-scope, with the possible caveat that writing constant-time code is the best known defense against them.

That said I think compiling a list of more general architecture properties is a good idea. An example of one is: the underlying architecture is assumed to be able to perform integer multiplication in constant time (e.g. doesn't short-circuit on multiply-by-0-or-1)

That said I think compiling a list of more general architecture properties is a good idea. An example of one is: the underlying architecture is assumed to be able to perform integer multiplication in constant time (e.g. doesn't short-circuit on multiply-by-0-or-1)

I don't think such a strong assumption is required. Once the information that inputs to a multiplication are secret is forwarded to the compiler, it's the compiler's job to generate constant-time code. If there is a constant-time multiplier in the target micro-architecture, that's easy. If not, what the compiler will need are constant-time runtime libraries to use for arithmetic on secret integers. The critical job for the language to take care of is to forward the information about secrecy in the first place.

If there is a constant-time multiplier in the target micro-architecture, that's easy. If not, what the compiler will need are constant-time runtime libraries to use for arithmetic on secret integers.

Even across the same microarchitecture there are CPUs that do-or-do-not perform multiply in constant time, e.g. the PowerPC 7xx CPUs used by Apple perform constant time multiplication, but many other PPC32 CPUs do not.

I agree that there is a difference between different microarchitectures (i.e., implementations of some architecture) with only some having variable-time arithmetic instructions. The situation is even worse if you take future implementations of some architecture into account, that suddenly introduce new variable-time behavior. Compilers will need a new hardware-software contract to know what subset of instructions is safe to use (see https://ts.data61.csiro.au/publications/csiro_full_text/Ge_YH_18.pdf).

However, to me this all seems like lower-level issues: what can be done on language-level is to ensure that some operations are avoided on secret data that a compiler cannot possibly protect (branching, addressing) and for all other operations leave it to the compiler and hardware to guarantee constant-time behavior.

Nice find! Here's a more recent one: https://arxiv.org/abs/1901.08338.

One of the authors of that paper also contributed to this recent attack, which shows that Intel's attempt at ad-hoc patching is useless: https://cacheoutattack.com/

RIDL is another good attack, well written paper: https://mdsattacks.com/

I imagine one could re-use some ideas from this paper and more generally from the IFC (information flow control) literature. The guarantee should probably be some form of non-interference -- program executions that only differ in secrets must not be observably different, or so. One hard problem is defining what exactly the possible observations are; that can and should (IMO) be done in an operational way (a "trace of observations" that e.g. Miri could print, or so).

Co-author of said paper here.

Cryptographic constant-time security is indeed usually stated as a non-interference property. This works by taking the operational semantics of the considered language, and instrumenting it by adding "leakages" or "observations" to step taken by instructions that can be compiled to some sort of jump or memory accesses. A program is then said to be secure if for two executions that differ only on secret inputs, then the leakages are equal.

As for impacted optimizations, surprisingly, the only thing impacted during our work on CompCert was the generation of builtin code that was not branchless, so it might be safe to consider those optimizations implemented in CompCert to not be impacted for llvm too: tailcall, inlining, CSE, constant propagation, deadcode elimination. That remains a lot less than the optimizations implemented in llvm though, I assume...

Centril · 2020-01-23T19:15:46Z

text/0000-secret-types.md

+[unresolved]: #unresolved-questions
+Out of scope: Memory zeroing (beyond implementing the drop trait) and register spilling
+
+There is an in progress RFC for LLVM that will be required for the secret type guarantees to be fully guaranteed throughout the compilation process.


What about Cranelift?

If Cranelift were to add support for this, the earlier the better.

It's going to be a monumental effort for LLVM, not because the work is complicated - it's simple and mundane, but there is going to be a lot of it because it touches every layer of code generation.

Also note that in as much as Cranelift is intended to be a backend for a WASM runtime, there are parallel efforts to add similar features to WASM, like CT-WASM:

https://github.com/PLSysSec/ct-wasm

Ideally if this does end up finding its way into WASM proper, Cranelift could use a common implementation for both Rust secret integers and WASM secret integers.

Great. This should be elaborated upon in the text. And specifically, I think guarantees are sought (which I am skeptical of, see above), I think the same guarantees must be given in Cranelift as well before this can become stable.

cc https://github.com/bytecodealliance/cranelift/issues/1327 (not exactly the same, but for the same purpose)

@bjorn3 aah nice!

I think for this a special type may be better though to also prevent any branches, non constant time multiplication and other ways of leaking it.

bjorn3 · 2020-01-23T19:40:59Z

text/0000-secret-types.md

+- ShrAssign
+- Sub
+- SubAssign
+- Sub


I think these should only be implemented when the size is at most the native pointer size to eliminate most non constant time implementations.

bjorn3 · 2020-01-23T19:41:42Z

text/0000-secret-types.md

+- Overflowing_shl
+- Overflowiing_shr
+- Overflowing_pow
+- Pow


This is never constant time I believe.

I don't seem to understand why this cannot be constant time. Can you elaborate?

It is implemented using a loop and if's: https://doc.rust-lang.org/stable/src/core/num/mod.rs.html#3667

pub fn pow(self, mut exp: u32) -> Self { let mut base = self; let mut acc = 1; while exp > 1 { if (exp & 1) == 1 { acc = acc * base; } exp /= 2; base = base * base; } // Deal with the final bit of the exponent separately, since // squaring the base afterwards is not necessary and may cause a // needless overflow. if exp == 1 { acc = acc * base; } acc }

Ah, I see what you mean. I misread this as "cannot be implemented in constant-time".

In any case, we could pow to do the computation in constant time. (example)

You need more code for constant time, like https://github.com/dalek-cryptography/curve25519-dalek/blob/master/src/backend/serial/u64/field.rs#L444 but maybe anyone using secret data already does this themselves.

I would allow base to be secret, but not exp. That way, it is trivial to do it in time that only depends on exp -- the existing pow algorithm is good enough. If secret exp is desired, you should just use a cryptography crate -- out-of-scope for std due to complexity.

bjorn3 · 2020-01-23T19:42:11Z

text/0000-secret-types.md

+- Overflowing_mul
+- Overflowing_neg
+- Overflowing_shl
+- Overflowiing_shr


*overflowing

bjorn3 · 2020-01-23T19:43:26Z

text/0000-secret-types.md

+- Default
+- Drop
+- Mul
+- MulAssign


This is often not constant time on the hardware level, right?

and / or / xor should be constant time

copy has no machine instructions it's just semantic

clone is just copy for copy types

Default just pulls pulls out a 0 for integers

Drop does nothing probably, or it could volatile write 0 or something, which is still constant time.

Add and Mul I don't know enough to say in all cases.

I should have said that I was talking about Mul and MulAssign.

Mul, and thus MulAssign, are not constant-time on ARM Cortex or Power/PowerPC. Many implementations use smaller multipliers in HW and then stitch the partial products together over multiple clock cycles. Most such implementations allow some of the clock cycles to be skipped for certain data. The most common is early termination if either input is zero. The next most common early termination case is if the upper half of either input is zero. The ARM Cortex family multipliers have an even more complex scheme; it appears to be undocumented, but testing indicates that the Cortex multipliers early terminate whenever the upper half of either input has a Hamming weight less than 2 or more than 14 (for 32x32 multiply).

Many architectures have a constant-time means of converting a status bit (usually carry) to an all-zero or all-one mask. On those architectures conditional if/then/else logic can be implemented by xor/masked-and/xor selection to run in constant time. The LLVM IR has the needed primitive, but LLVM usually realizes that primitive as a varying-execution-time conditional branch. That conversion would need to be suppressed for conditional expressions involving these secret types. Among other things, that should enable constant-time for Saturating_add and Saturating_sub.

The constant-time behaviour of multiplication can indeed be a problem on some CPUs, although it's very often safe. Much more information is at https://www.bearssl.org/ctmul.html

However, multiplication is extremely useful! Omitting it at one layer likely means that people will simulate it one layer up. (I.e. people will implement multiplication in Rust code using addition etc.) LLVM knows the CPU target, can know the behaviour of mul on that target, and is likely best placed to substitute a constant-time replacement when needed.

I think if you're using a CPU which e.g. short-circuits on multiply-by-zero or multiply-by-one, it's basically broken for cryptographic purposes (e.g. the aforementioned PPC32)

I have no expectation of Rust or LLVM to try to autodetect these CPUs or apply some sort of countermeasure, but would rather explicitly call out CPU-level assumptions about how e.g. multiplication must work, hopefully as a standard set of boilerplate which can be copied-and-pasted into Rust libraries which rely on those set of assumptions.

lovesegfault · 2020-01-23T23:54:56Z

text/0000-secret-types.md

+# Summary
+[summary]: #summary
+
+The goal is to provide primitive data types that can be used to traffic in transient secrets in code that are important to not accidentally leak. These new primitives would be used in conjunction with an in-progress LLVM RFC to ensure important security invariants like secret-independent runtime are maintained throughout the compilation process. Note that we explicitly do not want secret_isize and secret_usize, because we do not want to index based on secrets.


The LLVM RFC should be linked

lovesegfault · 2020-01-23T23:56:03Z

text/0000-secret-types.md

+```
+let x : secret_i8 = 6;
+let y : secret_i8 = 10;
+println!((x ^ y).declassify())


What does declassify() do?

Explicitly copy the secret type to a public type. As an example, think of a ciphertext, which depends on a secret key and a secret message and is thus secret, being declassified before being sent over a channel.

kennytm · 2020-01-24T06:15:33Z

🚲 The term "secret" is ambiguous. When you call these types "secret" I thought they implement zero-on-drop. Instead this RFC proposed constant-time types.

cryptojedi · 2020-01-24T07:33:24Z

bike The term "secret" is ambiguous. When you call these types "secret" I thought they implement zero-on-drop. Instead this RFC proposed constant-time types.

The term "secret" characterizes a property of the type or data. What guarantees the language can or should offer for this property is an interesting second question. Not leaking information through timing is one, but I agree that another property that (longer-term, once LLVM or cranelift have support for this) should probably be guaranteed is to zero memory and registers once the data is no longer used.

Tom-Phinney · 2020-01-24T14:32:11Z

I've been interpreting "secret" in this context as "disclosure resistant", including disclosure via timing side-channel attacks and (to the extent feasible) via data remnants in memory and registers. Note that other side-channels, such as instantaneous power fluctuations due to Hamming weight of a multiplier, are not covered.

burdges · 2020-01-24T15:36:05Z

I've no problem with the word secret here because for an individual u8, etc. any register zeroing must happen at the LLVM level too. Ideally LLVM might track register allocations across the entire crate and provide different external and internal entry points for pub fn with the external ones zeroing more and the internal ones zeroing less depending upon what they knew gets zeroed elsewhere.

We could specify this does no zeroing right now, but express an interest in register zeroing, explicitly leave stack zeroing for unclear future work, and plan that heap allocations should always deal with zeroing themselves.

@Centril If you want wrappers then maybe a trait works better:

pub trait SecretForm  {
    type Secret;
    fn to_secret(self) -> Self::Secret;
}
impl SecretForm for u8 {
    type Secret = secret_u8;
    ...
}
...
impl<'a,const N: usize> SecretForm for &'a mut [u8; N] {
    type Secret = &'a mut [secret_u8, N];
    ...
}
...
impl SecretForm for Vec<u8> {
    type Secret = SecretVec<u8>;  // Zeros on deallocation
    ...
}

egilburg · 2020-01-25T00:48:06Z

Instead of restricting division, perhaps the compiler should guarantee constant-time division (even if it means sometimes slowing down an otherwise faster computation to match the slow path?)

Also, "secret" is rather vague and could mean different things to different consumers. What is needed? Constant time operations? Guaranteed memory blanking on out-of-scope? What about other cases? For example, Ruby for a while used $SAFE for some related "secrecy" and had a bunch of interesting ideas (such as not allowing data from external sources like IO to be assigned to a "safe" variable, although that was later dropped for various technical reasons and low adoption. Could IO sanitization also be useful here?

And what if someone say builds a banking app and wants to store transactions as secrets (but requires features such as division)? They could assume secret_ is exactly what they need. For some highly secure applications, arguably many if not most variables representing customer data would qualify as "secret" if interpreted broadly. crypto_ is more specific but also more restricted in use cases.

lovesegfault · 2020-01-25T01:02:31Z

This seems relevant, as a reference if nothing else: https://www.bearssl.org/constanttime.html

burdges · 2020-01-25T02:25:58Z

I've normally seen separate manual inverse and reduce implementations, not manual division implementations. I've never seen a batch reduce operation, but batch inversion is far more efficient than one off inversions. It'd maybe resemble https://github.com/dalek-cryptography/curve25519-dalek/blob/master/src/scalar.rs#L761 so pub fn batch_invert(inputs: &mut [N]) -> N where all inputs get inverted and it returns the product of the inverses.

Yes, there is no reason for a banking app to store "transactions" as secrets because "transactions" are not usually processed often enough for details to leak. These secrets are secret keys like elliptic curve scalars or stream cipher keys, so we run the data through identical-ish computations millions of times, and a compromise does unquantifiable damage.

Ixrec · 2020-01-25T11:10:29Z

This RFC is missing an explicit discussion of all the various "secrecy properties" like constant-time operations and zeroing-on-drop, and how considering all of them leads to this particular design. I think it's suggesting they should all be covered by a single set of types because anytime you want one you want all the others, but it should explicitly argue that. And it should be much clearer about saying that zeroing-on-drop is something we want secret types to have, but it has to be a target-specific/backend-specific guarantee, although we need a surface-level type distinction to allow backends to reliably provide anything of the sort (that is true, right?). Just saying that zeroing is "out of scope" makes it sound like it's completely irrelevant to this RFC, but IIUC it's a critical part of the argument we want "secret types" at all as opposed to, for example, a bunch of functions like constant_time_add().

tarcieri · 2020-01-25T13:52:28Z

Speaking as the author of the zeroize crate and secrecy crates which provide zero-on-drop and generic secret value wrappers built on that, and also as someone who participated in the discussion that brought about this RFC... zero-on-drop for transient secrets is... tricky.

The main problem is: when do you zero? If you do it with traditional "drop" semantics, i.e. zero every intermediate secret value produced on every stack frame, you're introducing a lot of overhead to zero out values that might be erased by subsequent stack frames within the same overall secret handling computation, e.g. an implementation of a cryptographic algorithm. This has lead to some real-world performance problems with people who have tried to use zeroize for this purpose.

An alternative idea is to handle this sort of zeroization not at the level of individual values, but at the level of function calls via a "stack bleaching" approach using a special attribute which wipes the entire stack only after an algorithm (which may call other functions, potentially via FFI) has completed executing:

https://internals.rust-lang.org/t/annotations-for-zeroing-the-stack-of-sensitive-functions-which-deal-in-transient-secrets/11588

#[sensitive]
pub fn secret_key_op(key: &Key, plaintext: &[u8]) -> Vec<u8> {
    [...]
}

(note there's ample discussion about this approach on the linked rust-internals thread)

The reason why this RFC placed zeroization discussion out-of-scope for now is because secret types providing some resilience against leaking secrets via sidechannels/covert channels are useful in-and-of-themselves even without an associated zeroization story, and a language-level zeroization story is a tricky enough topic it could perhaps use its own separate RFC, with some research into the tradeoffs of value-based vs "stack bleaching" approaches.

cryptojedi · 2020-01-27T09:25:49Z

Yes, there is no reason for a banking app to store "transactions" as secrets because "transactions" are not usually processed often enough for details to leak.

This is a dangerous assumption. I don't think that programmers, in particular outside of the very specialized area of crypto code, should have to worry about /how/ secret their data is. They should have a means to express that data is "secret" and then programming language and compiler protect that data from leaking. In some cases this will come with performance penalties and in some cases those penalties will be serious, but my intuition is that introducing types for different levels of secrecy and teaching programmers what exactly those mean is much to complex and error prone.

tarcieri · 2020-01-27T14:48:46Z

In our initial discussions around this feature some interesting use cases came up, including things like rendering UIs that contain or accept secrets, such as credentials (e.g. PINs or pairing codes).

That's a pretty broad scope that encompasses much more than cryptography, and whether or not that's a good idea is debatable. I believe these should be the go-to types for anything it's desirable not to leak via sidechannels/covert channels.

burdges · 2020-01-27T23:17:48Z

We'll never have secret_char or secret_str though I think, but.. Yes, an interactive password with error correction could be processed many times, hopefully not millions, but yes enough for side channels to be a concern. We must still represent the underlying side channel resistant type though, for which the existing arguments for secret_* types still apply.

As part of stabilization, we could explore traits like CoerceUnsized and Unsize for converting from X into secret_X, so that some pub type Secret<T>(pub T); type could provide generic impl for AsRef/Borrow<[secret_u8]>, Deref<Target=[secret_u8]>, etc. for [u8] types like Vec<u8>, &mut [u8], etc. Ain't clear if this works however since negative reasoning crops up quite quickly.

Lokathor · 2020-01-27T23:30:47Z

Why, precisely, would we not ever have secret_char

burdges · 2020-01-28T00:39:57Z

You cannot iterate over the char in a str without branching, right? I'd think almost anything you do with a char requires branches or lookups, even if you magically obtain them as [char], but if not then maybe you'd choose secret_u32 anyways.

tarcieri · 2020-09-30T20:45:20Z

Apart from that: I'd be very hesitant to support accepting something like this until there's native compile support, because until that support exists, we don't know exactly what the surface area looks like, and we wouldn't want to end up with an impedance mismatch.

FYI, there's been some recent movement on a concrete LLVM proposal. I believe it will be posted to the LLVM mailing list soon.

FWIW, while I appreciate that this needs language support either way, I very much think this should be a type that uses generics: Secret<u32>, Secret<u8>, etc.

I think this would be fine from a syntactic perspective, although it would need to be inherently limited to the set of supported integer types (and effectively just sugar for a fixed set of concrete secret integer types at the LLVM/Cranelift/etc level).

joshtriplett · 2020-09-30T21:17:45Z

@tarcieri Happy to reopen this on a moment's notice, with minimal friction.

tarcieri · 2020-10-01T01:00:05Z

I'm not an expert, but I believe it should be possible to create integer types whose operations execute in constant time, without LLVM intrinsics.

What cryptographers would like in this application is a guarantee that code written with the intent of constant-time operation will generate code with this property across all LLVM backends.

While there are a number of crates that try to enforce this in pure Rust, like secret-integers and subtle, they are actively working to circumvent things like LLVM's optimizer. LLVM is free to (and does) insert branches in code which is written in a way that superficially appears constant-time, so the only present path to ensuring constant-time operation is manual inspection of the generated assembly, which varies considerably from architecture-to-architecture and depending on various compiler and optimization settings.

The only way to get this guarantee is if it's enforced by the code generator itself, be it LLVM or Cranelift, with end-to-end enforcement through every step of compilation that the constant-time property is maintained.

stef · 2020-10-14T13:27:21Z

i was wondering where is this LLVM rfc? and if there is any way to track the status of that RFC somewhere?

nikomatsakis · 2020-10-14T19:02:27Z

@rfcbot fcp reviewed -- I am interested in seeing us support this kind of thing at some point, but postponing seems like the right step for this particular RFC (I think that if the LLVM work has progressed, a project proposal might be appropriate in the future).

xFrednet · 2020-11-18T17:31:05Z

From reading the previous comments I would prefer the the syntax using a wrapper like this Secret<u32> out of a few reasons:

This enables us to use generics which can be a major advantage. This has been stated before.
Having these data types as primitives like secret_i16 could confuse newcomers to programming.
I remember using a lot of try and error to learn programming. Finding a primitive data type like that would have confused me but also lead me to use it everywhere because: "Why not?".
Having it as a wrapper type makes it clearer to me that this is a special case of that type and should just be used for a specific cases. In this case security.
If someone prefers having a single type for each secret type it's easy to simply define a type that corresponds to the secret like type secret_i16 = Secret<i16> this could however not be done the other way around. (From primitive types to a generic wrapper type). We would therefor gain flexibility by using a wrapper type.
The last and probably smallest argument is IDE support. I think it would be easier for the use auto completing if it is a separate type. This way you could just:
1. Type Sec
2. Select Secret<T>
3. And add the data type to have something like Secret<u8>

That being said I think we should look at the final LLVM implementation. Rust is still fairly close to the hardware. If LLVM actually defines new data types for each secure type than it might be better to define each type individually. In almost all other cases I would prefer the wrapper type for the stated reasons.

tarcieri · 2020-11-18T17:44:42Z

If LLVM actually defines new data types for each secure type than it might be better to define each type individually. In almost all other cases I would prefer the wrapper type for the stated reasons.

As someone who attended the original session which was the impetus for this RFC, I'm also certain that LLVM will model them all as separate types.

That said, I agree that it would feel much more idiomatic in Rust to have a Secret<_> wrapper, but it would have to have the caveat that it could only be composed with a select set of core types, at least at first anyway.

It seems like things like Secret<[u64; 4]> or Secret<(u64, u64, u64, u64)> should be easy enough. Perhaps it'd be possible to further extend it.

Also note that I am told there is a PoC of this feature in LLVM which will be published soon, although I haven't seen any definitive timeline.

pickfire · 2020-11-19T03:12:59Z

But I wonder if it will create a pitfall when someone do Secret<Option<NonZeroU8>> to do some optimization tricks within Secret, what will be the behavior?

xFrednet · 2020-11-19T13:53:36Z

I would think that such behavior has to be disallowed. That was one of the caveats @tarcieri was talking about.

[...] but it would have to have the caveat that it could only be composed with a select set of core types, at least at first anyway.

I'm a bit unsure where you would need Secret<Option<NonZeroU8>> all cases that I could think of could also work with Option<Secret<NonZeroU8>>. That is basically equivalent to the way it would be with the secret primitive types. Would anything speak against that option?

tarcieri · 2020-11-19T13:58:26Z

Option in particular is tricky. subtle handles it with CtOption:

https://docs.rs/subtle/2.3.0/subtle/struct.CtOption.html

CtOption always does the computation, substituting a Default value if need be, and then carrying a flag (modeled as a subtle::Choice, i.e. an integer ultimately passed as an argument to something like CMOV) as to whether the inner value is the equivalent of Some or None.

I'm not sure building all that machinery into Rust at the language level makes sense. Maybe. But subtle handles things well enough for now.

burdges · 2020-12-01T10:27:44Z

At Real World Crypto several years ago, there was a study arguing that crypto code should actually meter the stack when entering sensitive areas and then zero all touched stack space when done. It's normally faster to zero a bunch of stack than to zero each ephemeral [u64; 4] used internally. So how would one meter the stack like this?

xFrednet · 2020-12-01T12:12:26Z

@burdges this could be done by using a macro for the function. It could have a syntax like:

#[zero_stack]
fn do_stuff(...) {
    // Some secure stuff
}

Something like this should already be possible if I understood you correctly that the function stack gets zeroed.

This RFC is mostly about how we mark data to indicate that the operations should not be optimised. Optimized operations can enable someone to get information just from measuring the execution time of submitted data it self. Also note that LLVM is discussing to just use these types instead of zeroing the stack. :)

tarcieri · 2020-12-01T12:39:17Z

@burdges that was also discussed as part of the creation of this RFC, but considered somewhat orthogonal.

Note that the Unresolved questions section lists zeroization as explicitly out-of-scope. This RFC is explicitly about types where the compiler can guarantee constant-time operation at every pass of compilation and code generation.

That said, based on the same discussions which lead to this RFC I started a separate internals thread about that idea (a.k.a. "stack bleaching") earlier this year which, as it were, looks nearly identical to what @xFrednet proposed:

https://internals.rust-lang.org/t/annotations-for-zeroing-the-stack-of-sensitive-functions-which-deal-in-transient-secrets/11588

Perhaps there are ways these two ideas could be combined.

rfcbot · 2020-12-09T20:14:18Z

🔔 This is now entering its final comment period, as per the review above. 🔔

rfcbot · 2020-12-19T20:18:25Z

The final comment period, with a disposition to postpone, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC is now postponed.

SamB · 2023-10-28T04:06:20Z

It seems wierd to try to make this look generic when the usual numeric types aren't generic-y either: while you wouldn't actually want to write the secret operations in terms of the normal ones, it seems kind of crazy to write them as Secret when you couldn't, say, write generic QuickCheck-style checks that the normal and secret versions of operations compute corresponding values in every case. I guess you could use 1-3 macros instead of 1-3 generics?

(You guys do have a QuickCheck-alike, right?)

tarcieri · 2023-10-28T13:52:53Z

It seems wierd to try to make this look generic when the usual numeric types aren't generic-y either: while you wouldn't actually want to write the secret operations in terms of the normal ones, it seems kind of crazy to write them as Secret

Though this RFC only covers the core integer types, based on my experiences implementing Rust cryptography code it would be nice if such secret handling could potentially be extended to any core integer type including SIMD registers, as they can all potentially contain secrets.

With something like Secret<core::arch::x86_64::__m256i> for example, that's straightforward. What would you call it without generics though? core::arch::x86_64::secret___m256i?

(You guys do have a QuickCheck-alike, right?)

https://crates.io/crates/quickcheck
https://crates.io/crates/proptest

added secret types rfc

6f96744

colmmacc reviewed Jan 23, 2020

View reviewed changes

Centril added T-lang T-libs-api A-machine A-primitive A-traits-libstd A-arithmetic A-security labels Jan 23, 2020

Centril reviewed Jan 23, 2020

View reviewed changes

bjorn3 requested changes Jan 23, 2020

View reviewed changes

lovesegfault reviewed Jan 23, 2020

View reviewed changes

rfcbot added proposed-final-comment-period disposition-postpone labels Sep 30, 2020

rfcbot added the final-comment-period label Dec 9, 2020

rfcbot removed the proposed-final-comment-period label Dec 9, 2020

rfcbot added finished-final-comment-period and removed final-comment-period labels Dec 19, 2020

rfcbot added to-announce postponed and removed disposition-postpone labels Dec 19, 2020

rfcbot closed this Dec 19, 2020

tarcieri mentioned this pull request Aug 31, 2021

Compiler-level changes The-DevX-Initiative/RCIG_Coordination_Repo#55

Open

junderw mentioned this pull request Nov 23, 2021

Question: Could rust compiler optimization possibly expose side channel attacks? rust-bitcoin/rust-secp256k1#341

Open

tarcieri mentioned this pull request Dec 27, 2023

Add BitOps trait RustCrypto/crypto-bigint#507

Merged



		# Reference-level explanation
		[reference-level-explanation]: #reference-level-explanation

added secret types rfc #2859

added secret types rfc #2859

Conversation

avadacatavra commented Jan 23, 2020 • edited by XAMPPRocky Loading

Lokathor commented Jan 23, 2020

avadacatavra commented Jan 23, 2020

Choose a reason for hiding this comment

burdges commented Jan 23, 2020

Choose a reason for hiding this comment

Centril Jan 23, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tarcieri Jul 31, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RalfJung Jan 28, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Centril Jan 23, 2020 • edited Loading

Choose a reason for hiding this comment

bjorn3 Jan 23, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Tom-Phinney Jan 23, 2020 • edited Loading

Choose a reason for hiding this comment

Tom-Phinney Jan 23, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tarcieri Jan 24, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kennytm commented Jan 24, 2020

cryptojedi commented Jan 24, 2020

Tom-Phinney commented Jan 24, 2020

burdges commented Jan 24, 2020 • edited Loading

egilburg commented Jan 25, 2020 • edited Loading

lovesegfault commented Jan 25, 2020

burdges commented Jan 25, 2020

Ixrec commented Jan 25, 2020

tarcieri commented Jan 25, 2020 • edited Loading

cryptojedi commented Jan 27, 2020

tarcieri commented Jan 27, 2020

burdges commented Jan 27, 2020

Lokathor commented Jan 27, 2020

burdges commented Jan 28, 2020

tarcieri commented Sep 30, 2020 • edited Loading

joshtriplett commented Sep 30, 2020

avadacatavra commented Jan 23, 2020 •

edited by XAMPPRocky

Loading

Centril Jan 23, 2020 •

edited

Loading

tarcieri Jul 31, 2020 •

edited

Loading

RalfJung Jan 28, 2020 •

edited

Loading

Centril Jan 23, 2020 •

edited

Loading

bjorn3 Jan 23, 2020 •

edited

Loading

Tom-Phinney Jan 23, 2020 •

edited

Loading

Tom-Phinney Jan 23, 2020 •

edited

Loading

tarcieri Jan 24, 2020 •

edited

Loading

burdges commented Jan 24, 2020 •

edited

Loading

egilburg commented Jan 25, 2020 •

edited

Loading

tarcieri commented Jan 25, 2020 •

edited

Loading

tarcieri commented Sep 30, 2020 •

edited

Loading

tarcieri commented Oct 1, 2020 •

edited

Loading

tarcieri commented Nov 18, 2020 •

edited

Loading

tarcieri commented Nov 19, 2020 •

edited

Loading

tarcieri commented Dec 1, 2020 •

edited

Loading