Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose round functions under separate trait #323

Closed
shamatar opened this issue Oct 31, 2021 · 15 comments
Closed

Expose round functions under separate trait #323

shamatar opened this issue Oct 31, 2021 · 15 comments

Comments

@shamatar
Copy link

shamatar commented Oct 31, 2021

It would be handy to have some access to internals of e.g. SHA256/512, Blake2 family and Sha3 family to be able to manually update a state by "absorbing" a proper rate in every case. All hash families named above anyway follow a logic that some internal state is updated by processing potentially >1 round, but each round only a fixed amount of bytes is processed.

To clarify - such trait can even be made "unsafe" as the caller would be responsible to:

  • somehow create an "empty" state (that would be great to have as a part of the trait if "raw" internal state is not pub. By "raw" I mean e.g. 512 bits of internal state of Sha256 without any extra information such as what length was processed before this point, etc)
  • call "round function" as many times as necessary, ideally using [u8; RATE] as an input, but in principle for e.g. Sha3 family [u64; RATE_IN_WORDS] may be ok. For whatever reason caller uses such a functionality, he would have to take care of all the paddings!
  • take a "state" and either provide some as_ref() for it to be able to take inner information to produce a final hash value manually (caller is 100% responsible) or (more convenient, but more work and a lot of diversities here) have some "into_hash(state)" function that would produce a hash from the state (functions with extendable output are out of the scope of this feature request, so it's kind of expected that final state will be used only once)

It may be possible to use some feature flags, and e.g. just expose a raw "compress" function for sha256 and do it all by hands (not sure about other families if their internals are exposed to the necessary degree under feature flags), but consistent way for cases where such workflow is possible would be great (and will allow to avoid forking and butchering a crate to just add more pub)

@tarcieri
Copy link
Member

The sha2 crate already has compress256 and compress512 primitives gated under the compress feature:

@newpavlov
Copy link
Member

newpavlov commented Oct 31, 2021

The sha2 crate already provides compression functions behind the compress flag. The Keccak sponge function used in the sha3 crate is exposed in the keccak crate.

Also note that in digest v0.10 (see #217) we plan to expose block-level traits, though they still do not allow manual creation of a hash state and custom generation of a final hash value. The main goal is to reduce amount of code duplication across implementation crates and to improve performance in some rare cases (e.g. in the hmac crate).

@shamatar
Copy link
Author

Looks like UpdateCore is a good approximation for a start, if it is possible to peek into internals of the Buffer enough to get the hash from the state.

Will experiment tomorrow, everything looks to depend on the concrete definitions of the associated types

@newpavlov
Copy link
Member

UpdateCore is buffer agnostic. Only the *OutputCore traits depend on a buffer type. Effectively those buffer types are used as partial block inputs with different invariants (BlockBuffer guarantees that input "slice" has length in the range of 0..BlockSize, while for LazyBuffer the range is 0..=BlockSize). You can construct those buffers manually using the new methods.

@shamatar
Copy link
Author

shamatar commented Nov 1, 2021

I've checked what's available (not to say that branch doesn't build due to missing features in dependency chains), and unfortunately it's not enough (at least for sha3): for any macro implementation over

pub struct $name {
    state: Sha3State,
}

the Sha3State is not pub, and there is no any getter function or at least AsRef. Also I've found it strange that state declaration uses pub internally, while the structure is pub(crate) only anyway

pub(crate) struct Sha3State {
    pub state: [u64; PLEN],
}

@newpavlov
Copy link
Member

What exactly do you want to do? If you want something custom based on the Keccak function, then use the keccak crate directly without relying on sha3.

@shamatar
Copy link
Author

shamatar commented Nov 1, 2021

Sure, customizing always works, but as was explained in the first post a lot of hashes have a similarity that it's possible to define a clear internal state and update/absorb function. New traits already provide an access to the absorb, but not the access to the raw internal state. The easiest way is to add extra trait with a corresponding associated type. Of course alternatively I can always unsafe transmute based on a prior knowledge of the internal structure, but if you can suggest a good name for such trait and a place where to put it I can make a PR for it too

@newpavlov
Copy link
Member

You are not answering the question. Why exactly do you want to have access to the raw internal state, generalized across various hashes even? Exposing internal structure seriously limits us since it becomes part of our public API. This means we will not be able to change it without making a breaking release. Granted, in the existing implementation crates internal states are relatively stable and do not depend on target arch, but I do not want to tie our hands without a really compelling reason.

@shamatar
Copy link
Author

shamatar commented Nov 1, 2021

The reason is internal need to use a hash function in a manner of discrete calls to the round function only, where caller is responsible to provide padding of the initial byte array and computing a number of rounds. So when all the rounds have passed caller should also be able to interpret some parts of the internal state as some logical result (in most of the cases a hash output is just a part of the state, it can also be viewed this way

N.B. UpdateCore in a new set of traits is public, but is only used to have some internal abstractions and reduce code duplication. In a similar manner all the "Finalize"-like functions can instead be viewed as pad -> run round -> take state and use it to get the hash value

@tarcieri
Copy link
Member

tarcieri commented Nov 2, 2021

Perhaps have a look at this construction? https://moderncrypto.org/mail-archive/noise/2018/001876.html

@shamatar
Copy link
Author

shamatar commented Nov 2, 2021

This is completely independent proposal and not what I try to achieve. For keccak256 an unsafe transmute + standard Digest is enough for all my purposes, so may be I'll close this issue as it kind of goes nowhere.

P.S. do you need any feedback on new traits? I was building a branch from PR locally and found few places that needed fixes like missing features, imports and traits

@newpavlov
Copy link
Member

@shamatar
Can you draft the traits which you have in mind here in comments? I still don't quite understand why exposed compression functions and the finalize traits are not sufficient and you want to manually mess with the internal state.

In a similar manner all the "Finalize"-like functions can instead be viewed as pad -> run round -> take state and use it to get the hash value

Note that not only you may need to execute several rounds after padding, but compression function also can be different from one used in UpdateCore (e.g. BLAKE2 uses flags to distinguish between last and other blocks). There is even the FSB hash which calls a different hash function (Whirlpool) inside finalization step.

@shamatar
Copy link
Author

Sure, will do over a weekend

@shamatar
Copy link
Author

A full set would look like this I'd say. Made blake2s wireframe to check sanity. Keccak256 could use separate absorb + apply.

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=f7d535446f8779fdaf3319c5e631d5e2

@newpavlov
Copy link
Member

newpavlov commented Dec 8, 2021

@shamatar
At the first glance your traits look like an over-engineering to me (though I can not say I have fully understood them). The current design is already quite far from simple and I would like to not increase its complexity further without a really good reason. In terms of simplifying implementation crates I don't think it improves situation compared to digest v0.10.

And since your proposal is not driven by a concrete practical use case, I am inclined to close this issue.

In future we may introduce traits for the sponge construction, though I am not sure about exposing the full state, since the construction security relies on the hidden part of the state.

For keccak256 an unsafe transmute + standard Digest is enough for all my purposes

What exactly are you transmuting? Note that transmuting types defined in the sha3 to your own types is UB, even if you simply have copied their definitions. As suggested earlier, I think the best approach in your case will be to build your construction on top of the keccak crate without relying on sha3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants