Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce NIP-44 encryption standard #715

Closed
wants to merge 1 commit into from

Conversation

staab
Copy link
Member

@staab staab commented Aug 11, 2023

A distilled version of #574 focused on the cryptography standard, and omitting DM-specific stuff. @paulmillr please feel free to incorporate this diff into the original PR and I can close this one, or close the original and we can continue from here, it's up to you.

44.md Outdated Show resolved Hide resolved
44.md Outdated Show resolved Hide resolved
44.md Outdated Show resolved Hide resolved
44.md Outdated Show resolved Hide resolved
44.md Outdated Show resolved Hide resolved
44.md Outdated Show resolved Hide resolved
@jb55
Copy link
Contributor

jb55 commented Aug 11, 2023 via email

@jb55
Copy link
Contributor

jb55 commented Aug 11, 2023

because if this is true then there should be an advisory and every client should remove it?

@staab
Copy link
Member Author

staab commented Aug 11, 2023

nip4 is broken cryptography that will expose your private keys. It should never be used.

This is an important point. Signing applications don't support the new version, so compatibility with NIP 04 could ease adoption (or may it more complicated, I'm not sure). @paulmillr can you confirm how bad it would be to enshrine NIP 04 as a deprecated version 0?

@staab
Copy link
Member Author

staab commented Aug 11, 2023

Looping in a comment from #706 (review) we have a few different proposals for the payload:

  • CSV — compact, simple parsing, might be easier on mobile for @jb55. Downside is positional and @vitorpamplona thinks it's complicated
  • JSON.stringify({ciphertext: base64.encode(ciphertext), nonce: base64.encode(nonce), v}) — currently implemented in multiple places.
  • base64.encode(JSON.stringify({ciphertext, nonce, v})) — one fewer base64 call. I'm not sure base64 is even necessary if we're encoding to json either base64 decreases transport size significantly.
  • Tags — would make @jb55's life easier, but probably make interfacing with signers harder.
  • Custom TLV — faster but more annoying to build and parse

I'm going to leave what we currently have unless anyone has strong opinions so we can put this to rest.

@paulmillr
Copy link
Contributor

base64 is useful, because the result data is binary, and representing it in hex instead of base64 would increase the size by 1.5x in my benchmarks. Saving a traffic is cool.

@vitorpamplona
Copy link
Collaborator

Just for completeness, the most performant option is a custom TLV, which we also already implement in Nostr. CSV is fast, but not as fast/simple as TLV because of sequential bytearray copy instructions. We can even do the inner gossip event as a TLV as well.

I would still go for JSON.

@paulmillr
Copy link
Contributor

paulmillr commented Aug 11, 2023

Before further discussion, I would like to remind that nostr DMs, if nostr gets a traction, would be used by all kinds of people, including dissidents in dangerous places. Would you be willing to risk someone getting killed just because you want to keep backwards compatibility with bad encryption?

can you confirm how bad it would be to enshrine NIP 04 as a deprecated version 0?

  1. Unhashed ECDH exposes some curves to Cheon's attack 1, which allows an attacker to learn private key structure when doing continuous diffie-hellman operations. Hashing converts decisional oracle into a computational oracle, which is harder.
  2. CBC IVs should not only be unrepeatable, they must be unpredictable. Predictable IV leads to Chosen Plaintext Attack. There are real exploits 2.
  3. CBC with reused key has 96-bit nonces and loses 6 bits of security for every magnitude. 10**4 (10K) messages would bring archive security of the scheme to 72 bits. 72 bits is very low. For example, BTC has been doing 2^67 hashes/sec as per early 2023. Advanced adversaries have a lot of resources to break the scheme
  4. CBC has continuously been exploited by Padding Oracle attack. We can't expect all implementations to have proper padding check
  5. AES was modeled as ideal cipher. Ideal ciphers don't care about non-random keys. However, AES is not an ideal cipher 3. Having a non-IND key, as in NIP4, exposes it to all kinds of unknown attacks, and reducing security of overall scheme

Footnotes

  1. Kozaki, Kutsuma, Matsuo; Remarks on Cheon’s Algorithms for Pairing-Related Problems

  2. https://derekwill.com/2021/01/01/aes-cbc-mode-chosen-plaintext-attack/

  3. "AES-256 can not serve as a replacement for an ideal cipher", Biryukov-Khovratovich; Related-key Cryptanalysis of the Full AES-192 and AES-256

@staab
Copy link
Member Author

staab commented Aug 11, 2023

Thank you, that's exactly the kind of answer I was looking for. I'm so glad you're here to help us devs get things right. I will remove the recommendation for NIP 04 and will not include backwards compatibility in my implementation.

Tangentially, does this mean that anyone who has sent a substantial number of NIP 04 messages is more likely to have their key brute-forced?

@paulmillr
Copy link
Contributor

anyone who has sent a substantial number of NIP 04 messages is more likely to have their key brute-forced?

Yes.

That's from my understanding, but I think any auditor of the protocol would come to similar conclusions.

@staab
Copy link
Member Author

staab commented Aug 11, 2023

Well that sucks haha

@BlowaterNostr
Copy link

Why is custom TLV fast?

44.md Outdated Show resolved Hide resolved
44.md Outdated Show resolved Hide resolved
@vitorpamplona
Copy link
Collaborator

Why is custom TLV fast?

CSV Parser has to walk the byte sequence until it finds a comma. A TLV has length before the byte array. You can just do an arraycopy(0,length) and move the pointer to the next type value.

44.md Outdated Show resolved Hide resolved
@paulmillr
Copy link
Contributor

@fiatjaf what do you think? It's just encryption here, that can be used in any part of nostr. No DMs.

@staab
Copy link
Member Author

staab commented Aug 19, 2023

Of course. Obviously. Chacha is just a stream cipher

Wasn't obvious to me, I was today years old when I learned the term "malleable".

I just think it should be mentioned.

Added that to the NIP. Making edits is no problem, if you see problems in the PR feel free to suggest changes in github and I'll merge them.

@mikedilger
Copy link
Contributor

mikedilger commented Aug 19, 2023

libsecp256k1_ecdh() already does a SHA256. The Javascript example then does another SHA256. (this was wrong)

And this affects all those test vectors.

EDIT: hang on... I don't know how getSharedSecret() maps to libsecp256k1_ecdh(). And doing a double SHA256 in my code still isn't getting me data that matches the test vectors. But running the typescript does match the test vectors.

EDIT2: OK if the JavaScript was like this (without the subarray(1,33) on the shared secret output), then the shared key matches my rust code's output (which does not explicitly run a SHA256).

export const getSharedSecret = (privkey: string, pubkey: string): Uint8Array =>
  sha256(secp256k1.getSharedSecret(privkey, '02' + pubkey))

The shared key for test vector 0 is b1c9938f01121e159887ac2c8d393a22e4476ff8212de13fe1939de2a236f0a7

@mikedilger
Copy link
Contributor

I share my testing code here: https://github.com/mikedilger/nip44tests

@paulmillr
Copy link
Contributor

paulmillr commented Aug 20, 2023

@mikedilger With regards to padding, I support your decision number 2, it's simple and no-bs.

As for ecdh:

  1. noble-curves return 33-byte output for ecdh, the same output as in ecdsa-getPublicKey: parity byte AND x coordinate. The output is not hashed for flexibility.
  2. libsecp256k1_ecdh does this: https://github.com/bitcoin-core/secp256k1/blob/6b9507adf684f088f665354608685255d0bd162f/src/modules/ecdh/main_impl.h#L13-L24. It seems like it matches noble-curves behavior, with sha256() added on top?
  3. In this PR I initially proposed stripping parity byte because I thought libsecp256k1 strips is, but it doesn't seem to do so.

There is no single ECDH standard, unfortunately. For example, this rust library returns 64-byte unhashed point. What's needed: investigating more libraries to see how they behave and if they can be simply adjusted to one way or another.

See Thomases comment on stack overflow:

Formally, ANSI X9.63 defines use of elliptic curve Diffie-Hellman, and the "shared secret" is obtained by hashing the X coordinate of the resulting point. The Bitcoin world, intent on doing things its own way with no regard for standards (after all, at its core, Bitcoin is an anarchist endeavour), decided to do things differently. Since Bitcoin is the main user of secp256k1, their way of doing things is a de facto standard

@staab

Wasn't obvious to me, I was today years old when I learned the term "malleable".

We just need to understand all the use-cases of the NIP and decide if the result message would always be signed with schnorr. If they won't, perhaps it's worth switching to poly1305.

@mikedilger
Copy link
Contributor

Ok @paulmillr I have gotten my code to match and I get it now. The rust lib secp256k1 has a SharedSecret which effectively does a SHA256(parity + X) which gives the b1c99... output. The TypeScript/noble example in the NIP effectively does a SHA256(X) without the parity which gives the 0135d... output. I don't care, I was just trying to figure out why my code didn't match the test vectors.

@staab
Copy link
Member Author

staab commented Aug 20, 2023

We just need to understand all the use-cases of the NIP and decide if the result message would always be signed with schnorr. If they won't, perhaps it's worth switching to poly1305.

I think that's a safe assumption for now, we can introduce a new version if we need it

@fiatjaf
Copy link
Member

fiatjaf commented Aug 21, 2023

Stupid question, but isn't the event id/hash also serving the same function the poly1305 thing would?

@fiatjaf
Copy link
Member

fiatjaf commented Aug 21, 2023

Implemented this on https://github.com/nbd-wtf/go-nostr/blob/cd86ee25147ac3ec26326bbac52f90ebe2fb3710/nip44/nip44.go with the test vectors.

The test vectors don't check the type of the base64 encoding though.

@paulmillr
Copy link
Contributor

paulmillr commented Aug 21, 2023

Stupid question, but isn't the event id/hash also serving the same function the poly1305 thing would?

hash can be re-calculated if the underlying data is changed.

poly1305 is like hmac. they both receive key, data instead of just data, like hash.

so, poly1305(sharedKey, changedData) would fail. hash(changedData) would simply produce new, correct hash, which an attacker could place into the hash / id field.

Again, signatures resolve all the issues here. It's not possible to forge a new message that's valid for the same signature. Unless, of course, you're using shitty malleable signature scheme. Bonus point: Poly1305 is malleable and its outputs can be forged. Bonus point 2: tweetnacl implementation of ed25519 is malleable, so was secp256k1 ecdsa some years ago.

@fiatjaf I want to pursue padding addition, so this will need another day or so. Above, the good point was made with regards to it: yes would always produce 1+24+3 bytes, which can totally be used by an attacker to deduce the result. I will update test vectors ASAP. The padding is almost free, but is important. Secure messengers use padding.

44.md Outdated Show resolved Hide resolved
07.md Outdated Show resolved Hide resolved
46.md Outdated Show resolved Hide resolved
@mikedilger
Copy link
Contributor

I think this PR needs:

  1. Padding (I like my option2 but open to others)
  2. Switch to sha256(parity + X) (33 byte input) since that is what all the libraries are doing and we are working around the library implementations currently (unless we insist on the bitcoin tradition of doing things our own way).
  3. Specify base64 uses the RFC4648 alphabet and padding.
  4. Paul's new test vectors under those changes

@staab
Copy link
Member Author

staab commented Aug 24, 2023

Check on 2 and 3, just waiting for test vectors and padding. On padding, we cover that in #686, but a lower-level version would probably be better.

Currently defined encryption algorithms:

- `0x00` - Reserved
- `0x01` - XChaCha20 with same key `sha256(parity + ecdh)` per conversation
Copy link
Contributor

@mikedilger mikedilger Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the parity byte is part of the output of ecdh for most libraries, as is the sha256 part.

I would put something like ecdh(private_key, public_key) where the ecdh() function computes the shared point, and then returns the sha256 hash of the x-only component prefixed by the parity byte ('02').

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except that the libraries implement ECDSA and not schnorr. There is no parity byte in schnorr.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only understand a little. I thought the parity byte represented the Y coordinate and there were two options since the curve loops back around on itself. And I know that with Schnorr signatures it is always even (2) for some reason. And I thought DH just did successive multiplications to come up to the same point g^a * g^b = g^b * g^a and I don't understand where either signature algorithm (DSA or Schnorr) comes in when determining the DH shared point (and does ECDSA even have a shared point?).

All I want is for the NIP to do the default thing that libraries do, rather than a custom thing that requires lower-level operations.

@fiatjaf
Copy link
Member

fiatjaf commented Aug 25, 2023

I don't even know how to query what RFC4648 is. Is that what the default base64.encode() function does in most languages?

@paulmillr
Copy link
Contributor

paulmillr commented Aug 25, 2023

Besides everything, we need to analyze if the scheme is CCA-secure.

I think the API we provide to users should be bulletproof and not allow to shooting oneself into foot. Perhaps build encryption into event creation process signAndEncryptEvent(rawEvent).

Padding

For padding, i've did a deep dive a few days ago and padme seems to be good:

https://en.wikipedia.org/wiki/Padding_(cryptography)#Traffic_analysis_and_protection_via_padding
https://crypto.stackexchange.com/questions/102711/stream-cipher-padding
https://github.com/jedisct1/go-padme-padding/blob/master/go.mod
https://github.com/jedisct1/rust-padme-padding/blob/master/src/lib.rs

JS version is shitty, need to rewrite it https://github.com/sh-dv/padme/blob/main/index.js

ECDH

Switch to sha256(parity + X) (33 byte input) since that is what all the libraries are doing and we are working around the library implementations currently (unless we insist on the bitcoin tradition of doing things our own way).

To reiterate:

  1. ANSI X9.63 hashes X coordinate. This is ECDH standard used in NIST, P256, etc
  2. secp256k1 bitcoin hashes (y-parity + X coordinate).

Both schemes do not consider Schnorr signatures which work over X coordinates without parity bytes like ECDSA. On the other hand, there seem to be a separate development for X-only-ECDH-over-secp256k1.

We need to decide if parity + X is better than X.

@staab
Copy link
Member Author

staab commented Aug 25, 2023

Are you guys able to make suggestions on the PR? This is beyond me.

@mikedilger
Copy link
Contributor

I don't even know how to query what RFC4648 is. Is that what the default base64.encode() function does in most languages?

Section 4 of https://www.rfc-editor.org/rfc/rfc4648 specifies the alphabet and pad character. Some variants use different symbols for character 62 and 63 to be "url safe" for example. Some variants don't use padding characters (they are implied by the length). But that RFC is the original canonical base64 and if library doesn't give you options to pass into their base64() function, it is very probably the right one.

@mikedilger
Copy link
Contributor

mikedilger commented Aug 25, 2023

Besides everything, we need to analyze if the scheme is CCA-secure.

The lack of randomness in the choice of key, and then using XOR, leads me to think it is not. If you take an unknown message, XOR it with your chosen plaintext, do you get back the secret stream? That would be a serious flaw.

EDIT: Scratch that. I momentarily forgot that XChaCha has this huge random nonce. Carry on.

@mikedilger
Copy link
Contributor

I like PADMÉ it is elegant. But it presumes that if you want to send a 3-character message, using 6 characters would be a huge waste of space, and that there are probably lots of 3-character messages so no padding is needed. But my "yes"/"no" example still stands. In our case, given the overhead of every event, I think every message should be padded out to a minimum of 32 bytes. So I'm proposing MAX(padme(len), 32)

@paulmillr
Copy link
Contributor

32 bytes min seems fine. Any objections? Maybe 24?

@paulmillr
Copy link
Contributor

NIP44 is not robust against chosen-ciphertext attack. See code that proves it.

  1. Choose arbitrary $m_0 \neq m_1$
  2. Receive challenge ciphertext $C = (c, \sigma)$
  3. Generate a new keypair, sign $c$ again, ask challenger to decrypt $C' = (c, \sigma')$
  4. $\sigma'$ is valid and CCA security is broken

We need to change everything here. I don't think poly1305 would help, since it can be forged.

@paulmillr
Copy link
Contributor

@staab I will continue this in #746, there are quite a few adjustments that need to be made. Can we close this one?

@mikedilger
Copy link
Contributor

mikedilger commented Aug 28, 2023

I don't get it. So what if Mallory signs encrypted content that she doesn't understand. Bob will get the message, verify the valid signature, and it will decrypt into garbage because it was encrypted with Alice's key, not Mallory's.

If alternatively Bob gets it thinking it is Alice, the signature will not be valid.

@paulmillr
Copy link
Contributor

it will decrypt into garbage

Not certain, it may decrypt into a valid text sometimes.

So what

It's definition of CCA. It could leak the key in extreme cases.

@staab
Copy link
Member Author

staab commented Aug 28, 2023

Closed in favor of #746

@staab staab closed this Aug 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants