Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to our own memcmp function #825

Closed

Conversation

real-or-random
Copy link
Contributor

@real-or-random real-or-random commented Sep 24, 2020

Fixes #823.

Fwiw, I considered to make this a constant-time function but didn't do it in the end. Most constant-time memcmp replacements are actually only inequality checks but we need a real memcmp which outputs which one of the arguments is larger.

If you look at crypto libraries, there are a few candidates but I found this overkill for our use case
https://github.com/jedisct1/libsodium/blob/a8fa837aacd310bc08fa72705a738fafc2513125/src/libsodium/sodium/utils.c#L239
https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/lib/libc/string/timingsafe_memcmp.c?rev=1.1&content-type=text/x-cvsweb-markup

If we need a constant-time function in the future, it'll probably be only for equality/inequality, and then we can add the standard "OR of byte-wise XOR" implementation.

@@ -216,6 +216,21 @@ static SECP256K1_INLINE void memczero(void *s, size_t len, int flag) {
}
}

/** Semantics like memcmp. Variable-time.
We use this to avoid possible compiler bugs with memcmp, e.g., a bug in certain versions of GCC 9 and 10. */
int secp256k1_memcmp_var(const void *s1, const void *s2, size_t n) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make static please.

Copy link
Contributor

@roconnor-blockstream roconnor-blockstream Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we required to follow the const void * signature? Should we make it const unsigned char *?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that would need more code changes, as callers in some places expect to be able to pass a pointer to other objects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. This void thing is really subversive.

@@ -216,6 +216,21 @@ static SECP256K1_INLINE void memczero(void *s, size_t len, int flag) {
}
}

/** Semantics like memcmp. Variable-time.
We use this to avoid possible compiler bugs with memcmp, e.g., a bug in certain versions of GCC 9 and 10. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to the bug? So that it's easy to re-assess in the future whether this is still needed.

@roconnor-blockstream
Copy link
Contributor

roconnor-blockstream commented Oct 5, 2020

Wasn't the plan to get rid of string.h entirely? Thus needed to replace memset as well?

@real-or-random
Copy link
Contributor Author

Wasn't the plan to get rid of string.h entirely? Thus needed to replace memset as well?

@gmaxwell said in #823:

(perhaps even memcpy too, and eliminate the string.h usage)

But this was not discussed further, so I sticked to only memcmp for this PR.

I think I'd like to see some benchmarks before we decide to get rid of more functions. memcpy is used in the hashing code, and memset will be used a lot in cleaning up secret after #636 (which I have been procrastinating on for too long, I know...). And this can make a difference, see e.g., #636 (comment).

@roconnor-blockstream
Copy link
Contributor

I don't understand what this PR is trying to achieve. You are replacing occurrences of memcmp that are not believed to be miscompiled by GCC 9 & 10.

@real-or-random
Copy link
Contributor Author

I don't understand what this PR is trying to achieve. You are replacing occurrences of memcmp that are not believed to be miscompiled by GCC 9 & 10.

I'm not the best person to answer this because I tend to agree, see #823 (comment). But since this PR does not make our code worse (no performance penalty, small amount of added code), I'm ok with working on it.

@roconnor-blockstream
Copy link
Contributor

I think the worry is about introducing code that unwittingly encounters this bug in the future (#823 (comment)). I thought maybe removing string.h might help achieving that because without the header file, it is difficult to use memcmp (though how to stop the reintroduction of the header?). Otherwise, it seems like we need a linter step or something to disallow memcmp added to this PR.

@real-or-random
Copy link
Contributor Author

real-or-random commented Oct 5, 2020

Well we could add some linter script that uses grep to check whether all instances of the memcmp string are actually secp256k1_memcmp. But that's ... weird. It doesn't understands comments, other identifiers, etc. Real linting is only possible in the compiler. It would be interesting to look into tools like clang-query which can give you a lot of insights that are normally internal to the compiler, and I'd love to use this as a linter for some other stuff, e.g., integer divisions. But this sounds like it's not in the scope of this PR. :P

@real-or-random
Copy link
Contributor Author

real-or-random commented Oct 5, 2020

It would be interesting to look into tools like clang-query which can give you a lot of insights that are normally internal to the compiler, and I'd love to use this as a linter for some other stuff, e.g., integer divisions. But this sounds like it's not in the scope of this PR. :P

If someone wants to give it a try, this looks like a nice start: https://stackoverflow.com/a/31740134 maybe it's indeed a one-liner.

edit: yes:

clang-query src/secp256k1.c -c 'match callExpr(callee(functionDecl(hasName("memcmp"))))'

@sipa
Copy link
Contributor

sipa commented Oct 5, 2020

You could also define memcmp to be a macro that just turns into an #error "Use secp256k1_memcmp_var instead please".

@real-or-random
Copy link
Contributor Author

You could also define memcmp to be a macro that just turns into an #error "Use secp256k1_memcmp_var instead please".

Indeed. I'm not sure if I like this because that's UB according to the standard. Though I don't think any sane compiler in this world will have any issues with this.

@roconnor-blockstream
Copy link
Contributor

Technically the existence of a function named memczero is also UB, so I don't know by what measure we characterize some forms of UB over others.

@real-or-random
Copy link
Contributor Author

Technically the existence of a function named memczero is also UB, so I don't know by what measure we characterize some forms of UB over others.

If I had known that this is UB when it we introduced memczero, I would have pointed this out.

@gmaxwell
Copy link
Contributor

gmaxwell commented Oct 6, 2020

Damnit, I forgot that "mem[lowercase]" was reserved, fix it-- don't just gripe.

Some kinds of "technically not proper" may be excusable when its the only way to accomplish an intentional end and also extremely unlikely to cause problems -- I'd put e.g. using a macro that is never used past definition to nuke a function that shouldn't be called into that bin -- but using a reserved pattern to name a function that we do call I wouldn't. That memczero could just as easily be secp256k1_memczero even though it has no external linkage. It's also extremely likely to cause problems, but its an entirely unforced error. If some day a sufficiently smart linter or compiler flag starts complaining about something like it would have to get fixed anyways to satisfy the linter.

@roconnor-blockstream
Copy link
Contributor

roconnor-blockstream commented Oct 6, 2020

I didn't mean to sound like I was griping. I didn't know about the "mem[lowercase]" reserved function name thing until today. ... My reaction upon reading about it (and function names beginning with "is[lowercase]" and "to[lowercase]") was C's definition of UB was simply reaching absurd proportions and that it wasn't worth fixing. From your reaction, it would seem that it isn't as absurd as I thought.

I've filed #829 to begin with.

@sipa
Copy link
Contributor

sipa commented Oct 11, 2020

PR that merges this (with some comments addressed) with #822 and #826: #830

@real-or-random
Copy link
Contributor Author

Closing in favor of #830

sipa added a commit that referenced this pull request Oct 14, 2020
c582aba Consistency improvements to the comments (Pieter Wuille)
63c6b71 Reorder comments/function around scalar_split_lambda (Pieter Wuille)
2edc514 WNAF of lambda_split output has max size 129 (Pieter Wuille)
4232e5b Rip out non-endomorphism code (Pieter Wuille)
ebad841 Check correctness of lambda split without -DVERIFY (Gregory Maxwell)
fe7fc1f Make lambda constant accessible (Pieter Wuille)
9d2f2b4 Add tests to exercise lambda split near bounds (Pieter Wuille)
9aca2f7 Add secp256k1_split_lambda_verify (Russell O'Connor)
acab934 Detailed comments for secp256k1_scalar_split_lambda (Russell O'Connor)
76ed922 Increase precision of g1 and g2 (Russell O'Connor)
6173839 Switch to our own memcmp function (Tim Ruffing)

Pull request description:

  This is a rebased/combined version of the following pull requests/commits with minor changes:
  * #825 Switch to our own memcmp function
    * Modification: `secp256k1_memcmp_var` is marked static inline
    * Modification: also replace `memcmp` with `secp256k1_memcmp_var` in exhaustive tests
    * Modification: add reference to GCC bug 95189
  * #822 Increase precision of g1 and g2
    * Modification: use the new `secp256k1_memcmp_var` function instead of `memcmp` (see #822 (comment))
    * Modification: drop the " Allow secp256k1_split_lambda_verify to pass even in the presence of GCC bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95189." commit, as it's dealt with using `secp256k1_memcmp_var`.
    * Modification: rename secp256k1_gej_mul_lambda -> secp256k1_ge_mul_lambda
  * A new commit that moves the `lambda` constant out of `secp256k1_scalar_split_lambda` and (`_verify`).
  * The test commit suggested here: #822 (comment)
    * Modification: use the new accessible `secp256k1_const_lambda` instead of duplicating it.
  * #826 Rip out non-endomorphism code
  * A new commit that reduces the size of the WNAF output to 129, as we now have proof that the split output is always 128 bits or less.
  * A new commit to more consistently use input:`k`, integer outputs:`k1`,`k2`, modulo n outputs:`r1`,`r2`

ACKs for top commit:
  real-or-random:
    ACK c582aba code inspection, some tests, verified the new g1/g2 constants
  jonasnick:
    ACK c582aba didn't verify the proof

Tree-SHA512: 323a3ee3884b7ac4fa85c8e7b785111b5c0638d718bc1c805a38963c87411e81a746c98e9a42a3e2197ab34a874544de5cc51326955d1c4d0ea45afd418e819f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

memcmp may be miscompiled by GCC
4 participants