Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type mismatches in exports and headers #29

Closed
arnetheduck opened this issue Sep 16, 2020 · 8 comments
Closed

Type mismatches in exports and headers #29

arnetheduck opened this issue Sep 16, 2020 · 8 comments

Comments

@arnetheduck
Copy link

The header file that describes the function uses a different type than the actual implementation - this may lead to miscompiles / optimization errors as there may be subtle differences between bool and limb_t.

bool blst_fp12_is_equal(const blst_fp12 *a, const blst_fp12 *b);

limb_t blst_p1_affine_is_equal(const POINTonE1_affine *a,

blst/bindings/blst_aux.h:17:6: warning: type of ‘blst_p1_affine_is_equal’ does not match original declaration [-Wlto-type-mismatch]
   17 | bool blst_p1_affine_is_equal(const blst_p1_affine *a, const blst_p1_affine *b);
      |      ^
blst/src/exports.c:241:8: note: return value type mismatch
  241 | limb_t blst_p1_affine_is_equal(const POINTonE1_affine *a,
      |        ^
blst/src/exports.c:241:8: note: type ‘limb_t’ should match type ‘_Bool’
blst/src/exports.c:241:8: note: ‘blst_p1_affine_is_equal’ was previously declared here
blst/src/exports.c:241:8: note: code may be misoptimized unless ‘-fno-strict-aliasing’ is used
@dot-asm
Copy link
Collaborator

dot-asm commented Sep 25, 2020

this may lead to miscompiles / optimization errors

How? Well, given that you compile blst C source separately, as you're expected to (see below). Either way, it should be noted that formally speaking C standard is incomplete. Most notably it doesn't prescribe anything about calling convention. In other words one can't consider it in total isolation, but complement the context with something else, most notably platform-specific ABI specifications. Now, with this in mind, the assertion is that as long as the limb_t type doesn't exceed "natural" register size, and as long as its value is 1 or 0, it can be safely reinterpreted as it's currently done. On supported platforms. That's the keyword, "on supported platforms," or more specifically for relevant ABIs at hand.

Moving on. But there is something else that C standard is silent about... About how booleans are to be ... "instantiated" if you wish. More specifically it doesn't say that code evaluating (bool)x, where x is non-boolean, should be constant-time. But we have to have constant-time code! This is a non-negotiable requirement! Nor can it be left to compiler's mercy... Two possible solutions: a) implement everything in assembly; b) require separate compilation and utilize guarantees implied by relevant ABIs. Life is a trade-off, b) it is. This is why blst has to internally return non-boolean type, but it does return boolean value.

Now, you can say "but why doesn't blst.h return limb_t instead of bool?" Well, because there is another level of abstraction. Note that in blst.h limb_t is unconditionally 64-bit type, while internally it can be 32 bits. In other words, limb_t is effectively an opaque type to application. You can ask why aren't the types incomplete then? Well, because we didn't want to manage memory in the library. Why? Because we didn't want to limit application language to C...

Now, you can say "but why doesn't blst.h return say size_t then?" Yeah, but it would make interface to foreign languages less elegant. Not to mention that it might trigger casts, which, again, are not guaranteed to be constant-time...

Bottom line. It's not an oversight, and there is rationale behind it. And compile separately.

@dot-asm
Copy link
Collaborator

dot-asm commented Sep 25, 2020

Two possible solutions:

Well, formally speaking there is a 3rd option, implement the constant-time cast from non-boolean to boolean in assembly... It feels like an overdo, but if we come to the conclusion that separate compilation is not sufficient guarantee after all, it would be the way to go...

@dot-asm
Copy link
Collaborator

dot-asm commented Sep 25, 2020

Oh! I notice that you ought to come from nim, which means that you rather not compile it separately... Well, bool is not really a native C type, and it's actually defined in blst.h. So one can actually discuss custom definition. I mean if there is a macro defined that says "I come (with peace;-) from nim," we can add an additional #if to harmonize it for compile-together scenario...

@arnetheduck
Copy link
Author

Actually, not so much nim, as LTO :) We're considering disabling it for libraries like BLST though that are already well-optimized as it might interfere with, like you mention, constant time.

I guess the difference that can happen is that some compiler optimize more aggressively around bool, assuming that it takes a value of 0 or 1 exclusively which might affect code paths down the road.

Aliasing should be fine I guess since it's a return value that doesn't have an address - though in theory a function pointer could get messed up.

@dot-asm
Copy link
Collaborator

dot-asm commented Sep 25, 2020

assuming that it takes a value of 0 or 1 exclusively

And as already implied, it would be 100% correct assumption. Because all the subroutines that are declared bool in blst.h are explicitly coded to return 1 or 0 [in constant time]. They do not return a non-0 or 0.

As for LTO. I customarily argue against it for any cryptographic algorithm library. Because there is direct contradiction between the goals. I mean security countermeasures are most likely to make code slower, while optimization's goal is to produce fastest code possible. And there is no guarantee that having the complete picture it won't find a way to bypass the countermeasure.

@dot-asm
Copy link
Collaborator

dot-asm commented Nov 2, 2020

On a tangential note. More specifically on bool type in constant-time context. I've run into an article that pointed out that llvm has select IR opcode and that the corresponding machine code tends to be generated as a conditional branch. And that the select opcode tends to be emitted in response to (a & mask) | (b & ~mask) pattern. So I wondered... What if the loop body in vec_select looked as

rp[i] = (ap[i] & mask) | (bp[i] & ~mask);

where mask is calculated as 0 - sel_a. No, no select in sight... Well, not really surprising, as I might as well worked on individual bits. But then I've changed type of sel_a argument to _Bool and there it was, select:

  %14 = load i64, i64* %13, align 8, !tbaa !2
  %16 = load i64, i64* %15, align 8, !tbaa !2
  %17 = select i1 %4, i64 %14, i64 %16
  store i64 %17, i64* %18, align 8, !tbaa !2

I've omitted some instructions from the above transcript, but left the most important ones, two loads. This is what we want, right? I mean for constant-time-ness we want two loads, some calculations, and store. So it looks proper. But what about the final machine code? Well, it didn't have a conditional branch, not in this case, but the compiler transformed the IR to a selection between pointers instead of values and ended up using one load, thus making code non-constant-time:-(

So a non-bool type appears more suitable for the purpose, but does it actually guarantee the constant-time-ness? As mentioned earlier, even though blst returns a non-bool type, it's Boolean by value, and formally speaking, if given the opportunity, compiler could figure it out and apply same optimizations as to bool type. Or is there some provision in the standard that would prevent the compiler?

Just in case for reference, the loop in vec_select subroutine is effectively dead, because all possible input lengths are covered by assembly vec_select_N subroutines. Not that they, vec_select_N, were explicitly motivated by this bool thing, but they effectively counter all imaginable problems.

@arnetheduck
Copy link
Author

Or is there some provision in the standard that would prevent the compiler?

I'd think often ends up being more of a practical barrier than a standard issue due to how c files typically are compiled in separate translation units - as long as compilation, optimization and translation to machine instructions is done in isolation, optimization information such as possible value constraints are not passed between the TU's - with LTO however, that changes in two ways: value constraints that stem from flow analysis (and not type information) can now be applied in interprocedural analysis and ABI constraints can be relaxed as symbols no longer need to conform to an ABI description which incidentally also allows constant specialization and more aggressive inlining to take place.

Conversely, when returning bool, the compiler adds instructions to clamp the value, if along a code path it cannot deterministically say that the value will be 1 or 0, it will add instructions accordingly to clamp it.

FWIW, we've switched off LTO for now for our crypto dependencies, including blst, but it's a bit of a blunt way to address the problem - what I'd ideally like to do is to extend our compiler to annotate the llvm IR it generates with specific optimization constraints.

@dot-asm
Copy link
Collaborator

dot-asm commented Nov 27, 2020

Couple of closing comments.

As it turns out the above mentioned (a & mask) | (b & ~mask) is not the only recognized candidate pattern for select opcode. I've tried to disincentivize clang (there are couple commits), and confirmed that it stopped generating select this time, but it's not an absolute guarantee for all the future. Again, only non-assembly platforms can suffer from this.

Secondly, as time progresses, the initial assertion about supported ABIs is no longer rock-solid. Because wasm doesn't have as well-defined ABI as more traditional platforms. As result, it's int that gets reinterpreted now. Yes, reinterpretation is still going on, for same reason, constant-time-ness [on C side], but it should be less controversial.

@dot-asm dot-asm closed this as completed Nov 27, 2020
zah added a commit to status-im/nim-blscurve that referenced this issue May 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants