Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rand below should take a NonZero parameter #2519

Merged
merged 33 commits into from
Oct 4, 2024
Merged

Rand below should take a NonZero parameter #2519

merged 33 commits into from
Oct 4, 2024

Conversation

domenukk
Copy link
Member

@domenukk domenukk commented Sep 9, 2024

No description provided.

@domenukk
Copy link
Member Author

It looks more ugly but it makes API users more correct

@tokatoka
Copy link
Member

tokatoka commented Sep 10, 2024

this will have considerably bad impact on performance

@tokatoka
Copy link
Member

tokatoka commented Sep 10, 2024

mutators code is a very hot path, it's better not additional overheads..
also random_corpus_id affects splicing

@domenukk
Copy link
Member Author

this will have considerably bad impact on performance

No it's a zero cost abstraction and the compiler can even use the extra bit for other stuff

@domenukk
Copy link
Member Author

mutators code is a very hot path, it's better not additional overheads.. also random_corpus_id affects splicing

The behavior doesn't change. We can also unsafe unwrap if we feel confident about random_corpus_id...

@tokatoka
Copy link
Member

tokatoka commented Sep 10, 2024

The behavior doesn't change.

The behavor doesn't change, I know. But the assembly code changes.

I'm talking about the code that has expect() or Some(). It will add additional checks to check it is NULL or not, and that overhead is killing mutators.

@tokatoka
Copy link
Member

tokatoka commented Sep 10, 2024

I'm saying this because something similar happened before.
#1138

semantically same code, but the more "Rusty"-code had caused significant slowdown because it added useless instructions wasting CPU cycles.

@tokatoka
Copy link
Member

tokatoka commented Sep 10, 2024

also why can't below() take zero to begin with?
I don't see the problem with the current state

@domenukk
Copy link
Member Author

Right now below doesn't take 0, but it's implicit rather than explicit (with an debug_assert!).
What should below of 0 even be? MAX_INT?

@domenukk
Copy link
Member Author

FWIW the current behavior caused troubles in TNO-S3/WuppieFuzz#5 (comment)

@domenukk
Copy link
Member Author

Other option would be to add an additional below_including function that doesn't take NonZero

@tokatoka
Copy link
Member

Other option would be to add an additional below_including function that doesn't take NonZero

This is better. and all the hotpaths in libafl should use that instead.

@domenukk
Copy link
Member Author

Why do you assume NonZero is slower? It can likely be optimized better than the normal version

@tokatoka
Copy link
Member

tokatoka commented Sep 10, 2024

It can likely be optimized better than the normal version

I don't trust the optimization by Rust compiler. As I said, from what I disassembled last time in #1138, it doesn't optimize anything, but just puts useless bound checks just to slow down the stuff.
(Last time, the problem was using std::vec::splice(), it is performing far worse than doing unsafe copy-ing of the buffer. because it adds addtitional bounds checks.)

but if you make this compiles we can always count the instructions or the time spent during the mutations to see how they compare

@domenukk
Copy link
Member Author

image It does optimize this pretty well.

Another alternative is to specify that below of 0 is 0.
But that may lead to weird bugs for users (like panics when you use it for items.len() and then access items)

@tokatoka
Copy link
Member

It does optimize this pretty well.

in your example yes. but if you put if Some(x) == .. or expect() then there's no way NonZero is better

@domenukk
Copy link
Member Author

It does optimize this pretty well.

in your example yes. but if you put if Some(x) == .. or expect() then there's no way NonZero is better

Most cases that have if Some(x) == would otherwise need a length check or 0 check just above, so it really shouldn't matter

@tokatoka
Copy link
Member

I'll show you an example.
#2519 (comment)
This example doesn't have the overhead only because the function square() is actually taking the NonZeroUsize as the argument.
Also it is not similar to the use case of the changes in this PR.

In libafl, you are creating NonZero from usize, be it state.max_size() or input.size() or whatever, it ususally is usize
such as this line https://github.com/AFLplusplus/LibAFL/pull/2519/files#diff-955b4b9ddc30b6a7dcaaa86168ef04730876cbe9714fa570edc3118c740076e1R48 of this PR.

What happens if you create nonzero from usize is
With Nonzero:
https://godbolt.org/z/MTfGE1xsK

Without Nonzero:
https://godbolt.org/z/v59TWvMq3

        cmp     qword ptr [rsp + 272], 0
        sete    al
        test    al, 1
        jne     .LBB4_1
        jmp     .LBB4_2
.LBB4_1:
        lea     rdi, [rip + .L__unnamed_1]
        mov     rax, qword ptr [rip + core::option::unwrap_failed::hba6b08832f9ce30b@GOTPCREL]
        call    rax
.LBB4_2:
        mov     rax, qword ptr [rsp + 272]
        mov     qword ptr [rsp + 16], rax
        lea     rax, [rsp + 16]

THe difference is here, like this, It adds these branches. and I'm saying that even these several lines of additional lines is not good for the mutator.
Of course we should change if it is wrong, but we can simply avoid this overhead by naming below() to below_including().
so we should just keep the current behavior for the lib and swap it with below_including for avoiding users confusion

@domenukk
Copy link
Member Author

You need to optimize in godbolt else it's not useful.
In this case you have the +1 so the compiler can proove unwrap cannot fail. No overhead in this case.
Otherwise it could be a bug, and the unwrap would be the correct way.
Worst case we can always unwrap_unsafe if we want to make it go brrrr

@tokatoka
Copy link
Member

tokatoka commented Sep 13, 2024

NonZero with opt-level=3:
https://godbolt.org/z/axf9nvx4r

without NonZero with opt-level=3:
https://godbolt.org/z/fTeWzsfbz

Still the one without nonzero is better (in terms of the instruction count inside function w)
I mean you can't prove usize to NonZero in general..

@domenukk
Copy link
Member Author

We can always go back to the unsafe variant if we think we don't need the checks https://godbolt.org/z/rY5ebE9ve

@rmalmain
Copy link
Collaborator

Other option would be to add an additional below_including function that doesn't take NonZero

something like this? https://github.com/AFLplusplus/LibAFL/pull/2496/files#diff-8b8392ab89ae8517f8910eb579121b57031ada7370c32962b75d0cb62ab841a7R133

@tokatoka
Copy link
Member

tokatoka commented Sep 13, 2024

but your point is that "NonZero version could be optimized" which is not the case. (unless unwrap()_unsafe() is used)
i mean i still don't understand why nonzero is needed in most cases, and sure, we can change below() to take NonZero, but most of the code inside in LibAFL should use below_inclusive().

@rmalmain
Copy link
Collaborator

rmalmain commented Sep 13, 2024

otherwise, why not stick to debug_assert!?
it avoids the overhead in notation and potentially in terms of computation, no?

@domenukk
Copy link
Member Author

domenukk commented Sep 13, 2024

otherwise, why not stick to debug_assert!? it avoids the overhead in notation and potentially in terms of computation, no?

That's what we have now and it breaks at runtime instead of compile time. It's objectively worse.

Either we should explicitly allow 0 or explicitly forbid it. NonZero is the compile time way to explicitly forbid it.
If we allow it we should remove the asserts and manually check all users for bugs down the road

@rmalmain
Copy link
Collaborator

the other solution could be to have a wrapping type for NonZero which would hide the verbosity?
Maybe you could make this new type implement Deref and internally either unwrap or unwrap_unchecked depending on the build kind?

@tokatoka
Copy link
Member

Either we should explicitly allow 0 or explicitly forbid it. NonZero is the compile time way to explicitly forbid it.
If we allow it we should remove the asserts and manually check all users for bugs down the road

So I agree with this

Other option would be to add an additional below_including function that doesn't take NonZero

Expose below() as public function for user to use. for internal ones use below_inclusive()

@domenukk
Copy link
Member Author

domenukk commented Sep 13, 2024

We can just NonZero and unwrap_unchecked internally, it's the same in godbolt

@domenukk domenukk marked this pull request as ready for review September 26, 2024 12:34
@domenukk domenukk requested a review from tokatoka September 26, 2024 12:34
@domenukk
Copy link
Member Author

There are not too many constants around that need to be NonZero-ed.
I've used unsafe unwrap_unchecked() wherever possible, so there is no performance overhead in mutators.

@tokatoka
Copy link
Member

i need to add debug log to check it like last time

@@ -532,6 +533,11 @@ where
let rand_seed = state.rand_mut().next();
state.add_metadata::<MOpt>(MOpt::new(mutations.len(), swarm_num, rand_seed)?);
}
let Some(max_stack_pow) = NonZero::new(max_stack_pow) else {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not 100% sure if max_stack_pow of 0 could sometimes make sense(?)
I assumed that not.

@tokatoka
Copy link
Member

i'll measure on weekend

@domenukk
Copy link
Member Author

@tokatoka any news? I'd just merge this tbh, there shouldn't be any overhead

@domenukk
Copy link
Member Author

Didn't we even have CI that tests the performance diff?

@tokatoka
Copy link
Member

No, don't yet

@domenukk
Copy link
Member Author

Why?

@domenukk
Copy link
Member Author

domenukk commented Sep 30, 2024

It's the right thing to do ¯\_(ツ)_/¯

@domenukk
Copy link
Member Author

domenukk commented Oct 2, 2024

Poke @tokatoka

@domenukk domenukk merged commit 4fc136c into main Oct 4, 2024
100 checks passed
@domenukk domenukk deleted the nonzero branch October 4, 2024 00:16
rmalmain pushed a commit that referenced this pull request Oct 7, 2024
* Rand below should take a NonZero parameter

* More

* more

* More

* fix build

* bit of clippy

* more clippy

* more clippy

* More clippy

* More more

* more nonzero

* fix multipart

* Cleanup, more unsafe

* fix

* fix unicode

* clippy, fmt

* more

* More safer and more better

* MaxStackPow

* fix merge fails

* make random_slize_size faster

* fix

* more

* fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants