Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLMP compression #46

Closed
tokatoka opened this issue Apr 4, 2021 · 16 comments
Closed

LLMP compression #46

tokatoka opened this issue Apr 4, 2021 · 16 comments

Comments

@tokatoka
Copy link
Member

tokatoka commented Apr 4, 2021

Hi, I am interested in LLMP brotli compression and have been browsing codes related to LLMP lately.

Is LLMP brotli compression referring to compressing (and decompressing) events that are passed between a broker and clients?
so we want to compress large members of Event like Event::NewTestcase->observers_buf or Event::Log->message using brotli crate?

@vanhauser-thc
Copy link
Member

that is a very bad idea, sorry :)
the slowdown will be huge.
But try it out yourself.
I tried it for the afl++ network proxy, and there it only made sense for the map, as it is 64kb, and compressed only 1-2 tcp packets. also brotli is fast in compression, however you have to concentrate on decompression. you only have one entity compressing data, but multiple entity decompressing the same data. gzip is the fastest decompression algo AFAIK.

@domenukk
Copy link
Member

domenukk commented Apr 4, 2021

I tried it for the afl++ network proxy, and there it only made sense for the map, as it is 64kb, and compressed only 1-2 tcp packets. also brotli is fast in compression, however you have to concentrate on decompression. you only have one entity compressing data, but multiple entity decompressing the same data. gzip is the fastest decompression algo AFAIK.

Good to know, we didn't benchmark anything yet.
In that case, @tokatoka maybe try to find a (if possibly no_std) gzip lib?

The basic concept was to add a "compressed" flag to the llmp header, then, in send_buf, directly compress the complete payload to a (large) buffer allocated by llmp_alloc, and then add a llmp_shrink method to shrink the llmp buffer to the needed space after compression.

@tokatoka
Copy link
Member Author

tokatoka commented Apr 4, 2021

Okay. I got the idea
It might be more complicated than I had previously thought.
and the deadline for GSoC is coming close, I think I would work on writing the proposal first, and work on this one later.
(but I just wanted to have a concrete idea for llmp compression 😄)

@domenukk
Copy link
Member

domenukk commented Apr 4, 2021

I mean if you want an easy way out, you could allocate a compression buffer internally, and copy twice. But it's fine to write your proposal instead :)

@tokatoka
Copy link
Member Author

tokatoka commented Apr 8, 2021

I've had a look at the details of LlmpSender, which is a core struct along with LlmpReceiver to manage sending and receiving of LlmpMsg.

So, I have an idea that we can compress the buffer directly after the call to copy_to_nonoverlapping
(https://github.com/AFLplusplus/LibAFL/blob/main/libafl/src/bolts/llmp.rs#L879)
and then call llmp_shrink to update LlmpMsg->buf_len, buflen_padded, LlmpPage->size_total and LlmpPage->size_used

My concern is, since alloc_next checks if we need to allocate a new page depending on the buf_len before compression, it is likely that the msg after compression actually can fit into the previous_page
,which would lead to the waste of the shared memory.
(last_msg + size_before_compression + eop >= page_end and last_msg + size_after_compression + eop < page_end)

would it be better to allocate a compression buffer internally, because we can know the size after compression beforehand?

@tokatoka
Copy link
Member Author

tokatoka commented Apr 8, 2021

as for no_std gzip lib, I've found this one
https://crates.io/crates/compression
I'll try testing this to see how much impact this one has on the speed

@tokatoka tokatoka changed the title LLMP brotli compression LLMP compression Apr 8, 2021
@tokatoka
Copy link
Member Author

tokatoka commented Apr 12, 2021

Hello,
I've written test code that compresses LLMP_TAG_EVENT_TO_BOTH message using gzip, since they are the most frequently passed message between the clients and the broker.
3577a8f

And I've run libfuzzer_libpng and recorded the "exec/sec" after the fuzzer's corpus has hit the 450th entry, and repeated it for 10 times.
The results (libfuzzer_libpng exec/sec average over 10 runs):
exec/sec with compression : 111129.3 (exec/sec)
exec/sec without compression : 111803.8 (exec/sec)
and I feel it's ok.
(after all, the fuzzer send and compresses message iff it has found something interesting in this situation)

I'll make a pr after making my code more decent

@vanhauser-thc
Copy link
Member

just ensure that compression is configurable and a client can send both, so that there is a flag.

@tokatoka
Copy link
Member Author

tokatoka commented Apr 24, 2021

After a long struggle of debugging, I've found out that my implementation for the previous the experiment was wrong...
the problem is I forgot to call self.forward_msg(msg)?; after line 1657
(3577a8f)

I fixed them, and tried to test the performance again under the same condition as the previous experiment.
and the result is
[New Testcase #1] clients: 2, corpus: 450, objectives: 0, executions: 3462450, exec/sec: 32199,
exec/sec 32199 is not an acceptable performance.

as @vanhauser-thc pointed out, it slows the fuzzer down to a huge degree (less than 1/4? on my machine).
I don't have any good idea to solve this problem, I'm closing this issue.

@domenukk
Copy link
Member

domenukk commented Apr 24, 2021

Are you sure you don't want to push the current code somewhere and we can try to figure out improvements? My best guess is that you do more copies than needed. It would be a nice feature to have for network connected fuzzers

@tokatoka tokatoka reopened this Apr 24, 2021
@vanhauser-thc
Copy link
Member

even for network connected fuzzers you would not want that slow down. it takes longer to compress and send less packets than not to compress and send more packets. only if you can really, really compress tightly (e.g. 50k of only A) then it makes sense. But queue entries will usually not compress very much - between 10-25%. that is a lot of time for not much less packets sent.

@domenukk
Copy link
Member

We send over the observer maps, too. These will mostly be 0s so compressextremely well, but take up quite some storage in the broker map, as well as take longer to send. At least having the option (if only for low mem devices) is nice

@tokatoka
Copy link
Member Author

tokatoka commented Apr 24, 2021

My judgement might have been wrong.
The first experiment code that I used to I test was based off on main branch
The second experiment code that I've used to test was based off on dev branch
They have a lot of difference.
(especially I forgot to suffocate stderr from the client and forgot to call taskset in the second experiment, while on the first one, test.sh did it)

I'll test it again.

@vanhauser-thc
Copy link
Member

observer map can make sense - if they are large enough, but then I would only compress these.
I think having the compress code in there is fine. everything sent should have the option to be compressed, and the other receivers visibility if it is compressed or not.

@tokatoka
Copy link
Member Author

I've quickly run it 5 times to test exec/sec again
this time, I compared llmp_comp branch and dev branch, both using libfuzzer_libpng/test.sh that I added it in llmp_comp branch.

i-th run comp (exec/sec) dev (exec/sec)
1: 74000 76674
2: 78734 76517
3: 74409 104775
4: 87335 95011
5: 87975 68416

I'll open a draft pr to see how I can improve this one.

by the way, I occasionally observe a weird behaviour from the 'exec/sec' with current dev branch
this is the log of 3rd run from dev branch

[Stats #1] clients: 2, corpus: 442, objectives: 0, executions: 5636649, exec/sec: 68072
[Stats #1] clients: 2, corpus: 442, objectives: 0, executions: 5907168, exec/sec: 7710
[New Testcase #1] clients: 2, corpus: 443, objectives: 0, executions: 6068081, exec/sec: 143810
Received new Testcase from 0
[New Testcase #1] clients: 2, corpus: 444, objectives: 0, executions: 6116711, exec/sec: 142323
Received new Testcase from 0
[Stats #1] clients: 2, corpus: 444, objectives: 0, executions: 6172345, exec/sec: 138428
[Stats #1] clients: 2, corpus: 444, objectives: 0, executions: 6432736, exec/sec: 9272
[New Testcase #1] clients: 2, corpus: 445, objectives: 0, executions: 6569100, exec/sec: 132251
Received new Testcase from 0
[Stats #1] clients: 2, corpus: 445, objectives: 0, executions: 6702784, exec/sec: 128118
[Stats #1] clients: 2, corpus: 445, objectives: 0, executions: 6980582, exec/sec: 22280
[New Testcase #1] clients: 2, corpus: 446, objectives: 0, executions: 7183414, exec/sec: 120157
Received new Testcase from 0
[Stats #1] clients: 2, corpus: 446, objectives: 0, executions: 7257414, exec/sec: 117654
[New Testcase #1] clients: 2, corpus: 447, objectives: 0, executions: 7465295, exec/sec: 114705
Received new Testcase from 0
[Stats #1] clients: 2, corpus: 447, objectives: 0, executions: 7535187, exec/sec: 110472
[New Testcase #1] clients: 2, corpus: 448, objectives: 0, executions: 7606061, exec/sec: 112365
Received new Testcase from 0
[New Testcase #1] clients: 2, corpus: 449, objectives: 0, executions: 7636340, exec/sec: 110688
Received new Testcase from 0
[New Testcase #1] clients: 2, corpus: 450, objectives: 0, executions: 7803725, exec/sec: 108000
Received new Testcase from 0
[Stats #1] clients: 2, corpus: 450, objectives: 0, executions: 7811218, exec/sec: 104775
[Stats #1] clients: 2, corpus: 450, objectives: 0, executions: 8089826, exec/sec: 1248
[New Testcase #1] clients: 2, corpus: 451, objectives: 0, executions: 8322694, exec/sec: 102295
Received new Testcase from 0

Something is causing the exec/sec to fluctuate heavily, from 68072 to 7710 and then up to 143810 again..?

@tokatoka tokatoka mentioned this issue Apr 24, 2021
@vanhauser-thc
Copy link
Member

i-th run comp (exec/sec) dev (exec/sec)
1: 74000 76674
2: 78734 76517
3: 74409 104775
4: 87335 95011
5: 87975 68416

from this fluctuation I think we can see that this kind of measurement does not really tell us anything.
IMHO it would be better to add code that starts a timer before an entry is made until the entry is sent to llmp, and measure this to see what the average time is for non compression and for compression, same for reading from llmp.
then you have the exact cost of compression, plus you can collect how much ram is actually saved.

domenukk added a commit that referenced this issue Apr 29, 2021
* add compression

* modify event/llmp.rs

* rename to LLMP_TAG_COMPRESS

* remove compression code from bolts/llmp.rs

* add compress.rs

* handle compress & decompress in GzipCompress struct, compress if the size is large enough

* add code for benchmark

* remove LLMP_TAG_COMPRESS, use a flag instead

* cargo fmt

* rm test.sh

* passes the test

* comment benchmarks code out

* add recv_buf_with_flag()

* add the llmp_compress feature

* add send_buf, do not compile compression code if it's not used

* fix warning

* merged dev

* add error handling code

* doc for compress.rs

* remove tag from decompress

* rename every flag to flags

* fix some clippy.sh errors

* simplify recv_buf

* delete benchmark printf code

* cargo fmt

* fix doc

Co-authored-by: Dominik Maier <domenukk@gmail.com>
khang06 pushed a commit to khang06/LibAFL that referenced this issue Oct 11, 2022
* add compression

* modify event/llmp.rs

* rename to LLMP_TAG_COMPRESS

* remove compression code from bolts/llmp.rs

* add compress.rs

* handle compress & decompress in GzipCompress struct, compress if the size is large enough

* add code for benchmark

* remove LLMP_TAG_COMPRESS, use a flag instead

* cargo fmt

* rm test.sh

* passes the test

* comment benchmarks code out

* add recv_buf_with_flag()

* add the llmp_compress feature

* add send_buf, do not compile compression code if it's not used

* fix warning

* merged dev

* add error handling code

* doc for compress.rs

* remove tag from decompress

* rename every flag to flags

* fix some clippy.sh errors

* simplify recv_buf

* delete benchmark printf code

* cargo fmt

* fix doc

Co-authored-by: Dominik Maier <domenukk@gmail.com>
cube0x8 added a commit to cube0x8/LibAFL that referenced this issue Feb 5, 2024
cube0x8 added a commit to cube0x8/LibAFL that referenced this issue Feb 5, 2024
domenukk pushed a commit that referenced this issue Feb 15, 2024
* fixing qemu-libafl-bridge #46

* cargo fmt

* updated QEMU revision

---------

Co-authored-by: Andrea Fioraldi <andreafioraldi@gmail.com>
domenukk added a commit that referenced this issue Feb 16, 2024
domenukk added a commit that referenced this issue Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants