Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cheap reader for bytes slice #2261

Merged
merged 35 commits into from
Jul 30, 2024
Merged

Cheap reader for bytes slice #2261

merged 35 commits into from
Jul 30, 2024

Conversation

rmalmain
Copy link
Collaborator

the idea is to safely read over an input by chunks in a lightweight way (see the test for a simple example), by reusing BytesSubInput (that has been renamed to BytesSubInputMut since it contains a mutable reference).
it's still in draft state, i think there is room for improvement.

i tried to auto implement HasTargetBytes for all implementors of HasMutatorBytes but not sure if it is the right thing to do: could it make sense to have mutator bytes that are different from target bytes?

also, end_index now makes sure it does not go beyond the end of the subslice with a min, better this or should we return an error instead? (we could do something similar for start_index)

@rmalmain rmalmain marked this pull request as draft May 29, 2024 16:19
@rmalmain rmalmain requested a review from domenukk May 29, 2024 16:24
rmalmain added 2 commits May 30, 2024 11:48
* Added a checked version of the read to slice.
@rmalmain rmalmain marked this pull request as ready for review May 30, 2024 09:56
domenukk
domenukk previously approved these changes Jun 3, 2024
libafl/src/inputs/mod.rs Outdated Show resolved Hide resolved
/// Target bytes wrapper keeping track of the current read position.
/// Convenient wrapper when bytes must be split it in multiple subinputs.
#[derive(Debug)]
pub struct BytesReader<'a, I>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like std library functions for this use case exist?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at cursor, maybe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my first idea was actually to use Cursor, but it does not really offer an interface to extract a slice without any copy while moving the cursor forward, hence this implem. It has some good side effects, like the fine control on what to do when reaching the end of the reader and reuse OwnedSlice, which keeps things consistent with libafl api.

@domenukk
Copy link
Member

domenukk commented Jun 3, 2024

could it make sense to have mutator bytes that are different from target bytes?

I'd say a SubInput shouldn't really be sent to the target (?) for example

@rmalmain
Copy link
Collaborator Author

rmalmain commented Jun 3, 2024

I'd say a SubInput shouldn't really be sent to the target (?) for example

I agree on this, the reader was designed to be created and used by the target internally.

better doc.
/// - `Ok(Slice)` if the returned slice has `limit` bytes.
/// - `Err(Partial(slice))` if the returned slice has strictly less than `limit` bytes and is not empty.
/// - `Err(Empty)` if the reader was already at the end or `limit` equals zero.
pub fn read_to_slice(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next_sub_input_mut() or something like that?

}
}

impl<'a, I> BytesReader<'a, I>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InputSplitter or something? It's not really bytes specific, is it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now it is bytes-specific (even if it could be generic). It's because it creates a BytesSubInput internally, which requires the HasTargetBytes bound to get the OwnedSlice instance.
We can make it generic but not sure it would be so useful in practice?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still it's not really Reading Bytes, is it? It's spitting byte Inputs, then

Copy link
Collaborator Author

@rmalmain rmalmain Jun 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd say it's more like reading by chunk, close to what Read proposes: https://doc.rust-lang.org/std/io/trait.Read.html
this is why i called it like this initially. The difference is that this object is 0-copy (Read forces to copy in a buffer afaik), so normally this is more efficient.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, we could implement Seek: https://doc.rust-lang.org/std/io/trait.Seek.html
didn't do it since i don't really have the use, but it could be useful in some cases.


impl<'a, I> BytesReader<'a, I>
where
I: HasTargetBytes + HasLen,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need TargetBytes here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To get the OwnedSlice from the input directly.
Another possibility is to take an OwnedSlice<T> as input to make this generic. Is it worth doing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this operate on HasMutatorBytes though?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I wonder if HasTargetBytes shouldn't actually return a "normal" slice, instead of an OwnedSlice?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more restricting than necessary; HasTargetBytes is enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They happen to be the same for BytesInput but they are different for more abstract input types

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

("the thing we send to the target" vs "the thing we mutate")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to change it / any reason not to?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, didn't have time to check this week.
i was intending to use this in libafl_qemu, where we receive TargetBytes and wrap it whis this object to simplify the process of putting inputs in registers for example. doesn't it sound like TargetBytes-related? or maybe we need this as well for MutatorBytes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah.. The use case it not really mutator related then, right?
Might be somewhat independent (but maybe we can reuse some shared types then?)

@domenukk
Copy link
Member

What's the status here?
I think you don't want to mix the concept of MutatorBytes (bytes the mutator wants to mutate) and TargetBytes (bytes that the target will eat up) here.

@@ -96,13 +96,6 @@ impl HasMutatorBytes for BytesInput {
}
}

impl HasTargetBytes for BytesInput {
#[inline]
fn target_bytes(&self) -> OwnedSlice<u8> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be a blanket impl, keep this one (IMHO)

@domenukk
Copy link
Member

domenukk commented Jul 2, 2024

@rmalmain ping

@rmalmain
Copy link
Collaborator Author

rmalmain commented Jul 3, 2024

@domenukk pong. i created an intermediate trait that must be implemented by HasMutatorBytes and HasTargetBytes. that way, we can use BytesReader for both. what do you think?

@domenukk
Copy link
Member

domenukk commented Jul 3, 2024

@domenukk pong. i created an intermediate trait that must be implemented by HasMutatorBytes and HasTargetBytes. that way, we can use BytesReader for both. what do you think?

This doesn't really make sense imho, the use case for mutator bytes and target bytes is quite different.
Instead, you should probably make the BytesReader be instantiated using either a u8 slice, or a OwnedSlice, depending on what you need

@domenukk
Copy link
Member

it feels a bit weird to have BytesSubInputMut that has an input as parent and BytesSlice with a byte slice as parent. shouldn't we unify this with the same type as parent @domenukk (and thus the struct naming)?

I mean we could have a BytesSliceMut and an extra BytesSubInput, they have different use case IMHO. BytesSubInput specificially exposes HasMutatorBytes. It could maybe wrap a BytesSliceMut then

libafl/src/inputs/mod.rs Outdated Show resolved Hide resolved
@rmalmain rmalmain changed the title Reader for inputs implementing HasTargetBytes Cheap reader for bytes slice Jul 17, 2024
@rmalmain
Copy link
Collaborator Author

i think it's good now, anything else @domenukk ?

///
/// An immutable version is available: [`BytesSlice`].
#[derive(Debug)]
pub struct BytesSliceMut<'a> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BytesMutSlice probably, to be consistent

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, maybe we should change OwnedMutSlice to OwnedSliceMut? I think the name comes from MutPtr from the stdlib but in that case it could make sense to rename, anyway?
For now we should keep it consistent, though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah we also have AsSliceMut so the OwnedMutSlice is actually the one we should change, let's do it for 0.14.

@domenukk
Copy link
Member

Some smaller questions remain, but all in all looks good now!

@domenukk
Copy link
Member

Ooops I broke it in 1371646 ...
@rmalmain take a look at what I tried to do (mainly move the reader etc into bolts and make it more generic), if yo udon't like it undo and merge.

@rmalmain
Copy link
Collaborator Author

i was actually thinking of moving it to bolts, things like BytesSlice look pretty generic already

@rmalmain
Copy link
Collaborator Author

btw i think you forgot a file, no?
libafl_bolts/subrange.rs

@domenukk domenukk merged commit c319fe2 into main Jul 30, 2024
21 checks passed
@domenukk domenukk deleted the bytes_reader branch July 30, 2024 11:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants