-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse Strtab only once on Creation #275
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is super interesting! so i think we need to figure out the breaking changes situation, whether it's worth it (i'm sort of ok with it, in particular the get
method return signature (from Ok(Some
to Some
seems probably ok, but the various methods removed/renamed doesn't seem necessary?, but maybe you can change my mind :)
otherwise this is looking good; only other concern i have is for perf, parsing entire table at once could have some perf consequences? But maybe not, majority of symbols might be looked up intiially when we construct the dyn sym table + the symbol table, i'm not sure; do you have any insight into this aspect?
lasty, i'll have to reread your explanation why doing an initial parse avoids the o(n^2) runtime when doing gets, because it's not clear to me how that would happen; do you happen to have an elf binary you can share that exhibits this quadratic behavior?
otherwise this is super cool, thank you for making the PR and writing up your findings, this is very interesting :) (and in case it's not clear, yes i think this PR is welcome, and modulo concerns above, i'm ok with merging something like this)
Uses cases which only request a few strings from a huge table will suffer, but even if you request every possible entry only once and there is overlap (shared suffices) then this already provides gains. Obviously, it would improve even further if an entry is requested multiple times.
The worst case is that the However, if the
We have multiple and I wouldn't want to publish them before this is resolved. |
Do tools like |
This reminds me of
Additionally, given |
instead of on demand when entries are requested.
@philipc That is a good question, but I will have to craft different ELF files because the current ones are not completely valid in unrelated areas and it makes these tools return early with a warning. @m4b I added two unit tests for UTF-8 parsing and the CI is passing, can you review it again? |
Could you clarify what you mean here? I.e., why would resolution of this issue enable you to publish / share the problematic binaries? It seems to me if you can't share them now (which is fine), you wouldn't be able to share them after, regardless of whether this issue is resolved. |
We would simply like to keep using the upstream repo, not a fork with our security patches applied. That is all. Once this is merged we can deploy a new version of our software packages and release the exploits as they would be disarmed by the time. By release I mean first to you specifically and later to the general public, as we would want other projects depending on this one to have a change to adopt as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok so other than to_vec
being a breaking change, which if you feel strongly about, i guess i'm ok with it, the only other major concern i have is whether we're pessimizing 99% of binary's strtables in what seems like a very niche problem case.
as far as i understand it, the only security implications I see are a kind of DoS on accessing strtables for these malformed binaries, but it's not clear to me how that is impactful, other than a tool taking a long time to run? it would be good to get some further motivations here.
I'm not sure why, but I almost wish we had a variant constructor for Strtab that uses the pre-allocated form via a config we'd pass into parsing, and defaults to a non-preallocated form otherwise, but this is a much bigger PR. However, it might be worth doing, because a config for parsing options is something i've thought would be nice for a while.
with that in mind, i don't see anything in this PR that would prevent such a feature from landing in the future?
example:
enum StrtabKind {
Preallocated(Vec<&str>),
ZeroCopy
}
pub struct Strtab {
bytes: &[u8],
inner: StrtabKind
}
where get's dispatch depending on the enum variant accordingly. Anyway, just some food for thought, as this would allow users to opt into whether they want to pay for the preallocated penalty or not, at a programmatic, library level choice.
ok i've blabbed enough :D
that being said, i'm in favor of merging, unless anyone else has any other concerns?
pub unsafe fn from_raw(ptr: *const u8, len: usize, delim: u8) -> Strtab<'a> { | ||
Self::from_slice_unsafe(core::slice::from_raw_parts(ptr, len), 0, len, delim) | ||
} | ||
#[deprecated(since = "0.4.2", note = "Bad performance, use get_at() instead")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome
delim: ctx::StrCtx, | ||
bytes: &'a [u8], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: layout here didn't need to change (bytes after delim in field declarations), this and some of the functions moving around did make this PR a bit hard to review, just fyi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was for consistency with the constructors (see line 30 of the prev version).
src/strtab.rs
Outdated
pub fn new(bytes: &'a [u8], delim: u8) -> Self { | ||
Strtab { | ||
/// Creates a `Strtab` directly without bounds check and without parsing it. | ||
pub fn from_slice_unsafe(bytes: &'a [u8], offset: usize, len: usize, delim: u8) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g., this function here makes a lot of diff noise; also does this function need to be pub
? one can construct a strtable now, but the vec isn't filled, which means any get_at
calls would fail, which feels like the strtab is "half constructed"; also it is better to not introduce any new pub apis unless necessary, since once they're pub, we can't take them back
src/strtab.rs
Outdated
/// Safely parses and gets a str reference from the backing bytes starting at byte `offset`. | ||
#[cfg(feature = "alloc")] | ||
/// Converts the string table to a vector of parsed strings. | ||
pub fn to_vec(&self) -> Vec<&'a str> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately this is still a breaking change, since return signature changed. i'm 50/50 on fence whether it's worth it (can just Ok wrap at end and it's non-breaking, but maybe it's fine to break?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ironically, I was on the fence of breaking it even further by including the starting offset of each entry in the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so I've thought about this, and I think we should return Result
here, so that we don't have a breaking change, and if we switch to using the enum, it will work without changes; if we get that done i'm ok with merging this and making a minor release asap.
As a bonus, we can perhaps construct the vector on the fly if it's empty
(i.e., it was constructed via the unsafe method), but that is probably a niche usecase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both done, reverted to use the Result
return type and implemented a fallback.
/// Construct a strtab from a `ptr`, and a `size`, using `delim` as the delimiter | ||
/// | ||
/// This function creates a `Strtab` directly from a raw pointer and size | ||
pub unsafe fn from_raw(ptr: *const u8, len: usize, delim: u8) -> Strtab<'a> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see now why from_slice_unsafe
is pub; so i'm not sure this should be deprecated? i used this in the past, e.g., to construct strtables for a dynamic linker, which doesn't want to allocate (since it's already mapped into memory). i'm not sure what others are doing, but it would force someone to first call from raw_parts
on their pointer and len as you do here.
alternatively, it would be feasible to have from_slice_unsafe just get the ptr + len and pass it to this. however, it's probably better to get rid of / deprecate these unsafe apis, since it doesn't add much (the user can just call the unsafe from_raw_parts
themselves in their own unsafe context, so we push it onto them, which i think is fine).
so i guess what i'm saying is maybe it might be good to either:
- delete this
- deprecate as you do with warning it will be removed, then delete next release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deprecated marker is a few lines above (the comment) in 103.
Well, we have no data what the "usual" user of this crate is doing on average, so it is hard to optimize for.
Would not be hit at all or even profit from the pre-parsing.
I agree that many CLI tools some developer runs and cancels, if they take to long, are not affected by this.
Might not be so complicated, why not just default back to the original implementation of That being said, maybe |
I made the changes you requested, could you rerun the CI tests once more? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is great, thanks for doing this :)
I believe the CI has run on your last change Anyway, thank you for your patience! I think this is ready to merge; perhaps we can do the enum variant I talked about earlier, but it's fine for now I believe; we can do a minor crates release as well, since we don't (i believe!) have any breaking changes. Thanks again for everyone's input here! |
Thank you as well. Once you release the minor crate version I will send you the exploits via email to m4b.github.io a gmail.com |
ok i've published 0.4.2; do note that i caught a breaking change in the Thanks for the PR! |
Together with @jackcmay I am working on a BPF VM, which utilizes goblin for loading user provided ELF files.
So, last week we encountered a problem with the way
Strtab::get()
works, tried to fix it outside, which did not work and now have found a fix which would require upstream changes. Furthermore, I think these changes can also help other projects depending on goblin which are exposed to potentially hand-crafted adversary / worst case inputs too.The problem
A
get
request performs a linear scan to find the separators /NULL
termination and possibly invalid UTF-8 variable length encodings on every single request. So it is possible to construct e.g. an ELF file which is only a few MiB in size but has massive overlap in theStrtab
. This way everything involving theStrtab
suddenly becomes O(n^2) in runtime complexity because of the overlap (as opposed to independent strings which can be amorphized to be constant length) and just reading the file can take minutes to hours.What did not work
We tried to fix it on our side by caching the results of
Strtab::get()
. Butgoblin/src/elf/dynamic.rs
Line 450 in b2b15a3
goblin/src/elf/mod.rs
Line 304 in 5440554
Strtab
on average. So still O(n^2) runtime complexity in total.What does work
The proposed fix in this PR instead moves the parsing from the
get
request to the constructor of theStrtab
, which in turn also moves the error handling with it. The trick is to use a binary search (in eachget
request) over the cached positions of the separators (found by one linear scan in the constructor). This is only done when dynamic (heap) allocations are enabled via#[cfg(feature = "alloc")]
, otherwise we default to the original behavior.Missing
I did not address or test UTF-8 support yet, so it is probably broken. I will invest more time into it if this PR looks otherwise promising.