Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle fileoff out of file boundary in load segment commandj when parsing Mach-O #172

Merged
merged 2 commits into from
Jul 1, 2019
Merged

Handle fileoff out of file boundary in load segment commandj when parsing Mach-O #172

merged 2 commits into from
Jul 1, 2019

Conversation

raindev
Copy link
Contributor

@raindev raindev commented Jun 29, 2019

When parsing BoringSSL's object files built on macOS I had goblin failing on some with bad offset x where x is outside of file boundary (pointing at the byte right after the end of the object). It might be a toolchain bug but since such binaries exist it would be useful if goblin could handle them. objdump does not fail to parse those files nor Go's Mach-O parser does. The easiest way to reproduce the problem is to build BoringSSL and run bingrep on cpu-aarch64-fuchsia.c.o (a copy in my Dropbox to save time) or fuchsia.c.o.

I've intentionally only checked fileoff being outside of file boundary and not fileoff + filesize to not hide problems when an object file is obviously malformed.

Handle gracefully the fileoff in load segment command being outside of
file boundary as it can occur in valid Mach-O files
@philipc
Copy link
Collaborator

philipc commented Jun 30, 2019

I prefer a minimal fix of only ignoring fileoff if filesize is 0. We can reconsider this later if we encounter files that have both invalid fileoff and non-zero filesize, but that is not the case for the example you have provided.

@willglynn
Copy link
Collaborator

Would it be preferable to make pub data: &[u8] into pub data: Result<&[u8]>? That's an interface change, but it correctly captures this new failure mode (where fileoff is incorrect) and correctly prompts (forces) callers to consider the case where a parsed Segment does not contain data.

@philipc
Copy link
Collaborator

philipc commented Jun 30, 2019

It's not a new failure mode. It's an existing failure mode that most (all?) consumers want to ignore because files like this exist in practice, and in the absence of a spec that says otherwise, I don't see any benefit to treating this as a failure.

correctly prompts (forces) callers to consider the case where a parsed Segment does not contain data

I don't think it does that, unless you're proposing to return an error if filesize is 0 too, which I would disagree with.

If we're considering interface changes then my preference would be to split the parsing:

  • Segment::from_32/from_64 read the values without interpreting them (i.e. delete Segment::data)
  • add a method that interprets fileoff and filesize (e.g fn Segment::data(&self) -> Result<&[u8]>).

But I think anything like that should be a separate PR. We should at least do an initial minimal fix of ignoring fileoff if filesize is 0.

@m4b
Copy link
Owner

m4b commented Jun 30, 2019

I think @philipc suggestion is way forward here; @raindev can you force push that change to your branch, and we can merge? 0 size data entries should have empty data; this might even be a bug in scroll attempting to pread a 0 size data array. Need to check. @willglynn I like your suggestion too, but for now I think we should just parse out the 0 array, and in future if enough robustness is required, move the result array into the field as you suggested, but we can hold off on that for now :)

@m4b
Copy link
Owner

m4b commented Jun 30, 2019

On the note of that object file, @raindev where did you get it? It's weird in a few other ways, e.g.:

     Running `target/debug/bingrep /home/m4b/Downloads/cpu-aarch64-fuchsia.c.o`
Mach-o OBJECT x86_64-little-endian @ 0x0:

LoadCommands(2):
   0 LC_SEGMENT_64
   1 LC_UNKNOWN

why is load command 1 LC_UNKNOWN? And why is the file named aarch64 but the header states its an x86_64 object file? (perhaps last one is unrelated, was just curious)

@raindev
Copy link
Contributor Author

raindev commented Jun 30, 2019

Relying on filesize being zero sounds like a better approach to me as well, I've added a commit with the change.
Regarding the file, it was produced by normal BoringSSL build from this source file (which is basically empty on x86_64):
https://github.com/google/boringssl/blob/master/crypto/cpu-aarch64-fuchsia.c

@philipc
Copy link
Collaborator

philipc commented Jun 30, 2019

@m4b Looks like it is LC_BUILD_VERSION, which is a relatively recent addition in LLVM.

@m4b m4b merged commit 7f248a1 into m4b:master Jul 1, 2019
@m4b
Copy link
Owner

m4b commented Jul 1, 2019

Ok new goblin is published, 0.23 (and bingrep, 0.7); should be able to parse this stuff without error, thanks for the PR @raindev !

@raindev
Copy link
Contributor Author

raindev commented Jul 1, 2019

Thank you for getting it in quickly!

@m4b
Copy link
Owner

m4b commented Jul 1, 2019

Thank you for the PR and identifying the issue !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants