Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write support for PE binaries #361

Merged
merged 15 commits into from
Nov 21, 2023
Merged

Write support for PE binaries #361

merged 15 commits into from
Nov 21, 2023

Conversation

RaitoBezarius
Copy link
Contributor

@RaitoBezarius RaitoBezarius commented Mar 11, 2023

Write support for PE binaries in a basic form

This implements reserialization for many in-memory structure with PE semantics.
It does not yet enable you to modify a PE and write it because you will need many modifications to the existing structure to avoid breaking the PE.

But you can already modify existing bytes this way.

I learned the hard way that this code may be insufficient to write the worst PE binaries out there (look the Corkami repositories on pocs and look for the PE there), I deem this to be out of scope for now as it is very hard to get all of this right.

Also, I don't think that identity rewrite should be a goal for anything else than "basic" and "simple" (read: nice) PEs, which can be accomplished in this PR with NixOS PE binaries for UEFI (and I tested a hundred of them).

Testing ways:

cargo run --example rewrite_pe /boot/EFI/Linux/some-uefi.efi /tmp/uefi.efi

As a follow-up, I will work on:

  • section manipulation to implement "objcopy"-like features
  • signature manipulate to implement "pesign"-like features

TODO:

  • Sanity checks for non-overlapping sections (debug_assert! or assert! ?)
  • Fixup no_std needs
  • COFF strings write support
  • Data directories write support
  • Attribute certificate write support
  • Header write support
  • COFF symbols write support (including aux records)
  • Import table write support
  • PE32 write support for OptionalHeader
  • Write section data in the correct location
  • Write attr cert in the correct location
  • Write (delay-load) import tables
  • .debug
  • .edata
  • .idata
  • .pdata
  • .reloc
  • .tls
  • Load Configuration Structure
  • .rsrc
  • Double check padding
  • Read/Write/Read tests

Out of scope :

  • Update checksums if necessary?
  • Simple test with a NixOS PE binary that we can rewrite effectively the same
  • Canonicalization/Minimization of PEs ? (out of scope)

Relevant issues/PRs:

#277
#357

@RaitoBezarius RaitoBezarius marked this pull request as draft March 11, 2023 16:51
@RaitoBezarius
Copy link
Contributor Author

@m4b you did say you didn't want more dependencies in goblin, are you okay with some of them in examples code? (to showcase how to use external crates for example?)

src/pe/mod.rs Outdated Show resolved Hide resolved
let inner = bytes.gread_with(offset, scroll::LE)?;
Ok(DataDirectory {
inner,
offset: *offset,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I have your feedback on this @m4b before I go further?
I am not sure how to compute offset except by getting at parsing time.

@RaitoBezarius

This comment was marked as outdated.

@RaitoBezarius

This comment was marked as outdated.

@RaitoBezarius

This comment was marked as outdated.

@RaitoBezarius
Copy link
Contributor Author

@m4b What do you expect from me for the testing strategy of this PR? Should I add various PE binaries or just some binaries and test rewriting them or test writing binaries from scratch for some features?

I feel like getting this feature to perfect polish will take some hoops and I am wondering if we can merge a "beta" version and stabilize it further.

src/pe/header.rs Outdated Show resolved Hide resolved
src/pe/header.rs Outdated Show resolved Hide resolved
src/pe/header.rs Outdated Show resolved Hide resolved
@m4b
Copy link
Owner

m4b commented May 29, 2023

@m4b What do you expect from me for the testing strategy of this PR? Should I add various PE binaries or just some binaries and test rewriting them or test writing binaries from scratch for some features?

I feel like getting this feature to perfect polish will take some hoops and I am wondering if we can merge a "beta" version and stabilize it further.

I'm ok with merging a beta version. I think I'd like to do a release before that, this way you can work off of master and hopefully have a full featured version ready for a crates release?

@m4b
Copy link
Owner

m4b commented May 29, 2023

And re binaries and testing this has always been a source of tension. For now I'd recommend checking in u8 slices in test code and not binaries as we have in other places.

@m4b
Copy link
Owner

m4b commented Jul 5, 2023

@RaitoBezarius friendly ping/triage wondering what work is left to take this about of WIP and getting something ready to merge? :)

@RaitoBezarius
Copy link
Contributor Author

@RaitoBezarius friendly ping/triage wondering what work is left to take this about of WIP and getting something ready to merge? :)

I decided with @baloo to take this PR for a super crash test in https://github.com/RaitoBezarius/goblin-signing/tree/signed-data where we start signing binaries arbitrarily combining everything. I guess it will be a good test if I can wire this down to our integration testing with UEFI.

@m4b
Copy link
Owner

m4b commented Jul 9, 2023

awesome sounds good to me and let me know how it goes; lots of good work here, don't want it to get stale :)

@Lunarequest
Copy link

is there any update on this? whats is left to be done(other then rebasing)

@RaitoBezarius
Copy link
Contributor Author

Life has been busy and I didn't want to fuck up this by merging too early, all is left is testing two usecases "end to end":

  • adding new sections or new section data, usecase: replacing systemd's ukify or objcopy for creating UKIs binaries in UEFI land
  • signing binaries via Rust cryptography ecosystem, usecase: HSM/TPM signature of SecureBoot binaries

I feel like there should be like a final push of work, one week or something, but I am on vacations and just after I will be busy under conferences, hopefully, I aim to close this before the end of September.

@m4b
Copy link
Owner

m4b commented Sep 2, 2023

Great thank you for the update !

@m4b
Copy link
Owner

m4b commented Sep 2, 2023

And looking forward to seeing this merged if wasn’t clear, thanks @Lunarequest for pinging in status !

@RaitoBezarius RaitoBezarius force-pushed the write-certs branch 4 times, most recently from ff7b1ad to aace8e5 Compare October 1, 2023 01:36
@RaitoBezarius
Copy link
Contributor Author

I'm soon there, I want to rework the commit history, look for an easy basic testing strategy and I want to go for the merge then.

@RaitoBezarius
Copy link
Contributor Author

Planned follow-ups:

  • add/remove/edit a section in a PE, this will require a superstructure called PEWriter as it is impossible to write a new section without relaying out all the PE's sections, example: "minimal objcopy" in Rust for PEs, application: replace ukify and other shenanigans.
  • add/remove/edit a signature in goblin-signing subcrate, example: "minimal sign this binary" in Rust using local keys, application: replace pesign and other shenanigans.

@RaitoBezarius RaitoBezarius force-pushed the write-certs branch 3 times, most recently from 5ab66a6 to b76248a Compare October 2, 2023 18:39
@RaitoBezarius RaitoBezarius force-pushed the write-certs branch 4 times, most recently from c445107 to 5f69b48 Compare November 20, 2023 12:30
Copy link
Owner

@m4b m4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really amazing work @RaitoBezarius thank you so much! And thank you for committing the example too; tbh i'd really prefer not to have dev dependencies introduced, but it's fine I guess. I'll let this sit for a few hours, but otherwise I'm ready to merge, great stuff.

So as far as "feature completeness", would the PE pwrite implementation allow, e.g., writing out PE object files for a toy compiler backend, similar to what faerie did?

@@ -163,14 +262,24 @@ impl CoffHeader {
}

/// Return the COFF symbol table.
pub fn symbols<'a>(&self, bytes: &'a [u8]) -> error::Result<symbol::SymbolTable<'a>> {
pub fn symbols<'a>(&self, bytes: &'a [u8]) -> error::Result<Option<symbol::SymbolTable<'a>>> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm are you saying we should construct an empty symbol table and not an optional in this case? If yes, it might be worthwhile to not make this optional, as it might be changed back if we had the option of returning an empty symbol table / have scroll not crash on offset == 0

@@ -163,14 +262,24 @@ impl CoffHeader {
}

/// Return the COFF symbol table.
pub fn symbols<'a>(&self, bytes: &'a [u8]) -> error::Result<symbol::SymbolTable<'a>> {
pub fn symbols<'a>(&self, bytes: &'a [u8]) -> error::Result<Option<symbol::SymbolTable<'a>>> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess what i'm saying does the optionality signify important semantic information to the user that the thing you are trying to get might not "be there", as opposed to working around a scroll bug. If 0 signifies it isn't "there"/it isn't required, then let's keep it optional, yes.

@@ -3,7 +3,12 @@

// TODO: panics with unwrap on None for apisetschema.dll, fhuxgraphics.dll and some others
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated but i wonder if this comment still true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question, I don't have those Windows binaries on my hand, but I can check them.

/// has been deprecated.
pub symbols: Option<symbol::SymbolTable<'a>>,
/// The string table, they don't exist if COFF symbol table does not exist.
pub strings: Option<strtab::Strtab<'a>>,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I think this comment answered my question about optionality, namely the strtab and symbol table is actually optional

@@ -171,7 +208,7 @@ impl ctx::SizeWith<scroll::Endian> for SectionTable {
}
}

impl ctx::TryIntoCtx<scroll::Endian> for SectionTable {
impl ctx::TryIntoCtx<scroll::Endian> for &SectionTable {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, i wasn't aware you could do this; one longstanding issue in scroll (and major 1.0 blocker) was to allow pwrite impls for reference types, but it hadn't occurred to me the user could just write the TryIntoCtx impl in this manner, i'm not sure why it never was brought up in that issue...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I realized that there's almost no reason to give owned types to the serialization process given that you are really just reading those memory locations and discarding them, having always a reference makes them cheap to move around.

src/pe/symbol.rs Show resolved Hide resolved
src/strtab.rs Outdated Show resolved Hide resolved
@m4b
Copy link
Owner

m4b commented Nov 21, 2023

@baloo did you want to review? Otherwise I'm going to merge in an hour or so

@baloo
Copy link
Contributor

baloo commented Nov 21, 2023

I've tried to help @RaitoBezarius on this (he still deserves 120% of the credit here! But we've been bouncing off ideas on matrix). The code looks good to me.

@RaitoBezarius
Copy link
Contributor Author

This is really amazing work @RaitoBezarius thank you so much! And thank you for committing the example too; tbh i'd really prefer not to have dev dependencies introduced, but it's fine I guess. I'll let this sit for a few hours, but otherwise I'm ready to merge, great stuff.

Thank you so much, the next PR #381 is going to be also very good. :-)
stderrlog is not necessary per se, but I am not aware of an easier way to make the logs of log appear on stdout easily, we can remove it if you prefer.

So as far as "feature completeness", would the PE pwrite implementation allow, e.g., writing out PE object files for a toy compiler backend, similar to what faerie did?

So right now, it only focuses on PE executables, not PE object files (i.e. COFF), technically, if you had a PE object files library, you could write your COFF files and use this to assemble the final PE executable.

In the future, if someone adds COFF tables support (strings, symbols, etc.), you could probably write PE COFF files by filling the structures yourself, though, for an elegant and nice compiler, I recommend following the methodology in #381 where you have a meta structure PEBuilder or COFFBuilder and you assemble the final PE which you can write out via Scroll.

You need a lot of "layout"-ing when building such things from scratch, which can be a PITA when you want to focus on just giving a high level representation of your binary (here's my sections, this one is executable, attach some certificates here, etc.)

I am not planning to do the COFF thing in the near future because it creates a certain higher level of complexity and all I need is the PE executable stuff, but, I may come back later for this as I wonder if LLVM may someday adopt Rust and this library would probably be handy for a real compiler. :)

@RaitoBezarius
Copy link
Contributor Author

And @baloo more than helped me! He stayed with my ramblings about PE and we fixed a lot of stuff, if anything, he's definitely a co-author for me :).

Required to rematerialize a whole PE, otherwise, it is a corrupted PE.
…afely

It now returns a Cow as we need to transfer ownership whenever we have
virtual size > size of raw data, which forces us to pad with zeroes
according to the PE specification.
It can be used to zero-extend any buffer to make its length
aligned on some boundary, e.g. quadwords, for attribute certificates.

It is designed to be "zero cost" if you have no alignment to do, which
will result in no-allocation!
It enables a lossless conversion from 64 bits to 32 bits headers
except on platforms… that are strange.
This tucks `DataDirectory` with a Deref-style and expose `offset` for internal consumers (this crate)
and offers write support for data directories.
All that is needed is to import the allocation `Vec`.
This is a trivial rewriter that performs the identity transformation.
To further improve the debuggability, a non-overlapping sanity check is added as a
debug_assert!.
Previously, I (we?) thought that COFF symbols/strings were always there.

In fact, this is not the case and they are deprecated
according to https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#coff-file-header-object-and-image

> The file offset of the COFF symbol table, or zero
> if no COFF symbol table is present. This value should be zero for an image
> because COFF debugging information is deprecated.
We had the write certificate logic in the `TryIntoCtx`, but it makes
sense to have it separated to let consumers call it manually.
}

/// Return the COFF string table.
pub fn strings<'a>(&self, bytes: &'a [u8]) -> error::Result<strtab::Strtab<'a>> {
pub fn strings<'a>(&self, bytes: &'a [u8]) -> error::Result<Option<strtab::Strtab<'a>>> {
// > The file offset of the COFF symbol table, or zero if no COFF symbol table is present.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also see it here (the optionality) @m4b
Let me know if you prefer the doc in a different way

src/pe/data_directories.rs Show resolved Hide resolved
@m4b m4b merged commit 6d664c0 into m4b:master Nov 21, 2023
6 checks passed
@RaitoBezarius
Copy link
Contributor Author

Whew! What a PR, thank you @m4b for your patience. :)

@m4b
Copy link
Owner

m4b commented Jan 1, 2024

released in 0.8.0, happy new year!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants