Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use PDB files in rewriting process #74

Open
avncharlie opened this issue Mar 12, 2024 · 3 comments
Open

Use PDB files in rewriting process #74

avncharlie opened this issue Mar 12, 2024 · 3 comments

Comments

@avncharlie
Copy link

I'm not sure where the best place to put this is between gtirb-pprinter, gtirb-rewriting and here, so please let me know and I can reopen this in the best repo.

While developing instrumentation using gtirb-rewriting, I would like to do this:

  • Use ddisasm on a PE and its PDB symbol file to generate a GTIRB IR file with symbol info
  • Instrument this IR with gtirb-rewriting (maybe even by using symbol info, e.g only instrument a function with a specific name)
  • Use gtirb-pprinter to output a PE binary and corresponding PDB symbol file from the instrumented IR

I haven't found any way to do this (run instrumentation that preserves symbol information in the output PE binary), is this possible?

@aeflores
Copy link
Collaborator

Hi @avncharlie , this is an interesting idea!

Our tooling is missing a few pieces for this to be possible. You could run ddisasm on a PE binary and create a gtirb, but we don't have any utilities to parse PDBs and use their information. This could be done (1) as a post-processing step where you annotate the gtirb with information from the PDB, or (2) have ddisasm parse the PDB so it can use it for better disassembly. Option 1 would probably be simpler to implement, but ddisasm would not benefit from the PDB information. Option 2 would probably require more work but could potentially get you better results.

Once you have a gtirb annotated with symbols, I think you should be able to use gtirb-rewriting to instrument it and gtirb-pprinter to generate a new PE. However, gtirb-pprinter cannot generate PDB files, that would be the second missing piece.
I am not sure how much effort this would be. I know llvm's support for PDB files (e.g. https://llvm.org/docs/CommandGuide/llvm-pdbutil.html) has been getting better, so using some of that might make things easier.

@XVilka
Copy link

XVilka commented Mar 13, 2024

You could use the Rizin library for parsing both PDB and DWARF (and maybe some other debugging information in the future):

It is a C library and definitely smaller than LLVM, so using it is much easier.

@aeflores
Copy link
Collaborator

aeflores commented Aug 6, 2024

It looks like the latest version of LIEF https://lief.re/doc/stable/changelog.html#july-23th-2024 has some support for parsing PDBs and DWARF sections. Once we update Ddisasm to the latest Lief, using that information during disassembly should be much easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants