Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support DWARF debug info in PE files #744

Merged
merged 9 commits into from
Jan 23, 2023
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,21 @@
- `PortablePdbDebugSession` now returns files referenced in the Portable PDB file. ([#729](https://github.com/getsentry/symbolic/pull/729))
- `PortablePdbDebugSession` now returns source files embedded in the Portable PDB file. ([#734](https://github.com/getsentry/symbolic/pull/734))
- Implement `symbolic_common::AsSelf` `for SourceMapCache` ([#742](https://github.com/getsentry/symbolic/pull/742))
- Debug information can now be retrieved from PE's with DWARF debug info. ([#744](https://github.com/getsentry/symbolic/pull/744))

**Breaking changes**:

- Demangling functionality is removed from C and Python bindings. ([#730](https://github.com/getsentry/symbolic/pull/730))
- The fields of `FileInfo` and the `compilation_dir` field on `FileEntry` are now private. ([#729](https://github.com/getsentry/symbolic/pull/729))
- `PortablePdbDebugSession` now has a lifetime parameter. ([#729](https://github.com/getsentry/symbolic/pull/729))
- `PeDebugSession` placeholder has been removed. ([#744](https://github.com/getsentry/symbolic/pull/744))

**Thank you**:

Features, fixes and improvements in this release have been contributed by:

- [@vaind](https://github.com/vaind)
- [@casept](https://github.com/casept)

## 10.2.1

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Symbolic provides the following functionality:
- Symbolication based on custom cache files (symcache)
- Symbol cache file generators from:
- Mach, ELF and PE symbol tables
- Mach and ELF embedded DWARF data
- Mach, ELF and PE embedded DWARF data
- PDB CodeView debug information
- Breakpad symbol files
- Demangling support
Expand Down
6 changes: 4 additions & 2 deletions symbolic-debuginfo/src/dwarf.rs
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
//! Support for DWARF debugging information, common to ELF and MachO.
//! In rare cases, PE's may contain it as well.
//!
//! The central element of this module is the [`Dwarf`] trait, which is implemented by [`ElfObject`]
//! and [`MachObject`]. The dwarf debug session object can be obtained via getters on those types.
//! The central element of this module is the [`Dwarf`] trait, which is implemented by [`ElfObject`],
//! [`MachObject`] and [`PeObject`]. The dwarf debug session object can be obtained via getters on those types.
//!
//! [`Dwarf`]: trait.Dwarf.html
//! [`ElfObject`]: ../elf/struct.ElfObject.html
//! [`MachObject`]: ../macho/struct.MachObject.html
//! [`PeObject`]: ../pe/struct.PeObject.html

use std::borrow::Cow;
use std::collections::BTreeSet;
Expand Down
14 changes: 1 addition & 13 deletions symbolic-debuginfo/src/object.rs
Original file line number Diff line number Diff line change
Expand Up @@ -331,7 +331,7 @@ impl<'data> Object<'data> {
.map_err(ObjectError::transparent),
Object::Pe(ref o) => o
.debug_session()
.map(ObjectDebugSession::Pe)
.map(ObjectDebugSession::Dwarf)
.map_err(ObjectError::transparent),
Object::SourceBundle(ref o) => o
.debug_session()
Expand Down Expand Up @@ -446,7 +446,6 @@ pub enum ObjectDebugSession<'d> {
Breakpad(BreakpadDebugSession<'d>),
Dwarf(DwarfDebugSession<'d>),
Pdb(PdbDebugSession<'d>),
Pe(PeDebugSession<'d>),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a NOTE: removing these types is considered a breaking change.

As getsentry/publish#1700 has been stuck and noone has tried to fix this yet, we can squeeze this breaking change into the next release still.

Please make sure to document the changes in the changelog though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the changelog.

SourceBundle(SourceBundleDebugSession<'d>),
PortablePdb(PortablePdbDebugSession<'d>),
}
Expand All @@ -464,7 +463,6 @@ impl<'d> ObjectDebugSession<'d> {
ObjectDebugSession::Breakpad(ref s) => ObjectFunctionIterator::Breakpad(s.functions()),
ObjectDebugSession::Dwarf(ref s) => ObjectFunctionIterator::Dwarf(s.functions()),
ObjectDebugSession::Pdb(ref s) => ObjectFunctionIterator::Pdb(s.functions()),
ObjectDebugSession::Pe(ref s) => ObjectFunctionIterator::Pe(s.functions()),
ObjectDebugSession::SourceBundle(ref s) => {
ObjectFunctionIterator::SourceBundle(s.functions())
}
Expand All @@ -480,7 +478,6 @@ impl<'d> ObjectDebugSession<'d> {
ObjectDebugSession::Breakpad(ref s) => ObjectFileIterator::Breakpad(s.files()),
ObjectDebugSession::Dwarf(ref s) => ObjectFileIterator::Dwarf(s.files()),
ObjectDebugSession::Pdb(ref s) => ObjectFileIterator::Pdb(s.files()),
ObjectDebugSession::Pe(ref s) => ObjectFileIterator::Pe(s.files()),
ObjectDebugSession::SourceBundle(ref s) => ObjectFileIterator::SourceBundle(s.files()),
ObjectDebugSession::PortablePdb(ref s) => ObjectFileIterator::PortablePdb(s.files()),
}
Expand All @@ -500,9 +497,6 @@ impl<'d> ObjectDebugSession<'d> {
ObjectDebugSession::Pdb(ref s) => {
s.source_by_path(path).map_err(ObjectError::transparent)
}
ObjectDebugSession::Pe(ref s) => {
s.source_by_path(path).map_err(ObjectError::transparent)
}
ObjectDebugSession::SourceBundle(ref s) => {
s.source_by_path(path).map_err(ObjectError::transparent)
}
Expand Down Expand Up @@ -537,7 +531,6 @@ pub enum ObjectFunctionIterator<'s> {
Breakpad(BreakpadFunctionIterator<'s>),
Dwarf(DwarfFunctionIterator<'s>),
Pdb(PdbFunctionIterator<'s>),
Pe(PeFunctionIterator<'s>),
SourceBundle(SourceBundleFunctionIterator<'s>),
PortablePdb(PortablePdbFunctionIterator<'s>),
}
Expand All @@ -556,9 +549,6 @@ impl<'s> Iterator for ObjectFunctionIterator<'s> {
ObjectFunctionIterator::Pdb(ref mut i) => {
Some(i.next()?.map_err(ObjectError::transparent))
}
ObjectFunctionIterator::Pe(ref mut i) => {
Some(i.next()?.map_err(ObjectError::transparent))
}
ObjectFunctionIterator::SourceBundle(ref mut i) => {
Some(i.next()?.map_err(ObjectError::transparent))
}
Expand All @@ -576,7 +566,6 @@ pub enum ObjectFileIterator<'s> {
Breakpad(BreakpadFileIterator<'s>),
Dwarf(DwarfFileIterator<'s>),
Pdb(PdbFileIterator<'s>),
Pe(PeFileIterator<'s>),
SourceBundle(SourceBundleFileIterator<'s>),
PortablePdb(PortablePdbFileIterator<'s>),
}
Expand All @@ -593,7 +582,6 @@ impl<'s> Iterator for ObjectFileIterator<'s> {
Some(i.next()?.map_err(ObjectError::transparent))
}
ObjectFileIterator::Pdb(ref mut i) => Some(i.next()?.map_err(ObjectError::transparent)),
ObjectFileIterator::Pe(ref mut i) => Some(i.next()?.map_err(ObjectError::transparent)),
ObjectFileIterator::SourceBundle(ref mut i) => {
Some(i.next()?.map_err(ObjectError::transparent))
}
Expand Down
116 changes: 57 additions & 59 deletions symbolic-debuginfo/src/pe.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,15 @@
use std::borrow::Cow;
use std::error::Error;
use std::fmt;
use std::marker::PhantomData;

use gimli::RunTimeEndian;
use goblin::pe;
use thiserror::Error;

use symbolic_common::{Arch, AsSelf, CodeId, DebugId, Uuid};

use crate::base::*;
use crate::dwarf::*;
use crate::Parse;

pub use goblin::pe::exception::*;
Expand Down Expand Up @@ -59,7 +60,8 @@ fn is_pe_stub(pe: &pe::PE<'_>) -> bool {
/// container, [`PdbObject`]. The PE file contains a reference to the PDB and vice versa to verify
/// that the files belong together.
///
/// While in rare instances, PE files might contain debug information, this case is not supported.
/// In rare instances, PE files might contain debug information.
/// This is supported for DWARF debug information.
///
/// [`PdbObject`]: ../pdb/struct.PdbObject.html
pub struct PeObject<'data> {
Expand Down Expand Up @@ -108,7 +110,7 @@ impl<'data> PeObject<'data> {

/// The debug information identifier of this PE.
///
/// Since debug information is stored in an external
/// Since debug information is usually stored in an external
/// [`PdbObject`](crate::pdb::PdbObject), this identifier actually refers to the
/// PDB. While strictly the filename of the PDB would also be necessary fully resolve
/// it, in most instances the GUID and age contained in this identifier are sufficient.
Expand Down Expand Up @@ -194,9 +196,10 @@ impl<'data> PeObject<'data> {

/// Determines whether this object contains debug information.
///
/// This is always `false`, as debug information is not supported for PE files.
/// Not usually the case, except for PE's generated by some alternative toolchains
/// which contain DWARF debug info.
pub fn has_debug_info(&self) -> bool {
false
self.section(".debug_info").is_some()
}

/// Determines whether this object contains embedded source.
Expand All @@ -209,9 +212,21 @@ impl<'data> PeObject<'data> {
false
}

/// Constructs a no-op debugging session.
pub fn debug_session(&self) -> Result<PeDebugSession<'data>, PeError> {
Ok(PeDebugSession { _ph: PhantomData })
/// Constructs a debugging session.
///
/// A debugging session loads certain information from the object file and creates caches for
/// efficient access to various records in the debug information. Since this can be quite a
/// costly process, try to reuse the debugging session as long as possible.
///
/// PE files usually don't have embedded debugging information,
/// but some toolchains (e.g. MinGW) generate DWARF debug info.
///
/// Constructing this session will also work if the object does not contain debugging
/// information, in which case the session will be a no-op. This can be checked via
/// [`has_debug_info`](struct.PeObject.html#method.has_debug_info).
pub fn debug_session(&self) -> Result<DwarfDebugSession<'data>, DwarfError> {
let symbols = self.symbol_map();
DwarfDebugSession::parse(self, symbols, self.load_address() as i64, self.kind())
}

/// Determines whether this object contains stack unwinding information.
Expand All @@ -229,6 +244,17 @@ impl<'data> PeObject<'data> {
&self.pe.sections
}

/// Returns the `SectionTable` for the section with this name, if present.
pub fn section(&self, name: &str) -> Option<SectionTable> {
Swatinem marked this conversation as resolved.
Show resolved Hide resolved
for s in &self.pe.sections {
let sect_name = s.name();
if sect_name.is_ok() && sect_name.unwrap() == name {
return Some(s.clone());
}
}
None
}

/// Returns exception data containing unwind information.
pub fn exception_data(&self) -> Option<&ExceptionData<'_>> {
if self.is_stub {
Expand Down Expand Up @@ -277,8 +303,8 @@ impl<'data> Parse<'data> for PeObject<'data> {
}

impl<'data: 'object, 'object> ObjectLike<'data, 'object> for PeObject<'data> {
type Error = PeError;
type Session = PeDebugSession<'data>;
type Error = DwarfError;
type Session = DwarfDebugSession<'data>;
type SymbolIterator = PeSymbolIterator<'data, 'object>;

fn file_format(&self) -> FileFormat {
Expand Down Expand Up @@ -357,54 +383,26 @@ impl<'data, 'object> Iterator for PeSymbolIterator<'data, 'object> {
}
}

/// Debug session for PE objects.
///
/// Since debug information in PE containers is not supported, this session consists of NoOps and
/// always returns empty results.
#[derive(Debug)]
pub struct PeDebugSession<'data> {
_ph: PhantomData<&'data ()>,
}

impl<'data> PeDebugSession<'data> {
/// Returns an iterator over all functions in this debug file.
pub fn functions(&self) -> PeFunctionIterator<'_> {
std::iter::empty()
}

/// Returns an iterator over all source files referenced by this debug file.
pub fn files(&self) -> PeFileIterator<'_> {
std::iter::empty()
}

/// Looks up a file's source contents by its full canonicalized path.
///
/// The given path must be canonicalized.
pub fn source_by_path(&self, _path: &str) -> Result<Option<Cow<'_, str>>, PeError> {
Ok(None)
impl<'data> Dwarf<'data> for PeObject<'data> {
fn endianity(&self) -> RunTimeEndian {
// According to https://reverseengineering.stackexchange.com/questions/17922/determining-endianness-of-pe-files-windows-on-arm,
// the only known platform running PE's with big-endian code is the Xbox360. Probably not worth handling.
RunTimeEndian::Little
}

fn raw_section(&self, name: &str) -> Option<DwarfSection<'data>> {
// Name is given without leading "."
let sect = self.section(&format!(".{}", name))?;
let start = sect.pointer_to_raw_data as usize;
let end = start + (sect.virtual_size as usize);
let dwarf_data: &'data [u8] = self.data.get(start..end)?;
let dwarf_sect = DwarfSection {
// TODO: What about 64-bit PE+? Still 32 bit?
address: u64::from(sect.virtual_address),
data: Cow::from(dwarf_data),
offset: u64::from(sect.pointer_to_raw_data),
align: 4096, // TODO: Does goblin expose this? For now, assume 4K page size
};
Some(dwarf_sect)
}
}

impl<'session> DebugSession<'session> for PeDebugSession<'_> {
type Error = PeError;
type FunctionIterator = PeFunctionIterator<'session>;
type FileIterator = PeFileIterator<'session>;

fn functions(&'session self) -> Self::FunctionIterator {
self.functions()
}

fn files(&'session self) -> Self::FileIterator {
self.files()
}

fn source_by_path(&self, path: &str) -> Result<Option<Cow<'_, str>>, Self::Error> {
self.source_by_path(path)
}
}

/// An iterator over functions in a PE file.
pub type PeFunctionIterator<'s> = std::iter::Empty<Result<Function<'s>, PeError>>;

/// An iterator over source files in a PE file.
pub type PeFileIterator<'s> = std::iter::Empty<Result<FileEntry<'s>, PeError>>;
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
source: symbolic-debuginfo/tests/test_objects.rs
expression: "FilesDebug(&files[..10])"
---
/mxe/tmp-gcc-x86_64-w64-mingw32.static/gcc-11.3.0.build_/mingw-w64-v10.0.0/mingw-w64-crt/crt/crtexe.c
/mxe/tmp-gcc-x86_64-w64-mingw32.static/gcc-11.3.0.build_/mingw-w64-v10.0.0/mingw-w64-crt/crt/crtexe.c
/mxe/usr/x86_64-w64-mingw32.static/include/winnt.h
/mxe/usr/x86_64-w64-mingw32.static/include/psdk_inc/intrin-impl.h
/mxe/usr/x86_64-w64-mingw32.static/include/corecrt.h
/mxe/usr/x86_64-w64-mingw32.static/include/minwindef.h
/mxe/usr/x86_64-w64-mingw32.static/include/basetsd.h
/mxe/usr/x86_64-w64-mingw32.static/include/stdlib.h
/mxe/usr/x86_64-w64-mingw32.static/include/errhandlingapi.h
/mxe/usr/x86_64-w64-mingw32.static/include/processthreadsapi.h

Loading