Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some methods useful for locating debug symbols #39

Merged
merged 2 commits into from
Dec 19, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ repository = "https://github.com/gimli-rs/object"

[dependencies]
goblin = "0.0.13"
uuid = "0.5.1"

[dev-dependencies]
memmap = "0.6"
Expand Down
9 changes: 8 additions & 1 deletion src/elf.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@ use std::slice;

use goblin::{elf, strtab};

use {Machine, Object, ObjectSection, ObjectSegment, SectionKind, Symbol, SymbolKind, SymbolMap};
use {DebugFileInfo, Machine, Object, ObjectSection, ObjectSegment, SectionKind, Symbol, SymbolKind,
SymbolMap};

/// An ELF object file.
#[derive(Debug)]
Expand Down Expand Up @@ -148,6 +149,12 @@ where
fn is_little_endian(&self) -> bool {
self.elf.little_endian
}

fn has_debug_symbols(&self) -> bool {
self.section_data_by_name(".debug_info").is_some()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not confident that this is correct. What if the file was generated with .debug_line only? What's the intended use for this? Should calllers only look for .gnu_debuglink if this returns false? I think it might be better to always use .gnu_debuglink if it exists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, .debug_info exists even with -C debuginfo=1. Still might be better to always check for .gnu_debuglink though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure, honestly, except that .debug_info is a pretty strong signal. Calling strip on a binary will definitely remove .debug_info by default. This is mostly intended as a check for "can I load debug info from this binary, or do I need to look for an external symbol file?"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly intended as a check for "can I load debug info from this binary, or do I need to look for an external symbol file?"

There's also gdb "mini debug info" to consider: https://sourceware.org/gdb/onlinedocs/gdb/MiniDebugInfo.html#MiniDebugInfo

With this an executable might contain some minimal info (line table) but more data might be available elsewhere. gdb looks for direct debug info first; then tries separate debuginfo; then falls back to minidebug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following whatever gdb does here is probably the best bet for compatibility's sake.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine then.

}

fn debug_file_info(&self) -> Option<DebugFileInfo> { None }
}

impl<'data, 'file> Iterator for ElfSegmentIterator<'data, 'file> {
Expand Down
18 changes: 18 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
#![deny(missing_debug_implementations)]

extern crate goblin;
extern crate uuid;

use std::fmt;
use std::io::Cursor;
Expand All @@ -25,6 +26,8 @@ pub use pe::*;
mod traits;
pub use traits::*;

pub use uuid::Uuid;

/// An object file.
#[derive(Debug)]
pub struct File<'data> {
Expand Down Expand Up @@ -54,6 +57,13 @@ pub enum Machine {
X86_64,
}

/// Information from an object file that can be used to locate separate debug info.
#[derive(Debug, Clone, Copy)]
pub enum DebugFileInfo {
/// The UUID from a Mach-O `LC_UUID` load command.
MachOUuid(Uuid),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense to have this be an associated type of the trait, something like:

impl Object for MachOFile {
    type DebugFileInfo = Uuid;

    fn get_debug_file_info(&self) -> Option<Uuid> { ... }
}

And then when we add support for GnuDebugLink for elf files it could have its own associated type as well.

For object files that don't have separate debug info, or we haven't implemented it yet, then we can use () as the associated type.

Thoughts on this approach?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that returning an associated type is enough. The goal of the Object trait is to allow you to write cross-platform code that optionally allows you to only include support for the native platform. For debug_file_info, using an associated type for the return value is more useful if the return value can also be handled in cross-platform code, and the way we've handled that for other associated types is for each of them to satisfy a trait too. So the question is, what trait should this return value satisfy? And I don't have any good ideas for that.

Maybe the API should instead be fn debug_file_path(&self) -> Option<PathBuf>, and the implementations of this would use the moria crate (so moria would need to become independent of object).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets a little twisty, certainly. I wasn't sure which way the dependency ought to go, just that I wanted a straightforward API for users. Having the APIs in object continue to be platform-agnostic from the outside seems sensible. One thing to consider is that moria is going to wind up dragging in a bunch of other dependencies because of what it wants to accomplish. In the future I might want to add APIs to locate debug symbols by fetching them from symbol servers over HTTP, which is going to pull in a whole world of things. Do we want all of that as dependencies on object, or would it be better to keep object simpler and have moria as an optional crate people can pull in to get that functionality?

The other way we can go with this is to provide a method on File that returns what's currently the FileInternal enum, to allow callers to delve into format-specific implementation details. Then we could just have methods like Mach::get_uuid and Elf::get_build_id which would be very straightforward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that exposing FileInternal would be any better for cross-platform code than the DebugFileInfo enum, and enum would be easier to use.

We do want to keep object dependencies simple. Even if we added debug_file_path, we would still need to expose the UUID for the more complex uses. So let's leave it how it is.

}

/// An iterator over the segments of a `File`.
#[derive(Debug)]
pub struct SegmentIterator<'data, 'file>
Expand Down Expand Up @@ -318,6 +328,14 @@ where
fn is_little_endian(&self) -> bool {
with_inner!(self.inner, FileInternal, |x| x.is_little_endian())
}

fn has_debug_symbols(&self) -> bool {
with_inner!(self.inner, FileInternal, |x| x.has_debug_symbols())
}

fn debug_file_info(&self) -> Option<DebugFileInfo> {
with_inner!(self.inner, FileInternal, |x| x.debug_file_info())
}
}

impl<'data, 'file> Iterator for SegmentIterator<'data, 'file> {
Expand Down
23 changes: 22 additions & 1 deletion src/macho.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,11 @@ use std::fmt;
use std::slice;

use goblin::mach;
use goblin::mach::load_command::CommandVariant;
use uuid::Uuid;

use {Machine, Object, ObjectSection, ObjectSegment, SectionKind, Symbol, SymbolKind, SymbolMap};
use {DebugFileInfo, Machine, Object, ObjectSection, ObjectSegment, SectionKind, Symbol, SymbolKind,
SymbolMap};

/// A Mach-O object file.
#[derive(Debug)]
Expand Down Expand Up @@ -205,6 +208,24 @@ where
fn is_little_endian(&self) -> bool {
self.macho.header.is_little_endian()
}

fn has_debug_symbols(&self) -> bool {
self.section_data_by_name(".debug_info").is_some()
}

fn debug_file_info(&self) -> Option<DebugFileInfo> {
// Return the UUID from the `LC_UUID` load command, if one is present.
self.macho.load_commands.iter().filter_map(|lc| {
match lc.command {
CommandVariant::Uuid(ref cmd) => {
//TODO: Uuid should have a `from_array` method that can't fail.
Uuid::from_bytes(&cmd.uuid).ok()
.map(|uuid| DebugFileInfo::MachOUuid(uuid))
}
_ => None,
}
}).nth(0)
}
}

impl<'data, 'file> Iterator for MachOSegmentIterator<'data, 'file> {
Expand Down
12 changes: 11 additions & 1 deletion src/pe.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@ use std::borrow;

use goblin::pe;

use {Machine, Object, ObjectSection, ObjectSegment, SectionKind, Symbol, SymbolKind, SymbolMap};
use {DebugFileInfo, Machine, Object, ObjectSection, ObjectSegment, SectionKind, Symbol, SymbolKind,
SymbolMap};

/// A PE object file.
#[derive(Debug)]
Expand Down Expand Up @@ -152,6 +153,15 @@ where
// characteristics flags, but these are obsolete.
true
}

#[inline]
fn has_debug_symbols(&self) -> bool {
// TODO: look at what the mingw toolchain does with DWARF-in-PE, and also
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File a follow-up investigation issue and link it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #40

// whether CodeView-in-PE still works?
false
}

fn debug_file_info(&self) -> Option<DebugFileInfo> { None }
}

impl<'data, 'file> Iterator for PeSegmentIterator<'data, 'file> {
Expand Down
9 changes: 8 additions & 1 deletion src/traits.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use {Machine, SectionKind, Symbol, SymbolMap};
use {DebugFileInfo, Machine, SectionKind, Symbol, SymbolMap};

/// An object file.
pub trait Object<'data, 'file> {
Expand Down Expand Up @@ -49,6 +49,13 @@ pub trait Object<'data, 'file> {

/// Return true if the file is little endian, false if it is big endian.
fn is_little_endian(&self) -> bool;

/// Return true if the file contains debug information sections, false if not.
fn has_debug_symbols(&self) -> bool;

/// Get `DebugFileInfo` that can be used to locate external debug symbols for the file
/// if present.
fn debug_file_info(&self) -> Option<DebugFileInfo>;
}

/// A loadable segment defined in an object file.
Expand Down