Skip to content

Latest commit

 

History

History
215 lines (152 loc) · 13.8 KB

README.md

File metadata and controls

215 lines (152 loc) · 13.8 KB

Rdr 3.0 - Welcome

Floobits Status

Welcome to the rdr project.

MAJOR UPDATE

This project is completely finished, and as such, is no longer under active development. Once I get some spare time, I will publish to opam this version, fix the mach threads hack (or wait until Rust no longer uses the unix threads load command), and probably call it a day!

Of course, if anyone has any suggestions for improvement, pull requests can still be submitted and I'll probably merge it (but we know that's never going to happen) --- and I might add a feature every now and then, but, I consider rdr to be stable enough that I use it on a day to day basis, and that I just simply don't have the time to implement some of the nicer features. But I hope you enjoy, and have fun with it!

rdr is now version 3.0, supporting tools like bin2json, which further supports tools like the silicon element suite. Here are some (new) features:

  • PE32 support
  • Unified export/import model using the Goblin binary format, a kind of IR for binaries
  • Disassemble symbols in a binary (as opposed to just symbols in the map) --- this is still experimental and very much hacky, llvm-mc must be installed. I'll figure out a better way soon, or write my own x86-64 and ARM64 disassembler, cause I'm crazy.
  • Print Goblin representation rdr -g
  • A slightly better symbol tree
  • Import library resolution for ELF, which looks up the imported symbol for a binary using the symbol map/tree
  • Better byte-coverage printing in addition to more extensive coverage
  • Scan the binary with a hexadecimal scan string - no spaces or 0x. rdr --scan 5589e58b450839450c0f4d450c bin/pe/libbeef.dll or rdr --scan deadbeef bin/pe/libbeef.dll
  • Disassemble at a particular offset (experimental): rdr --do 0x51f bin/pe/libbeef.dll
  • Print the particular binaries version of a "section". Section headers for ELF, segments for mach-o, and section tables for PE: rdr --sections bin/elf/deadbeef.elf

rdr is an OCaml tool/library for doing cross-platform analysis of binaries. I typically use it for looking up symbol names, finding the address offset, and then running gdb or lldb to mess around (you should be using both if you even know what you're doing).

I also find that it's useful for resolving linking errors if you're trying to build some project, especially some random, misconfigured XCode project, or what have you.

Basically it's the best, free, cross-platform reverse engineering tool out there.

See the usage section for a list of features.

Currently, only:

  • 64-bit ELF
  • 64-bit Mach-o (also will suck out the first 64-bit binary found in a fat universal binary)
  • 32-bit PE32

binaries are supported (64-bit PE32, i.e. PE32+ coming soon).

Also, 32-bit binaries aren't cool anymore; stop publishing reverse engineering tutorials on them (in nix land at least: apparently Microsoft still publishes 32-bit binaries for general consumption).

Happily, the project has no dependencies (besides the standard libs and unix and str). I have switched to an oasis build system however, and it's awesome, but does add some slight extra complexity (not really). See the install section for more details.

Install

Easy (OPAM)

Install with OPAM: opam install rdr

Slightly Less Easy (Manual)

NOTE This will not build on 32-bit systems.

  • You must have OCaml and findlib installed, and OCaml must be at least version 4.02 (I use the Bytes module and ppx annotations). You can install findlib through your package manager; on Arch it's currently ocaml-findlib.
  • You must run make, or execute ocaml setup.ml -configure && ocaml setup.ml -build (especially if on 64-bit windows) in the base project directory.
  • You may then sudo make install (or sudo ocaml setup.ml -install) to copy the rdr binary to your /usr/local/bin, in addition to installing the library with findlib. Or you can just mv the generated binary, main.native, wherever you want, with whatever name, if that's your fancy.

Usage

Essentially, rdr performs two tasks, and should probably be two programs.

Binary Analysis

The first is pointing rdr at a binary. Example:

rdr /usr/lib/libc.so.6

It should output something like: ELF X86_64 DYN @ 0x20920. Which is boring.

You can pass it various flags, -e for printing the exports found in the binary (see this post on ELF exports for what I'm counting as an "export"), -i for imports, etc. For mach-o and PE32 binaries, exporthood and importhood are clearly defined, so blog posts detailing this isn't necessary (unless you want a detailed analysis of the mach binary format).

Some examples:

  • rdr -v - prints the version
  • rdr -h - prints a help menu
  • rdr -h /usr/lib/libc.so.6 - prints the program headers, bookkeeping data, and other bureaucratic aspects of binaries specific to the format your analyzing
  • rdr -f printf /usr/lib/libc.so.6 - searches the libc.so.6 binary for an exported symbol named exactly "printf", and if found, prints its binary offset and size (in bytes). Watch out for _ prefixed symbols in mach and compiler private symbols in ELF. Definitely watch out for funny ($) symbols, like in mach-o Objective C binaries; you'll need to quote the symbol name to escape them, otherwise bash gets mad. Future: regexp multiple returns, and searching imports as well.
  • rdr -D -f printf /usr/lib/libc.so.6 - disassembles the printf symbol if it's found.
  • rdr -l /usr/lib/libc.so.6 - lists the dynamic libraries libc.so.6 explicitly depends on (I'm looking at you dlsym).
  • rdr -i /usr/lib/libc.so.6 - lists the imports the binary depends on. NOTE when run on linux ELF binaries, if a system map has been built, it will use that to resolve the import's library. Depending on your machine, can add a slight delay; sorry bout that. On mach-o and PE this delay caused by an extra lookup isn't necessary, since imports are required to state where they come from, because the format was built by sane people (more or less).
  • rdr -G /usr/lib/libz.so.1.2.8 - graphs the libraries, imports, and exports of libz.so.1.2.8; run dot -O -n -Tpng libz.so.1.2.8.gv to make a pretty picture. Does a simple, hackish check to see if dot is in your ${PATH}, and if so, runs the above dot command for you - you should probably just install it before you run this. See the examples for rdr output.
  • rdr -s /usr/lib/libc.so.6 - print the nlist/strippable symbol table, if it exists. Crappy programs like nm only use the strippable symbol table, even for exports and imports.
  • rdr -v /usr/lib/libc.so.6 - print everything; you have been warned.
  • rdr -c /usr/lib/libc.so.6 - prints the byte coverage rdr generated for the binary

Symbol Map

rdr can create a "symbol map" for you, in ${HOME}/.rdr/. What's that you ask? It's a map from exported symbol name -> list of exported symbols, where symbol information is offset, size, exporting library, etc. In the future I will add tags to the symbol; I'll explain what that means when the time comes.

But in other words, this is a map from keys of symbol names to lists of symbol information, because symbol-to-symbol information is not a function. To put that less technically: for any given symbol name, malloc for example, you can have multiple libraries which provide (export) that same exact symbol. It is a one to many relationship.

Nevertheless, with such a map, we can perform a variety of useful activities, like looking up a symbol's offset in a library, its size, etc.

Why hasn't this existed before? I don't know.

You build the map first by invoking:

rdr -b

Which defaults to scanning /usr/lib/ for things it considers "binaries". Basically, it works pretty well.

If you want to recursively search, you give it a directory (or supply none at all, and it uses the default, /usr/lib), and the -r flag:

rdr -b -r -d "/usr/lib /usr/local/lib"

Spaces or colons (':') in the -d string separate different directories; with -r set, it searches each recursively.

Be careful (patient); on slow machines, this can take a whole bunch of time, especially on linux, where everything and their mother put their garbage in /usr/lib (I'm looking at you node). But on the brightside, if you're lucky enough to have one, on a recent MBP, it's so fast it can build the map in realtime, and then do a symbol lookup (I don't do that).

Anyway, after you've built the map, you can perform exact symbol lookups, for example:

$ rdr -m -f printf
searching /usr/lib/ for printf:
           30f90 printf (334) -> /usr/lib/libtsan.so.0.0.0 [libtsan.so.0]
           4ed10 printf (161) -> /usr/lib/libc-2.22.so [libc.so.6]
           60c00 printf (284) -> /usr/lib/libasan.so.2.0.0 [libasan.so.2]

Where the output format for each symbol is offset symbol_name (size) -> /path/to/exporting/library [alias]. The alias is important for ELF, as it allows import resolution in the analyzed binaries (basically what the dynamic linker does --- it's awesome).

If you find a symbol you admire, you can disassemble it by adding the -D flag, using llvm-mc. This is an experimental feature and subject to change (it'll definitely have to stay in though, cause it's awesome).

Again, I do a simple, hackish check to see if llvm-mc is in your ${PATH}, and if so, the program is run, otherwise an error message is printed. However, to quote a C idiom: "this behavior is undefined" if llvm-mc isn't installed and in your ${PATH}.

Example with llvm-mc correctly installed:

$ rdr -D -m -f printf
searching /usr/lib/ for printf:
           4f0a0 printf (161) -> /usr/lib/libc-2.21.so
	.text
	subq	$216, %rsp
	testb	%al, %al
	movq	%rsi, 40(%rsp)
	movq	%rdx, 48(%rsp)
	movq	%rcx, 56(%rsp)
	movq	%r8, 64(%rsp)
	movq	%r9, 72(%rsp)
	je	55
	movaps	%xmm0, 80(%rsp)
	movaps	%xmm1, 96(%rsp)
	movaps	%xmm2, 112(%rsp)
	movaps	%xmm3, 128(%rsp)
	movaps	%xmm4, 144(%rsp)
	movaps	%xmm5, 160(%rsp)
	movaps	%xmm6, 176(%rsp)
	movaps	%xmm7, 192(%rsp)
	leaq	224(%rsp), %rax
	movq	%rdi, %rsi
	leaq	8(%rsp), %rdx
	movq	%rax, 16(%rsp)
	leaq	32(%rsp), %rax
	movl	$8, 8(%rsp)
	movl	$48, 12(%rsp)
	movq	%rax, 24(%rsp)
	movq	3464671(%rip), %rax
	movq	(%rax), %rdi
	callq	-44329
	addq	$216, %rsp
	retq

If you don't like AT&T syntax (FYI you should probably become a real hacker and learn to read and understand both syntax flavors), the lack of options, and a host of other issues w.r.t. disassembly, then you're out of luck for now. Maybe make a pull request?

You can also graph the library dependencies (the .gv file is generated at build time in ${HOME}/.rdr/) with rdr -m -G. Currently, it creates a library_dependency.png file; in the future, this will be named after the map it was generated from, once named maps become a thing. Also, this .png will be probably be enormous.

This can be useful, if for example, you collate a series of binaries and shared libraries into a directory, and then have rdr build a map from that directory, and want to graph their interrelated dependencies. If you want it to lookup the correct /usr/lib deps, then the full command might be something like: rdr -b -G -D "$(pwd):/usr/lib/", and that map's dependency graph will be in ${HOME}/.rdr/lib_dependency_graph.png.

Finally, and again at build time, a stats file is generated from the system map in ${HOME}/.rdr/; this simply counts the number of times a symbol was imported by every binary analyzed when the system map was built (so with a -d directory specified, the default is /usr/lib/, and so it counts every time some symbol x was imported in every binary found in /usr/lib). Expect this file to change, or various other statistical files to be created in the ${HOME}/.rdr/ directory.

Once versioned/named maps are implemented, the stats will be per map.

There are also times that you will want to grep symbols, maybe because you only know a part of it, or etc.

For now, this facility is enabled by writing a flattened symbol map to disk, using rdr -m -w, located at ${HOME}/.rdr/. This file is named symbols and you can grep it to your heart's content. It is flattened because each element in the list of symbol information a symbol maps to is output to disk.

So, for example, grep -w "malloc" ~/.rdr/symbols yields:

0x16a50 malloc (13) E -> /usr/lib/ld-2.21.so 
0x576f0 malloc (303) E -> /usr/lib/libasan.so.1.0.0 
0x7a7b0 malloc (394) E -> /usr/lib/libc-2.21.so 
0x346f0 malloc (137) E -> /usr/lib/libgvpr.so.2.0.0 
0x5f90 malloc (1543) E -> /usr/lib/libjemalloc.so.1 
0xb290 malloc (267) E -> /usr/lib/liblsan.so.0.0.0 
0x19c0 malloc (299) E -> /usr/lib/libmemusage.so 
0x1200 malloc (33) E -> /usr/lib/libtbbmalloc_proxy.so.2 
0x1210 malloc (33) E -> /usr/lib/libtbbmalloc_proxy_debug.so.2 
0x367a0 malloc (2395) E -> /usr/lib/libtcmalloc.so.4.2.6 
0x3a640 malloc (2395) E -> /usr/lib/libtcmalloc_and_profiler.so.4.2.6 
0x3d740 malloc (718) E -> /usr/lib/libtcmalloc_debug.so.4.2.6 
0x1d2b0 malloc (2395) E -> /usr/lib/libtcmalloc_minimal.so.4.2.6 
0x242a0 malloc (702) E -> /usr/lib/libtcmalloc_minimal_debug.so.4.2.6 
0x4d020 malloc (175) E -> /usr/lib/libtsan.so.0.0.0 

Project Structure

Because I just knew you were going to ask, I made this sweet graphic, just for you:

project deps

Examples

  • rdr -G /usr/lib/libz.so.1.2.8: libz so hard
  • See my gallery for more inspiring images of what you can do with rdr