-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
significantly improve instruction printing efficiency #34
Conversation
for mem size labels: add one new "BUG" entry at the start of the array so `mem_size` does not need to be adjusted before being used to look up a string from the `MEM_SIZE_STRINGS` array. it's hard to measure the direct benefit of this, but it shrinks codegen size by a bit and simplfies a bit of assembly.... for segment reporting changes: stos/scas/lods do not actually need special segment override logic. instead, set their use of `es` when decoded, if appropriate. this is potentially ambiguous; in non-64bit modes the sequence `26aa` would decode as `stos` with explicit `es` prefix. this is now identical to simply decoding `aa`, which now also reports that there is an explicit `es` prefix even though there is no prefix on tne instruction. on the other hand, the prefix-reported segment now more accurately describes the memory selector through which memory accesses will happen. seems ok?
it is almost always the case that self.prefixes.segment == Segment::DS, meaning testing for it first avoids checking `self.operands[op].is_memory()` later. this overall avoids a few instructions in the typical path, rather than checking `is_memory()` first (which would always be true in the places this function is called from)
testing against six opcodes to see if we should print rep or repnz is a bit absurd. they are relatively rare instructions, so this is a long sequence of never-taken tests. we can avoid the whole thing in the common case by testing if there is any kind of rep prefix at all.
the match on opcode should have been dce, match on operands would only matter if there was a bug
this reduces a `slice::contains` to a single bit test, and regroups prefix printing to deduplicate checks of the `rep` prefix seemingly this reduces instruction counts by about 1%, cycles by 0.3% or so.
the reasoning for *why* `visit_operand` is better here lives as doc comments on `visit_operand` itself: it avoids going from scattered operand details to `enum Operand` only to deconstruct the enum again. instead, branch arms can get codegen'd directly against `struct Instruction` layout.
write_2 will never actually be used, but im adapting it into contextualize in a... better way
`name()` returning a `[u8; 2]` is nice when there is a specializing and unrolling write implementation, whereas `&str` might not consistently unroll into a simple 2-byte copy (rather than loop). it'll look a little more reasonable soon, hopefully..
it turns out that yaxpeax-arch's notion of colorization has been broken from the start for systems that do markup without inline sequences (e.g. windows/cmd.exe before vt100 support)
if mem_size is ever out of bounds thats a severe bug on its own
fix 32-bit 66-prefixed ff /2 call not having 16-bit operands fix momentary regression in rendering `call` instructions to string
this is also checked by a new fuzz target
ffi/ still needs... much more work
Actually, I believe safer_unchecked is not needed anymore, this should be checked by default in debug builds nowadays. fn main() {
let slice = &[1];
unsafe {
let _ = slice.get_unchecked(2);
}
} Compiling playground v0.0.1 (/playground)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.47s
Running `target/debug/playground`
thread 'main' panicked at library/core/src/panicking.rs:220:5:
unsafe precondition(s) violated: slice::get_unchecked requires that the index is within the slice
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread caused non-unwinding panic. aborting. see: rust-lang/rust#120594 (before that, the checks existed, but were optimized out unless you rebuilt the stdlib since the std is built in release mode) The checking in the std is also a lot more complete than just some bounds checks. Still no replacement for miri, but tells you when you're doing something blatantly wrong without a huge compatibility/perf hit. |
oh! well that's great. serves me right for using an old rust for testing |
this is where much of iximeow/yaxpeax-arch#7 originated.
std::fmt
as a primary writing mechanism has.. some limitations:memcpy
llvm/llvm-project#87440and some more interesting more fundamental limitations - writing to a
T: fmt::Write
means implementations don't know if it's possible to write bytes in reverse order (useful for printing digits) or if it's OK to write too many bytes and then only advancelen
by the correct amount (useful for copying variable-length-but-short strings like register names). these are both perfectly fine to aString
orVec
, less fine to do to a file descriptor like stdout.at the same time,
Colorize
and traits depending on it are very broken, for reasons described in yaxpeax-arch.so, this adapts
yaxpeax-x86
to use the newDisplaySink
type for writing, with optimizations where appropriate and output spans for certain kinds of tokens - registers, integers, opcodes, etc. it's not a perfect replacement for Colorize-to-ANSI-supporting-outputs but it's more flexible and i think can be made right.along the way this completes the move of
safer_unchecked
out to yaxpeax-arch (ty @5225225 it's still so useful), cleans up some docs, and comes with a few new test cases.because of the major version bump of yaxpeax-arch, and because this removes most functionality of the Colorize impl - it prints the correct words, just without coloring - this is itself a major version bump to 2.0.0. yay! this in turn is a good point to change the
Opcode
enums from being tuple-like to struct-like, and i've done so in 1b8019d.full notes in CHANGELOG ofc. this is notes for myself when i'm trying to remember any of this in two years :)