Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reg_read from xzr / wzr on Arm64 / AArch64 broken in Unicorn 2.1: "Invalid argument (UC_ERR_ARG)" #2019

Open
Montg0mery opened this issue Oct 1, 2024 · 1 comment

Comments

@Montg0mery
Copy link

Previously in Unicorn 2.0.1.post1, you could call reg_read on xzr and wzr on Arm64 / Aarch64. In Unicorn 2.1.0 / 2.1.1 you cannot.

I've bisected the code and it works in commit d7a806c and fails in the next commit 4055a5a

This points to #1835 (for #1831).

Looking at the code, I think it's because those registers aren't covered by any of the if statements and the default return value for unhandled registers was also changed from UC_ERR_OK to UC_ERR_ARG, which by itself makes sense.

Now, you may ask why do you need to read from xzr / wzr - but it didn't throw an exception before, and now it does. Code that steps through an emulation and reads the source registers of any ldr / mov instructions or similar before making a decision about whether to continue stepping further or stop will run into this bug, for example. We could special case it on the consumer side, but there's a lot of code that relied on the previous behaviour, so this is an unexpected compatibility change which would require lots of special handling if not fixed.

In my opinion these 2 registers should have their own if / case statement and be handled appropriately by returning 0 (width-appropriate depending on which register was requested).

Minimal testcase based on the existing sample_arm64.py code for the binding:

#!/usr/bin/env python3
# Sample code for ARM64 of Unicorn. Nguyen Anh Quynh <aquynh@gmail.com>
# Python sample ported by Loi Anh Tuan <loianhtuan@gmail.com>

from unicorn import *
from unicorn.arm64_const import *


# code to be emulated
ARM64_CODE = b"\xEF\x03\x1F\xAA" # mov x15, xzr

# memory address where emulation starts
ADDRESS    = 0x10000

# Test ARM64 reading from xzr register
def test_arm64():
    try:
        # Initialize ARM64 emulator in ARM mode
        mu = Uc(UC_ARCH_ARM64, UC_MODE_ARM)

        # map 4kB memory for this emulation
        mu.mem_map(ADDRESS, 4 * 1024)

        # write machine code to be emulated to memory
        mu.mem_write(ADDRESS, ARM64_CODE)

        # initialize machine register
        mu.reg_write(UC_ARM64_REG_X15, 0x4141414141414141)

        # emulate machine code in infinite time
        mu.emu_start(ADDRESS, ADDRESS + len(ARM64_CODE))

        print("Emulation done.\n")

        print("UC_ARM64_REG_X15 const = %u" % UC_ARM64_REG_X15)
        x15 = mu.reg_read(UC_ARM64_REG_X15)
        print("x15 = 0x%x\n" % x15)

        print("UC_ARM64_REG_XZR const = %u" % UC_ARM64_REG_XZR)
        xzr_value = mu.reg_read(UC_ARM64_REG_XZR)
        print("xzr = 0x%x" % xzr_value)

    except UcError as e:
        print("ERROR: %s" % e)


if __name__ == '__main__':
    test_arm64()

This used to return:

Emulation done.

UC_ARM64_REG_X15 const = 214
x15 = 0x0

UC_ARM64_REG_XZR const = 7
xzr = 0x0

But now returns:

Emulation done.

UC_ARM64_REG_X15 const = 214
x15 = 0x0

UC_ARM64_REG_XZR const = 7
ERROR: Invalid argument (UC_ERR_ARG)

And the same is true for wzr (const = 6).

Thanks for your hard work, hopefully that makes sense.

@wtdcode
Copy link
Member

wtdcode commented Oct 1, 2024

I got the similar feedbacks of this behavior change from various sources, which I admit I should have highlight in the changelog.

Generally, I’m thinking of printing a warning to stderr (probably in the incoming 2.1.2) and changing this to an error in next minor version like 2.2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants