Need help in analyzing Windows shellcode or attack coming from Metasploit Framework or Cobalt Strike (or may be also other malicious or obfuscated code)? Do you need to automate tasks with simple scripting? Do you want help to decrypt MSF generated traffic by extracting keys from payloads?
REW-sploit is here to help Blue Teams!
Here a quick demo:
An introduction to the tool has been registered at Insomni'hack 2022:
https://www.youtube.com/watch?v=-sjM0k0hvMU
Installation is very easy. I strongly suggest to create a specific Python Env for it:
# python -m venv <your-env-path>/rew-sploit
# source <your-env-path>/rew-sploit/bin/activate
# git clone https://github.com/REW-sploit/REW-sploit.git
# cd REW-sploit
# pip install -U setuptools
# pip install -r requirements.txt
# ./apply_patch.py -f
# ./rew-sploit
If you prefer, you can use the Dockerfile. To create the image:
docker build -t rew-sploit/rew-sploit .
and then start it (sharing the /tmp/
folder):
docker run --rm -it --name rew-sploit -v /tmp:/tmp rew-sploit/rew-sploit
You see an apply_patch.py
script in the installation sequence. This is required to apply a small patch to the speakeasy-emulator
(https://github.com/fireeye/speakeasy/) to make it compatible with REW-sploit
. You can easily revert the patch with ./apply_patch.py -r
if required.
Optionally, you can also install Cobalt-Strike Parser:
# cd REW-sploit/extras
# git clone https://github.com/Sentinel-One/CobaltStrikeParser.git
NOTE: from version 0.4.2
I switched to use the latest commit of Speakeasy-Emulator
instead of the stable release. If you want to use the stable release, use
pip install -r requirements_stable.txt
REW-sploit
is based on a couple of great frameworks, Unicorn
and speakeasy-emulator
(but also other libraries). Thanks to everyone and thanks to the OSS movement!
In general we can say that whilst Red Teams have a lot of tools helping them in "automating" attacks, Blue Teams are a bit "tool-less". So, what I thought is to build something to help Blue Team Analysis.
REW-sploit
can get a shellcode/DLL/EXE, emulate the execution, and give you a set of information to help you in understanding what is going on. Example of extracted information are:
- API calls
- Encryption keys used by MSF payloads
- decrypted 2nd stage coming from MSF
- Cobalt-Strike configurations (if CobaltStrike parser is installed)
You can find several examples on the current capabilities here below:
- RC4 Keys Extraction
- RC4 Keys Extraction + PCAP 2nd stage decryption
- ChaCha Keys Extraction
- Meterpreter session Decryption (no RSA)
- Cobalt-Strike beacon Emulation
- Cobalt-Strike config Extraction
- Debugging options
- Dumping Threads
- Dumping Memory Allocations
You know for sure the Donut package, able to create PIC from EXE, DLL, VBScript and JScript.
Donut
, in order to evade detection, uses a API exports enumeration based on hashes computed on every API name, as many PIC do. This is very CPU intensive (especially in an emulated environment like REW-sploit
).
So, I implemented a sort of shortcut (changed from 0.3.3 release) to unhook some of the slowest parts of emulation when a Donut
stub is detected.
Also, in order to be able to correctly complete the emulation, you need to give to Speakeasy
the DLL to get the complete exports. To do it copy the following DLLs
kernel32.dll
mscoree.dll
ole32.dll
oleaut32.dll
wininet.dll
in the Speakeasy
folder winenv/decoys/amd64
and/or winenv/decoys/x86
(see Speakeasy README for details). If you don't need them, don't leave the DLLs there, in other case they can slow down emulation.
For Donut
1.0 you may want to add even the following DLLs
combase.dll
shell32.dll
ntdll.dll
This Shikata Ga Nai implementation works just fine most of the times. In some cases it fails with an invalid read
, so I implemented Fixup #4
for it.
A new command emulate_antidebug
has been added from version 0.4
: this should help in identifying antidebug tricks used in teh analyzed code, so that you can patch it when executing in a real debug environment. This is an example of what has been implemented:
[#] Call to QueryPerformanceCounter() at 0x414049
[#] IsDebuggerPresent() at 0x4157e7
[#] CheckRemoteDebuggerPresent() at 0x415828
[#] Suspect NtQueryInformationProcess() at 0x4158b3
[#] Suspect NtQuerySystemInformation() at 0x415a22
[#] Direct access to PEB!BeingDebugged at 0x415a6a
[#] Direct access to PEB!NtGlobalFlag at 0x415a9e
[#] Suspect access to HeapBase (may be used to access Flags and ForceFlags) at 0x415b17
[#] GetProcAddress() of CRSS.EXE at 0x415b75
[#] Exclusive CreateFileA() on current process at 0x415bc4
[#] Call to GetLocalTime() at 0x415bfd
[#] Call to GetSystemTime() at 0x415c20
[#] Call to GetTickCount() at 0x415c52
[#] Call to QueryPerformanceCounter() at 0x415ca6
[#] Call to timeGetTime() at 0x415cd7
[#] Call to VirtualProtect() on "Return Address" at 0x4120b9
The emulate_payload
command has 3 options to dump content of several artifacts:
-T, --thread Dump CreateThread API content from lpStartAddress
-W, --writefile Dump WriteFile API content
-M, --writemem Dump VirtualAlloc API allocated content
With these options you can dump the content of memory areas or files during emulation. For example the --writemem
option allows to dump all the allocated memory when accessed in read or execution (for example with a JMP in the area); a common behavior of malicious code is to allocate memory, decrypt it and then execute or use it to do additional things. This option allows to get this content after decryption.
Obviously emulation slows down everything. Moreover, hooking every instruction in order to interact with the execution, make things even slower. In general this works fine with small shellcode, but have some issues with complex code. That's why I added an option to turn off hooking to speed up execution:
emulate_payload -P <path_to_filename> -U 0
In this way you can get a picture of what the emulated code is doing (with API tracking), but nothing else will be done (no fixups, no key extractions, etc). If you specify something different than 0
the hooking will be re-enabled when the IP
(instruction pointer) will reach the specified address (fixups will be applied from the same address).
In some cases emulation was simply breaking, for different reasons. In some cases obfuscation was using some techniques that was confusing the emulation engine. So I implemented some ad-hoc fixups (you can enable them by using -F
option of the emulate_payload
command). Fixups are implemented in modules/emulate_fixups.py
. Currently we have
Unicorn issue #1092:
#
# Fixup #1
# Unicorn issue #1092 (XOR instruction executed twice)
# https://github.com/unicorn-engine/unicorn/issues/1092
# #820 (Incorrect memory view after running self-modifying code)
# https://github.com/unicorn-engine/unicorn/issues/820
# Issue: self modfying code in the same Translated Block (16 bytes?)
# Yes, I know...this is a huge kludge... :-/
#
FPU emulation issue:
#
# Fixup #2
# The "fpu" related instructions (FPU/FNSTENV), used to recover EIP, sometimes
# returns the wrong addresses.
# In this case, I need to track the first FPU instruction and then place
# its address in STACK when FNSTENV is called
#
Trap Flag evasion:
#
# Fixup #3
# Trap Flag evasion technique
# https://unit42.paloaltonetworks.com/single-bit-trap-flag-intel-cpu/
#
# The call of the RDTSC with the trap flag enabled, cause an unhandled
# interrupt. Example code:
# pushf
# or dword [esp], 0x100
# popf
# rdtsc
#
# Any call to RDTSC with Trap Flag set will be intercepted and TF will
# be cleared
#
Too few values on stack:
#
# Fixup #4
# Stack too small (not enough values stored)
#
# Some obfuscator/evasion technique try to access some values on the stack
# (like for example SGN https://github.com/EgeBalci/sgn.git):
#
# cmovne ax, word ptr [esp + 0xfa]
#
# In this case the emulation fails with an "invalid_read" since ESP is too
# close to the top of the stack. This creates some 'fake' values.
#
File modules/emulate_rules.py
contains the YARA rules used to intercept the interesting part of the code, in order to implement instrumentation. I tried to comment as much as possible these sections in order to let you create your own rule (please share them with a pull request if you think they can help others). For example:
#
# Payload Name: [MSF] windows/meterpreter/reverse_tcp_rc4
# Search for : mov esi,dword ptr [esi]
# xor esi,0x<const>
# Used for : this xor instruction contains the constant used to
# encrypt the lenght of the payload that will be sent as 2nd
# stage
# Architecture: x32
#
yara_reverse_tcp_rc4_xor_32 = 'rule reverse_tcp_rc4_xor { \
strings: \
$opcodes_1 = { 8b 36 \
81 f6 ?? ?? ?? ?? } \
condition: \
$opcodes_1 }'
Please, open Issues if you find something that not work or that can be improved. Thanks!