We will be doing this analysis in a Windows 10 VM with the flare-vm tools installed. Most of the tools listed below will come with flare-vm by default.
- CFF Explorer - Good tool to gloss through basic PE information
- dnSpy - For reversing .NET binaries
- HxD - Pretty good hex editor for windows
- IDA Pro - A must have. While the decompiler is not strictly necessary, it can make life much easier and we'll be using it extensively here
- Python 3 - Our weapon of choice. We'll also need to install some packages using the following pip command
pip install Pillow numpy pefile pycryptodome
- Powershell - Comes with windows by default. While we can survive with just python, some actions involving the OS (e.g. listing files) are so much more simple in powershell
The crackme is an executable called MBCrackme.exe
.
When we run the application we see a GUI like this, containing a form with text fields asking for 3 different passwords and buttons labelled "Check!"
When we open it in CFF explorer, we see that the file type is a .NET assembly and it contains a directory called the .NET directory, which tells us that this is a .NET binary.
This means that we can analyze the binary in dnSpy which would allow us to see the C# code in the binary. Luckily for us, the variables in the code are properly named and aren't obfuscated.
Under the Form1
class, we see 3 important methods, button1_Click(object, EventArgs)
, button2_Click(object, EventArgs)
, and button3_Click(object, EventArgs)
, which are called when the corresponding form buttons are clicked.
Hence, this crackme challenge can be divided into 3 levels, each level corresponding to a password to find.
To find the first password, it's probably a good idea to look into the method that is called when the first button is clicked, button1_Click(object sender, EventArgs e)
.
private void button1_Click(object sender, EventArgs e)
{
if (this.textBox1.Text.Length == 0)
{
MessageBox.Show("Enter the password!");
return;
}
After checking that the password submitted is not blank (which is also present in the other 2 button click methods), it calls the decode(Bitmap, string)
method from the same class, passing in as arguments the resource named mb_logo_star
, which is actually the background image we see in the form GUI, and the first password we entered in the form.
// continued from button1_Click...
bool flag = false;
string text = this.textBox1.Text;
byte[] array = Form1.decode(Resources.mb_logo_star, text);
This method iterates through the RGB pixels in the bitmap argument in column major order, then for each pixel, it takes the least significant few bits from the R, G, and B channels respectively, combines them into a single byte b
, then XORs b
with a byte from the password_str
argument, and returns the result of this applied to all the pixels as a byte array. This is a form of bitmap steganography, used to hide a payload that is encrypted with a simple repeating key XOR cipher.
public static byte[] decode(Bitmap bm, string password_str)
{
byte[] bytes = Encoding.ASCII.GetBytes(password_str);
byte[] array = new byte[bm.Width * bm.Height];
int num = 0;
for (int i = 0; i < bm.Width; i++)
{
for (int j = 0; j < bm.Height; j++)
{
Color pixel = bm.GetPixel(i, j);
int num2 = Form1.keep_bits((int)pixel.R, 3);
int num3 = Form1.keep_bits((int)pixel.G, 3) << 3;
int num4 = Form1.keep_bits((int)pixel.B, 2) << 6;
byte b = (byte)(num2 | num3 | num4);
if (bytes.Length != 0)
{
b ^= bytes[num % bytes.Length];
}
array[num] = b;
num++;
}
}
return array;
}
After calling the decode()
method and obtaining the decoded byte array from the resource, the original button1_Click()
method truncates the array if it is larger than Form1.validSize_1
, which is 241152, then computes the CRC32 checksum of the truncated array and compares it against Form1.validCrc32_1
, which has the value of 2741486452
(0xA367C374
).
// continued from button1_Click...
if (array.Length > Form1.validSize_1)
{
Array.Resize<byte>(ref array, Form1.validSize_1);
}
if (Crc32Algorithm.Compute(array) == Form1.validCrc32_1)
{
flag = true;
try
{
if (Form1.g_serverProcess == null || Form1.g_serverProcess.HasExited)
{
File.WriteAllBytes(this.g_serverPath, array);
flag = this.runProcess(this.g_serverPath);
}
}
// Exception handling and form GUI adjustments...
Then, if the CRC32 hashes match, it dumps the decoded byte array into a file on disk whose path is given by this.g_serverPath
. If we look into the definition of this.g_serverPath
, we can see that the file being created is called level2.exe
and is located in the temp directory.
private string g_serverPath = Path.Combine(Path.GetTempPath(), "level2.exe");
Then it calls the method runProcess(string)
on that path which starts a hidden window process from an executable at the given argument path, which suggests that the decoded data was an executable file.
private bool runProcess(string path)
{
if (Form1.g_serverProcess != null)
{
if (!Form1.g_serverProcess.HasExited)
{
Form1.g_serverProcess.Kill();
}
Form1.g_serverProcess.Close();
}
bool result = false;
try
{
Process process = new Process();
process.StartInfo.FileName = path;
process.StartInfo.WindowStyle = ProcessWindowStyle.Hidden;
result = process.Start();
Form1.g_serverProcess = process;
}
catch (Exception)
{
}
return result;
}
To summarize what we know so far,
- The password XOR-decrypts the ciphertext (embedded in the resource image) into an executable file
- The CRC32 checksum of this decrypted executable file is
0xA367C374
And we don't know
- The length of the password
- What kind of characters make up the password
The next few sections will talk about different methods to recover the password.
An obvious (but later shown to be unfeasible) method would be to try brute forcing the password. This involves trying different values of the password, and verifying if the decrypted payload is correct by comparing the CRC32 checksum against 0xA367C374
.
As mentioned previously, we don't know the character set or the length of the password, but we can make a few reasonable guesses. We know that we have to enter the password into the text box before its used by the decode method, so the password would likely be made up of printable ASCII characters. As for the length, we can just try different values until we find a password of some length that matches.
First, we should extract the ciphertext from the image. We need to first extract the resource, we'll call it mb_logo_star.png
Then, we can re-implement the extraction method without doing any XORing to get our ciphertext.
from PIL import Image
import numpy as np
def extract_data(image):
newarray = []
num = 0
for j in range(len(image[0])):
for i in range(len(image)):
r = (0b00000111 & image[i][j][0])
g = (0b00000111 & image[i][j][1]) << 3
b = (0b00000011 & image[i][j][2]) << 6
newarray.append(r | g | b)
return bytes(newarray)
image = np.asarray(Image.open('mb_logo_star.png')) # 700 x 700 x 3
ciphertext = extract_data(image)[:241152]
Now we have the extracted ciphertext in the variable ciphertext
. We also want to implement our XOR cipher
def xor(data, key):
return bytes([b ^ key[i % len(key)] for i, b in enumerate(data)])
Then we have what we need to start our brute force attack
import zlib
import string
import itertools
CHARSET = string.printable
for length in range(1, 50):
print(f"Trying length {length}...")
attempts = itertools.product(CHARSET, repeat=length)
for attempt in attempts:
key = bytes(attempt)
if zlib.crc32(xor(ciphertext, key)) == 0xA367C374:
print(f"Found password! {key}")
However, if we run it, we find that our script can barely get to trying passwords of length 3 before it starts to get stuck.
There are a few factors at play here. Firstly, the ciphertext is 241152 bytes which is huge, so for each password attempt, every time we call xor(ciphertext, key)
, there are a lot of operations to be run. The second more pressing problem is that the search space grows exponentially with the password length. There are 100 printable ASCII characters (we can find this out with len(string.printable)
), meaning that every time we add a new character, our new search space is 100 times bigger. As we later find out, the password was actually 49 characters long, which would mean that we would have had to try 100 to the power of 49 (that's 1 followed by 98 zeros) passwords, a few orders of magnitude greater than the number of atoms in the observable universe.
This means that brute-force is a no go, and we should try something else.
A good idea would be to take a step back and take a look at our ciphertext for any clues. We can reuse the code we wrote previously in our brute-force attempt to get the ciphertext, and dump out the first few bytes to see if there's anything we can work with
>>> ciphertext[:100]
b'(;\xe3y\\leval_o\x91\x9a_a\xd4most_do.e_xor_pe_and_keep_going!easy_level_\x97ne_os\xd7as\xc0V\xa9N\xd6d\x13\xb5N&7\x19\x16\x7f\x11\x1c\x0b8\x19\x04\x08P<\x06\x01\x07\x01\x13\x01\x07\x04'
>>> ciphertext[:1000]
b"(;\xe3y\\leval_o\x91\x9a_a\xd4most_do.e_xor_pe_and_keep_going!easy_level_\x97ne_os\xd7as\xc0V\xa9N\xd6d\x13\xb5N&7\x19\x16\x7f\x11\x1c\x0b8\x19\x04\x08P<\x06\x01\x07\x01\x13\x01\x07\x04S\x0b*\x02E\x1f\x0bL\x1b =E2\x0e\x08\x08A~yU@one_xor\xfb\x9b\xdeL\x81\xe4\xb1\x1f\x8b\xef\xb00\xbf\xed\xba)\xeb\x8b\xf7$\x8b\xf9\xac\x1f\xe9\x89\xa6$\x07\xd5\xba.\xe0\xb3\xb0-\x9f\xe5\xa64\xed\x86\xbf/\xa1\xd5\xad/\xc0\xbd\xa1$\xae\xeb\xbb$\xed\x89\xb3$\x81\xd5\xb2/\xec\x82\xb3`\x80\xeb\xa69\xbf\xe6\xb16\xd7\xe6\x8a/*\x86\x8f \x88\xe7\xba30\xbcN/\x8f\xef\x8a8+\x91\x881\x84\xd5\xb4.66\x08\r\x85\xfa\x8a'oing!eas)\x1ale:di_\x80Mu>almost_d\x8fng^sn|Op\xa7_an\x96]keep_\x14zing1eas\x99_lev%l_\x7fne_clmist_doneYxor_pe_a\x9eg_kaep_goilga\xe4asi_luvel_\x7fneOalmostOdone_xor_pesSodckeep\x8fdo\x89og!easy_level_one_al\x8dlslQdonN^xsr_pe_and_keep_going!easy_leVNm_/ne_almost_do\x8ee_lnr_pe_and_keep_going!easy_lK\x02\x00\x14+one\xd8\xa1lmoct_d\xadne_|or_pe_and_keep\x7fgo\t@\x15E\x04\x15\x12y_\x1a=vel\xbfone\x05alm\xa9st_done_xor_pe\x1fan$q\x0f\x04\x11\x11_go}\xebe!e!ry_\x10gveL^one_almost_do.e_\xb8A\x00,\x02\x06_an\x84^kee\xa0\\gokng!\xf9bsy_level_one_!lm/]\x06:\x08\x00\re_`ar_p\x85\\ant_ke\xfbs_going!easy_l%ve._one_almost_done_xor_pe_and_keep_going!easy_level_one_almost_done_xor_pe_and_keep_going!easy_level_one_almost_done_xor_pe_and_keep_going!easy_level_one_almost_done_xor_pe_and_keep_going!easy_level_one_almost_done_xor_pe_and_keep_going!easy_level_one_almost_done_xor_pe_and_keep_going!easy_level_one_almos"
We can see the string easy_level_one_almost_done_xor_pe_and_keep_going!
being repeated many times, which is a result of the plaintext containing large blocks of null bytes, which would leak the key in the ciphertext. This is because plaintext XOR key = ciphertext
, so if our plaintext is 0, then we have ciphertext = 0 XOR key = key
. The repeating part comes from our key being repeated in the cipher, so when a large block in our plaintext contains large blocks of null bytes (which is a very common occurrence), we would be able to see our key being repeated in the clear.
Hence, we try easy_level_one_almost_done_xor_pe_and_keep_going!
as they key, and see that after decryption, our plaintext has the correct CRC32 hash
>>> hex(zlib.crc32(xor(ciphertext, b"easy_level_one_almost_done_xor_pe_and_keep_going!")))
'0xa367c374'
Even though we already have the key, we can explore another method of decrypting the data for the sake of learning.
We know that the decoded file is an executable, which means that we have some information about what our decrypted plaintext is supposed to look like and so we can try using a known plaintext attack to get the XOR key.
This known plaintext attack vulnerability of XOR ciphers is described in its Wikipedia entry
In any of these ciphers, the XOR operator is vulnerable to a known-plaintext attack, since plaintext ⊕ ciphertext = key.
Windows executables follow a format called the Portable Executable (PE) format, which is described on MSDN here. They have a specific header format beginning with the famous magic bytes "MZ" and with other fields described here, but usually the first 0x3C bytes of most PE files are identical, so we can take those first bytes from any executable file we have lying around (like MBCrackme.exe
) as our known plaintext, and XOR them with our ciphertext to get the key
with open("MBCrackme.exe", "rb") as fp:
known_plaintext = fp.read(0x3C)
print(xor(ciphertext, known_plaintext)[:0x3C])
which gives b'easy_level_one_almost_done_xor_pe_and_keep_going!easy_level_'
, clearly showing us our key.
Now that we have our level 1 key, we can dump out the decrypted executable for analysis.
plaintext = xor(ciphertext, b"easy_level_one_almost_done_xor_pe_and_keep_going!")
with open('level2.exe', 'wb') as fp:
fp.write(plaintext)
Opening the file in CFF explorer, we see that it is not a .NET binary (sadly), so we have to look it in IDA to find out what it does.
For the subsequent analysis in IDA, we will use a base address of 0x400000
when referring to subroutine addresses. By default, the base address should be loaded at that memory address, the reasons for which are explained in this article. However, when debugging, the base address may shift to some other location like 0x1A0000
(usually because of ASLR, which we can check if it's enabled with CFF explorer), and this would cause all the addresses in our IDA database to be out of sync with the ones in this writeup. To change the base address that IDA uses to calculate the addresses, we can select "Edit"->"Segments"->"Rebase Program" under the menu bar.
We start our analysis at the main function at 0x401070
, which first calls AddVectoredExceptionHandler
to register the function at 0x401000
(which IDA has renamed Handler
for us) as a vectored exception handler. We will examine what this Handler
function does later when its use comes into play, but for now we will continue looking at what the main function does after that.
The main function then calls memset()
to zero out 0x4E4B2
bytes of a stack variable that IDA has labelled Src
, then calls sub_4011D0
with the following arguments
unk_414000
is a memory location in the .data
section that contains strange data
If we look into sub_4011D0
, we see that it gets the address of the PEB, then traverses the linked list in the structure to get one of the loaded modules (DLL) in the process. Then, it passes the loaded module and a weird number 985953233
to the function sub_401250
which would return some sort of function, and that returned function is called afterwards with most of the original arguments.
Even without looking into sub_401250
, this behavior already looks very similar to API hashing, a technique used to obfuscate API calls, which makes static analysis difficult because we can't directly see which APIs are being called. It is a popular technique used by real malwares like Dridex, or Cobalt Strike, and it works by passing a API hash and a reference to a loaded module to an API resolving function (in this case sub_401250
), which would go through all the exported APIs from that module, hash the names of the APIs, then compare it against the given hash. If the resolver function finds an API which matches the given hash, it returns the address of that API.
Just so we have a clearer picture of what's going on before we analyze the resolver function sub_401250
, we refer to some documentation (the structure is partially undocumented on MSDN so we have to refer to other sources) and find that Flink
actually should have the pointer type _LDR_DATA_TABLE_ENTRY *
, so we change the type (by pressing "Y") and find that v8
is from the field DllBase
which points to the base address of the loaded module.
Now we can analyze sub_401250
with the correct pointer types (e.g. IMAGE_DOS_HEADER *
for a1
). After some renaming and cleaning up, we see the following code which represents the hashing algorithm
There are a few methods we can use to find out which APIs are being resolved.
The first uses dynamic analysis, where we just step over the API resolver function to see which API got resolved. This would probably be easier in our case, but sometimes it can be more troublesome especially if there are many resolved API calls or if extensive anti-debugging techniques are present.
The second method would be to re-implement the resolver function, which would allow us to know which APIs correspond to which hashes statically. This method also gives us the flexibility to write an IDAPython script to automatically resolve all calls within the binary if there are many such calls inside and resolving them individually would be too tedious.
Note that there are also existing tools like HashDB that you could explore, but we won't be using them today.
We would write something like this to re-implement the resolver function
import pefile
def rol4(x, n):
return ((x << n) | (x >> (32 - n))) & 0xFFFFFFFF
def hash_name(function_name):
hash_value = 0xF00DF00D
for b in function_name:
hash_value = (rol4(hash_value, 5) ^ b) & 0xFFFFFFFF
return hash_value
entry_export = [pefile.DIRECTORY_ENTRY["IMAGE_DIRECTORY_ENTRY_EXPORT"]]
exports = []
libraries = [
'ntdll',
'kernel32',
'gdi32',
'user32',
'comctl32',
'comdlg32',
'ws2_32',
'advapi32',
'netapi32',
'ole32',
'winmm',
'imm32',
'bcrypt',
'wmi',
]
for library in libraries:
pe = pefile.PE(f'C:/windows/system32/{library}.dll')
pe.parse_data_directories(directories=entry_export)
exports += [e.name for e in pe.DIRECTORY_ENTRY_EXPORT.symbols if e.name]
print(exports[-1])
hash_to_name = {hash_name(name):name for name in exports}
So now we have a dictionary hash_to_name
that acts like the API resolver. If we give it our original hash 985953233
, we find that it actually resolves to RtlDecompressBuffer
.
>>> hash_to_name[985953233]
b'RtlDecompressBuffer'
Now that we know the API called, we realize that sub_4011D0
is in fact just a wrapper for RtlDecompressBuffer
with the options COMPRESSION_FORMAT_LZNT1 | COMPRESSION_ENGINE_MAXIMUM
, meaning that it just decompresses the 160345-byte buffer at unk_414000
using the LZNT1 algorithm, and places it into the stack variable Src
.
To extract the decompressed bytes, we run it dynamically by stepping over the call to RtlDecompressBuffer
, then find the address of the output buffer (0x8F19DC
in this case) and extract the file with the following IDAPython code
a = ida_bytes.get_bytes(0x8F19DC, 320690)
open('decompressed_buf_2.bin', 'wb').write(a)
Continuing with our analysis of main, after it decompresses the buffer, it makes another hash-resolved API call with a different hash 0xF4DD3DAD
Our re-implemented resolver tells us that this is NtAllocateVirtualMemory
>>> hash_to_name[0xF4DD3DAD]
b'NtAllocateVirtualMemory'
and after some cleaning up, we can see that the decompressed data is run as code, telling us that the decompressed data is shellcode
However, when we take a look at the decompressed data in a hex editor, we find that it is a whole PE file with all the "MZ" and "PE" headers and not just some shellcode.
This is strange because loaders (level2.exe
in this case) would normally have to do some PE header parsing to find the entry point before transferring control over, but this loader directly transfers control to the start of the file which is where the "MZ" header is at. If we run the loader dynamically and step into the call to the PE file, we find that the "MZ" header actually contains some valid instructions
If we take a closer look at the MZ header and do some digging online, we find this tool by hasherezade (the challenge author) which modifies a PE structure to make the header executable and move the reflective loading code into the PE itself, effectively "shellcodifying" the PE file, which is pretty neat. The MZ header of our decompressed PE file is very similar to the source code of this tool, which tells us that it's very likely this tool was used.
The main function just returns after the call to the shellcode / PE file, so we can conclude that the role of level2.exe
is simply a loader that decompresses and loads a shellcodified PE file, and later analysis tells us that this PE file acts as some sort of server, so we will call it server.exe
.
This binary is also not a .NET application, so we have to look at it in IDA again.
The main function allocates some memory, then calls sub_405E10
, then sub_4061E0
with 2 arguments, one looking like a pipe name and the other being the number 1337, and lastly sub_4092A6
which just calls free.
Luckily for us, the API calls are not obfuscated like in the loader was so analysis is slightly easier.
sub_405E10
first places 34 DWORDs into an array in the stack, then calls sub_405CB0
for each one of them.
If we look at the code in sub_405CB0
, it seems to be accessing some structure, and choosing which field to access based on some comparison. This behavior is a clue to us that there might be some sort of binary search tree (BST) structure involved here, since when we are searching for elements in a BST, we have to compare the current node against our search value then pick either the left or right nodes, which would be represented by different fields in our structure. More information about BSTs can be found here.
In addition, sub_405CB0
calls sub_405B50
which contains the string "map/set<T> too long"
, which is a smoking gun telling us that this is part of some C++ standard library function, more specifically std::set<int>
. This set container uses a balanced binary tree to store its items internally. Rolf Rolles published an article which talks about reversing STL containers, which we can reference to create this struct
struct bbt_node
{
bbt_node *left;
bbt_node *parent;
bbt_node *right;
char field_C;
char is_terminal;
_BYTE gap_E[2];
int data;
};
and also conclude that sub_405CB0
is std::set::insert()
, and that off_437E9C
points to a std::set
which contains the 34 numbers.
After adding the numbers to the set, it checks if any of the functions sub_402520
, sub_402560()
, or sub_402320
returns 1, and if so executes the instruction int 3
which triggers a breakpoint exception.
The first function sub_402520
is relatively simple, only calling IsDebuggerPresent()
and IsRemoteDebuggerPresent()
to check for any debuggers attached.
The second function sub_402560
calls IsBadReadPtr(0x7FFE0000, 0x3B8)
, then checks if 0x7FFE02D4
is set. A bit of searching gives us this very similar function (also from the challenge author), which tells us that 0x7FFE0000
is the address to the structure KUSER_SHARED_DATA. This means that 0x7FFE02D4
(offset 0x2D4
) corresponds to the field KdDebuggerEnabled
. This function just checks if a debugger is present.
The last function sub_402320
takes the argument off_437E9C
, which is our std::set
structure. If we look inside, it calls CreateToolhelp32Snapshot
then Process32First
. After that, it takes the field szExeFile
(which contains the name of the executable file that the process is running without the .exe
extension) and runs this hash algorithm, which is almost identical to the one used in the API hashing method, the only difference being the initial hash value and the fact that this one hashes the lowercased string.
Afterwards, it searches for the hash value in our set, and returns true if found. Then it calls Process32Next
then does the same thing in a loop. The function essentially scans the system for blacklisted processes, the hashes of which are found in the 34 numbers in our set. We can guess that the hashes are of analysis or sandbox tools, and this technique of evading analysis via system checks is used by real world malware, like ObliqueRAT which has an analysis process blacklist, or PlugX which checks for processes named "vmtoolsd".
Note that this function calls sub_4013E0
and sub_4014D0
which are both related to std::string
. We know this because there are a lot of comparisons of some offset with the number 16, which is a byproduct of the short string optimization used by MSVC. The structure for std::string
can be represented something like this (taken from Eleemosynator's writeup for the previous Malwarebytes crackme)
// Simplified layout of the std::string object
struct string_layout {
union contents_union {
char buffer[16];
char *data;
} contents; // offset 0x00
size_t size; // offset 0x10
size_t reserved; // offset 0x14
} ;
We now know that the three functions were just checking for signs of debugging, but it is still strange why it would execute int 3
if it detected debugging. To answer this, we have to bring our attention back to the VEH Handler
function in the loader level2.exe
.
If the exception code is an access violation then they output a failure message box and terminate the process. If the exception address within the loaded binary, and the exception is a breakpoint, then they terminate the process, which means that the int 3
instruction was just to terminate the process via the exception handler.
We now know that sub_405E10
is an anti-debugging function that terminates the process if debugging is detected. The next function called is sub_4061E0
, which first creates a named mutex MB_Crackme_level2_mutex
, and returns if it isn't able to create it. Real world malwares do this too, usually to ensure that only a single instance of the program is running.
Afterwards, it initializes the values in a structure in the stack with 2 fields, the pipe name and the address of sub_406290
. We will label the fields in the struct as such
struct thread_struct
{
int value;
int subroutine;
};
Then it calls sub_4051E0
which creates a thread at sub_405210
, passing in that struct as its parameters. It also calls sub_405A80
which creates another thread with the same struct with different values, but that part is for level 3 so we will focus on sub_405210
first.
sub_405210
calls CreateNamedPipeA
to create a new named pipe at \\.\pipe\crackme_pipe
, then reads from it, then creates a thread to sub_4032C0
, then immediately waits for the thread to close, so it's basically a function call. sub_4032C0
reads from the pipe, then calls sub_406290
which it got from its thread struct argument. The result of calling sub_406290
is then written back to the same pipe, then it cleans up and closes itself.
sub_406290
hashes the string that was written to the pipe with the same hashing algorithm as the one used in the anti-debug-blacklisted-process-checking-function sub_402320
, then checks if its in the set at off_437E9C
which is the same set containing those 34 blacklisted process name hashes. If it is, it uses the string as an RC4 key to decrypt the following 29-byte-long string (repesented in hex), then returns the decrypted string (to get written back to the pipe)
5a9558f17c6d62b5c2c68ad620f2f610d88fef4cd663468b1a0dbea251
We know its RC4 because it calls the function sub_401200
which contains this block of code that creates the substitution box
which is a tell-tale sign of RC4, and we can find the KSA and PRGA after analyzing the rest of the function. More information about how to recognize RC4 constructs can be found in this video by OALabs.
To find out what gets written into the pipe and what happens to the decrypted data, we have to switch our focus back to the original .NET executable MBCrackme.exe
, at button2_Click
.
private void button2_Click(object sender, EventArgs e)
{
if (this.textBox2.Text.Length == 0)
{
MessageBox.Show("Enter the password!");
return;
}
bool flag = false;
string pipeName = "crackme_pipe";
string text = this.textBox2.Text;
byte[] array = null;
try
{
NamedPipeClientStream namedPipeClientStream = new NamedPipeClientStream(".", pipeName);
namedPipeClientStream.Connect(1000);
StreamWriter streamWriter = new StreamWriter(namedPipeClientStream);
TextReader textReader = new StreamReader(namedPipeClientStream);
streamWriter.WriteLine(text);
streamWriter.Flush();
string s = textReader.ReadLine();
array = Encoding.ASCII.GetBytes(s);
if (Crc32Algorithm.Compute(array) == Form1.validCrc32_2)
{
flag = true;
}
}
// Exception and GUI handling...
It connects to the pipe and writes the level 2 password to it, then reads from it and checks if the CRC32 checksum of the response is equal to Form1.validCrc32_2
, which is 0x1DC85E5D
.
This means that the pipe serves as a communication channel between server.exe
and MBCrackme.exe
.
Tip: if you want to debug the application and write to the pipe, it takes only 5 lines to do it in powershell
$npipeClient = new-object System.IO.Pipes.NamedPipeClientStream('.', 'crackme_pipe')
$npipeClient.Connect()
$pipeWriter = new-object System.IO.StreamWriter($npipeClient)
$pipeWriter.Write('stuff to write to the pipe')
$pipeWriter.Flush()
To summarize what we know so far,
- There are 34 names of analysis processes that it compares hashes against
- The level 2 password is one of the analysis process names, but might not be lowercase itself
- The password is used as the RC4 key to decrypt a 29-byte long stack string
- The RC4 decrypted string has a CRC32 checksum of
0x1DC85E5D
(after removing the trailing null byte)
Hence, to find out what the level 2 password could be, we have to first find out what processes are being blacklisted. To do this, we will take the executable names of as many analysis processes as we can, then hash their names and check if their hashes match any of the 34.
We can start with gathering all executable files on our system with this powershell line. Our flare-vm environment comes with many analysis tools that could give us a hit, so this is worth a shot.
Get-ChildItem *.exe -Recurse -Path C:\ | % { echo $_.BaseName } > executables.txt
This Github repo also contains the names of some analysis processes, so we format it (remove the .exe
extension) and add it to our list executables.txt
.
We write a script to test the hashes
def rol4(x, n):
return ((x << n) | (x >> (32 - n))) & 0xFFFFFFFF
def hash_name(text, initial_hash_value):
hash_value = initial_hash_value
for b in text:
hash_value = (rol4(hash_value, 5) ^ b) & 0xFFFFFFFF
return hash_value
HASHES = [
0xC81D63C9, 0x5B2839AC, 0x17DAD73F, 0x72C7241C,
0x58E483ED, 0x82134662, 0x34204667, 0x4CD53A71,
0x34206499, 0xFFDEB191, 0x7AC6410B, 0xEA3503AA,
0xCCFA2924, 0x3A09FFBC, 0x38EA0C1B, 0x58E479EC,
0x1B964E1A, 0x707F9D9A, 0xF5A79701, 0x09F5473B,
0xBA635AC6, 0x0BB18A65, 0x46119FD8, 0xFB7BF6AF,
0x3F75D54B, 0x49110E9F, 0x5D9F9FD8, 0x5DCC9FD8,
0x8293C33E, 0x5D112314, 0x9D9F8189, 0xC10AE786,
0x67D8B725, 0x07FE9020,
]
with open('executables.txt', 'r') as fp:
executables = fp.read().split()
executables = list(set(executables))
hash_to_name = {}
for executable in executables:
processed_name = executable.lower().encode()
hash_value = hash_name(processed_name, 0xBADC0FFE)
hash_to_name[hash_value] = executable
for hash_value in HASHES:
if hash_value in hash_to_name:
print(hex(hash_value), hash_to_name[hash_value])
else:
print(hex(hash_value, "not found")
We run it and get this mapping
> python .\match_hashes.py
0xc81d63c9 ollydbg
0x5b2839ac ProcessHacker
0x17dad73f tcpview
0x72c7241c autoruns
0x58e483ed autorunsc
0x82134662 filemon
0x34204667 procmon
0x4cd53a71 regmon
0x34206499 procexp
0xffdeb191 idaq
0x7ac6410b idaq64
0xea3503aa ImmunityDebugger
0xccfa2924 Wireshark
0x3a09ffbc dumpcap
0x38ea0c1b HookExplorer
0x58e479ec ImportREC
0x1b964e1a PETools
0x707f9d9a LordPE
0xf5a79701 SysInspector
0x9f5473b proc_analyzer
0xba635ac6 sysAnalyzer
0xbb18a65 sniff_hit
0x46119fd8 windbg
0xfb7bf6af joeboxcontrol
0x3f75d54b joeboxserver
0x49110e9f ResourceHacker
0x5d9f9fd8 x32dbg
0x5dcc9fd8 x64dbg
0x8293c33e Fiddler
0x5d112314 httpdebugger
0x9d9f8189 Vmwaretray
0xc10ae786 pe-sieve
0x67d8b725 hollows_hunter
0x7fe9020 pin
We then try each of them and see if it matches the CRC32 checksum, using this script
import Crypto.Cipher.ARC4
import zlib
possible_passwords = [
"ollydbg", "ProcessHacker", "tcpview", "autoruns",
"autorunsc", "filemon", "procmon", "regmon",
"procexp", "idaq", "idaq64", "ImmunityDebugger",
"Wireshark", "dumpcap", "HookExplorer", "ImportREC",
"PETools", "LordPE", "SysInspector", "proc_analyzer",
"sysAnalyzer", "sniff_hit", "windbg", "joeboxcontrol",
"joeboxserver", "ResourceHacker", "x32dbg", "x64dbg",
"Fiddler", "httpdebugger", "Vmwaretray", "pe-sieve",
"hollows_hunter", "pin",
]
# Try lowercase too
possible_passwords += [s.lower() for s in possible_passwords]
ciphertext = bytes.fromhex("5a9558f17c6d62b5c2c68ad620f2f610d88fef4cd663468b1a0dbea251")
for password in possible_passwords:
# RC4 requires a password length of at least 5
if len(password) < 5:
continue
cipher = Crypto.Cipher.ARC4.new(password.encode())
decrypted_string = cipher.decrypt(ciphertext).rstrip(b'\x00')
if zlib.crc32(decrypted_string) == 0x1DC85E5D:
print("Found match!")
print(f"Password: {password}")
print(f"Decrypted: {decrypted_string}")
Running it gives us
> python .\find_password.py
Found match!
Password: ProcessHacker
Decrypted: b'we_are_good_to_go_to_level3!'
So we now know that the password for level 2 is ProcessHacker
. We are good to go to level 3!
Quite a lot happened, so we have this figure to summarize this level
Now that we know what the server writes to the pipe, we can shift our attention back to the original MBCrackme.exe
. Looking at the last part of button2_Click
, it calls LoadNext.Load
.
// button2_Click continued
if (flag)
{
this.button2.Enabled = false;
this.textBox2.Enabled = false;
this.button2.BackColor = Color.OldLace;
this.button3.Enabled = true;
this.textBox3.Enabled = true;
this.button3.BackColor = SystemColors.ActiveCaption;
MessageBox.Show("Level up!");
LoadNext.Load(Form1.g_serverProcess, array);
return;
}
This method takes LoadNext.EncArr
, then performs base64 decoding on it, then AES decryption (using the returned string we_are_good_to_go_to_level3!
from level 2), then gzip decompression (LoadNext.DecompressBytes
), before loading it using Assembly.Load
and running the RunMe
method under some Level3Bin.Class1
class.
public static int Load(Process process1, byte[] password)
{
try
{
Type type = Assembly.Load(LoadNext.DecompressBytes(AES.decryptContent(Convert.FromBase64String(LoadNext.EncArr), password))).GetType("Level3Bin.Class1");
object obj = Activator.CreateInstance(type);
Type[] types = new Type[]
{
typeof(Process)
};
MethodInfo method = type.GetMethod("RunMe", types);
object[] parameters = new object[]
{
process1
};
method.Invoke(obj, parameters);
}
// Exception handling...
This tells us that the decoded content after all those steps is some .NET binary.
AES.decryptContent
uses a salt and the returned password to generate an AES key and IV using PBKDF2, then uses AES CBC mode to decrypt the content.
public static byte[] decryptContent(byte[] fileContent, byte[] password)
{
MemoryStream memoryStream = new MemoryStream();
byte[] password2 = SHA256.Create().ComputeHash(password);
byte[] salt = new byte[]
{
5,
3,
3,
7,
8,
0,
0,
8
};
RijndaelManaged rijndaelManaged = new RijndaelManaged();
rijndaelManaged.KeySize = 256;
rijndaelManaged.BlockSize = 128;
Rfc2898DeriveBytes rfc2898DeriveBytes = new Rfc2898DeriveBytes(password2, salt, 1000);
rijndaelManaged.Key = rfc2898DeriveBytes.GetBytes(rijndaelManaged.KeySize / 8);
rijndaelManaged.IV = rfc2898DeriveBytes.GetBytes(rijndaelManaged.BlockSize / 8);
rijndaelManaged.Mode = CipherMode.CBC;
try
{
CryptoStream cryptoStream = new CryptoStream(memoryStream, rijndaelManaged.CreateDecryptor(), CryptoStreamMode.Write);
cryptoStream.Write(fileContent, 0, fileContent.Length);
cryptoStream.FlushFinalBlock();
cryptoStream.Close();
}
// Exception handling...
We want to decrypt the content, but the base64 string is too long to view completely in dnSpy, and it gives the [...string is too long...]"
message at the end, so to get the whole string, we have to right click and select "Edit IL instructions"
This way we can have access to the whole string under ldstr
Afterwards we save it in a file called b64_encoded_string.txt
, and write the following script to get the decoded data
import base64
import gzip
import hashlib
import Crypto.Cipher.AES
import Crypto.Util.Padding
import Crypto.Protocol.KDF
with open('b64_encoded_string.txt', 'r') as fp:
text = fp.read()
encrypted_data = base64.b64decode(text)
salt = bytes([5, 3, 3, 7, 8, 0, 0, 8])
password = b'we_are_good_to_go_to_level3!'
password = hashlib.sha256(password).digest()
rfc_2898_bytes = Crypto.Protocol.KDF.PBKDF2(password, salt, dkLen=48)
aes_key = rfc_2898_bytes[:32]
aes_iv = rfc_2898_bytes[32:]
aes_cipher = Crypto.Cipher.AES.new(aes_key, Crypto.Cipher.AES.MODE_CBC, aes_iv)
compressed_data = aes_cipher.decrypt(encrypted_data)
compressed_data = Crypto.Util.Padding.unpad(compressed_data, Crypto.Cipher.AES.block_size)
actual_data = gzip.decompress(compressed_data)
with open('decrypted_data.bin', 'wb') as fp:
fp.write(actual_data)
The decrypted data turns out to be a DLL file, and is a .NET binary like we suspected
We rename the file level3.dll
, then put it in dnSpy for analysis. We look at the function RunMe
first because that's what's called by MBCrackme.exe
.
public static int RunMe(Process process1)
{
try
{
string tempFileName = Class1.GetTempFileName("dat");
if (Class1.DropTheDll(tempFileName))
{
DllInj.InjectToProcess(process1, tempFileName);
}
}
catch (IOException ex)
{
MessageBox.Show(ex.Message, "Error!", MessageBoxButtons.OK, MessageBoxIcon.Hand);
}
return 0;
}
Class1.DropTheDll
just drops a DLL file into a .dat
file in the temp directory with a random file name, then DllInj.InjectToProcess
injects the DLL into the server.exe
process. This technique is called DLL injection, and is frequently used by real malware, examples of which can be found in that link.
public static int InjectToProcess(Process targetProcess, string dllName)
{
IntPtr hProcess = DllInj.OpenProcess(1082, false, targetProcess.Id);
IntPtr procAddress = DllInj.GetProcAddress(DllInj.GetModuleHandle("kernel32.dll"), "LoadLibraryA");
IntPtr intPtr = DllInj.VirtualAllocEx(hProcess, IntPtr.Zero, (uint)((dllName.Length + 1) * Marshal.SizeOf(typeof(char))), 12288U, 4U);
UIntPtr uintPtr;
DllInj.WriteProcessMemory(hProcess, intPtr, Encoding.Default.GetBytes(dllName), (uint)((dllName.Length + 1) * Marshal.SizeOf(typeof(char))), out uintPtr);
DllInj.CreateRemoteThread(hProcess, IntPtr.Zero, 0U, procAddress, intPtr, 0U, IntPtr.Zero);
return 0;
}
CFF explorer tells us that the injected DLL is not a .NET binary, so we will analyze it in IDA
Here we will use 0x10000000
as the base address when referring to addresses.
This DLL doesn't have any exports, only a DllMain
.
Looking at the segments, we see two unusual segments .detourc
and .detourd
which are not standard in normal PE files. This will be useful later.
The DLL when loaded outputs a debug message saying it is hooking some process, and during unloading outputs another debug string saying that it is unhooks the process.
sub_10004D00
seems to be acquiring some lock, then calls sub_10004520
which calls VirtualProtect
to change the permissions of some memory block to RWX.
sub_10005260
suspends the thread it takes as the argument if it's not the current thread
sub_10004570
is a wrapper for sub_10004590
, and takes in 2 arguments, one being some global variable and the other being some function. If we cross-reference the global variables (pressing "X" in IDA), we can see that dword_1002AE60
points to CryptStringToBinaryA
(initialized in sub_10001000
), dword_1002AE64
points to GetCursorPos
(initialized in sub_10001010
), and dword_1002AE68
points to Sleep
(initialized in sub_10001020
). The initialization functions were all called in dllmain_crt_process_attach
, which is run before DllMain
is run.
Within sub_10004590
, it calls many functions that check for certain jmp
or nop
opcodes in the subroutine given in the global variable, like sub_100049F0
that calls sub_100043C0
which gets a jump destination relative to the first instruction, or sub_10003BA0
which checks returns the size of the first instruction if it is a nop
or int 3
.
This behavior suggests that some sort of API hooking is happening, more specifically inline hooking, because the hook function would have to check the API function's instructions to set up the trampoline properly and sure that the program doesn't crash.
If we search online for any information for the .detourc
and .detourd
sections, we can find an instrumentation library by Microsoft called Detours that uses inline hooking. If we do more digging, we can find this, written by the challenge author, with the following example usage code
void hook_apis()
{
DetourTransactionBegin();
DetourUpdateThread(GetCurrentThread());
DetourAttach(&(PVOID&)pMessageBoxA, my_MessageBoxA);
DetourTransactionCommit();
}
void unhook_apis()
{
DetourTransactionBegin();
DetourUpdateThread(GetCurrentThread());
DetourDetach(&(PVOID&)pMessageBoxA, my_MessageBoxA);
DetourTransactionCommit();
}
It seems that sub_10004D00
is DetourTransactionBegin
, sub_10005260
is DetourUpdateThread
, and sub_10004570
is DetourAttach
. This gives us even more confirmation that this DLL performs inline hooking. We will explore the modified behavior of the functions later when it comes into play.
For now, we will turn our attention back to the original binary MBCrackme.exe
, specifically button3_Click
.
TcpClient tcpClient = new TcpClient("127.0.0.1", 1337);
byte[] array = Encoding.ASCII.GetBytes(this.textBox3.Text);
NetworkStream stream = tcpClient.GetStream();
stream.Write(array, 0, array.Length);
array = new byte[256];
string text = string.Empty;
int count = stream.Read(array, 0, array.Length);
text = Encoding.ASCII.GetString(array, 0, count);
if (text.Length > 10)
{
this.label4.Text = text;
this.button3.BackColor = Color.OldLace;
this.textBox3.Enabled = false;
this.button3.Enabled = false;
}
MessageBox.Show(text);
It connects and writes the third password to the localhost TCP socket at port 1337, then receives back the reply and places it in the message box.
To find out how the listening socket gets set up and what gets written to it, we have to revisit server.exe
, and take a look at the other thread at sub_4056F0
that was created by sub_405A80
which is called by sub_4061E0
. It first creates a socket and prepares the structures to point to the socket address 127.0.0.1:1337
.
Then it binds that address to the socket and listens on it
When it accepts a connection, it calls sub_406530
, which first prepares the following 68-byte long stack string (in hex representation)
7FB19BA3DBB87A983EE96B2FACC4405A420F905F5CF19CAB32791BF50CCAA306C4454A4AF61D592141DAF3C7BAEFEEA32D0D82451735D334CBDCC3D7B35B5EFA673FE269EF02415A
Afterwards, it takes the submitted password, then calls CryptStringToBinaryA
with CRYPT_STRING_BASE64
as the dwFlags
option.
Then, it takes the decoded string (in pbBinary
) and for every byte, calls GetCursorPos
and then compares the x
value against the decoded byte rotated
- left or right depending on whether the index is even or odd, and
- by how much depending on the
y
value returned
If the comparison succeeds, it sets the value of the decoded text to the original byte rotated the other way by the same amount.
Then, it calls Sleep
.
Afterwards, it takes the resultant value of the decoded text as an RC4 key (using the same RC4 function sub_401200
used in level 2)
This seems arbitrary because the cursor position could really be anywhere, but remember that these three APIs were hooked in the injected DLLs, so it's time to analyze how their behavior changes after being hooked.
The proxy function (the function whose code is run instead of the hooked function) for CryptStringToBinaryA
is sub_10002990
(from the injected DLL). It outputs some debug strings about the arguments, then calls the real CryptStringToBinaryA
. Afterwards, it sets a global variable, which we will call g_counter
, to 4
.
sub_10002B10
is the proxy function for GetCursorPos
. It doesn't actually call the real GetCursorPos
function, but instead returns the x
and y
values as bytes from the arrays byte_1002A000
and byte_1002A020
based on the value of g_counter
.
The arrays contain non-ASCII values
sub_10002B60
is the proxy function for Sleep
, and it simply increments g_counter
.
Taken together, this means that the processing of the decoded base64 string is not arbitrary, but actually has nothing to do with our mouse position and is in fact deterministic.
Turning our attention back to the password processing function sub_4056F0
inside server.exe
, we will assume that the comparison of the rotated byte to the x
value matches every time (if it doesn't, then the decoded byte can take any other value, which leaves us with no information). So, we will find the value of the decoded bytes that passes all the checks using a bit of algebra.
We will call the index i
, the decoded byte ciphertext[i]
, and the new value set plaintext[i]
. For even i
, if the comparison were to succeed, we have to have
rol(ciphertext[i], y_val % 8) == x_val
plaintext[i] == ror(ciphertext[i], y_val % 8)
We can rotate right both sides of the first line to get
ciphertext[i] == ror(x_val, y_val % 8)
and then substitute it in the second line to get
plaintext[i] == ror(ror(x_val, y_val % 8), y_val % 8)
and since rotating right by the same value twice is the same as rotating right by twice the value,
plaintext[i] == ror(x_val, (y_val * 2) % 8)
For odd i
, it's just rol
instead of ror
.
With that, we can write the following python script to get the plaintext
x_array = [
0x95, 0xb9, 0x63, 0x59, 0xdc, 0xb5, 0x58, 0xc6,
0x6c, 0x5f, 0x68, 0x6f, 0x6f, 0xad, 0xdc, 0x5f,
0x6d, 0x58, 0xda, 0x65, 0x5f, 0x58, 0xd7, 0x62,
0x69, 0x9d, 0xd7, 0x91, 0x96, 0x99, 0x66, 0x65,
0x9c,
]
y_array = [
0x83, 0x1b, 0x89, 0x20, 0x37, 0x8b, 0x57, 0xc6,
0x78, 0x74, 0x00, 0xc4, 0x48, 0x83, 0xdb, 0x7c,
0x48, 0x49, 0x8b, 0x48, 0xf8, 0x49, 0xff, 0x24,
0x74, 0x93, 0x53, 0x03, 0x4a, 0x03, 0xc0, 0x48,
]
def rol(x, n):
return ((x << n) | (x >> (8 - n))) & 0xFF
def ror(x, n):
return ((x >> n) | (x << (8 -n))) & 0xFF
ans = []
counter = 4
for i in range(33):
x_val = x_array[counter % 33]
y_val = y_array[counter % 32]
if i % 2 == 0:
plaintext = ror(x_val, (2 * y_val) % 8)
else:
plaintext = rol(x_val, (2 * y_val) % 8)
ans.append(plaintext)
counter += 1
print(bytes(ans))
which gives b'small_hooks_make_a_big_difference'
.
And to get the flag, we just use it at the RC4 key to decrypt the stack string
import Crypto.Cipher.ARC4
password = b'small_hooks_make_a_big_difference'
ciphertext = bytes.fromhex("7FB19BA3DBB87A983EE96B2FACC4405A420F905F5CF19CAB32791BF50CCAA306C4454A4AF61D592141DAF3C7BAEFEEA32D0D82451735D334CBDCC3D7B35B5EFA673FE269EF02415A")
cipher = Crypto.Cipher.ARC4.new(password)
decrypted_string = cipher.decrypt(ciphertext).rstrip(b'\x00')
print(decrypted_string)
giving us flag{you_got_this_best_of_luck_in_reversing_and_beware_of_red_herrings}
.
This figure summarizes what happened in this level
Even though we already got the flag, I want to briefly go through what were the possible values for the third password, for the sake of completeness.
There are actually a lot of valid values of the password here. The only condition for it to be correct was it decoded to small_hooks_make_a_big_difference
after all the rotating. This means that if we base64 encode that string itself, i.e. c21hbGxfaG9va3NfbWFrZV9hX2JpZ19kaWZmZXJlbmNl
, then that would be a valid password. We could also have rotated the bytes and let the check correct it for us, so something like ua2wsWz1aPZvbZv1bbBbZV+wryaW7PqMpcxmZZOs3GOy
would also work.
Note that any of the bytes in our password can be either from the decoded string (small_hooks_make_a_big_difference
) or be rotated, which gives 2 possibilities per byte. Since the password has 33 characters, this gives 2 to the power of 33 (8589934592) possible values for password #3.
In this crackme, we saw techniques like
- Steganography
- Vectored exception handling
- API hash obfuscation
- "Shellcodified" PE files
- Anti-debugging system checks
- RC4 encryption
- The use of pipes and network sockets for inter process communication
- DLL injection
- Inline hooking
While analyzing all of them wasn't strictly necessary for getting the flag, they are used in real malware so there is utility in understanding how they are used. In addition, the code wasn't obfuscated much so this is a good place to get exposed to these techniques. Kudos to hasherezade for setting up this challenge!
flag{you_got_this_best_of_luck_in_reversing_and_beware_of_red_herrings}