Skip to content

JLeow00/malwarebytes-crackme-3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Malwarebytes Crackme 2021

Tools and environment setup

We will be doing this analysis in a Windows 10 VM with the flare-vm tools installed. Most of the tools listed below will come with flare-vm by default.

  • CFF Explorer - Good tool to gloss through basic PE information
  • dnSpy - For reversing .NET binaries
  • HxD - Pretty good hex editor for windows
  • IDA Pro - A must have. While the decompiler is not strictly necessary, it can make life much easier and we'll be using it extensively here
  • Python 3 - Our weapon of choice. We'll also need to install some packages using the following pip command
pip install Pillow numpy pefile pycryptodome
  • Powershell - Comes with windows by default. While we can survive with just python, some actions involving the OS (e.g. listing files) are so much more simple in powershell

Initial analysis

The crackme is an executable called MBCrackme.exe.

When we run the application we see a GUI like this, containing a form with text fields asking for 3 different passwords and buttons labelled "Check!"

When we open it in CFF explorer, we see that the file type is a .NET assembly and it contains a directory called the .NET directory, which tells us that this is a .NET binary.

This means that we can analyze the binary in dnSpy which would allow us to see the C# code in the binary. Luckily for us, the variables in the code are properly named and aren't obfuscated.

Under the Form1 class, we see 3 important methods, button1_Click(object, EventArgs), button2_Click(object, EventArgs), and button3_Click(object, EventArgs), which are called when the corresponding form buttons are clicked.

Hence, this crackme challenge can be divided into 3 levels, each level corresponding to a password to find.

Level 1

To find the first password, it's probably a good idea to look into the method that is called when the first button is clicked, button1_Click(object sender, EventArgs e).

private void button1_Click(object sender, EventArgs e)
{
	if (this.textBox1.Text.Length == 0)
	{
		MessageBox.Show("Enter the password!");
		return;
	}

After checking that the password submitted is not blank (which is also present in the other 2 button click methods), it calls the decode(Bitmap, string) method from the same class, passing in as arguments the resource named mb_logo_star, which is actually the background image we see in the form GUI, and the first password we entered in the form.

// continued from button1_Click...
bool flag = false;
string text = this.textBox1.Text;
byte[] array = Form1.decode(Resources.mb_logo_star, text);

This method iterates through the RGB pixels in the bitmap argument in column major order, then for each pixel, it takes the least significant few bits from the R, G, and B channels respectively, combines them into a single byte b, then XORs b with a byte from the password_str argument, and returns the result of this applied to all the pixels as a byte array. This is a form of bitmap steganography, used to hide a payload that is encrypted with a simple repeating key XOR cipher.

public static byte[] decode(Bitmap bm, string password_str)
{
	byte[] bytes = Encoding.ASCII.GetBytes(password_str);
	byte[] array = new byte[bm.Width * bm.Height];
	int num = 0;
	for (int i = 0; i < bm.Width; i++)
	{
		for (int j = 0; j < bm.Height; j++)
		{
			Color pixel = bm.GetPixel(i, j);
			int num2 = Form1.keep_bits((int)pixel.R, 3);
			int num3 = Form1.keep_bits((int)pixel.G, 3) << 3;
			int num4 = Form1.keep_bits((int)pixel.B, 2) << 6;
			byte b = (byte)(num2 | num3 | num4);
			if (bytes.Length != 0)
			{
				b ^= bytes[num % bytes.Length];
			}
			array[num] = b;
			num++;
		}
	}
	return array;
}

After calling the decode() method and obtaining the decoded byte array from the resource, the original button1_Click() method truncates the array if it is larger than Form1.validSize_1, which is 241152, then computes the CRC32 checksum of the truncated array and compares it against Form1.validCrc32_1, which has the value of 2741486452 (0xA367C374).

// continued from button1_Click...
if (array.Length > Form1.validSize_1)
{
	Array.Resize<byte>(ref array, Form1.validSize_1);
}
if (Crc32Algorithm.Compute(array) == Form1.validCrc32_1)
{
	flag = true;
	try
	{
		if (Form1.g_serverProcess == null || Form1.g_serverProcess.HasExited)
		{
			File.WriteAllBytes(this.g_serverPath, array);
			flag = this.runProcess(this.g_serverPath);
		}
	}
// Exception handling and form GUI adjustments...

Then, if the CRC32 hashes match, it dumps the decoded byte array into a file on disk whose path is given by this.g_serverPath. If we look into the definition of this.g_serverPath, we can see that the file being created is called level2.exe and is located in the temp directory.

private string g_serverPath = Path.Combine(Path.GetTempPath(), "level2.exe");

Then it calls the method runProcess(string) on that path which starts a hidden window process from an executable at the given argument path, which suggests that the decoded data was an executable file.

private bool runProcess(string path)
{
	if (Form1.g_serverProcess != null)
	{
		if (!Form1.g_serverProcess.HasExited)
		{
			Form1.g_serverProcess.Kill();
		}
		Form1.g_serverProcess.Close();
	}
	bool result = false;
	try
	{
		Process process = new Process();
		process.StartInfo.FileName = path;
		process.StartInfo.WindowStyle = ProcessWindowStyle.Hidden;
		result = process.Start();
		Form1.g_serverProcess = process;
	}
	catch (Exception)
	{
	}
	return result;
}

To summarize what we know so far,

  1. The password XOR-decrypts the ciphertext (embedded in the resource image) into an executable file
  2. The CRC32 checksum of this decrypted executable file is 0xA367C374

And we don't know

  1. The length of the password
  2. What kind of characters make up the password

The next few sections will talk about different methods to recover the password.

Brute force attack

An obvious (but later shown to be unfeasible) method would be to try brute forcing the password. This involves trying different values of the password, and verifying if the decrypted payload is correct by comparing the CRC32 checksum against 0xA367C374.

As mentioned previously, we don't know the character set or the length of the password, but we can make a few reasonable guesses. We know that we have to enter the password into the text box before its used by the decode method, so the password would likely be made up of printable ASCII characters. As for the length, we can just try different values until we find a password of some length that matches.

First, we should extract the ciphertext from the image. We need to first extract the resource, we'll call it mb_logo_star.png

Then, we can re-implement the extraction method without doing any XORing to get our ciphertext.

from PIL import Image
import numpy as np

def extract_data(image):
  newarray = []
  num = 0
  for j in range(len(image[0])):
      for i in range(len(image)):
          r = (0b00000111 & image[i][j][0])
          g = (0b00000111 & image[i][j][1]) << 3
          b = (0b00000011 & image[i][j][2]) << 6
          newarray.append(r | g | b)
  return bytes(newarray)

image = np.asarray(Image.open('mb_logo_star.png')) # 700 x 700 x 3
ciphertext = extract_data(image)[:241152]

Now we have the extracted ciphertext in the variable ciphertext. We also want to implement our XOR cipher

def xor(data, key):
    return bytes([b ^ key[i % len(key)] for i, b in enumerate(data)])

Then we have what we need to start our brute force attack

import zlib
import string
import itertools

CHARSET = string.printable

for length in range(1, 50):
    print(f"Trying length {length}...")
    attempts = itertools.product(CHARSET, repeat=length)
    for attempt in attempts:
        key = bytes(attempt)
        if zlib.crc32(xor(ciphertext, key)) == 0xA367C374:
            print(f"Found password! {key}")

However, if we run it, we find that our script can barely get to trying passwords of length 3 before it starts to get stuck.

There are a few factors at play here. Firstly, the ciphertext is 241152 bytes which is huge, so for each password attempt, every time we call xor(ciphertext, key), there are a lot of operations to be run. The second more pressing problem is that the search space grows exponentially with the password length. There are 100 printable ASCII characters (we can find this out with len(string.printable)), meaning that every time we add a new character, our new search space is 100 times bigger. As we later find out, the password was actually 49 characters long, which would mean that we would have had to try 100 to the power of 49 (that's 1 followed by 98 zeros) passwords, a few orders of magnitude greater than the number of atoms in the observable universe.

This means that brute-force is a no go, and we should try something else.

Leaked key in ciphertext

A good idea would be to take a step back and take a look at our ciphertext for any clues. We can reuse the code we wrote previously in our brute-force attempt to get the ciphertext, and dump out the first few bytes to see if there's anything we can work with

>>> ciphertext[:100]
b'(;\xe3y\\leval_o\x91\x9a_a\xd4most_do.e_xor_pe_and_keep_going!easy_level_\x97ne_os\xd7as\xc0V\xa9N\xd6d\x13\xb5N&7\x19\x16\x7f\x11\x1c\x0b8\x19\x04\x08P<\x06\x01\x07\x01\x13\x01\x07\x04'
>>> ciphertext[:1000]
b"(;\xe3y\\leval_o\x91\x9a_a\xd4most_do.e_xor_pe_and_keep_going!easy_level_\x97ne_os\xd7as\xc0V\xa9N\xd6d\x13\xb5N&7\x19\x16\x7f\x11\x1c\x0b8\x19\x04\x08P<\x06\x01\x07\x01\x13\x01\x07\x04S\x0b*\x02E\x1f\x0bL\x1b =E2\x0e\x08\x08A~yU@one_xor\xfb\x9b\xdeL\x81\xe4\xb1\x1f\x8b\xef\xb00\xbf\xed\xba)\xeb\x8b\xf7$\x8b\xf9\xac\x1f\xe9\x89\xa6$\x07\xd5\xba.\xe0\xb3\xb0-\x9f\xe5\xa64\xed\x86\xbf/\xa1\xd5\xad/\xc0\xbd\xa1$\xae\xeb\xbb$\xed\x89\xb3$\x81\xd5\xb2/\xec\x82\xb3`\x80\xeb\xa69\xbf\xe6\xb16\xd7\xe6\x8a/*\x86\x8f \x88\xe7\xba30\xbcN/\x8f\xef\x8a8+\x91\x881\x84\xd5\xb4.66\x08\r\x85\xfa\x8a'oing!eas)\x1ale:di_\x80Mu>almost_d\x8fng^sn|Op\xa7_an\x96]keep_\x14zing1eas\x99_lev%l_\x7fne_clmist_doneYxor_pe_a\x9eg_kaep_goilga\xe4asi_luvel_\x7fneOalmostOdone_xor_pesSodckeep\x8fdo\x89og!easy_level_one_al\x8dlslQdonN^xsr_pe_and_keep_going!easy_leVNm_/ne_almost_do\x8ee_lnr_pe_and_keep_going!easy_lK\x02\x00\x14+one\xd8\xa1lmoct_d\xadne_|or_pe_and_keep\x7fgo\t@\x15E\x04\x15\x12y_\x1a=vel\xbfone\x05alm\xa9st_done_xor_pe\x1fan$q\x0f\x04\x11\x11_go}\xebe!e!ry_\x10gveL^one_almost_do.e_\xb8A\x00,\x02\x06_an\x84^kee\xa0\\gokng!\xf9bsy_level_one_!lm/]\x06:\x08\x00\re_`ar_p\x85\\ant_ke\xfbs_going!easy_l%ve._one_almost_done_xor_pe_and_keep_going!easy_level_one_almost_done_xor_pe_and_keep_going!easy_level_one_almost_done_xor_pe_and_keep_going!easy_level_one_almost_done_xor_pe_and_keep_going!easy_level_one_almost_done_xor_pe_and_keep_going!easy_level_one_almost_done_xor_pe_and_keep_going!easy_level_one_almos"

We can see the string easy_level_one_almost_done_xor_pe_and_keep_going! being repeated many times, which is a result of the plaintext containing large blocks of null bytes, which would leak the key in the ciphertext. This is because plaintext XOR key = ciphertext, so if our plaintext is 0, then we have ciphertext = 0 XOR key = key. The repeating part comes from our key being repeated in the cipher, so when a large block in our plaintext contains large blocks of null bytes (which is a very common occurrence), we would be able to see our key being repeated in the clear.

Hence, we try easy_level_one_almost_done_xor_pe_and_keep_going! as they key, and see that after decryption, our plaintext has the correct CRC32 hash

>>> hex(zlib.crc32(xor(ciphertext, b"easy_level_one_almost_done_xor_pe_and_keep_going!")))
'0xa367c374'

Known plaintext attack

Even though we already have the key, we can explore another method of decrypting the data for the sake of learning.

We know that the decoded file is an executable, which means that we have some information about what our decrypted plaintext is supposed to look like and so we can try using a known plaintext attack to get the XOR key.

This known plaintext attack vulnerability of XOR ciphers is described in its Wikipedia entry

In any of these ciphers, the XOR operator is vulnerable to a known-plaintext attack, since plaintext ⊕ ciphertext = key.

Windows executables follow a format called the Portable Executable (PE) format, which is described on MSDN here. They have a specific header format beginning with the famous magic bytes "MZ" and with other fields described here, but usually the first 0x3C bytes of most PE files are identical, so we can take those first bytes from any executable file we have lying around (like MBCrackme.exe) as our known plaintext, and XOR them with our ciphertext to get the key

with open("MBCrackme.exe", "rb") as fp:
    known_plaintext = fp.read(0x3C)

print(xor(ciphertext, known_plaintext)[:0x3C])

which gives b'easy_level_one_almost_done_xor_pe_and_keep_going!easy_level_', clearly showing us our key.

Level 2

Now that we have our level 1 key, we can dump out the decrypted executable for analysis.

plaintext = xor(ciphertext, b"easy_level_one_almost_done_xor_pe_and_keep_going!")
with open('level2.exe', 'wb') as fp:
    fp.write(plaintext)

Opening the file in CFF explorer, we see that it is not a .NET binary (sadly), so we have to look it in IDA to find out what it does.

For the subsequent analysis in IDA, we will use a base address of 0x400000 when referring to subroutine addresses. By default, the base address should be loaded at that memory address, the reasons for which are explained in this article. However, when debugging, the base address may shift to some other location like 0x1A0000 (usually because of ASLR, which we can check if it's enabled with CFF explorer), and this would cause all the addresses in our IDA database to be out of sync with the ones in this writeup. To change the base address that IDA uses to calculate the addresses, we can select "Edit"->"Segments"->"Rebase Program" under the menu bar.

Main

We start our analysis at the main function at 0x401070, which first calls AddVectoredExceptionHandler to register the function at 0x401000 (which IDA has renamed Handler for us) as a vectored exception handler. We will examine what this Handler function does later when its use comes into play, but for now we will continue looking at what the main function does after that.

The main function then calls memset() to zero out 0x4E4B2 bytes of a stack variable that IDA has labelled Src, then calls sub_4011D0 with the following arguments

unk_414000 is a memory location in the .data section that contains strange data

sub_4011D0

If we look into sub_4011D0, we see that it gets the address of the PEB, then traverses the linked list in the structure to get one of the loaded modules (DLL) in the process. Then, it passes the loaded module and a weird number 985953233 to the function sub_401250 which would return some sort of function, and that returned function is called afterwards with most of the original arguments.

Even without looking into sub_401250, this behavior already looks very similar to API hashing, a technique used to obfuscate API calls, which makes static analysis difficult because we can't directly see which APIs are being called. It is a popular technique used by real malwares like Dridex, or Cobalt Strike, and it works by passing a API hash and a reference to a loaded module to an API resolving function (in this case sub_401250), which would go through all the exported APIs from that module, hash the names of the APIs, then compare it against the given hash. If the resolver function finds an API which matches the given hash, it returns the address of that API.

Just so we have a clearer picture of what's going on before we analyze the resolver function sub_401250, we refer to some documentation (the structure is partially undocumented on MSDN so we have to refer to other sources) and find that Flink actually should have the pointer type _LDR_DATA_TABLE_ENTRY *, so we change the type (by pressing "Y") and find that v8 is from the field DllBase which points to the base address of the loaded module.

sub_401250

Now we can analyze sub_401250 with the correct pointer types (e.g. IMAGE_DOS_HEADER * for a1). After some renaming and cleaning up, we see the following code which represents the hashing algorithm

There are a few methods we can use to find out which APIs are being resolved.

The first uses dynamic analysis, where we just step over the API resolver function to see which API got resolved. This would probably be easier in our case, but sometimes it can be more troublesome especially if there are many resolved API calls or if extensive anti-debugging techniques are present.

The second method would be to re-implement the resolver function, which would allow us to know which APIs correspond to which hashes statically. This method also gives us the flexibility to write an IDAPython script to automatically resolve all calls within the binary if there are many such calls inside and resolving them individually would be too tedious.

Note that there are also existing tools like HashDB that you could explore, but we won't be using them today.

We would write something like this to re-implement the resolver function

import pefile

def rol4(x, n):
  return ((x << n) | (x >> (32 - n))) & 0xFFFFFFFF

def hash_name(function_name):
    hash_value = 0xF00DF00D
    for b in function_name:
        hash_value = (rol4(hash_value, 5) ^ b) & 0xFFFFFFFF
    return hash_value

entry_export = [pefile.DIRECTORY_ENTRY["IMAGE_DIRECTORY_ENTRY_EXPORT"]]
exports = []
libraries = [
    'ntdll',
    'kernel32',
    'gdi32',
    'user32',
    'comctl32',
    'comdlg32',
    'ws2_32',
    'advapi32',
    'netapi32',
    'ole32',
    'winmm',
    'imm32',
    'bcrypt',
    'wmi',
]

for library in libraries:
    pe = pefile.PE(f'C:/windows/system32/{library}.dll')
    pe.parse_data_directories(directories=entry_export)
    exports += [e.name for e in pe.DIRECTORY_ENTRY_EXPORT.symbols if e.name]
    print(exports[-1])

hash_to_name = {hash_name(name):name for name in exports}

So now we have a dictionary hash_to_name that acts like the API resolver. If we give it our original hash 985953233, we find that it actually resolves to RtlDecompressBuffer.

>>> hash_to_name[985953233]
b'RtlDecompressBuffer'

Now that we know the API called, we realize that sub_4011D0 is in fact just a wrapper for RtlDecompressBuffer with the options COMPRESSION_FORMAT_LZNT1 | COMPRESSION_ENGINE_MAXIMUM, meaning that it just decompresses the 160345-byte buffer at unk_414000 using the LZNT1 algorithm, and places it into the stack variable Src.

To extract the decompressed bytes, we run it dynamically by stepping over the call to RtlDecompressBuffer, then find the address of the output buffer (0x8F19DC in this case) and extract the file with the following IDAPython code

a = ida_bytes.get_bytes(0x8F19DC, 320690)
open('decompressed_buf_2.bin', 'wb').write(a)

Back to main

Continuing with our analysis of main, after it decompresses the buffer, it makes another hash-resolved API call with a different hash 0xF4DD3DAD

Our re-implemented resolver tells us that this is NtAllocateVirtualMemory

>>> hash_to_name[0xF4DD3DAD]
b'NtAllocateVirtualMemory'

and after some cleaning up, we can see that the decompressed data is run as code, telling us that the decompressed data is shellcode

However, when we take a look at the decompressed data in a hex editor, we find that it is a whole PE file with all the "MZ" and "PE" headers and not just some shellcode.

This is strange because loaders (level2.exe in this case) would normally have to do some PE header parsing to find the entry point before transferring control over, but this loader directly transfers control to the start of the file which is where the "MZ" header is at. If we run the loader dynamically and step into the call to the PE file, we find that the "MZ" header actually contains some valid instructions

If we take a closer look at the MZ header and do some digging online, we find this tool by hasherezade (the challenge author) which modifies a PE structure to make the header executable and move the reflective loading code into the PE itself, effectively "shellcodifying" the PE file, which is pretty neat. The MZ header of our decompressed PE file is very similar to the source code of this tool, which tells us that it's very likely this tool was used.

The main function just returns after the call to the shellcode / PE file, so we can conclude that the role of level2.exe is simply a loader that decompresses and loads a shellcodified PE file, and later analysis tells us that this PE file acts as some sort of server, so we will call it server.exe.

Analyzing server.exe

This binary is also not a .NET application, so we have to look at it in IDA again.

The main function allocates some memory, then calls sub_405E10, then sub_4061E0 with 2 arguments, one looking like a pipe name and the other being the number 1337, and lastly sub_4092A6 which just calls free.

Luckily for us, the API calls are not obfuscated like in the loader was so analysis is slightly easier.

sub_405E10 - Anti-debugging

sub_405E10 first places 34 DWORDs into an array in the stack, then calls sub_405CB0 for each one of them.

If we look at the code in sub_405CB0, it seems to be accessing some structure, and choosing which field to access based on some comparison. This behavior is a clue to us that there might be some sort of binary search tree (BST) structure involved here, since when we are searching for elements in a BST, we have to compare the current node against our search value then pick either the left or right nodes, which would be represented by different fields in our structure. More information about BSTs can be found here.

In addition, sub_405CB0 calls sub_405B50 which contains the string "map/set<T> too long", which is a smoking gun telling us that this is part of some C++ standard library function, more specifically std::set<int>. This set container uses a balanced binary tree to store its items internally. Rolf Rolles published an article which talks about reversing STL containers, which we can reference to create this struct

struct bbt_node
{
  bbt_node *left;
  bbt_node *parent;
  bbt_node *right;
  char field_C;
  char is_terminal;
  _BYTE gap_E[2];
  int data;
};

and also conclude that sub_405CB0 is std::set::insert(), and that off_437E9C points to a std::set which contains the 34 numbers.

After adding the numbers to the set, it checks if any of the functions sub_402520, sub_402560(), or sub_402320 returns 1, and if so executes the instruction int 3 which triggers a breakpoint exception.

The first function sub_402520 is relatively simple, only calling IsDebuggerPresent() and IsRemoteDebuggerPresent() to check for any debuggers attached.

The second function sub_402560 calls IsBadReadPtr(0x7FFE0000, 0x3B8), then checks if 0x7FFE02D4 is set. A bit of searching gives us this very similar function (also from the challenge author), which tells us that 0x7FFE0000 is the address to the structure KUSER_SHARED_DATA. This means that 0x7FFE02D4 (offset 0x2D4) corresponds to the field KdDebuggerEnabled. This function just checks if a debugger is present.

The last function sub_402320 takes the argument off_437E9C, which is our std::set structure. If we look inside, it calls CreateToolhelp32Snapshot then Process32First. After that, it takes the field szExeFile (which contains the name of the executable file that the process is running without the .exe extension) and runs this hash algorithm, which is almost identical to the one used in the API hashing method, the only difference being the initial hash value and the fact that this one hashes the lowercased string.

Afterwards, it searches for the hash value in our set, and returns true if found. Then it calls Process32Next then does the same thing in a loop. The function essentially scans the system for blacklisted processes, the hashes of which are found in the 34 numbers in our set. We can guess that the hashes are of analysis or sandbox tools, and this technique of evading analysis via system checks is used by real world malware, like ObliqueRAT which has an analysis process blacklist, or PlugX which checks for processes named "vmtoolsd".

Note that this function calls sub_4013E0 and sub_4014D0 which are both related to std::string. We know this because there are a lot of comparisons of some offset with the number 16, which is a byproduct of the short string optimization used by MSVC. The structure for std::string can be represented something like this (taken from Eleemosynator's writeup for the previous Malwarebytes crackme)

// Simplified layout of the std::string object
struct string_layout {
	union contents_union {
		char	buffer[16];
		char	*data;
	}		contents;		// offset 0x00
	size_t		size;			// offset 0x10
	size_t		reserved;		// offset 0x14
} ;

About that handler function

We now know that the three functions were just checking for signs of debugging, but it is still strange why it would execute int 3 if it detected debugging. To answer this, we have to bring our attention back to the VEH Handler function in the loader level2.exe.

If the exception code is an access violation then they output a failure message box and terminate the process. If the exception address within the loaded binary, and the exception is a breakpoint, then they terminate the process, which means that the int 3 instruction was just to terminate the process via the exception handler.

Back to server.exe

We now know that sub_405E10 is an anti-debugging function that terminates the process if debugging is detected. The next function called is sub_4061E0, which first creates a named mutex MB_Crackme_level2_mutex, and returns if it isn't able to create it. Real world malwares do this too, usually to ensure that only a single instance of the program is running.

Afterwards, it initializes the values in a structure in the stack with 2 fields, the pipe name and the address of sub_406290. We will label the fields in the struct as such

struct thread_struct
{
  int value;
  int subroutine;
};

Then it calls sub_4051E0 which creates a thread at sub_405210, passing in that struct as its parameters. It also calls sub_405A80 which creates another thread with the same struct with different values, but that part is for level 3 so we will focus on sub_405210 first.

sub_405210 calls CreateNamedPipeA to create a new named pipe at \\.\pipe\crackme_pipe, then reads from it, then creates a thread to sub_4032C0, then immediately waits for the thread to close, so it's basically a function call. sub_4032C0 reads from the pipe, then calls sub_406290 which it got from its thread struct argument. The result of calling sub_406290 is then written back to the same pipe, then it cleans up and closes itself.

sub_406290 hashes the string that was written to the pipe with the same hashing algorithm as the one used in the anti-debug-blacklisted-process-checking-function sub_402320, then checks if its in the set at off_437E9C which is the same set containing those 34 blacklisted process name hashes. If it is, it uses the string as an RC4 key to decrypt the following 29-byte-long string (repesented in hex), then returns the decrypted string (to get written back to the pipe)

5a9558f17c6d62b5c2c68ad620f2f610d88fef4cd663468b1a0dbea251

We know its RC4 because it calls the function sub_401200 which contains this block of code that creates the substitution box

which is a tell-tale sign of RC4, and we can find the KSA and PRGA after analyzing the rest of the function. More information about how to recognize RC4 constructs can be found in this video by OALabs.

Button 2

To find out what gets written into the pipe and what happens to the decrypted data, we have to switch our focus back to the original .NET executable MBCrackme.exe, at button2_Click.

private void button2_Click(object sender, EventArgs e)
{
	if (this.textBox2.Text.Length == 0)
	{
		MessageBox.Show("Enter the password!");
		return;
	}
	bool flag = false;
	string pipeName = "crackme_pipe";
	string text = this.textBox2.Text;
	byte[] array = null;
	try
	{
		NamedPipeClientStream namedPipeClientStream = new NamedPipeClientStream(".", pipeName);
		namedPipeClientStream.Connect(1000);
		StreamWriter streamWriter = new StreamWriter(namedPipeClientStream);
		TextReader textReader = new StreamReader(namedPipeClientStream);
		streamWriter.WriteLine(text);
		streamWriter.Flush();
		string s = textReader.ReadLine();
		array = Encoding.ASCII.GetBytes(s);
		if (Crc32Algorithm.Compute(array) == Form1.validCrc32_2)
		{
			flag = true;
		}
	}
	// Exception and GUI handling...

It connects to the pipe and writes the level 2 password to it, then reads from it and checks if the CRC32 checksum of the response is equal to Form1.validCrc32_2, which is 0x1DC85E5D.

This means that the pipe serves as a communication channel between server.exe and MBCrackme.exe.

Tip: if you want to debug the application and write to the pipe, it takes only 5 lines to do it in powershell

$npipeClient = new-object System.IO.Pipes.NamedPipeClientStream('.', 'crackme_pipe')
$npipeClient.Connect()
$pipeWriter = new-object System.IO.StreamWriter($npipeClient)
$pipeWriter.Write('stuff to write to the pipe')
$pipeWriter.Flush() 

Mapping hashes to analysis processes

To summarize what we know so far,

  1. There are 34 names of analysis processes that it compares hashes against
  2. The level 2 password is one of the analysis process names, but might not be lowercase itself
  3. The password is used as the RC4 key to decrypt a 29-byte long stack string
  4. The RC4 decrypted string has a CRC32 checksum of 0x1DC85E5D (after removing the trailing null byte)

Hence, to find out what the level 2 password could be, we have to first find out what processes are being blacklisted. To do this, we will take the executable names of as many analysis processes as we can, then hash their names and check if their hashes match any of the 34.

We can start with gathering all executable files on our system with this powershell line. Our flare-vm environment comes with many analysis tools that could give us a hit, so this is worth a shot.

Get-ChildItem *.exe -Recurse -Path C:\ | % { echo $_.BaseName } > executables.txt

This Github repo also contains the names of some analysis processes, so we format it (remove the .exe extension) and add it to our list executables.txt.

We write a script to test the hashes

def rol4(x, n):
    return ((x << n) | (x >> (32 - n))) & 0xFFFFFFFF

def hash_name(text, initial_hash_value):
        hash_value = initial_hash_value
        for b in text:
                hash_value = (rol4(hash_value, 5) ^ b) & 0xFFFFFFFF
        return hash_value

HASHES = [
    0xC81D63C9, 0x5B2839AC, 0x17DAD73F, 0x72C7241C,
    0x58E483ED, 0x82134662, 0x34204667, 0x4CD53A71,
    0x34206499, 0xFFDEB191, 0x7AC6410B, 0xEA3503AA,
    0xCCFA2924, 0x3A09FFBC, 0x38EA0C1B, 0x58E479EC,
    0x1B964E1A, 0x707F9D9A, 0xF5A79701, 0x09F5473B,
    0xBA635AC6, 0x0BB18A65, 0x46119FD8, 0xFB7BF6AF,
    0x3F75D54B, 0x49110E9F, 0x5D9F9FD8, 0x5DCC9FD8,
    0x8293C33E, 0x5D112314, 0x9D9F8189, 0xC10AE786,
    0x67D8B725, 0x07FE9020,
]

with open('executables.txt', 'r') as fp:
	executables = fp.read().split()

executables = list(set(executables))

hash_to_name = {}

for executable in executables:
	processed_name = executable.lower().encode()
	hash_value = hash_name(processed_name, 0xBADC0FFE)
	hash_to_name[hash_value] = executable

for hash_value in HASHES:
    if hash_value in hash_to_name:
        print(hex(hash_value), hash_to_name[hash_value])
    else:
        print(hex(hash_value, "not found")

We run it and get this mapping

> python .\match_hashes.py
0xc81d63c9 ollydbg
0x5b2839ac ProcessHacker
0x17dad73f tcpview
0x72c7241c autoruns
0x58e483ed autorunsc
0x82134662 filemon
0x34204667 procmon
0x4cd53a71 regmon
0x34206499 procexp
0xffdeb191 idaq
0x7ac6410b idaq64
0xea3503aa ImmunityDebugger
0xccfa2924 Wireshark
0x3a09ffbc dumpcap
0x38ea0c1b HookExplorer
0x58e479ec ImportREC
0x1b964e1a PETools
0x707f9d9a LordPE
0xf5a79701 SysInspector
0x9f5473b proc_analyzer
0xba635ac6 sysAnalyzer
0xbb18a65 sniff_hit
0x46119fd8 windbg
0xfb7bf6af joeboxcontrol
0x3f75d54b joeboxserver
0x49110e9f ResourceHacker
0x5d9f9fd8 x32dbg
0x5dcc9fd8 x64dbg
0x8293c33e Fiddler
0x5d112314 httpdebugger
0x9d9f8189 Vmwaretray
0xc10ae786 pe-sieve
0x67d8b725 hollows_hunter
0x7fe9020 pin

We then try each of them and see if it matches the CRC32 checksum, using this script

import Crypto.Cipher.ARC4
import zlib

possible_passwords = [
    "ollydbg", "ProcessHacker", "tcpview", "autoruns", 
    "autorunsc", "filemon", "procmon", "regmon", 
    "procexp", "idaq", "idaq64", "ImmunityDebugger", 
    "Wireshark", "dumpcap", "HookExplorer", "ImportREC", 
    "PETools", "LordPE", "SysInspector", "proc_analyzer", 
    "sysAnalyzer", "sniff_hit", "windbg", "joeboxcontrol", 
    "joeboxserver", "ResourceHacker", "x32dbg", "x64dbg", 
    "Fiddler", "httpdebugger", "Vmwaretray", "pe-sieve", 
    "hollows_hunter", "pin", 
]

# Try lowercase too
possible_passwords += [s.lower() for s in possible_passwords]

ciphertext = bytes.fromhex("5a9558f17c6d62b5c2c68ad620f2f610d88fef4cd663468b1a0dbea251")

for password in possible_passwords:
    # RC4 requires a password length of at least 5
    if len(password) < 5:
        continue
    cipher = Crypto.Cipher.ARC4.new(password.encode())
    decrypted_string = cipher.decrypt(ciphertext).rstrip(b'\x00')
    if zlib.crc32(decrypted_string) == 0x1DC85E5D:
        print("Found match!")
        print(f"Password: {password}")
        print(f"Decrypted: {decrypted_string}")

Running it gives us

> python .\find_password.py
Found match!
Password: ProcessHacker
Decrypted: b'we_are_good_to_go_to_level3!'

So we now know that the password for level 2 is ProcessHacker. We are good to go to level 3!

Quite a lot happened, so we have this figure to summarize this level

Level 3

Now that we know what the server writes to the pipe, we can shift our attention back to the original MBCrackme.exe. Looking at the last part of button2_Click, it calls LoadNext.Load.

// button2_Click continued
if (flag)
{
	this.button2.Enabled = false;
	this.textBox2.Enabled = false;
	this.button2.BackColor = Color.OldLace;
	this.button3.Enabled = true;
	this.textBox3.Enabled = true;
	this.button3.BackColor = SystemColors.ActiveCaption;
	MessageBox.Show("Level up!");
	LoadNext.Load(Form1.g_serverProcess, array);
	return;
}

This method takes LoadNext.EncArr, then performs base64 decoding on it, then AES decryption (using the returned string we_are_good_to_go_to_level3! from level 2), then gzip decompression (LoadNext.DecompressBytes), before loading it using Assembly.Load and running the RunMe method under some Level3Bin.Class1 class.

public static int Load(Process process1, byte[] password)
{
	try
	{
		Type type = Assembly.Load(LoadNext.DecompressBytes(AES.decryptContent(Convert.FromBase64String(LoadNext.EncArr), password))).GetType("Level3Bin.Class1");
		object obj = Activator.CreateInstance(type);
		Type[] types = new Type[]
		{
			typeof(Process)
		};
		MethodInfo method = type.GetMethod("RunMe", types);
		object[] parameters = new object[]
		{
			process1
		};
		method.Invoke(obj, parameters);
	}
	// Exception handling...

This tells us that the decoded content after all those steps is some .NET binary.

AES.decryptContent uses a salt and the returned password to generate an AES key and IV using PBKDF2, then uses AES CBC mode to decrypt the content.

public static byte[] decryptContent(byte[] fileContent, byte[] password)
{
	MemoryStream memoryStream = new MemoryStream();
	byte[] password2 = SHA256.Create().ComputeHash(password);
	byte[] salt = new byte[]
	{
		5,
		3,
		3,
		7,
		8,
		0,
		0,
		8
	};
	RijndaelManaged rijndaelManaged = new RijndaelManaged();
	rijndaelManaged.KeySize = 256;
	rijndaelManaged.BlockSize = 128;
	Rfc2898DeriveBytes rfc2898DeriveBytes = new Rfc2898DeriveBytes(password2, salt, 1000);
	rijndaelManaged.Key = rfc2898DeriveBytes.GetBytes(rijndaelManaged.KeySize / 8);
	rijndaelManaged.IV = rfc2898DeriveBytes.GetBytes(rijndaelManaged.BlockSize / 8);
	rijndaelManaged.Mode = CipherMode.CBC;
	try
	{
		CryptoStream cryptoStream = new CryptoStream(memoryStream, rijndaelManaged.CreateDecryptor(), CryptoStreamMode.Write);
		cryptoStream.Write(fileContent, 0, fileContent.Length);
		cryptoStream.FlushFinalBlock();
		cryptoStream.Close();
	}
	// Exception handling...

We want to decrypt the content, but the base64 string is too long to view completely in dnSpy, and it gives the [...string is too long...]" message at the end, so to get the whole string, we have to right click and select "Edit IL instructions"

This way we can have access to the whole string under ldstr

Afterwards we save it in a file called b64_encoded_string.txt, and write the following script to get the decoded data

import base64
import gzip
import hashlib
import Crypto.Cipher.AES
import Crypto.Util.Padding
import Crypto.Protocol.KDF

with open('b64_encoded_string.txt', 'r') as fp:
    text = fp.read()

encrypted_data = base64.b64decode(text)
salt = bytes([5, 3, 3, 7, 8, 0, 0, 8])
password = b'we_are_good_to_go_to_level3!'
password = hashlib.sha256(password).digest()
rfc_2898_bytes = Crypto.Protocol.KDF.PBKDF2(password, salt, dkLen=48)

aes_key = rfc_2898_bytes[:32]
aes_iv = rfc_2898_bytes[32:]

aes_cipher = Crypto.Cipher.AES.new(aes_key, Crypto.Cipher.AES.MODE_CBC, aes_iv)
compressed_data = aes_cipher.decrypt(encrypted_data)
compressed_data = Crypto.Util.Padding.unpad(compressed_data, Crypto.Cipher.AES.block_size)

actual_data = gzip.decompress(compressed_data)

with open('decrypted_data.bin', 'wb') as fp:
	fp.write(actual_data)

Decrypted DLL

The decrypted data turns out to be a DLL file, and is a .NET binary like we suspected

We rename the file level3.dll, then put it in dnSpy for analysis. We look at the function RunMe first because that's what's called by MBCrackme.exe.

public static int RunMe(Process process1)
{
	try
	{
		string tempFileName = Class1.GetTempFileName("dat");
		if (Class1.DropTheDll(tempFileName))
		{
			DllInj.InjectToProcess(process1, tempFileName);
		}
	}
	catch (IOException ex)
	{
		MessageBox.Show(ex.Message, "Error!", MessageBoxButtons.OK, MessageBoxIcon.Hand);
	}
	return 0;
}

Class1.DropTheDll just drops a DLL file into a .dat file in the temp directory with a random file name, then DllInj.InjectToProcess injects the DLL into the server.exe process. This technique is called DLL injection, and is frequently used by real malware, examples of which can be found in that link.

public static int InjectToProcess(Process targetProcess, string dllName)
{
	IntPtr hProcess = DllInj.OpenProcess(1082, false, targetProcess.Id);
	IntPtr procAddress = DllInj.GetProcAddress(DllInj.GetModuleHandle("kernel32.dll"), "LoadLibraryA");
	IntPtr intPtr = DllInj.VirtualAllocEx(hProcess, IntPtr.Zero, (uint)((dllName.Length + 1) * Marshal.SizeOf(typeof(char))), 12288U, 4U);
	UIntPtr uintPtr;
	DllInj.WriteProcessMemory(hProcess, intPtr, Encoding.Default.GetBytes(dllName), (uint)((dllName.Length + 1) * Marshal.SizeOf(typeof(char))), out uintPtr);
	DllInj.CreateRemoteThread(hProcess, IntPtr.Zero, 0U, procAddress, intPtr, 0U, IntPtr.Zero);
	return 0;
}

Injected DLL

CFF explorer tells us that the injected DLL is not a .NET binary, so we will analyze it in IDA

Here we will use 0x10000000 as the base address when referring to addresses.

This DLL doesn't have any exports, only a DllMain.

Looking at the segments, we see two unusual segments .detourc and .detourd which are not standard in normal PE files. This will be useful later.

The DLL when loaded outputs a debug message saying it is hooking some process, and during unloading outputs another debug string saying that it is unhooks the process.

sub_10004D00 seems to be acquiring some lock, then calls sub_10004520 which calls VirtualProtect to change the permissions of some memory block to RWX.

sub_10005260 suspends the thread it takes as the argument if it's not the current thread

sub_10004570 is a wrapper for sub_10004590, and takes in 2 arguments, one being some global variable and the other being some function. If we cross-reference the global variables (pressing "X" in IDA), we can see that dword_1002AE60 points to CryptStringToBinaryA (initialized in sub_10001000), dword_1002AE64 points to GetCursorPos (initialized in sub_10001010), and dword_1002AE68 points to Sleep (initialized in sub_10001020). The initialization functions were all called in dllmain_crt_process_attach, which is run before DllMain is run.

Within sub_10004590, it calls many functions that check for certain jmp or nop opcodes in the subroutine given in the global variable, like sub_100049F0 that calls sub_100043C0 which gets a jump destination relative to the first instruction, or sub_10003BA0 which checks returns the size of the first instruction if it is a nop or int 3.

This behavior suggests that some sort of API hooking is happening, more specifically inline hooking, because the hook function would have to check the API function's instructions to set up the trampoline properly and sure that the program doesn't crash.

If we search online for any information for the .detourc and .detourd sections, we can find an instrumentation library by Microsoft called Detours that uses inline hooking. If we do more digging, we can find this, written by the challenge author, with the following example usage code

void hook_apis()
{
    DetourTransactionBegin();
    DetourUpdateThread(GetCurrentThread());
    DetourAttach(&(PVOID&)pMessageBoxA, my_MessageBoxA);
    DetourTransactionCommit();
}

void unhook_apis()
{
    DetourTransactionBegin();
    DetourUpdateThread(GetCurrentThread());
    DetourDetach(&(PVOID&)pMessageBoxA, my_MessageBoxA);
    DetourTransactionCommit();
}

It seems that sub_10004D00 is DetourTransactionBegin, sub_10005260 is DetourUpdateThread, and sub_10004570 is DetourAttach. This gives us even more confirmation that this DLL performs inline hooking. We will explore the modified behavior of the functions later when it comes into play.

Button 3

For now, we will turn our attention back to the original binary MBCrackme.exe, specifically button3_Click.

TcpClient tcpClient = new TcpClient("127.0.0.1", 1337);
byte[] array = Encoding.ASCII.GetBytes(this.textBox3.Text);
NetworkStream stream = tcpClient.GetStream();
stream.Write(array, 0, array.Length);
array = new byte[256];
string text = string.Empty;
int count = stream.Read(array, 0, array.Length);
text = Encoding.ASCII.GetString(array, 0, count);
if (text.Length > 10)
{
	this.label4.Text = text;
	this.button3.BackColor = Color.OldLace;
	this.textBox3.Enabled = false;
	this.button3.Enabled = false;
}
MessageBox.Show(text); 

It connects and writes the third password to the localhost TCP socket at port 1337, then receives back the reply and places it in the message box.

Server yet again

To find out how the listening socket gets set up and what gets written to it, we have to revisit server.exe, and take a look at the other thread at sub_4056F0 that was created by sub_405A80 which is called by sub_4061E0. It first creates a socket and prepares the structures to point to the socket address 127.0.0.1:1337.

Then it binds that address to the socket and listens on it

When it accepts a connection, it calls sub_406530, which first prepares the following 68-byte long stack string (in hex representation)

7FB19BA3DBB87A983EE96B2FACC4405A420F905F5CF19CAB32791BF50CCAA306C4454A4AF61D592141DAF3C7BAEFEEA32D0D82451735D334CBDCC3D7B35B5EFA673FE269EF02415A

Afterwards, it takes the submitted password, then calls CryptStringToBinaryA with CRYPT_STRING_BASE64 as the dwFlags option.

Then, it takes the decoded string (in pbBinary) and for every byte, calls GetCursorPos and then compares the x value against the decoded byte rotated

  1. left or right depending on whether the index is even or odd, and
  2. by how much depending on the y value returned

If the comparison succeeds, it sets the value of the decoded text to the original byte rotated the other way by the same amount.

Then, it calls Sleep.

Afterwards, it takes the resultant value of the decoded text as an RC4 key (using the same RC4 function sub_401200 used in level 2)

This seems arbitrary because the cursor position could really be anywhere, but remember that these three APIs were hooked in the injected DLLs, so it's time to analyze how their behavior changes after being hooked.

Proxy functions

The proxy function (the function whose code is run instead of the hooked function) for CryptStringToBinaryA is sub_10002990 (from the injected DLL). It outputs some debug strings about the arguments, then calls the real CryptStringToBinaryA. Afterwards, it sets a global variable, which we will call g_counter, to 4.

sub_10002B10 is the proxy function for GetCursorPos. It doesn't actually call the real GetCursorPos function, but instead returns the x and y values as bytes from the arrays byte_1002A000 and byte_1002A020 based on the value of g_counter.

The arrays contain non-ASCII values

sub_10002B60 is the proxy function for Sleep, and it simply increments g_counter.

Taken together, this means that the processing of the decoded base64 string is not arbitrary, but actually has nothing to do with our mouse position and is in fact deterministic.

Finding our flag

Turning our attention back to the password processing function sub_4056F0 inside server.exe, we will assume that the comparison of the rotated byte to the x value matches every time (if it doesn't, then the decoded byte can take any other value, which leaves us with no information). So, we will find the value of the decoded bytes that passes all the checks using a bit of algebra.

We will call the index i, the decoded byte ciphertext[i], and the new value set plaintext[i]. For even i, if the comparison were to succeed, we have to have

rol(ciphertext[i], y_val % 8) == x_val
plaintext[i] == ror(ciphertext[i], y_val % 8)

We can rotate right both sides of the first line to get

ciphertext[i] == ror(x_val, y_val % 8)

and then substitute it in the second line to get

plaintext[i] == ror(ror(x_val, y_val % 8), y_val % 8)

and since rotating right by the same value twice is the same as rotating right by twice the value,

 plaintext[i] == ror(x_val, (y_val * 2) % 8)

For odd i, it's just rol instead of ror.

With that, we can write the following python script to get the plaintext

x_array = [
    0x95, 0xb9, 0x63, 0x59, 0xdc, 0xb5, 0x58, 0xc6,
    0x6c, 0x5f, 0x68, 0x6f, 0x6f, 0xad, 0xdc, 0x5f,
    0x6d, 0x58, 0xda, 0x65, 0x5f, 0x58, 0xd7, 0x62,
    0x69, 0x9d, 0xd7, 0x91, 0x96, 0x99, 0x66, 0x65,
    0x9c,
]

y_array = [
    0x83, 0x1b, 0x89, 0x20, 0x37, 0x8b, 0x57, 0xc6,
    0x78, 0x74, 0x00, 0xc4, 0x48, 0x83, 0xdb, 0x7c,
    0x48, 0x49, 0x8b, 0x48, 0xf8, 0x49, 0xff, 0x24,
    0x74, 0x93, 0x53, 0x03, 0x4a, 0x03, 0xc0, 0x48,
]

def rol(x, n):
    return ((x << n) | (x >> (8 - n))) & 0xFF

def ror(x, n):
    return ((x >> n) | (x << (8 -n))) & 0xFF

ans = []
counter = 4
for i in range(33):
    x_val = x_array[counter % 33]
    y_val = y_array[counter % 32]
    if i % 2 == 0:
        plaintext = ror(x_val, (2 * y_val) % 8)
    else:
        plaintext = rol(x_val, (2 * y_val) % 8)
    ans.append(plaintext)
    counter += 1

print(bytes(ans))

which gives b'small_hooks_make_a_big_difference'.

And to get the flag, we just use it at the RC4 key to decrypt the stack string

import Crypto.Cipher.ARC4

password = b'small_hooks_make_a_big_difference'
ciphertext = bytes.fromhex("7FB19BA3DBB87A983EE96B2FACC4405A420F905F5CF19CAB32791BF50CCAA306C4454A4AF61D592141DAF3C7BAEFEEA32D0D82451735D334CBDCC3D7B35B5EFA673FE269EF02415A")

cipher = Crypto.Cipher.ARC4.new(password)
decrypted_string = cipher.decrypt(ciphertext).rstrip(b'\x00')

print(decrypted_string)

giving us flag{you_got_this_best_of_luck_in_reversing_and_beware_of_red_herrings}.

This figure summarizes what happened in this level

What was the third password

Even though we already got the flag, I want to briefly go through what were the possible values for the third password, for the sake of completeness.

There are actually a lot of valid values of the password here. The only condition for it to be correct was it decoded to small_hooks_make_a_big_difference after all the rotating. This means that if we base64 encode that string itself, i.e. c21hbGxfaG9va3NfbWFrZV9hX2JpZ19kaWZmZXJlbmNl, then that would be a valid password. We could also have rotated the bytes and let the check correct it for us, so something like ua2wsWz1aPZvbZv1bbBbZV+wryaW7PqMpcxmZZOs3GOy would also work.

Note that any of the bytes in our password can be either from the decoded string (small_hooks_make_a_big_difference) or be rotated, which gives 2 possibilities per byte. Since the password has 33 characters, this gives 2 to the power of 33 (8589934592) possible values for password #3.

Conclusion

In this crackme, we saw techniques like

  • Steganography
  • Vectored exception handling
  • API hash obfuscation
  • "Shellcodified" PE files
  • Anti-debugging system checks
  • RC4 encryption
  • The use of pipes and network sockets for inter process communication
  • DLL injection
  • Inline hooking

While analyzing all of them wasn't strictly necessary for getting the flag, they are used in real malware so there is utility in understanding how they are used. In addition, the code wasn't obfuscated much so this is a good place to get exposed to these techniques. Kudos to hasherezade for setting up this challenge!

Flag

flag{you_got_this_best_of_luck_in_reversing_and_beware_of_red_herrings}

About

Writeup and scripts for the 2021 malwarebytes crackme

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages