-
-
Notifications
You must be signed in to change notification settings - Fork 17
Getting Started
Make sure to follow the instructions in the README file to build and link against the library. Once you have your project set up, follow these instructions to start using virt86.
virt86 exposes three major components:
- Platform: the entry point of a virtualization platform's functionality. You can check if the platform was initialized successfully or not, query the platform's capabilities and manage virtual machines.
- Virtual Machine: the top-level entity of a virtualization platform. Contains one or more virtual processors, zero or more host memory areas mapped to guest physical addresses and I/O handlers.
- Virtual Processor: the main component of a virtual machine that does the actual work. Virtual processors expose their state to the user, including almost all virtual registers.
Each of these components is modeled as a class: Platform
, VirtualMachine
and VirtualProcessor
, respectively. The sections below describe how to use them.
The virt86/virt86.hpp
header is the entry point of the library. Include it in your application to gain access to all of virt86's features.
All virt86 library code resides in the virt86
namespace. Platform-specific code resides in their own namespaces under virt86
, such as virt86::haxm
for HAXM or virt86::kvm
for KVM. From now on, this guide will omit the namespace when referring to types defined in it for brevity and clarity.
The header files are thoroughly documented. If you want to know more about an specific feature, go ahead and open them. At the top of the file you'll find an overview of the file's contents and, in some cases, usage examples. The declarations are also well documented, so if you need to know something about a particular struct or method, go straight to their declaration.
virt86 supports four major hypervisor platforms, but not all of them are available on your system. To help with that, the virt86/virt86.hpp
header selectively includes the available platforms and exposes a fixed-size array of factory functions named virt86::PlatformFactories
that can be used to retrieve their instances. Alternatively, you can check for the presence of preprocessor macros VIRT86_HAXM_AVAILABLE
, VIRT86_HVF_AVAILABLE
, VIRT86_KVM_AVAILABLE
or VIRT86_WHPX_AVAILABLE
, which are defined and set to 1 if the respective platform is present, and undefined if not.
Once you know which platform to use, you'll want to get their singleton instance. All platforms expose their instances through the static method Instance()
:
auto& haxmPlatform = virt86::haxm::HaxmPlatform::Instance();
auto& hvfPlatform = virt86::hvf::HvFPlatform::Instance();
auto& kvmPlatform = virt86::kvm::KvmPlatform::Instance();
auto& whpxPlatform = virt86::whpx::WhpxPlatform::Instance();
or you can use one of the factories:
// Use the first available platform
// (not guaranteed to be successfully initialized!)
auto& platform = virt86::PlatformFactories[0]();
Platform
s cannot be assigned to variables -- their copy and move constructors are deleted. You may, however, use a reference such as virt86::Platform&
or auto&
.
This is the only place you will need to know about the underlying hypervisor platform. All other usages are abstract (to the extent allowed by leaky abstractions) and will never refer to specific features of individual platforms.
Before you can use a platform, you'll need to check if it was properly initialized by checking the result of Platform::GetInitStatus()
. It can be one of the following outcomes:
-
PlatformInitStatus::OK
: The platform was initialized successfully and is ready to be used. -
PlatformInitStatus::Unavailable
: The platform is unavailable, possibly because the driver is not installed or stopped. -
PlatformInitStatus::Unsupported
: The platform is unsupported on the host machine likely because the host CPU lacks certain features required by the hypervisor. -
PlatformInitStatus::Failed
: Initialization failed for another reason, which may include issues with the driver.
For example, HAXM may be available on all platforms, but it requires the driver to be installed and running on the system. If virt86 cannot find the driver, GetInitStatus()
will return PlatformInitStatus::Unavailable
.
You might be interested in checking the platform's feature set before moving on. You can do so with the Platform::GetFeatures()
method, which returns a struct indicating the features supported by the platform and the host's CPU.
For more information on platforms, feel free to read the virt86/platform/platform.hpp
header file.
A Platform
by itself is not very useful. In order to get any virtualization work done, you'll need to create virtual machines, and Platform
s provide the method for doing that.
Having a Platform
in hand, it's easy to create a virtual machine. The method Platform::CreateVM()
does just that. It takes a VMSpecifications
which describes the specifications of the virtual machine.
Within that struct (defined in virt86/vm/specs.hpp
) you'll find various parameters such as the set of extended VM exits to enable or a list of custom CPUID responses. Most of them depend on being supported by the platform, and may be ignored if the platform doesn't support them.
The most important parameter is the number of processors. A virtual machine cannot be created with zero processors.
The CreateVM
method returns a populated std::optional
if the virtual machine was succesfully created. A typical pattern for creating a virtual machine looks like this:
VMSpecifications specs = { 0 };
specs.numProcessors = 1;
// Set other specifications here as desired
auto opt_vm = platform.CreateVM(specs);
if (!opt_vm) {
// Virtual machine creation failed
return;
}
auto& vm = opt_vm->get();
// vm is ready for use
The virtual processors will be automatically created and initialized as part of the virtual machine's initialization process. They can be retrieved with VirtualMachine::GetVirtualProcessor(index)
, where index
is 0-based. You'll learn more about virtual processors later in this guide. For now, lets focus on other major features provided by VirtualMachine
s. You can find the declaration of the VirtualMachine
class in virt86/vm/vm.hpp
.
Most guests require physical memory to work with. In order to provide the virtual machine with physical memory, you'll first need to allocate a page-aligned block of memory on the host (using functions like VirtualAlloc
or aligned_alloc
) and then map it to an arbitrary portion of the guest's physical address area using the VirtualMachine::MapGuestMemory
method. You'll need to specify the base physical address and the size of the guest memory area to be mapped, as well as a pointer to the page-aligned host memory block, and a set of flags for the memory range. These flags include the typical read/write/execute permissions and a special flag to enable dirty page tracking on platforms that support it. You'll learn more about this feature later in the guide.
Use VirtualMachine::UnmapGuestMemory
to remove a portion of physical memory from the guest. Some platforms may allow you to partially unmap existing ranges, while others will only let you unmap entire regions, and there are platforms that don't support unmapping at all.
You can also modify the memory protection flags of a given range using VirtualMachine::SetGuestMemoryFlags
. Again, this is an optional operation and for platforms that do support it, they may only allow you to modify entire ranges.
Guest physical memory can be managed at any time as long as no virtual processors are running. Attempting to manipulate physical memory while a virtual processor is running results in undefined behavior, which typically means a Blue Screen of Death or a kernel panic at best.
Always check the Platform
's features and the return code of these methods to be sure the operation worked.
Now that you have enough tools under your belt, it's time to move on and begin the real work: running virtual processors.
As explained earlier, a virtual machine is built with at least one virtual processor. In order to retrieve it, you should follow the same approach as creating a virtual machine:
auto opt_vp = vm.GetVirtualProcessor(index);
if (!opt_vp) {
// Virtual processor index is out of bounds
return;
}
auto& vp = opt_vp->get();
Since you already know the number of virtual processors when you created the VM, you can skip the check entirely as long as you know your index is not out of bounds:
auto& vp = vm.GetVirtualProcessor(index)->get();
A virtual processor will be initialized to the standard 16-bit real mode reset vector, with CS:IP pointing to F000:FFF0
, which corresponds to the physical memory address FFFFFFF0
. In a typical machine, this is where the BIOS ROM resides. If you already set up some code in that area, you can already tell your virtual CPU to run by invoking VirtualProcessor::Run()
. The method will return on a VM exit or error, but the return code won't tell you what the exit reason was. This information is available through VirtualProcessor::GetVMExitInfo()
.
There are several possible VM exit reasons, some of which must be explicitly enabled in the VMSpecifications
in order to be used. They are defined in the enum class VMExitReason
:
-
Normal
: An uneventful VM exit, usually due to time slice expiration. -
Cancelled
: Only used with WHPX, indicates that CPU execution was cancelled to inject an interrupt. You can treat this like a normal VM exit. It is used internally for testing purposes. -
Interrupt
: Indicates that an interrupt window has opened. Again, this can be treated like a normal VM exit. LikeCancelled
, it's used for tests. -
PIO
: VM exited to handle an IN or OUT instruction. I/O is handled by callbacks registered with theVirtualMachine
, so this VM exit code is again purely informational. -
MMIO
: LikePIO
, but for memory accesses. When the guest code attempts to access unmapped memory, the VM will exit with this reason. MMIO handlers will take care of the actual work. -
Step
: Single stepping completed successfully. You'll learn more about this VM exit on the guest debugging section below. -
SoftwareBreakpoint
: A software breakpoint was hit. This is also related to guest debugging. -
HardwareBreakpoint
: A hardware breakpoint was hit. This, too, is used with guest debugging. -
HLT
: The virtual CPU executed theHLT
instruction. The instruction pointer will be located at the instruction followingHLT
. -
CPUID
: The virtual CPU executed theCPUID
instruction. This must be explicitly enabled in the virtual machine's specifications by setting theExtendedVMExit::CPUID
flag inextendedVMExits
. -
MSRAccess
: The virtual CPU is reading or writing an MSR. Enabled by theExtendedVMExit::MSRAccess
flag. -
Exception
: The virtual CPU raised an exception. Enabled by theExtendedVMExit::Exception
flag. The set of exceptions that may be captured are specified inVMSpecifications::exceptionExits
. Some platforms may allow users to capture individual exceptions, while others will capture all-or-nothing. -
Shutdown
: The VM is shutting down. Only a few hypervisors return this code, so don't rely on it for detecting a system shutdown. TheHLT
exit code is more reliable and is supported by all hypervisors. -
Error
: An unrecoverable error ocurred within the virtual machine. -
Unhandled
: The VM exit reason returned by the hypervisor is unknown to the platform adapter. This may happen if the user installs a newer driver which provides new VM exit codes not handled by virt86.
A reasonable design for a program that can run multiple virtual processors in a virtual machine will contain one thread for each virtual processor pinned to individual host processors running a loop that constantly executes Run()
and checks the VM exit reason. In code:
void Emulator::VPThreadFunction() {
// m_running can be set to false to stop the thread
// m_vp is the VirtualProcessor reference
while (m_running) {
auto result = m_vp.Run();
if (result != VPExecutionStatus::OK) {
// The virtual processor failed to execute; send a signal to stop the entire emulator and exit thread
StopEmulator();
return;
}
auto& exitInfo = m_vp.GetVMExitInfo();
switch (exitInfo.reason) {
case VMExitReason::HLT:
// Handle HLT
break;
case VMExitReason::Error:
// The VM crashed; send a signal to stop the emulator and exit thread
StopEmulator();
return;
// ... and so on
}
}
}
The virt86/vp/vp.hpp
header file specifies the VirtualProcessor
class. Now, let's move on to other features provided by virtual processors and virtual machines.
One of the most powerful features of a hypervisor is the ability to manipulate the virtual processor's registers. virt86's VirtualProcessor
class has a set of methods for reading and writing registers: RegRead
, RegWrite
and RegCopy
. The first two are self-explanatory, while the third copies the value from a register to another register. All three methods come in two variants: one for individual registers, one for bulk operations.
Some platforms provide mechanisms to optimize bulk reads and writes; virt86 takes advantage of them when available. Other platforms may only provide access to the entire CPU register state, in which case virt86 will cache the entire state and refresh them only when registers are accessed through one of these methods, and the virtual processor state is only updated if a register was modified.
Be aware that if you leave the registers in an invalid state, the virtual processor will fail to run and usually exit with VMExitReason::Error
.
The set of registers supported by virt86 is defined by the enum class Reg
and the register value is stored in a RegValue
(both defined in virt86/vp/regs.hpp
). The latter is an union of various different types that match the register value structures. In order to read the GDTR, for example, you'll pass Reg::GDTR
and a reference to a RegValue
object where the value will be stored to VirtualProcessor::RegRead
. The GDTR is a table register, so the value will be stored in the table
field of the union, which contains the base
and limit
fields.
The VirtualProcessor
class also provides access to the x87 control registers through the Get/SetFPUControl
methods, and the MXCSR control and status register through the Get/SetMXCSR
pair, as well as the MXCSR_MASK through similarly named methods. The corresponding data structures are defined in virt86/vp/fpregs.hpp
Likewise, MSRs can be manipulated through the Get/SetMSR
pair for individual registers, and Get/SetMSRs
for bulk access.
You may wish to know the current execution and paging modes of the virtual CPU. The method GetExecutionMode
returns a CPUExecutionMode
indicating if the CPU is currently in real-address, virtual-8086, protected or IA-32e mode, based on the current state of the CR0.PE, RFLAGS.VM and EFER.LMA bits. Similarly, you can determine the current paging mode with GetPagingMode
that returns a CPUPagingMode
based on the CR0.PG, CR4.PAE and EFER.LME bits.
GDT and IDT entries can be manipulated through the Get/SetGDTEntry
and Get/SetIDTEntry
methods. These methods will access the guest's physical memory and take into account the values of the GDTR and IDTR registers in order to retrieve the corresponding entries. You may also use ReadSegment
to read the parameters of a segment based on its selector and the current state of the guest.
These methods are provided mostly for convenience and testing purposes, since the guest will not appreciate having their state modified externally.
Physical memory can be directly accessed through the buffer pointers provided to the virtual machine's memory mapping method, but the virtual machine object also provides a pair of methods to read and write to physical memory using the guest's address mappings, which eliminates the need to do pointer arithmetic on the buffers: MemRead
and MemWrite
.
The virtual machine object can also be used to read and write to guest linear memory with LMemRead
and LMemWrite
. Those methods take into account the current CPU paging mode and the relevant data structures in memory as well as the value of the CR3 register if paging is enabled. They support all paging modes available so far:
- No paging (when CR0.PG = 0), in which case linear addresses are translated directly to physical addresses;
- 32-bit paging (when CR0.PG = 1 and CR4.PAE = 0), where address translation goes through at most two levels: PDEs and PTEs;
- PAE paging (when CR0.PG = 1, CR4.PAE = 1 and EFER.LME = 0), where address translation goes through at most three levels: PDPTEs, PDEs and PTEs; and
- 4-level paging (when CR0.PG = 1, CR4.PAE = 1 and EFER.LME = 1), where address translation goes through at most four levels: PML4Es, PDPTEs, PDEs and PTEs.
You can also translate a linear address to a physical address using LinearToPhysical
. This is, in fact, the same method used by the linear memory read and write methods to compute the physical addresses of the linear addresses.
In order to handle I/O and MMIO operations, the virtual machine exposes a set of methods to register I/O callbacks: RegisterIORead/WriteCallback
and RegisterMMIORead/WriteCallback
. The callbacks are invoked automatically when an I/O or MMIO operation is executed by a virtual processor. If a nullptr
callback is specified, a default no-op handler is used instead.
You may also provide a context pointer through RegisterIOContext
, which is passed down to registered callback handlers. The context pointer is usually a pointer to an instance of a class that your application uses to handle I/O, or to a context struct if your design is more C-like.
The callback functions are defined as follows:
using IOReadFunc_t = uint32_t(*)(void *context, uint16_t port, size_t size);
using IOWriteFunc_t = void(*)(void *context, uint16_t port, size_t size, uint32_t value);
using MMIOReadFunc_t = uint64_t(*)(void *context, uint64_t address, size_t size);
using MMIOWriteFunc_t = void(*)(void *context, uint64_t address, size_t size, uint64_t value);
While you can certainly register I/O callbacks at any time, a well-designed application will only set the handlers once at virtual machine creation time and manage I/O from there.
One optional feature that can provide a boost to performance for certain applications is the ability to retrieve the dirty page bitmap of a portion of physical memory. The dirty page bitmap consists of a bitfield where each bit represents a single 4 KiB page and, when set, indicates that data in that particular page has been modified. This aids applications in locating changes in memory very quickly when they need to synchronize memory areas from system memory to devices, particularly when dealing with systems that have an Unified Memory Architecture (UMA), where the guest's physical memory is shared with the GPU (a common feature of modern video game consoles).
The VirtualMachine
class contains two methods for managing the dirty page bitmap: QueryDirtyPages
and ClearDirtyPages
. The first is used to retrieve and clear the dirty bitmap of a given region of memory. The second clears the region without retrieving the bitmap. The bitmap array must be large enough to contain all the bits of the specified physical memory range. The bitmapSize
parameter is given in uint64_t
units, that is, the size of the uint64_t
array that will contain the bitmap.
Note that some platforms may not allow you to query the dirty bitmap of a smaller portion of a mapped GPA range. This is denoted by the feature partialDirtyBitmap
being false
. In that case, you will need to specify the full range of the mapped GPA range for the query.
The final set of features we're going to cover in this guide is guest debugging. When a platform supports guest debugging, the following operations are available on a VirtualProcessor
:
- Single stepping
- Enabling and disabling software breakpoints (INT 3)
- Setting and clearing hardware breakpoints (debug registers)
The simplest of the guest debugging features. All you need to do is invoke Step()
instead of Run()
on your virtual processor. The CPU will run only one instruction and the VM exit reason will be VMExitReason::Step
, unless some other event occurred.
This feature can be enabled by invoking EnableSoftwareBreakpoints(true)
on a virtual processor. If the guest code attempts to execute the INT 3
instruction while software breakpoints are enabled, the VM will exit with VMExitReason::SoftwareBreakpoint
and the instruction pointer will not move until the breakpoint is removed or software breakpoints are disabled. The guest will never have a chance to handle INT 3
instructions while software breakpoints are enabled.
A debugging software can use this feature to implement software breakpoints by first making a backup of a particular byte in guest memory, then writing the byte 0xCC to that location (corresponding to the INT 3
instruction) and enabling software breakpoints. After running the virtual processor, if the VM exit reason indicated that a software breakpoint was hit, the debugging software can determine if the instruction pointer points to the location of one of its breakpoints. If it does, when the user decides to continue execution of the guest program, the debugger will write the original byte back, run a single step, write the INT 3
instruction again and continue with normal execution (unless another software breakpoint was hit). If it does not match any of the breakpoints set by the debugger, it can temporarily disable software breakpoints, step, then reenable breakpoints to give a chance for the guest to handle the INT 3
instruction. When the user decides to remove the breakpoint, the byte can be reverted permanently.
It is also possible to use hardware breakpoints via the debug registers. The VirtualProcessor
class offers a method for conveniently setting up hardware breakpoints: SetHardwareBreakpoints
. It takes a structure containing the specifications for all four possible hardware breakpoints and takes care of mapping its values into the corresponding DR# registers. When the virtual processor runs and hits one such breakpoint, the VM will exit with VMExitReason::HardwareBreakpoint
Once you're done working with hardware breakpoints, invoke the VirtualProcessor::ClearHardwareBreakpoints()
method to reset the debug registers.
For both software and hardware breakpoints, you can retrieve the last breakpoint address hit with the GetBreakpointAddress
method. If a breakpoint was hit as of the most recent virtual processor execution, the method returns OK and fills in the address in the specified pointer, otherwise it returns VPOperationStatus::BreakpointNeverHit
.
This covers the most important features of virt86 and should give you a firm grasp on how to use the library.
Have fun using virt86!