Skip to content

System Architecture

Derek Gerstmann edited this page Oct 27, 2023 · 3 revisions

General Overview

Halide consists of a compiler stack (packaged as a compiled C++ library), as well as a runtime system (pre-assembled as LLVM bitcode, and dynamically linked into a library at code generation time).

The compiler can generate code dynamically via "Just-in-Time" (JIT) compilation, or pre-compiled via "Ahead-of-Time" (AOT) compilation. For stability and performance, AOT compilation is the preferred path for integrating the generated code into other projects. For tutorials, experimentation and testing, JIT compilation requires less tooling, and a simpler "write code and run it" experience.

Front-End

Halide provides a C++ API for creating and manipulating objects which, in turn, constructs a directed acyclic graph (DAG) of functions (or function graph) that is represented internally as "Front-End IR" (FIR) (see src/IR.h. Python bindings are provided for all public C++ API methods as well.

When the function graph is compiled (via the realize() or compile_to_XXX() methods), the FIR is analyzed, simplified, and then lowered into native code via several lowering stages (see src/Lower.cpp)

Back-End

Once all of the lowering stages have succeeded, Halide leverages LLVM for native code generation (see src/CodeGen_LLVM.cpp which involves compiling the lowered statements into instructions that will execute on the host target, as well as weaving in external calls to the runtime.

Runtime System

The runtime system is composed of a very thin driver like interface that the various backends need to implement for handling memory allocation, native compilation, kernel execution, memory transfers, and device synchronization (see src/runtime/HalideRuntime.h. System calls for the various platforms (Posix, Windows, MacOS, iOS, Android, etc) are implemented in a similar fashion. All of these functions are pre-assembled into LLVM bitcode files (.ll) via clang, and embedded within the Halide compiler library as part of the Halide build. These bitcode modules are then linked into a target specific runtime library during code generation (see src/LLVM_Runtime_Linker.cpp). This allows the Halide compiler to cross-compile to any supported target on any supported platform.

The generated runtime libraries are target specific, but they are generated with weak symbols. For systems that support weak linking this allows multiple runtimes to be mixed together, and selected based on the Halide target selected in the running process.

Generators

To simplify AOT compilation, a C++ template class is provided that defines a common interface for Generators (see tutorial 15). The generator interface handles all of the inputs/outputs/params declarations as well as the algorithm and schedule definitions in a standalone class. A pipeline registration system is used to map the template class definition to a specific generator instance. Multiple generator classes can be compiled together with tools/RunGenMain.cpp) to create a specialized binary executable capable of generating code for any registered pipeline for any supported Halide target. There is an equivalent interface for Python as well (see README_python.md).