Refactor kernel compilation #7002

PGZXB · 2022-12-28T16:16:31Z

(This is part of Refactor kernel compilation and launch, which aims to split kernel compilation and kernel launching totally.)

Brief Introduction

Isolate the stuff which is not related with kernel compilation from lang::Kernel.
lang::CompileConfig per compilation instead of lang::CompileConfig per lang::Program.
Introduce lang::CompiledKernelData, lang::KernelCompiler to unify the compilation result and the interfaces of kernel compilation.
Introduce lang::KernelCompilationManager to unify the implementation of cache on ALL backends, including online cache and offline cache
Re-impl the JIT in Python frontend and remove the online cache in frontend

Prototype

An incomplete implementation of this proposal: see [refactor] Refactor kernel compilation (1st prototype, WIP, Draft, Don't merge) #6819 or PGZXB:dev-refactor-kernelCR-1st-prototype

Provide `@ti.pkernel`

NOTE: The @ti.pkernel is only an attempt based on the prototype in order to verify the effect of the proposal, which is NOT a "feature request".

@ti.pkernel = @ti.kernel + CompileConfig, which allows the user to specify different compile configuration options for each kernel.

Implementation

See PGZXB:dev-pkernel-based_on-refactorKCaL_1st_prototype

Example

import taichi as ti

ti.init(arch=ti.vulkan, offline_cache=True, print_preprocessed_ir=False)

@ti.kernel
def K1(a: ti.i32, b: ti.i32) -> ti.i32:
  return a + b


@ti.pkernel(offline_cache=False, print_preprocessed_ir=True)
def PK1(a: ti.i32, b: ti.i32) -> ti.i32:
  return a * b

# Not allowed before finishing "Refactor kernel launch"
# @ti.pkernel(arch=ti.cpu) # ti.cpu != ti.vulkan
# def PK2(a: ti.i32, b: ti.i32) -> ti.i32:
#   return a * b

K1(10, 10)
PK1(10, 10)
# PK2(10, 100)

To-Do List

Related work

Maybe start replacing pybind11 with XXXXXX from the APIs about kernel compilation and launch. Some of them will be refactored and standardized in this proposal.

The text was updated successfully, but these errors were encountered:

Issue: #7002

…ev#7045) Issue: taichi-dev#7002

…++ side (#7044) Issue: #7002

…++ side (taichi-dev#7044) Issue: taichi-dev#7002

Issue: #7002

… *kernel, IRNode *ir) (#7046) Issue: #7002

Issue: #7002

Issue: #7002 Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Issue: #7002

…x::run_codegen (#7089) Issue: #7002

…degen_cc.cpp (#7088) Issue: #7002

…vm backends codegen (taichi-dev#7153) Issue: taichi-dev#7002 Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…me tests (taichi-dev#7155) Issue: taichi-dev#7002

…pass::constant_fold (taichi-dev#7159) Issue: taichi-dev#7002 ### Brief Summary 1. The dependencies on `Program::this_thread_config()` in `irpass::constant_fold` were removed; 2. The race condition of `Program::config` (concurrent write) was killed, so we can remove the multi-thread version `Program::this_thread_config()`. I will do it in next PR; 3. The `Program::compile(Kernel *kernel)` was refactored to `Program::compile(const CompileConfig &compile_config, Kernel *kernel)`, which is a temporary solution because I will introduce new classes to provide compilation interfaces (see taichi-dev#7002 for more information). Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…taichi-dev#7199) Issue: taichi-dev#7002 Removed multi-thread version `Program::this_thread_config()` (see taichi-dev#7159 (comment)) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…hi-dev#7209) Issue: taichi-dev#7002 ### Brief Summary Removed dependencies on `Program::this_thread_config()` in `lang::Function` compilation (AST->IR part) * Push off the compilation of `lang::Function`: Introduce the `irpass::compile_called_function(IRNode *root, const CompileConfig &config)`, which compiles the AST/IR of `Function`s called in `root` to the final IR.

Issue: taichi-dev#7002

…ichi-dev#7243) Issue: taichi-dev#7002, taichi-dev#7159 (comment) --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…rt2) (taichi-dev#7253) Issue: taichi-dev#7002 --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…ed_offline_cache_key (taichi-dev#7287) Issue: taichi-dev#7286 * taichi-dev#7286: Part of taichi-dev#7002 --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

… in KernelCodeGen (taichi-dev#7289) Issue: taichi-dev#7286 * This PR: Part of removing `KernelCodeGen::prog` * removing `KernelCodeGen::prog`: Part of taichi-dev#7286 * taichi-dev#7286: Part of taichi-dev#7002 --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

…Graph::run (taichi-dev#7288) Issue: taichi-dev#7286 * taichi-dev#7286: Part of taichi-dev#7002 --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Issue: taichi-dev#7002

…#7391) Issue: taichi-dev#7002 ### Brief Summary Disable `ASTSerializer::allow_undefined_visitor`: 1. Enabling `ASTSerializer::allow_undefined_visitor` is dangerous; 2. Prepare for introducing `KernelCompilationManager`.

…er (taichi-dev#7371) Issue: taichi-dev#7002 ### Brief Summary * Impl `spirv::CompilerKernelData` * Introduce `lang::KernelCompiler` & Impl `spirv::KernelCompiler` p.s. The `KernelCompiler` is not used now. Next PR, I will introduce `KernelCompilatonManager`, which depends on the `CompilerKernelData` and `KernelCompiler` --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Issue: taichi-dev#7002, taichi-dev#4401 ### Brief Summary This PR: 1. Introduced `KernelCompilationManager` to unify implementation of the Offline Cache; 2. Used `KernelCompilationManager` re-impl JIT, Offline Cache on gfx backends (vulkan, metal, dx11, opengl); 3. Removed the `gfx::CacheManager`. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Issue: taichi-dev#7002 ### Brief Summary The member was used by old implementation of offline cache, which is unnecessary now.

…aichi-dev#7426) Issue: taichi-dev#7002 ### Brief Summary Support cleaning `.tic` files.

…chi-dev#7425) Issue: taichi-dev#7002 ### Brief Summary The functions were used by old implementation of offline cache, which are unnecessary now.

Issue: taichi-dev#7002 ### Brief Summary #### Fixed bug reported by https://github.com/taichi-dev/taichi/actions/runs/4346468687/jobs/7592576659 : ``` Error in atexit._run_exitfuncs: Traceback (most recent call last): File "tests/run_tests.py", line 233, in print_and_remove os.listdir(os.path.join(tmp_cache_file_path, subdir))) NotADirectoryError: [Errno 20] Not a directory: '/var/folders/1r/fk_r0s1d4ss8m19rg11bq1j80000gp/T/tmpcnyh7mz1/Tbdbfefe5b5fe6db167e250d15bb1c4cc797b2beee4fa7a09be9a6cff8c060101-metal.tic' ``` #### Quick repro on master: `python run_tests.py -v -t4 -a vulkan,cpu abs --with-offline-cache` ``` ... Error in atexit._run_exitfuncs: Traceback (most recent call last): File "..\taichi\tests\run_tests.py", line 233, in print_and_remove os.listdir(os.path.join(tmp_cache_file_path, subdir))) NotADirectoryError: [WinError 267] The directory name is invalid. : 'C:\\Users\\xxx\\AppData\\Local\\Temp\\tmpcs7z9cpq\\T1571a996babd2f9b7a24eeba9d8450fd767cf939b8d717941f629550302d6408-vulkan.tic' ``` #### On this branch (fixed the bug): `python run_tests.py -v -t4 -a vulkan,cpu abs --with-offline-cache` ``` ... Summary of testing the offline cache: Simple statistics: {'llvm': 19, '*.tic': 18} Size of cache files: 646.04 KB ```

…i-dev#7515) Issue: taichi-dev#7002

…() (taichi-dev#7540) Issue: taichi-dev#7002 ### Brief Summary It is challenging to implement the function `CompiledKernelData::size()`, particularly for the `llvm::CompiledKernelData` class, as it is not possible to obtain the size of the `llvm::Module` without dumping it.

…LlvmAotModuleBuilder (taichi-dev#7714) Issue: taichi-dev#7002 ### Brief Summary --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Issue: taichi-dev#7002 ### Brief Summary  ### <samp>🤖 Generated by Copilot at f2ee059</samp> Improve kernel compilation manager efficiency by avoiding redundant cache loading. ### Walkthrough  ### <samp>🤖 Generated by Copilot at f2ee059</samp> * Add a check for existing kernel data to avoid redundant loading ([link](https://github.com/taichi-dev/taichi/pull/7741/files?diff=unified&w=0#diff-b7662dbf5bcf20f4b99048f4f2405316c0ba037c0722273522efaad72d256ef0L219-R228))

Issue: taichi-dev#7002 ### Brief Summary  ### <samp>🤖 Generated by Copilot at 8d2c768</samp> This pull request improves the error handling and debugging of the kernel compilation and caching process using the LLVM backend. It adds a `check` function to the `CompiledKernelData` class that verifies the LLVM module and tasks, and uses it to assert and report the status of the data. ### Walkthrough  ### <samp>🤖 Generated by Copilot at 8d2c768</samp> * Add a new error enum value and message for the case when the CompiledKernelData is broken ([link](https://github.com/taichi-dev/taichi/pull/7743/files?diff=unified&w=0#diff-5e2472488f9620231c8e3d6a2c0413742e2b42424691991f4a85af81832af3f4R95), [link](https://github.com/taichi-dev/taichi/pull/7743/files?diff=unified&w=0#diff-70ec30330946b7543dcf0b458091cfc6128a5cb5460041db64144b84722de4abR184-R185)) * Implement a check function for the CompiledKernelData class that verifies the LLVM module and the tasks stored in the data using the LLVM verifier ([link](https://github.com/taichi-dev/taichi/pull/7743/files?diff=unified&w=0#diff-3986d4b2137cab0463bb950113c0ddc44cfdf3baff237da6286c42c478309c3bR3), [link](https://github.com/taichi-dev/taichi/pull/7743/files?diff=unified&w=0#diff-3986d4b2137cab0463bb950113c0ddc44cfdf3baff237da6286c42c478309c3bR30-R43), [link](https://github.com/taichi-dev/taichi/pull/7743/files?diff=unified&w=0#diff-e54a1ed2c2d35f9357e4dfbbbc1224e70a95280b43d4a59f0017dc2c6163dca0R55-R56)) * Add assertions and checks for the result of the check function after compiling or loading a kernel using the LLVM backend, and modify the debug message to include the cache filename ([link](https://github.com/taichi-dev/taichi/pull/7743/files?diff=unified&w=0#diff-b7662dbf5bcf20f4b99048f4f2405316c0ba037c0722273522efaad72d256ef0R178), [link](https://github.com/taichi-dev/taichi/pull/7743/files?diff=unified&w=0#diff-b7662dbf5bcf20f4b99048f4f2405316c0ba037c0722273522efaad72d256ef0L266-R275))

…gfx::AotModuleBuilderImpl (taichi-dev#7715) Issue: taichi-dev#7002, taichi-dev#6520 (comment) ### Brief Summary

Issue: taichi-dev#7002 ### Brief Summary `Program::compile()` => `Program::compile_kernel()` + `Program::launch_kernel()`

Issue: taichi-dev#7002 ### Brief Summary  ### <samp>🤖 Generated by Copilot at 5923f65</samp> Refactor compiled function management for kernels in `PyTaichi`. Use `compiled_kernels` attribute of each kernel instead of `compiled_functions` attribute of `PyTaichi`. ### Walkthrough  ### <samp>🤖 Generated by Copilot at 5923f65</samp> * Remove the `compiled_functions` attribute from the `PyTaichi` class and use the `compiled_kernels` attribute of each kernel instead ([link](https://github.com/taichi-dev/taichi/pull/7867/files?diff=unified&w=0#diff-99744c5ae5f6a754d6f68408fdc64fb0d6097216518a7f3d1ef43ffe12599577L316)) * Update the `clear_compiled_functions` method of the `PyTaichi` class to clear the `compiled_kernels` attribute of each kernel ([link](https://github.com/taichi-dev/taichi/pull/7867/files?diff=unified&w=0#diff-99744c5ae5f6a754d6f68408fdc64fb0d6097216518a7f3d1ef43ffe12599577L339-R339)) * Update the `get_num_compiled_functions` method of the `PyTaichi` class to sum up the lengths of the `compiled_kernels` attribute of each kernel ([link](https://github.com/taichi-dev/taichi/pull/7867/files?diff=unified&w=0#diff-99744c5ae5f6a754d6f68408fdc64fb0d6097216518a7f3d1ef43ffe12599577L354-R357)) --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

taichi-gardener added this to Taichi Lang Dec 28, 2022

github-project-automation bot moved this to Untriaged in Taichi Lang Dec 28, 2022

PGZXB self-assigned this Dec 28, 2022

PGZXB added the refactor Refactor of API or codebases label Dec 28, 2022

This was referenced Dec 28, 2022

Support JIT Offline Cache for Taichi #4401

Open

[misc] Remove unnecessary CompileConfig::lazy_compilation #7009

Merged

feisuzhu moved this from Untriaged to In Progress in Taichi Lang Dec 30, 2022

PGZXB added a commit that referenced this issue Jan 5, 2023

[refactor] Remove offloaded parameter of Program::compile() (#7045)

875c14a

Issue: #7002

feisuzhu pushed a commit to feisuzhu/taichi that referenced this issue Jan 5, 2023

[refactor] Remove offloaded parameter of Program::compile() (taichi-d…

b4d8782

…ev#7045) Issue: taichi-dev#7002

PGZXB added a commit that referenced this issue Jan 5, 2023

[refactor] Remove dependencies on Program::current_ast_builder() in C…

8066f43

…++ side (#7044) Issue: #7002

PGZXB added a commit to PGZXB/taichi that referenced this issue Jan 5, 2023

[refactor] Remove dependencies on Program::current_ast_builder() in C…

50f0aa9

…++ side (taichi-dev#7044) Issue: taichi-dev#7002

PGZXB added a commit that referenced this issue Jan 5, 2023

[refactor] Remove unnecessary IRNode::kernel (#7047)

edb8afa

Issue: #7002

PGZXB added a commit that referenced this issue Jan 5, 2023

[refactor] Remove ir parameter of KernelCodeGen::KernelCodeGen(Kernel…

1a7a5ef

… *kernel, IRNode *ir) (#7046) Issue: #7002

PGZXB added a commit that referenced this issue Jan 9, 2023

[refactor] Remove unnecessary parameter of irpass::scalarize (#7087)

539d2a5

Issue: #7002

PGZXB added a commit that referenced this issue Jan 9, 2023

[refactor] Remove unnecessary Kernel::arch (#7074)

04150ac

Issue: #7002

PGZXB added a commit that referenced this issue Jan 9, 2023

[refactor] Remove Program::current_ast_builder() (#7075)

c24e6ae

Issue: #7002 Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

PGZXB added a commit that referenced this issue Jan 10, 2023

[refactor] Move Kernel::lower() outside the taichi::lang::Kernel (#7048)

b376493

Issue: #7002

PGZXB added a commit that referenced this issue Jan 11, 2023

[refactor] Remove dependencies on Program::this_thread_config() in gf…

9aea100

…x::run_codegen (#7089) Issue: #7002

PGZXB added a commit that referenced this issue Jan 11, 2023

[refactor] Remove dependencies on Program::this_thread_config() in co…

35a0e5b

…degen_cc.cpp (#7088) Issue: #7002

quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023

[refactor] Remove dependencies on Program::this_thread_config() in so…

b9b72e8

…me tests (taichi-dev#7155) Issue: taichi-dev#7002

quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023

[refactor] Remove Kernel::offload_to_executable (taichi-dev#7210)

9b11485

Issue: taichi-dev#7002

quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023

[refactor] Introduce lang::CompiledKernelData (taichi-dev#7340)

8f51476

Issue: taichi-dev#7002

quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023

[misc] Remove AotModuleParams::enable_lazy_loading (taichi-dev#7424)

d017521

Issue: taichi-dev#7002 ### Brief Summary The member was used by old implementation of offline cache, which is unnecessary now.

quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023

[bug] Fix offline_cache::clean_offline_cache_files (ti cache clean) (t…

a2cc98b

…aichi-dev#7426) Issue: taichi-dev#7002 ### Brief Summary Support cleaning `.tic` files.

quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023

[misc] Implement KernelCompialtionManager::clean_offline_cache (taich…

885a689

…i-dev#7515) Issue: taichi-dev#7002

quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023

[refactor] Let KernelCompilationManager manage kernel compilation in …

4f54877

…gfx::AotModuleBuilderImpl (taichi-dev#7715) Issue: taichi-dev#7002, taichi-dev#6520 (comment) ### Brief Summary

quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023

[refactor] Split Program::compile() (taichi-dev#7847)

6df8ee7

Issue: taichi-dev#7002 ### Brief Summary `Program::compile()` => `Program::compile_kernel()` + `Program::launch_kernel()`

PGZXB closed this as completed Sep 14, 2023

github-project-automation bot moved this from In Progress to Done in Taichi Lang Sep 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor kernel compilation #7002

Refactor kernel compilation #7002

PGZXB commented Dec 28, 2022 •

edited

Loading

Refactor kernel compilation #7002

Refactor kernel compilation #7002

Comments

PGZXB commented Dec 28, 2022 • edited Loading

Brief Introduction

Prototype

Provide @ti.pkernel

Implementation

Example

To-Do List

Related work

PGZXB commented Dec 28, 2022 •

edited

Loading

Provide `@ti.pkernel`