From 6d6bfc94223f4b927675abe8b77006c12a03162b Mon Sep 17 00:00:00 2001 From: Vissidarte-Herman <93570324+Vissidarte-Herman@users.noreply.github.com> Date: Tue, 20 Sep 2022 11:30:25 +0800 Subject: [PATCH] Update accelerate_pytorch.md Minor format updates. --- docs/lang/articles/get-started/accelerate_pytorch.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/lang/articles/get-started/accelerate_pytorch.md b/docs/lang/articles/get-started/accelerate_pytorch.md index 0d1676103f171..2c94768a590fb 100644 --- a/docs/lang/articles/get-started/accelerate_pytorch.md +++ b/docs/lang/articles/get-started/accelerate_pytorch.md @@ -94,10 +94,10 @@ As the following table shows, the PyTorch kernel takes 30.392 ms[1] to complete `torch_pad()` launches 58 CUDA kernels, whilst Taichi compiles all computation into one CUDA kernel. The fewer the CUDA kernels, the less GPU launch overhead is incurred. Moreover, the Taichi kernel manages to save a lot more redundant memory operations than the PyTorch kernel. The GPU launch overhead and the redundant memory operations are the potential source for optimization and acceleration. -| Kernel function | Average time (ms) | CUDA kernels launched (number) | -| --------------- | ----------------- | ------------------------------ | -| `torch_pad()` | 30.392 | 58 | -| `ti_pad()` | 0.267 | 1 | +| Kernel function | Average time (ms) | CUDA kernels launched (number) | +| :--------------- | :----------------- | :------------------------------ | +| `torch_pad()` | 30.392 | 58 | +| `ti_pad()` | 0.267 | 1 | > - GPU: RTX3090 > - PyTorch version: v1.12.1; Taichi version: v1.1.0