PyTorch Approx Topk

An alpha implementation of the bucketed top-k algorithm using a priority queue.

Requires: Python 3.11, CUDA toolkit 12.1, Ninja (ninja-build).

pip install git+https://github.com/graphcore-research/pytorch-approx-topk.git

Usage (note that kernel compilation on first use may take a while):

from approx_topk.priority_queue import topk as approx_topk
import torch

x = torch.randn(128, int(2**20), device="cuda")
values, indices = approx_topk(x, k=int(2**16), dim=-1, j=2, k_mult=1)

Note that j is $k_b$ and k_mult is $k_b \cdot b / k$.

Repository highlights:

approx_top/ PyTorch library code
- priority_queue.py custom priority queue implementation (also priority_queue.cu)
- bucket_argmax.py $k_b=1$ torch & triton implementations
benchmarks/ benchmarking scripts
- measure_speed.py main benchmarking script for measuring runtime/bandwidth as in Figure 1
notebooks/ experimental results notebooks (including work-in-progress results)
- 20240912-benchmarks-3090.ipynb example of visualising memory bandwidth results

Development

To set up the environment, install the dependencies:

CUDA toolkit 12.1
Ninja (ninja-build)
Python 3.11
Python Poetry

Then run poetry install --with benchmarks

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
approx_topk		approx_topk
benchmarks		benchmarks
data		data
notebooks		notebooks
tests		tests
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyTorch Approx Topk

Development

License

About

Contributors 4

Languages

License

graphcore-research/pytorch-approx-topk

Folders and files

Latest commit

History

Repository files navigation

PyTorch Approx Topk

Development

License

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 4

Languages