PyTorch implementation of kmeans for utilizing GPU
import torch
import numpy as np
from kmeans_pytorch import kmeans
# data
data_size, dims, num_clusters = 1000, 2, 3
x = np.random.randn(data_size, dims) / 6
x = torch.from_numpy(x)
# kmeans
cluster_ids_x, cluster_centers = kmeans(
X=x, num_clusters=num_clusters, distance='euclidean', device=torch.device('cuda:0')
)
see example.ipynb
for a more elaborate example
- PyTorch version >= 1.0.0
- Python version >= 3.6
install with pip
:
pip install kmeans-pytorch
Installing from source
To install from source and develop locally:
git clone https://github.com/subhadarship/kmeans_pytorch
cd kmeans_pytorch
pip install --editable .
see cpu_vs_gpu.ipynb
for a comparison between CPU and GPU
- useful when clustering large number of samples
- utilizes GPU for faster matrix computations
- support euclidean and cosine distances (for now)
- This implementation closely follows the style of this
- Documentation is done using the awesome theme jekyllbook