Name		Name	Last commit message	Last commit date
parent directory ..
tests		tests
.bumpversion.cfg		.bumpversion.cfg
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
build.sh		build.sh
captain.yml		captain.yml
distributed_kmeans.py		distributed_kmeans.py
publish.sh		publish.sh
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
slack.json		slack.json

README.md

Python distributed k-means clustering

Implementation of distributed k-means clustering in Python. It uses Single-Shot Decentralized LLoyd.

Clustering is parametrized using env MODEL_PARAM_n_clusters, but the final number of clusters is also influenced by the number of nodes - total number of output clusters is floor(n_clusters * n_nodes / 2).

Usage

It has two modes

compute --mode intermediate
compute --mode aggregate --job-ids 1 2 3

Intermediate mode calculates clusters on a single node, while aggregate mode is merging the clusters according to least merging error (e.g. smallest distance between centroids).

Build (for contributors)

Run: ./build.sh

Integration Test (for contributors)

Run: captain test

Publish (for contributors)

Run: ./publish.sh

Unit tests (for contributors)

WARNING: unit tests can fail nondeterministically on AttributeError: can't set attribute because of some error in Titus port to Python 3

Create symlink from python-distributed-kmeans to mip_helper module from python-mip

ln -s ~/projects/python-base-docker-images/python-mip/mip_helper/mip_helper mip_helper

Run unit tests

find . -name \*.pyc -delete
(cd tests; docker-compose run test_suite -x --ff --capture=no)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python-distributed-kmeans

python-distributed-kmeans

README.md

Python distributed k-means clustering

Usage

Build (for contributors)

Integration Test (for contributors)

Publish (for contributors)

Unit tests (for contributors)

Files

python-distributed-kmeans

Directory actions

More options

Directory actions

More options

Latest commit

History

python-distributed-kmeans

Folders and files

parent directory

README.md

Python distributed k-means clustering

Usage

Build (for contributors)

Integration Test (for contributors)

Publish (for contributors)

Unit tests (for contributors)