Weighted Expectation-Maximization for sparse GMM Training

Proof of concept EM-Algorithm implementation that uses prior knowledge of probabilities on 2D points to train a multivariate Gaussian Mixture Model (GMM).

TL;DR

Basically the probability is used for normalization during the maximization step. When the sampling count is low, the square root of the probability p can be used instead of p as an optimization.

The expectation step is not changed:

Summary

KMeans with random point initialization
Low Max-Iterations (default: 5)
Low count of training points (default: 20).
Comparison and reference of EM implementation in OpenCV

Usage

Run: ./main.py
Store: ./main.py --save test.json
Replay: ./main.py --load test.json
Run large test: ./compare.py

What can be observed?

More information is used to approximate the incomplete data. It shows slightly better results than the reference algorithm, especially in a sparse sampled environment.

But keep in mind that with a low iteration count the initial guess via K-Means plays a big role.

Example 1

Initial is the desired distribution that was used to sample the red dots. OpenCV-EM is the reference algorithm by OpenCV. Weighted-EM is the enhancement by using the probabilities in normalization.

Example 2

Optimizations

On low sampling rate some normals can become faint. The square-root on the probability can bring them to the front again.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
em		em
img		img
mathlib		mathlib
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
compare.py		compare.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Weighted Expectation-Maximization for sparse GMM Training

TL;DR

Summary