This repository provides Python implementations for Non-negative Matrix Factorization (NMF) using the Multiplicative Update (MU) algorithm. Two initialization methods are supported: random initialization and Non-negative Double Singular Value Decomposition (NNDSVD). NMF is a matrix factorization technique used in various fields, including topic modeling, collaborative filtering, and dimensionality reduction.
Non-negative Matrix Factorization (NMF) is a family of linear algebra algorithms used for identifying the latent structure within data represented as a non-negative matrix. For a comprehensive overview, check out this YouTube tutorial.
Given an input matrix ( A ) and a rank ( k ), NMF approximates ( A ) as the product of two non-negative matrices ( W ) and ( H ), where ( W ) and ( H ) are ( k )-dimensional factors.
Objective Function: The goal is to minimize the following function:
There are several approaches to solving this minimization problem. This implementation uses the Multiplicative Update Method, introduced by Lee and Seung in 1999. For more details, refer to this paper.
The Multiplicative Update is an iterative method and can be sensitive to the initializations of ( W ) and ( H ). To improve convergence, this repository includes the NNDSVD Method, an SVD-based initialization introduced by C. Boutsidis and E. Gallopoulos in 2007. More details can be found in this paper.
- Random Initialization: Initializes the factor matrices ( W ) and ( H ) with random values.
- NNDSVD Initialization: Provides a smarter initialization using Singular Value Decomposition (SVD) to improve convergence.
- Multiplicative Update Algorithm: Performs iterative updates to minimize the Frobenius norm between the input matrix ( A ) and the product of ( W ) and ( H ).
- Configurable Parameters: Allows specifying the rank of factorization, maximum iterations, and initialization mode.
- Track Convergence: Returns the Frobenius norm at each iteration to monitor convergence.
-
random_initialization(A, rank)
: Randomly initializes matrices ( W ) and ( H ). -
nndsvd_initialization(A, rank)
: Initializes ( W ) and ( H ) using NNDSVD for improved factorization. -
multiplicative_update(A, k, max_iter, init_mode='random')
: Performs the NMF algorithm with multiplicative updates and tracks the convergence.
- NumPy: The code relies on NumPy for matrix operations.
- Topic Modeling: Decompose a term-document matrix to identify topics in text.
- Collaborative Filtering: Factorize a user-item matrix for recommendation systems.
- Dimensionality Reduction: Reduce the dimensionality of large datasets while preserving non-negativity.
Feel free to explore and modify the code to suit your NMF needs!
To get started with the NMF implementation, you can use the following code examples:
This example demonstrates how to use the random initialization method for NMF:
import numpy as np
from nmf_module import random_initialization, multiplicative_update
# Generate a non-negative input matrix A
A = np.random.rand(10, 10)
# Define the rank for factorization
rank = 3
# Initialize W and H randomly
W_rand, H_rand = random_initialization(A, rank)
# Perform NMF using Multiplicative Update with random initialization
W_mu, H_mu, norms = multiplicative_update(A, rank, max_iter=100, init_mode='random')
print("Randomly Initialized W:\n", W_mu)
print("Randomly Initialized H:\n", H_mu)
print("Convergence Norms:\n", norms)
This example demonstrates how to perform NNDSVD initialization and run the multiplicative update for NMF:
import numpy as np
from nmf_module import nndsvd_initialization, multiplicative_update
# Generate a non-negative input matrix A
A = np.random.rand(10, 10)
# Define the rank for factorization
rank = 3
# Initialize W and H using NNDSVD
W_nndsvd, H_nndsvd = nndsvd_initialization(A, rank)
# Perform NMF using Multiplicative Update with NNDSVD initialization
W_mu, H_mu, norms = multiplicative_update(A, rank, max_iter=100, init_mode='nndsvd')
# Display the results
print("NNDSVD Initialized W:\n", W_mu)
print("NNDSVD Initialized H:\n", H_mu)
print("Convergence Norms:\n", norms)