Python classes for performing a weighted principal component analysis (PCA).
A class for performing a weighted principal component analysis (WPCA) using the singular value decomposition (SVD).
See this blog plost for more details on how the algorithm functions.
import numpy as np
from WeightedPCA import WPCA
m, n = 8, 4
A = np.random.randint(0, 10, size=(m, n))
# Repeats (or weightings) of the rows of A
W = np.random.randint(1, 10, size=m)
wpca = WPCA().fit(A, W)
TA = np.repeat(wpca.transform(A), W, axis=0)
A class for performing a weighted principal component analysis (WPCA) using the eigenvalue decomposition of either XX*
or X*X
. If not specified, the approach is chosen to the be the more efficient of the two. The method supports sparse matrices and is able to efficiently decompose mxn rectangular matrices (i.e. when m << n
or m >> n
)
See this other blog plost for more details on the two methods.
from scipy.sparse import random
from EigenPCA import EigenPCA
Z = random(1000000, 1000, density=0.005, format='csc')
EP = EigenPCA(n_components=4)
X = EP.fit_transform(Z).A
Note: sparse.random
for csc
format is somewhat slow and memory hungry.
%timeit X = EP.fit_transform(Z).A
1.87 s ± 40.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Further, when running the above, there is no significant increase in memory usage.