Releases · epigen/unsupervised_analysis

12 Sep 14:34

sreichl

v3.0.0

239f630

v3.0.0 - Snakemake 8 compatible Latest

Latest

Breaking change: Requires Snakemake >= v8.

Full Changelog: v2.0.0...v3.0.0

Assets 2

30 Jun 13:54

sreichl

v2.0.0

1f3993a

v2.0.0 - Performance improvements

Enhancements and new features

PCA: To improve performance n_components and svd_solver can be configured.
Heatmap: performance improvements
- distance matrix calculation done by pdist from scipy and parallelized for observations and features
- hierarchical clustering using fastcluster
- observations can be downsampled using configuration n_observations
- top features can be selected by variability using configuration n_features

The documentation was updated accordingly.

Bug fixes and other performance improvements are not mentioned.

Full Changelog: v1.1.0...v2.0.0

Assets 2

25 Jun 09:48

sreichl

v1.1.0

483f23f

v1.1.0 - small enhancements and bug fixes

Enhancements and new features

Additional PCA diagnostics: Visualization of the top 10 loadings per principal component using lollipop plots.
Internal cluster index calculation optional (very compute intensive).
Enable plotting of all features using the keyword "ALL".
Enhance Snakemake report using labels.
Switch from panels to solo plots.
Switch to data.table usage for accelerated read/write in R.

The documentation was updated accordingly.

Bug fixes and performance improvements are not mentioned.

Full Changelog: v1.0.1...v1.1.0

Assets 2

08 Oct 12:30

sreichl

v1.0.1

728031c

v1.0.1 - update author ORCID

Full Changelog: v1.0.0...v1.0.1

Assets 2

04 Oct 08:10

sreichl

v1.0.0

f273a0b

v1.0.0 - unsupervised analysis now includes cluster analysis methods

enhancements

added a config flag for 2D plot coord_fixed() option

new features

Clustering
- Leiden algorithm
- Clustification: an ML-based clustering approach that iteratively merges clusters based on misclassification
Clustree analysis and visualization
Cluster Validation
- External cluster indices are determined by comparing all clustering results with all categorical metadata
- Internal cluster indices are determined for each clustering and [metadata_of_interest]
- Multiple-criteria decision-making (MCDM) using TOPSIS for ranking clustering results by internal indices
Visualization
- all clustering results as 2D and interactive 2D & 3D plots for all available embedings/projections.
- external cluster indices as hierarchically clustered heatmaps, aggregated in one panel.
- internal cluster indices as one heatmap with clusterings and selected metadata sorted by TOPSIS ranking from top to bottom and split cluster indices split by type (cost/benefit functions to be minimized/maximized).

documentation

add scRNA-seq analysis section to the documentation
update the documentation accordingly (Software, Methods, Features, Examples)
update report to include all new feature outputs
update rulegraph

Bug fixes and performance improvements are not mentioned.

Full Changelog: v0.2.0...v1.0.0

Assets 2

12 Oct 13:45

sreichl

v0.2.0

61e1328

v0.2.0 - enhancements, new features and a full example added

enhancements

2D metadata plots: up to 10 columns per row, coordinates are fixed on both axes, numeric color scheme blue to red with midpoint 0 in grey

new features

2D feature plots: specify features of interest, which values from the data, will be highlighted in the 2D plots (motivated by bioinformatics highlighting expression levels of marker genes)
densMAP support: local density preserving regularization as an additional dimensionality reduction method
additional PCA diagnostics:
- pairs: sequential pair-wise PCs for up to 10 PCs using scatter- and density-plots colored by metadata_of_interest
- loadings: showing the magnitude and direction of the 10 most influential features for each PC combination
interactive 2D and 3D visualizations (self-contained HTML files) of all projections and embeddings including widgets to color by categorical and numerical metadata, respectively
hierarchically clustered heatmaps of scaled data (z-score) with configured distance metrics and clustering methods (all combinations are computed), and annotated with metadata_of_interest

documentation

add a minimal example, using the digits dataset from sklearn, to show configuration, results, and report (.test/ folder)
update the documentation accordingly (Software, Methods, Features, Examples)
update report to include all new feature outputs (apart from interactive plots)
update rulegraph

Bug fixes and performance improvements are not mentioned.

Full Changelog: v0.1.0...v0.2.0

Assets 2

22 Sep 13:25

sreichl

v0.1.0

01849e4

v0.1.0 - first stable version with PCA, UMAP and 2D visualizations

skip empty metadata columns in 2D plots

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancements and new features

Enhancements and new features

Releases: epigen/unsupervised_analysis

v3.0.0 - Snakemake 8 compatible

v2.0.0 - Performance improvements

Enhancements and new features

v1.1.0 - small enhancements and bug fixes

Enhancements and new features

v1.0.1 - update author ORCID

v1.0.0 - unsupervised analysis now includes cluster analysis methods

v0.2.0 - enhancements, new features and a full example added

v0.1.0 - first stable version with PCA, UMAP and 2D visualizations