Feature/exe 1769 dsb implementation #147

ptajvar · 2024-05-31T13:59:26Z

Description

Added dsb_normalization as part of the new normalization package in the analysis stage. dsb normalization has been shown to effectively compensate for noise effects in expression data.

Fixes: #(issue number)

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)

How Has This Been Tested?

It has been tested against a similar function that is already implemented in R and the results are almost identical (A slight variation is expected given that the gaussian mixture model stage is indeterministic).
A corresponding test function is implemented to compare against future changes to the function.

PR checklist:

This comment contains a description of changes (with reason).
I have performed a self-review of my own code
My changes generate no new warnings
I have checked my code and documentation and corrected any misspellings
I have documented any significant changes to the code in CHANGELOG.md

…or test.

codecov · 2024-05-31T14:46:44Z

Codecov Report

Attention: Patch coverage is 94.59459% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 81.83%. Comparing base (7c69253) to head (9562e38).

Files	Patch %	Lines
src/pixelator/analysis/normalization/__init__.py	94.59%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##              dev     #147      +/-   ##
==========================================
+ Coverage   81.80%   81.83%   +0.02%     
==========================================
  Files         118      119       +1     
  Lines        6568     6605      +37     
==========================================
+ Hits         5373     5405      +32     
- Misses       1195     1200       +5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

johandahlberg

Excellent! I have focuses on the python aspect here. I think @maxkarlsson is more qualified to comment on the actual implementation of the method.

Very nice and clear code. I had some minor suggestions.

src/pixelator/analysis/normalization/__init__.py

johandahlberg · 2024-06-03T06:32:49Z

src/pixelator/analysis/normalization/__init__.py

+def dsb_normalize(
+    raw_expression: pd.DataFrame, isotype_controls: Union[List, None] = None
+):
+    """empty-droplet-free method as implemented in the dsb package.


I think we should add a reference to the original package here.

Mulè, M.P., Martins, A.J. & Tsang, J.S. Normalizing and denoising protein expression data from droplet-based single cell profiling. Nat Commun 13, 2099 (2022).
https://doi.org/10.1038/s41467-022-29356-8

tests/analysis/normalization/test_normalization.py

…randomizations for the same seed across platforms.

Changed terminology somewhat. expression --> abundance limma --> _regress_out_confounder baseline --> background

maxkarlsson

Looks great! I have some comments and a control question for the PCA step. I have a little trouble understanding whether the linear regression is equivalent, mostly because I am not used to the syntax. I would suggest that we generate data for both versions of the normalization and verify that the result is close to exactly the same. Great and quick work!

I also made some commits with terminology changes. Please hava a look and see if you agree with the changes.

src/pixelator/analysis/normalization/__init__.py

maxkarlsson · 2024-06-03T08:55:21Z

src/pixelator/analysis/normalization/__init__.py

+        current_axis = dataframe.loc[i, :] if axis == 0 else dataframe.loc[:, i]
+        gmm = gmm.fit(current_axis.to_frame())
+        background[i] = np.min(gmm.means_)
+        scores[i] = np.abs(gmm.means_[1] - gmm.means_[0]) / np.sum(gmm.covariances_)


What is done at this line? Scores for a marker is calculated as how many standard deviations the positive population mean is from the negative population mean?

Yes! right now the score is not used anywhere as we do the baseline subtraction for all markers, but could later be used as a check for existence of negative population.

maxkarlsson · 2024-06-03T08:58:03Z

src/pixelator/analysis/normalization/__init__.py

+
+def _get_background_abundance(dataframe: pd.DataFrame, axis=0):
+    """Fit a double gaussian distribution to the abundance data and return the mean 
+    of the first gaussian as an estimation of the background level."""


We also return some scores, right?

maxkarlsson · 2024-06-03T09:01:07Z

src/pixelator/analysis/normalization/__init__.py

+    log_abundance = log_abundance - marker_background
+    component_background, _ = _get_background_abundance(log_abundance, axis=0)
+
+    if isotype_controls is not None:


I think we should force the user to input isotypes, since dsb relies on those.

maxkarlsson · 2024-06-03T09:03:13Z

src/pixelator/analysis/normalization/__init__.py

+    if isotype_controls is not None:
+        control_signals = log_abundance.loc[:, isotype_controls]
+        control_signals["component_background"] = component_background
+        pheno = PCA(n_components=1).fit_transform(control_signals)


Are variables in control_signals scaled to unit variance before PCA?

…dsb_normalize to be called with at least one isotype.

maxkarlsson

Looks great!

ptajvar added 3 commits May 31, 2024 14:33

Adding the dsb_normalize function.

aa3724e

added test for dsb normalization.

d1efd86

Updated CHANGELOG.md

510f24d

ptajvar requested review from johandahlberg and maxkarlsson May 31, 2024 13:59

ptajvar self-assigned this May 31, 2024

ptajvar added 3 commits May 31, 2024 16:07

Add tolerance for test_dsb_normalize.

28dd804

Trying re-initializing the gmm at each run.

5d9bf3f

Trying to shift dsb normalization values based on a fixed component f…

90d38dd

…or test.

johandahlberg approved these changes Jun 3, 2024

View reviewed changes

ptajvar and others added 6 commits June 3, 2024 09:59

Added citation to the dsb package.

d3f81e7

Added dsb normalization test case with no isotype.

3825ad4

Simplified _get_baseline_expression implementation.

4a40848

Increased dsb_normalization test tolerances to account for different …

1ea377a

…randomizations for the same seed across platforms.

Update __init__.py

5ad5ba7

Changed terminology somewhat. expression --> abundance limma --> _regress_out_confounder baseline --> background

Update CHANGELOG.md

703a227

maxkarlsson reviewed Jun 3, 2024

View reviewed changes

maxkarlsson and others added 2 commits June 3, 2024 11:20

fix: linting

769a372

Normalizing the control signals before PCA in dsb_normalize. Forcing …

9562e38

…dsb_normalize to be called with at least one isotype.

ptajvar requested a review from maxkarlsson June 3, 2024 11:00

maxkarlsson approved these changes Jun 3, 2024

View reviewed changes

ptajvar merged commit 206c806 into dev Jun 3, 2024
15 checks passed

ptajvar deleted the feature/exe-1769-dsb-implementation branch June 3, 2024 11:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/exe 1769 dsb implementation #147

Feature/exe 1769 dsb implementation #147

ptajvar commented May 31, 2024 •

edited

Loading

codecov bot commented May 31, 2024 •

edited

Loading

johandahlberg left a comment

johandahlberg Jun 3, 2024

maxkarlsson Jun 3, 2024

maxkarlsson left a comment •

edited

Loading

maxkarlsson Jun 3, 2024

ptajvar Jun 3, 2024

maxkarlsson Jun 3, 2024

maxkarlsson Jun 3, 2024

maxkarlsson Jun 3, 2024

maxkarlsson left a comment

Feature/exe 1769 dsb implementation #147

Feature/exe 1769 dsb implementation #147

Conversation

ptajvar commented May 31, 2024 • edited Loading

Description

Type of change

How Has This Been Tested?

PR checklist:

codecov bot commented May 31, 2024 • edited Loading

Codecov Report

johandahlberg left a comment

Choose a reason for hiding this comment

johandahlberg Jun 3, 2024

Choose a reason for hiding this comment

maxkarlsson Jun 3, 2024

Choose a reason for hiding this comment

maxkarlsson left a comment • edited Loading

Choose a reason for hiding this comment

maxkarlsson Jun 3, 2024

Choose a reason for hiding this comment

ptajvar Jun 3, 2024

Choose a reason for hiding this comment

maxkarlsson Jun 3, 2024

Choose a reason for hiding this comment

maxkarlsson Jun 3, 2024

Choose a reason for hiding this comment

maxkarlsson Jun 3, 2024

Choose a reason for hiding this comment

maxkarlsson left a comment

Choose a reason for hiding this comment

ptajvar commented May 31, 2024 •

edited

Loading

codecov bot commented May 31, 2024 •

edited

Loading

maxkarlsson left a comment •

edited

Loading