Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation and refactoring of differential expression analysis #388

Open
wants to merge 41 commits into
base: development
Choose a base branch
from

Conversation

JuliaS92
Copy link
Collaborator

@JuliaS92 JuliaS92 commented Dec 16, 2024

This is a base branch for a major effort to refactor the differential expression analysis.

Currently this covers a skeleton for a new 3-tier class system for all differential expression analysis:

  1. DiffernetialExpressionAnalysis: enforces unified input and output
  2. DifferentialExpressionAnalysis_TwoGroups: validation of groups shared, but unique funcitonality
  3. Concrete statistical test, e.g. DifferentialExpressionAnalysis_TTest

If you review at this stage, please check in particular if

  • validation and passing of parameters is sufficiently robust and not overengineered (already a lot beeter after Magnus's review)
  • Check the TODOs added in other files to see how I envision to integrate this with the rest of the tool
  • any of the branches below are not supported by the code yet

Checklist to do before branching off:

  • Write tests
  • Talk to an AI about this
  • Check if this is compatible with all statistical tests that are supposed to run in the end
  • The only exception is the multicova, since this needs to be refactored internally first.
  • Check if this easily interfaces with UI and plotting

Branches that can then be based on this:

  • Run and store in Dataset
  • Recreate Volcano plot from result, significance and dataset
  • UI integration via gui/utils/analysis
  • Depracate old functionality (DifferentialExpressionAnalysis without plot), keep thin wrapper for multicova until refactoring.
  • Individual statistical test classes, including actual test of the static functions (not included for the t-test here on purpose, will be a separate branch).

@JuliaS92 JuliaS92 requested a review from mschwoer December 16, 2024 11:22
@JuliaS92 JuliaS92 marked this pull request as ready for review December 17, 2024 15:26
Copy link
Contributor

@mschwoer mschwoer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look quite good already!

alphastats/dataset/dataset.py Show resolved Hide resolved
alphastats/tl/differential_expression_analysis.py Outdated Show resolved Hide resolved
alphastats/tl/differential_expression_analysis.py Outdated Show resolved Hide resolved
alphastats/tl/differential_expression_analysis.py Outdated Show resolved Hide resolved
alphastats/tl/differential_expression_analysis.py Outdated Show resolved Hide resolved
tests/tl/test_differential_expression_analysis.py Outdated Show resolved Hide resolved
alphastats/tl/differential_expression_analysis.py Outdated Show resolved Hide resolved
tests/tl/test_differential_expression_analysis.py Outdated Show resolved Hide resolved
tests/tl/test_differential_expression_analysis.py Outdated Show resolved Hide resolved
alphastats/tl/differential_expression_analysis.py Outdated Show resolved Hide resolved
alphastats/tl/differential_expression_analysis.py Outdated Show resolved Hide resolved
Comment on lines 97 to 107
for parameter in kwargs:
if parameter not in self._allowed_parameters():
raise TypeError(
f"Parameter {parameter} should not be provided for this analysis. Accepted keyword arguments to perform are {', '.join(self._allowed_parameters())}."
)

def _allowed_parameters(self) -> List[str]:
"""Method returning a list of allowed parameters for the analysis to avoid calling tests with additional parameters."""
perform_signature = inspect.signature(self._perform)
parameters = list(perform_signature.parameters.keys())
return [parameter for parameter in parameters if parameter != "kwargs"]
Copy link
Contributor

@mschwoer mschwoer Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be removed, python does that for you :-)

(from for parameter in kwargs: ... to return return [para..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not if I pass kwargs to _perform though, right? Which I need to do to match the signature of the abstract method? The idea was that by adding this here I enforce kwargs to be empty without relying on every implementation of _perform to contain that check.

alphastats/tl/differential_expression_analysis.py Outdated Show resolved Hide resolved
tests/tl/test_differential_expression_analysis.py Outdated Show resolved Hide resolved
Comment on lines 309 to 314
**valid_parameter_input_two_groups,
**{
DeaParameters.TEST_TYPE: DeaTestTypes.INDEPENDENT,
DeaParameters.FDR_METHOD: "fdr_bh",
DeaParameters.ISLOG2TRANSFORMED: False,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we pass those parameters explicitly?

Copy link
Contributor

@mschwoer mschwoer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! this should be a solid basis for all the upcoming analysis!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants