DRY refactoring #60

charles-cowart · 2023-05-12T17:19:31Z

Goal is to refactor code so that there is minimal duplication, easily extensible to enable metatranscriptomics and future Assay types, and breakout all of the functionality from the main plugin function so they can be tested properly.

Some Additional modifications added.

test works as intended locally

antgonza

Thank you @charles-cowart, I have some concerns about the possibility of duplication of code; please let me know what you think.

.gitignore

qp_klp/AmpliconStep.py

qp_klp/MetagenomicStep.py

qp_klp/klp.py

qp_klp/tests/test_amplicon_step.py

charles-cowart · 2023-05-18T19:23:19Z

@antgonza I've pushed a commit of quick updates based on your feedback, thanks! I haven't responded to all of your concerns yet, but I will shortly.

antgonza

Thank you for the comments before. Now, a general question, if you are "extracting" the general functions from Amplicon/Metagenomic Step and adding them to the base class Step, does it really make sense to have a klp_util.py file? I mean basically you have to access method X in Amplicon/Metagenomic Step, that will call X method in Step and then do X in klp_util.py; no? Would it be better to remove one step and just go either Amplicon/Metagenomic Step -> klp_util.py or Amplicon/Metagenomic Step -> Step and remove klp_util.py?

.gitignore

qp_klp/AmpliconStep.py

qp_klp/tests/test_amplicon_step.py

Removed configuration.json. Additional updates based on testing and feedback.

charles-cowart · 2023-05-20T00:54:16Z

Thank you for the comments before. Now, a general question, if you are "extracting" the general functions from Amplicon/Metagenomic Step and adding them to the base class Step, does it really make sense to have a klp_util.py file? I mean basically you have to access method X in Amplicon/Metagenomic Step, that will call X method in Step and then do X in klp_util.py; no? Would it be better to remove one step and just go either Amplicon/Metagenomic Step -> klp_util.py or Amplicon/Metagenomic Step -> Step and remove klp_util.py?

We're definitely on the same page. The idea was to incorporate klp_util.py into the base Step class. I left them as is for the time being, just because so many other things had changed already.

antgonza · 2023-05-22T12:35:53Z

We're definitely on the same page. The idea was to incorporate klp_util.py into the base Step class. I left them as is for the time being, just because so many other things had changed already.

I think this will be the only blocker before merging this initial PR and this would be the base for future ones, could you do this now before waiting for a better time?

antgonza

Some questions.

qp_klp/Amplicon.py

qp_klp/Step.py

qp_klp/Metagenomic.py

qp_klp/klp.py

antgonza

A few more comments.

antgonza · 2023-05-22T23:49:05Z

qp_klp/Amplicon.py

+            job_output = [join(raw_fastq_files_path, x) for x in
+                          listdir(raw_fastq_files_path)]
+            job_output = [x for x in job_output if isfile(x) and x.endswith(
+                'fastq.gz') and not basename(x).startswith('Undetermined')]


I guess these 2 can be merged, no?

I think it might be better to keep them separate, because you want x in isfile(x) in the second one to be the full path created by join(raw_fastq_files_path, x) in the first one and if we merged them that would mean we'd have something like:

job_output = [join(raw_fastq_files_path, x) for x in listdir(raw_fastq_files_path) if isfile(join(raw_fastq_files_path, x)) and x.endswith('fastq.gz') and not basename(x).startswith('Undetermined')]]

The join() has to be present twice.

antgonza · 2023-05-22T23:49:50Z

qp_klp/Amplicon.py

+    def generate_prep_file(self):
+        config = self.pipeline.configuration['seqpro']
+
+        seqpro_path = config['seqpro_path'].replace('seqpro', 'seqpro_mf')


What's the plan to remove seqpro_mf?

It's an easy enough change on this side of things, and a little bit of refactoring for metapool package. I can make the change by the end of the week in between other things.

antgonza · 2023-05-22T23:55:12Z

qp_klp/Metagenomic.py

+        config = self.pipeline.configuration['bcl-convert']
+        job = super()._convert_bcl_to_fastq(config,
+                                            self.pipeline.sample_sheet.path)
+        self.fsr.write(job.audit(self.pipeline.get_sample_ids()), 'ConvertJob')


It looks like at the end of each method something similar to this line is called, should it be more general? I mean something like self.failed_samples_audit('ConvertJob') where that calls something like self.write(job.audit(self.pipeline.get_sample_ids()), audit_name)

I tried this out, but it seems like it will just create another function something like the following:
def _failed_samples_audit(self, job, step_name):
self.fsr.write(job.audit(self.pipeline.get_sample_ids()), step_name)

It doesn't really do much. The meat of each audit is in each Job class, while the structure and writing of each failed samples record is in the FailedSamplesRecord class.

qp_klp/Step.py

qp_klp/klp.py

qp_klp/tests/test_amplicon_step.py

antgonza

A few more comments.

* DRY refactoring (#60) * WIP DRY refactoring Goal is to refactor code so that there is minimal duplication, easily extensible to enable metatranscriptomics and future Assay types, and breakout all of the functionality from the main plugin function so they can be tested properly. * Forgot to add new Step files to DRY refactor. Some Additional modifications added. * Laying out new test classes * Added support for base Step tests * Ensure output_dir is created * bugfix * Temporarily set mg-scripts dependency to development version * .gitignore fix * First test w/pseudo job-submission added. * CI Fixes * Fix for CI * debug CI. test works as intended locally * debugging CI * ci debug * ci debug * ci debug * Added test for base QCJob * Added MetagenomicStep tests * Updated testing infrastructure * Added AmpliconStep tests * flake8 * setup now points to merged mg-scripts. * Updated code to create fake files in a searchable path. * Bugfix * flake8 * Easy updates based on feedback * Removed configuration.json Removed configuration.json. Additional updates based on testing and feedback. * Merged klp_util.py into Step and klp.py based on feedback * Removed test_klp_util.py * Updates based on testing * Updates based on feedback * Changes based on feedback * Add new file * Changes based on feedback * bugfix * Testing updates (#61) * Updates based on testing on qiita-rc * Updates based on feedback * Updates based on feedback * Removed sn_tid_map_by_project as a constructor parameter for Step() * flake8 * Updates based on feedback * Added test for get_special_map() * Added test_get_tube_ids_from_qiita() * Split BaseStepTests into two halves. Tests in BaseStepTests were being inherited by child BaseStepTests and run two more times (one for Amplicon and one for Metagenomic). Partitioning out general tests into a new child class allows all children to inherit configuration, but not inherit tests. * Updates based on feedback * Updates based on feedback * Migrated pipeline calls into a single function Migrated pipeline calls into a single function to preserve their sequence, offer potential alternatives, and include a debug option to not update Qiita. * Adding coveralls to project * bugfix * test coveralls * Updating coveralls * Testing codecov integration * testing codecov * Testing codecov --------- Co-authored-by: Charles Cowart <42684307+charles-cowart@users.noreply.github.com>

This code was originally reviewed and approved as part of the following PRs: #60 #61 Code refactored to be more extensible to new Assay types. Also, monolithic unit-tests that required testing on Qiita-RC were removed. New unittests are more numerous and test smaller self-contained functionality. Qiita is no longer needed for testing; FakeQiita class is now used to emulate canned Qiita API queries. Job submission and SLURM responses are also emulated w/fake binaries.

WIP DRY refactoring

0be276d

Goal is to refactor code so that there is minimal duplication, easily extensible to enable metatranscriptomics and future Assay types, and breakout all of the functionality from the main plugin function so they can be tested properly.

charles-cowart requested a review from antgonza May 12, 2023 17:19

charles-cowart added 24 commits May 15, 2023 20:51

Forgot to add new Step files to DRY refactor.

d287bdb

Some Additional modifications added.

Laying out new test classes

135780c

Added support for base Step tests

d5196c2

Ensure output_dir is created

773c9ba

bugfix

f199189

Temporarily set mg-scripts dependency to development version

6fd27d2

.gitignore fix

2ba2bc6

First test w/pseudo job-submission added.

b740b0d

CI Fixes

01d2f3d

Fix for CI

537cc7e

debug CI.

6dffa20

test works as intended locally

debugging CI

1117f5a

ci debug

65a04ca

ci debug

715716e

ci debug

b6fde09

Added test for base QCJob

7f7a6ef

Added MetagenomicStep tests

70301a6

Updated testing infrastructure

b889ec3

Added AmpliconStep tests

fe186de

flake8

3df5f25

setup now points to merged mg-scripts.

547d3d1

Updated code to create fake files in a searchable path.

b2e48a6

Bugfix

2f638b0

flake8

3e788ec

charles-cowart changed the base branch from main to dev May 18, 2023 17:14

antgonza requested changes May 18, 2023

View reviewed changes

Easy updates based on feedback

5784752

antgonza reviewed May 19, 2023

View reviewed changes

.gitignore Show resolved Hide resolved

qp_klp/AmpliconStep.py Outdated Show resolved Hide resolved

qp_klp/AmpliconStep.py Outdated Show resolved Hide resolved

qp_klp/tests/test_amplicon_step.py Outdated Show resolved Hide resolved

Removed configuration.json

1c8f792

Removed configuration.json. Additional updates based on testing and feedback.

charles-cowart added 3 commits May 22, 2023 10:05

Merged klp_util.py into Step and klp.py based on feedback

d9a6efa

Removed test_klp_util.py

0d82aee

Updates based on testing

8392d24

antgonza reviewed May 22, 2023

View reviewed changes

charles-cowart added 3 commits May 22, 2023 15:27

Updates based on feedback

f3779cd

Changes based on feedback

3d5a223

Add new file

245effb

antgonza reviewed May 23, 2023

View reviewed changes

charles-cowart added 2 commits May 22, 2023 18:01

Changes based on feedback

1c1e762

bugfix

b2840f2

charles-cowart changed the title ~~WIP DRY refactoring~~ DRY refactoring May 23, 2023

antgonza merged commit e26c59b into qiita-spots:dev May 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DRY refactoring #60

DRY refactoring #60

charles-cowart commented May 12, 2023

antgonza left a comment

charles-cowart commented May 18, 2023

antgonza left a comment

charles-cowart commented May 20, 2023

antgonza commented May 22, 2023

antgonza left a comment

antgonza left a comment

antgonza May 22, 2023

charles-cowart May 23, 2023 •

edited

Loading

antgonza May 22, 2023

charles-cowart May 23, 2023

antgonza May 22, 2023

charles-cowart May 23, 2023 •

edited

Loading

antgonza left a comment

DRY refactoring #60

DRY refactoring #60

Conversation

charles-cowart commented May 12, 2023

antgonza left a comment

Choose a reason for hiding this comment

charles-cowart commented May 18, 2023

antgonza left a comment

Choose a reason for hiding this comment

charles-cowart commented May 20, 2023

antgonza commented May 22, 2023

antgonza left a comment

Choose a reason for hiding this comment

antgonza left a comment

Choose a reason for hiding this comment

antgonza May 22, 2023

Choose a reason for hiding this comment

charles-cowart May 23, 2023 • edited Loading

Choose a reason for hiding this comment

antgonza May 22, 2023

Choose a reason for hiding this comment

charles-cowart May 23, 2023

Choose a reason for hiding this comment

antgonza May 22, 2023

Choose a reason for hiding this comment

charles-cowart May 23, 2023 • edited Loading

Choose a reason for hiding this comment

antgonza left a comment

Choose a reason for hiding this comment

charles-cowart May 23, 2023 •

edited

Loading

charles-cowart May 23, 2023 •

edited

Loading