Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

somalier breaks when renaming sample names #1234

Closed
leonorpalmeira opened this issue Jun 22, 2020 · 3 comments
Closed

somalier breaks when renaming sample names #1234

leonorpalmeira opened this issue Jun 22, 2020 · 3 comments
Labels
bug: module Bug in a MultiQC module

Comments

@leonorpalmeira
Copy link

leonorpalmeira commented Jun 22, 2020

We are using MultiQC to parse all our QC data, and we have stumbled upon an Issue when integrating the tool somalier:

The following invocation runs fine and the output html is correctly formed:

$MultiQC somalier.* -n  MultiQC_report_test 
[INFO   ]         multiqc : This is MultiQC v1.9
[INFO   ]         multiqc : Template    : default
[INFO   ]         multiqc : Searching   : /filepath/QC/somalier.groups.tsv
[INFO   ]         multiqc : Searching   : /filepath/QC/somalier.html
[INFO   ]         multiqc : Searching   : /filepath/somalier.pairs.tsv
[INFO   ]         multiqc : Searching   : /filepath/somalier.samples.tsv
[INFO   ]        somalier : Found 465 reports
[INFO   ]         multiqc : Compressing plot data
[INFO   ]         multiqc : Report      : MultiQC_report_test.html
[INFO   ]         multiqc : Data        : MultiQC_report_test_data
[INFO   ]         multiqc : MultiQC complete

When adding the yaml file with a regexp to deal with our sampling name, we get this error:

$MultiQC --force -n MultiQC_report_test somalier* -c MultiQC_outputs/multiQC_config.yaml
[INFO   ]         multiqc : This is MultiQC v1.9
[INFO   ]         multiqc : Template    : default
[INFO   ]         multiqc : Report title: HUMANOMICS v3.0 Quality Control
[INFO   ]         multiqc : Searching   : /filepath/QC/somalier.groups.tsv
[INFO   ]         multiqc : Searching   : /filepath/QC/somalier.html
[INFO   ]         multiqc : Searching   : /filepath/QC/somalier.pairs.tsv
[INFO   ]         multiqc : Searching   : /filepath/QC/somalier.samples.tsv
[ERROR  ]         multiqc : Oops! The 'somalier' MultiQC module broke... 
  Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues 
  If possible, please include a log file that triggers the error - the last file found was:
    somalier.pairs.tsv
============================================================
Module somalier raised an exception: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/multiqc/multiqc.py", line 569, in run
    output = mod()
  File "/usr/local/lib/python3.6/dist-packages/multiqc/modules/somalier/somalier.py", line 64, in __init__
    self.somalier_data[s_name] = parsed_data[s_name]
KeyError: '99-200622-1234'
============================================================
[WARNING]         multiqc : No analysis results found. Cleaning up..
[INFO   ]         multiqc : MultiQC complete

And indeed, if I remove the three lines concerning the regexp from our yaml file, we recover the initial correct behavior:

$MultiQC multiqc --force -n MultiQC_report_test somalier* -c MultiQC_outputs/multiQC_config_somalier.yaml 
[INFO   ]         multiqc : This is MultiQC v1.9
[INFO   ]         multiqc : Template    : default
[INFO   ]         multiqc : Report title: HUMANOMICS v3.0 Quality Control
[INFO   ]         multiqc : Searching   : /filepath/QC/somalier.groups.tsv
[INFO   ]         multiqc : Searching   : /filepath/QC/somalier.html
[INFO   ]         multiqc : Searching   : /filepath/QC/somalier.pairs.tsv
[INFO   ]         multiqc : Searching   : /filepath/QC/somalier.samples.tsv
[INFO   ]        somalier : Found 465 reports
[INFO   ]         multiqc : Compressing plot data
[WARNING]         multiqc : Deleting    : MultiQC_report_test.html   (-f was specified)
[WARNING]         multiqc : Deleting    : MultiQC_report_test_data   (-f was specified)
[INFO   ]         multiqc : Report      : MultiQC_report_test.html
[INFO   ]         multiqc : Data        : MultiQC_report_test_data
[INFO   ]         multiqc : MultiQC complete

Here are the three lines I had to remove from the yaml file in order to get MultiQC to run properly:

extra_fn_clean_exts:
    - type: regex_keep
      pattern: '^[0-9]{2}-[0-9]{6}-[0-9]{4}'

Is it possible for you to look into this and give us an idea of what is happening?

@ewels ewels added the bug: core Bug in the main MultiQC code label Jun 22, 2020
@ewels ewels changed the title "Oops! The 'somalier' MultiQC module broke..." problem with regexp within yaml config file somalier breaks when renaming sample names Jun 22, 2020
@ewels
Copy link
Member

ewels commented Jun 22, 2020

Thanks for the detailed bug report @leonorpalmeira - usually when problems like this happen it's because the module code is written in such a way that it doesn't expect the sample names to change. It will likely require an update to the main MultiQC code.

@leonorpalmeira
Copy link
Author

Thanks @ewels for your quick answer! We will currently use a workaround by removing the yaml file to specifically deal with the somalier files. Let us know when there is a fix available.

@ewels ewels added bug: module Bug in a MultiQC module and removed bug: core Bug in the main MultiQC code labels Dec 28, 2020
@ewels ewels closed this as completed in 744d666 Mar 31, 2021
@ewels
Copy link
Member

ewels commented Mar 31, 2021

Sorry it took me so long to get to this @leonorpalmeira - I think it should now be fixed in the development branch on master.

It was slightly fiddly as Somalier has concatenated sample names (P1234*P1235) so the code needed to be a little more clever in how it was cleaning them. Then as I predicted it was a minor refactor to avoid the changing key causing an error.

Anyway, if you get a chance please give it a test and let me know how you get on 👍🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug: module Bug in a MultiQC module
Projects
None yet
Development

No branches or pull requests

2 participants