Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change in "extra_fn_clean_exts" behavior #1264

Closed
skchronicles opened this issue Aug 4, 2020 · 3 comments
Closed

Change in "extra_fn_clean_exts" behavior #1264

skchronicles opened this issue Aug 4, 2020 · 3 comments

Comments

@skchronicles
Copy link

skchronicles commented Aug 4, 2020

Hey @ewels,

I hope all is going well, and you're staying safe in these strange times!

Description of bug:
I am in the process of testing a newer version of MultiQC (v1.9) after hearing about some of its newer features. Our current pipelines are using an older version of MultiQC (v1.4).

I provided MultiQC/1.9 the same config file we are using in the older version (1.4), and I noticed that extra_fn_clean_exts was no longer cleaning the directory names when providing the -d flag.

At first, I thought something was perhaps wrong with the regular expression but no, it is fine:
image

I took a peek at the source code for cleaning the filenames.

In version (1.4), it looks like you're adding the directory names prior to cleaning the sample names:

https://github.com/ewels/MultiQC/blob/baefc2e5658ea81b9f4f1f0367f0be68fec7e05e/multiqc/modules/base_module.py#L180-L191

Screen grab

image

In version (1.9), the order of operations are flipped. It looks like you're cleaning the sample names prior to prepending the directory names:

https://github.com/ewels/MultiQC/blob/a2597e36afd2d0686975f3354b1873cc631abe5b/multiqc/modules/base_module.py#L220-L248

Screen grab

image

Because of this, I do not have as much fine-grain control over cleaning a given string (path + sample name).

MultiQC Error log:

No errors, just unexpected behavior moving from version 1.4 to 1.9. I cannot use extra_fn_clean_exts to clean the entire sample string (path + sample name) when providing the -d flag. This is due to problems described above.

File that triggers the error:

MultiQC run details (please complete the following):

  • Command used to run MultiQC:
  • MultiQC Version: 1.9
  • Operating System: centOS7
  • Python Version: Python 3.8.2
  • Method of MultiQC installation: Docker

Additional context

No rush, and thank you for looking into this!

@birnbera
Copy link

I'd like to second this issue as it makes it difficult to get consistent sample naming in the report when using modules that get their sample names in different ways.

@mtandon09
Copy link

Yes! Getting extra_fn_clean_exts to work again would be super helpful!

@ewels
Copy link
Member

ewels commented Mar 29, 2021

Hi all,

So after looking into this a bit, it seems that I changed this in 7948ffa in response to a bug report in #848 (it went out quite a while ago - release v1.7 in 2018). But I guess that for you guys this was a feature, not a bug.. 😉

I just had a play with that example again and I can't find any difference in the output for that setup with the directory prepend code in either location. But a lot has changed since then, so maybe something else in the codebase has solved the issue a different way.

Anyway, given in that my minimal tests putting it back to prepend first then clean didn't have any negative effect, I'm happy to put it back to how it was 👍🏻

Phil

@ewels ewels closed this as completed in 9277f2e Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants