Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escaped characters handled inconsistently depending on how they are input #7167

Open
aofarrel opened this issue Jun 26, 2023 · 0 comments
Open

Comments

@aofarrel
Copy link

aofarrel commented Jun 26, 2023

Based on a post I made on the openWDL slack; was asked to make a ticket here.

quick summary

Cromwell handles / in strings inconsistently. In some cases, it is dropped without throwing an error, in other cases it will cause an error immediately. If the string is in the WDL file itself, womtool does not detect any issues with it but it will not be handled as expected as runtime.

use case and how to reproduce

goleft indexcov defaults to this value for --excludePattern:
"^chrEBV$|_random$|Un_|^HLA|_alt$|hap$"

So I set String excludePattern = "^chrEBV$|_random$|Un_|^HLA|_alt$|hap$" in my WDL. That passes miniwdl check and womtool. But...

  • Terra will accept ^chrEBV$|^NC|_random$|Un_|^HLA\-|_alt$|hap\d$ as a variable default or as hardcoded variable, but will handle it incorrectly -- it will not error, but it will be changed into ^chrEBV$|^NC|_random$|Un_|^HLA-|_alt$|hapd$
  • Terra will not accept ^chrEBV$|^NC|_random$|Un_|^HLA\-|_alt$|hap\d$ as an input variable via JSON; it will fail to import
  • Terra will not accept ^chrEBV$|^NC|_random$|Un_|^HLA\-|_alt$|hap\d$ as an input variable if entered manually; it will throw token recognition error in the workflow menu and not allow you to submit
  • Terra will accept the escaped version ^chrEBV$|^NC|_random$|Un_|^HLA\\-|_alt$|hap\\d$ as an input if entered manually or hardcoded, and will interpret it as ^chrEBV$|^NC|_random$|Un_|^HLA\-|_alt$|hap\d$

Only tested via Terra-Cromwell, as I was previously told local-Cromwell is a lower development priority.

expected behavior

  1. A user inputting a string as a variable vs that exact same string being a hardcoded default should be handled the same way.
  2. If Cromwell is supposed to handle / by requiring they be escaped as //, that should be documented if it isn't already.
  3. womtool should throw a warning when it sees a hardcoded variable/default with a / inside of it, and that warning should guide the user as to how it will be interpreted at runtime.
  4. The same workflow running in Cromwell and miniwdl should not result in different interpretations of a given string. The WDL spec should be updated to stop inconsistencies like this from happening, since it currently does not seem to give guidance on this particular issue. (Linking WDL 1.0 spec since that's what Cromwell currently supports.)

Personally I'd prefer if strings were interpreted literally, ie / does not need to be escaped, but that might conflict with the typical JSON standard. Consistency is more important than my mild distaste for escaping.

related issues

#3990 (comment)

miniwdl comparison (not necessarily the ideal, just to illustrate point 4 on expected behavior)

  • miniwdl will accept ^chrEBV$|^NC|_random$|Un_|^HLA\-|_alt$|hap\d$ as a variable default or as hardcoded variable, and will handle it literally as written, as long as the workflow author followed shellcheck's recommendation and used quotes in the command section (ie --excludePatt "~{excludePattern}")
  • miniwdl will also accept escaping the characters as a variable default/hardcode, and interprets ^chrEBV$|^NC|_random$|Un_|^HLA\\-|_alt$|hap\\d$ as ^chrEBV$|^NC|_random$|Un_|^HLA\-|_alt$|hap\d$
  • miniwdl will not accept ^chrEBV$|^NC|_random$|Un_|^HLA\-|_alt$|hap\d$ as an input variable via JSON because JSON parsers are just kind of built like that. the two problematic parts are - and \d. However, you can get around this by escaping those slashes, ie ^chrEBV$|^NC|_random$|Un_|^HLA\\-|_alt$|hap\\d$ does get interpreted as ^chrEBV$|^NC|_random$|Un_|^HLA\-|_alt$|hap\d$

In other words, miniwdl and Cromwell are both internally inconsistent, and inconsistent compared to each other. miniwdl does not parse variables internally consistently as a consequence of its JSON parser. Cromwell is also internally inconsistent, but additionally can't handle the unescaped string as a default/hardcoded variable even those those don't go thru a JSON parser.

screenshot example

escaped character example
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant