Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

toil-cwl-runner can ignore LoadListingRequirements on a Workflow and always uses no_listing #5104

Closed
stxue1 opened this issue Sep 26, 2024 · 1 comment · Fixed by #5149
Closed

Comments

@stxue1
Copy link
Contributor

stxue1 commented Sep 26, 2024

May be related to #5099
Given this workflow:

#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: Workflow
requirements:
  InlineJavascriptRequirement: {}
  LoadListingRequirement:
    loadListing: shallow_listing
  StepInputExpressionRequirement: {}
inputs:
  input_directory:
    type: Directory
outputs:
  output_file:
    type: File
    outputSource: echo/out
steps:
  echo:
    run:
      class: CommandLineTool
      requirements:
        LoadListingRequirement:
          loadListing: deep_listing
      baseCommand: echo
      inputs:
        message:
          type: string
          inputBinding: {}
        dir:
          type: Directory
      outputs:
        out:
          type: stdout
    in:
      dir: input_directory
      message: 
        valueFrom: $(JSON.stringify(inputs.dir))
    out: [out]

With JSON input file:

{
    "input_directory": {"class": "Directory", "location": "directory"}
}

And a directory in the current working directory as:

(venv3.12) heaucques@pop-os:~/Documents/toil$ tree directory
directory
├── directory
│   └── file2.txt
└── file.txt

1 directory, 2 files

After running the command:

toil-cwl-runner shallow_listing_workflow.cwl shallow_listing.json > json.txt && jq . $(jq -r .output_file.path json.txt)

I'm getting this directory object with no listing (with what the expression at workflow scope viewed):

{
  "class": "Directory",
  "location": "toildir:eyJkaXJlY3RvcnkiOiB7ImZpbGUyLnR4dCI6ICJ0b2lsZmlsZTowOjA6ZmlsZXMvbm8tam9iL2ZpbGUtMWUyYjUzNjYxODQ2NDVjZjg3OTc3NTE5MDJkZTU3ZmEvZmlsZTIudHh0In0sICJmaWxlLnR4dCI6ICJ0b2lsZmlsZTowOjA6ZmlsZXMvbm8tam9iL2ZpbGUtNDhlYjIxNTc2MzhiNDhmNGE5YmU2NmJjODIzNWMzMTQvZmlsZS50eHQifQ==",
  "basename": "directory"
}

The expression is supposed to run with shallow_listing, but appears to be running with no_listing.

Running the cwltool gives it the right shallow_listing:

{
  "class": "Directory",
  "location": "file:///home/heaucques/Documents/toil/directory",
  "basename": "directory",
  "listing": [
    {
      "class": "Directory",
      "location": "file:///home/heaucques/Documents/toil/directory/directory",
      "basename": "directory"
    },
    {
      "class": "File",
      "location": "file:///home/heaucques/Documents/toil/directory/file.txt",
      "basename": "file.txt",
      "size": 0
    }
  ]
}

┆Issue is synchronized with this Jira Story
┆Issue Number: TOIL-1650

@stxue1
Copy link
Contributor Author

stxue1 commented Sep 27, 2024

I think cwltoil grabs the correct load listing, but is overridden later:

toil/src/toil/cwl/cwltoil.py

Lines 4023 to 4027 in a698f45

# make sure this doesn't add listing items; if shallow_listing is
# selected, it will discover dirs one deep and then again later on
# (probably when the cwltool builder gets ahold of the job in the
# CWL job's run()), producing 2+ deep listings instead of only 1.
builder.loadListing = "no_listing"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant