Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conda lock files #5827

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open

Conda lock files #5827

wants to merge 17 commits into from

Conversation

edmundmiller
Copy link
Contributor

@edmundmiller edmundmiller commented Jun 18, 2024

Closes #2193

Follow up to #4080. This will add conda lock files whenever the environment.yml gets updated, and then pass that to wave.

@edmundmiller edmundmiller self-assigned this Jun 18, 2024
@ewels ewels linked an issue Jun 19, 2024 that may be closed by this pull request
@ewels ewels changed the base branch from master to batch_update_staging June 19, 2024 13:15
@edmundmiller
Copy link
Contributor Author

Status update:

This PR doesn't really do anything for us until we adopt wave.

Main issue was the GitHub action can't commit the lock file back for whatever permissions issue.

@edmundmiller edmundmiller changed the base branch from batch_update_staging to master August 8, 2024 17:14
@edmundmiller edmundmiller force-pushed the lock-files branch 5 times, most recently from 13df3e3 to a1bb7d5 Compare August 8, 2024 17:57
@edmundmiller edmundmiller marked this pull request as ready for review August 8, 2024 18:05
@pinin4fjords
Copy link
Member

This is pretty frickin' awesome, lock files have been on my xmas list for ages.

My only questions surround how these lock files will actually get used. As-is. the conda and container directives are unchanged for this module, so behaviour in use will be unchanged.

We spoke on Slack about how I thought (and others disagreed) that we should also change the container directive to use what Wave spits back. I also wonder how we could make it so that people deploying via Conda could get a static environment built off the lock file?

@pinin4fjords
Copy link
Member

@adamrtalbot raises the excellent point that lock files will be architecture-specific. So maybe what we need is actually a conda-lock-amd64.txt, conda-lock-arm64.txt? Then workflows would need to be run with architecture-specific configurations referencing these?

@adamrtalbot
Copy link
Contributor

@adamrtalbot raises the excellent point that lock files will be architecture-specific. So maybe what we need is actually a conda-lock-amd64.txt, conda-lock-arm64.txt? Then workflows would need to be run with architecture-specific configurations referencing these?

Can you embed multiple architectures in a lock file? The docs kinda say you can: https://github.com/conda/conda-lock?tab=readme-ov-file#platform-specification

@pinin4fjords
Copy link
Member

pinin4fjords commented Aug 12, 2024

Can you embed multiple architectures in a lock file? The docs kinda say you can: https://github.com/conda/conda-lock?tab=readme-ov-file#platform-specification

YAAAAS!

 conda-lock -f environment.yml -p osx-64 -p linux-64
Spec hash already locked for ['osx-arm64']. Skipping solve.
Locking dependencies for ['linux-64', 'osx-64']...
INFO:conda_lock.conda_solver:linux-64 using specs ['bioconda::multiqc=1.23']
...

Good man. We would probably want to put them in the environment.yml, given that cross-platform compatibility will vary.

- name: Run conda-lock
run: |
rm --force "${{steps.lockfile-name.outputs.result}}"
conda-lock lock $MAMBA --file "${{ matrix.files }}" --kind lock --platform linux-64 --lockfile "${{steps.lockfile-name.outputs.result}}"
Copy link
Member

@pinin4fjords pinin4fjords Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I tested why the lock file didn't work, and I think it's because of the default unified lock files. To get a lock file compatible with Nextflow (which just does like this), you have to render it to a platform specific one:

conda-lock render -p osx-64  
Rendering lockfile(s) for osx-64...
 - Install lock using : conda create --name YOURENV --file conda-osx-64.lock

We can also just do:

conda-lock --kind explicit -f environment.yml

... which automatically renders to all the specified platforms.

In any case, for nextflow as-is, we will need platform-specific lockfiles.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! I didn't really want to have 4+ lock files floating around. I didn't know this was a possibility, great find!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid we're stuck with those 4 lock files, unless we do that rendering dynamically somehow.

@pinin4fjords
Copy link
Member

pinin4fjords commented Aug 12, 2024

@adamrtalbot raises the excellent point that lock files will be architecture-specific. So maybe what we need is actually a conda-lock-amd64.txt, conda-lock-arm64.txt? Then workflows would need to be run with architecture-specific configurations referencing these?

Can you embed multiple architectures in a lock file? The docs kinda say you can: https://github.com/conda/conda-lock?tab=readme-ov-file#platform-specification

@adamrtalbot unfortunately I just discovered that the reason @edmundmiller 's new conda declaration didn't work is probably because of unified configs. So absent a fix to Nextflow we'll be needing platform-specific ones.

Edit: Paolo nixed my idea for supporting unified lock files. So we are definitely limited to the platform-specific ones.

Copy link
Member

@pinin4fjords pinin4fjords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically I think we'll need to do like this

- name: Run conda-lock
run: |
rm --force "${{steps.lockfile-name.outputs.result}}"
conda-lock lock $MAMBA --file "${{ matrix.files }}" --kind lock --platform linux-64 --lockfile "${{steps.lockfile-name.outputs.result}}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
conda-lock lock $MAMBA --file "${{ matrix.files }}" --kind lock --platform linux-64 --lockfile "${{steps.lockfile-name.outputs.result}}"
conda-lock lock $MAMBA --file "${{ matrix.files }}" --kind explicit --platform linux-64 --lockfile "${{steps.lockfile-name.outputs.result}}"
mv conda-linux-64.lock conda-lock-linux-64.yml

Nextflow needs it to be a .yml, but use of --kind explicit means we get .lock files, so we'll need to do some file renaming.

Platforms should probably be set in the environment.ymls actually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just do...

Suggested change
conda-lock lock $MAMBA --file "${{ matrix.files }}" --kind lock --platform linux-64 --lockfile "${{steps.lockfile-name.outputs.result}}"
conda-lock lock $MAMBA --file "${{ matrix.files }}" --kind lock --platform linux-64 --lockfile "${{steps.lockfile-name.outputs.result}}".yml

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't look like that works 😞

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explicit thing is important. It makes you a lock file for each platform which is consumable by conda env create, and therefore by Nextflow.

But by doing explicit you lose control over the naming of the multiple outputs, and Nextflow needs .yml, hence the rename hack I suggested.

git config push.default upstream
git add .
git status
git commit -m "[automated] autogenerated conda-lock file"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
git commit -m "[automated] autogenerated conda-lock file"
git commit -m "[automated] autogenerated conda-lock files"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should just be a single commit per lock file I think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One lock file per platform

@@ -2,7 +2,7 @@ process BOWTIE2_ALIGN {
tag "$meta.id"
label 'process_high'

conda "${moduleDir}/environment.yml"
conda "${moduleDir}/conda-lock.yml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
conda "${moduleDir}/conda-lock.yml"
conda "${moduleDir}/conda-lock-linux-64.yml"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm obviously not proposing we actually change modules for platform specificity, just illustrating).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think tracking them at this point might keep us sane though 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the lock files will be platform specific they can't go directly on the modules. I think we'd need platform-specific configs at the workflow level that referenced all the lockfiles.

@@ -15,3 +15,4 @@ __pycache__
*.pyo
*.pyc
.github/renovate.json5
**/conda-lock.yml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**/conda-lock.yml
**/conda-lock*.yml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use Conda lock files new module: shinyngs/static_exploratory
4 participants