Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python dependency gets wrong category name when pip is only used in dev dependencies. #275

Open
ghost opened this issue Nov 7, 2022 · 16 comments

Comments

@ghost
Copy link

ghost commented Nov 7, 2022

Hi,

I have an environment file environment.yml:

category: main
dependencies:
  - python=3.10.6

And a dev environment dev-environment.yml:

category: dev
dependencies:
- pip:
   - numpy

When I create the lock file (conda-lock lock -f environment.yml -f dev-environment.yml --check-input-hash --mamba) the python dependency will now have category: dev. This breaks conda-lock install --no-dev since python will not be installed.

Probably something in the pip code overwrites the category of python.

@maresb
Copy link
Contributor

maresb commented Nov 7, 2022

Hi, I'm having trouble reproducing this. What's your conda-lock version?

BTW, a few minor suggestions:

  1. You may have default channels configured in your ~/.condarc, but to ensure portability, it's best to be explicit and include

    channels:
    - conda-forge

    (we should probably warn about this, since I'm getting a nasty-looking and confusing traceback)

  2. Typically one doesn't pin the patch version for something which adheres reasonably well to semver like Python. (In other words, pin python=3.10 and let conda-lock select the latest patch version.)

@mariusvniekerk
Copy link
Collaborator

You can also try adding in python as a dep in the dev-environment, that will probably make things behave correctly

@maresb
Copy link
Contributor

maresb commented Nov 7, 2022

Last time I tried adding something as a dev and main dep it led to it not being installed in main. That's why we need to turn category: str into categories: list[str].

@ghost
Copy link
Author

ghost commented Nov 8, 2022

Hi, thanks for the quick responses. I created an environment closer to my use-case to reproduce the problem.

environment.yml:

name: conda_lock_bug_demo
category: main
platforms:
  - linux-64
channels:
  - conda-forge
  - anaconda
  - pytorch
  # We want to have a reproducible setup, so we don't want default channels,
  # which may be different for different users. All required channels should
  # be listed explicitly here.
  - nodefaults
dependencies:
  - python=3.10
  - conda-lock=1.2

dev-environment.yml:

name: conda_lock_bug_demo
category: dev
platforms:
  - linux-64
channels:
  - conda-forge
  - anaconda
  - pytorch
  # We want to have a reproducible setup, so we don't want default channels,
  # which may be different for different users. All required channels should
  # be listed explicitly here.
  - nodefaults
dependencies:
   # Dev dependencies
  - pytest

  - pip:
    - seaborn

In an environment running python 3.10.4, installing conda-lock using:

pip install conda-lock[pip_support]==1.2.1

Creating the lock file using:

conda-lock lock -f environment.yml -f dev-environment.yml --check-input-hash --mamba --lockfile "conda-lock.yml"

This will create a lock file where the python dependency has the incorrect dev category :

- category: dev
  dependencies:
    bzip2: '>=1.0.8,<2.0a0'
    ld_impl_linux-64: '>=2.36.1'
    libffi: '>=3.4.2,<3.5.0a0'
    libgcc-ng: '>=12'
    libnsl: '>=2.0.0,<2.1.0a0'
    libsqlite: '>=3.39.2,<4.0a0'
    libuuid: '>=2.32.1,<3.0a0'
    libzlib: '>=1.2.12,<1.3.0a0'
    ncurses: '>=6.3,<7.0a0'
    openssl: '>=3.0.5,<4.0a0'
    readline: '>=8.1.2,<9.0a0'
    tk: '>=8.6.12,<8.7.0a0'
    tzdata: ''
    xz: '>=5.2.6,<5.3.0a0'
  hash:
    md5: 98d77e6496f7516d6b3c508f71c102fc
    sha256: 51858b574a043bd0f7225880ecb11624c0545ef04865f848cd5a54c487bc637f
  manager: conda
  name: python
  optional: true
  platform: linux-64
  url: https://conda.anaconda.org/conda-forge/linux-64/python-3.10.6-ha86cf86_0_cpython.tar.bz2
  version: 3.10.6

For this example I first tried installing numpy as the pip dependency. In that scenario the problem did not occur. Possibly something goes wrong when installing a pip dependency that has sub-requirements.

@mariusvniekerk
Copy link
Collaborator

I would recommend not including conda-lock in your lockfile itself. Thats a rather weird antipattern.

@maresb
Copy link
Contributor

maresb commented Nov 8, 2022

I would recommend not including conda-lock in your lockfile itself. Thats a rather weird antipattern.

Yes, but it's normal to have conda-lock in dev-environment.yml right? (Especially if one uses micromamba to create the environment from the lockfile.)

@mariusvniekerk
Copy link
Collaborator

mariusvniekerk commented Nov 8, 2022

I tend to just install conda-lock with pipx or condax (or directly into the base conda environment) and have it live entirely in its own environment. It's a cli tool so no need to import it or mix it with other stuff

maresb added a commit to maresb/conda-lock that referenced this issue Nov 8, 2022
@ghost
Copy link
Author

ghost commented Nov 8, 2022

Makes sense, but I don't think having the conda-lock in the environment is causing the python being in dev problem?

@voodoo11
Copy link

voodoo11 commented Nov 8, 2022

I can confirm issue exists. It is not only a python problem but also applies to other dependencies. Having conda-lock in env is unrelated.

@voodoo11
Copy link

voodoo11 commented Nov 9, 2022

Last time I tried adding something as a dev and main dep it led to it not being installed in main. That's why we need to turn category: str into categories: list[str].

In my understanding categories are reliable only if whole dependency trees of categories are disjoint (which in the real world hardly happens). First, no dependency, direct or indirect, can have two categories, so it is not possible to select an arbitrary subset of categories when installing the environment. Second, a category can be overwritten in an non-deterministic manner, which makes it impossible to create hierarchy of categories.

@maresb
Copy link
Contributor

maresb commented Nov 10, 2022

Regarding categories:, I proposed a change to the lockfile format in mamba-org/mamba#1209 (comment). The response was positive, but there doesn't seem to be any movement yet towards an implementation.

@maresb
Copy link
Contributor

maresb commented Nov 11, 2022

Regarding categories I just had a rough idea how we might be able to move the concept forward from purely the conda-lock side, posted as #278. I'm not sure if it's viable, but if it works it'd be fairly simple to implement.

@Ben-Habermeyer
Copy link

I seem to be experiencing this issue, specifically when locking pip requirements defined in a pyproject.toml file where dev dependencies are installed in the [tool.poetry.dev-dependencies] section.

What I observe is that transitive dependencies managed with pip are installed in the the dev category only, such that when installing only main dependencies into an environment, ModuleNotFound error is raised by the code.

Example, requirement X and dev requirement Y both depend on Z, but Z gets installed in dev only.

I am only observing this issue when moving to the new lockfile format (conda-lock >=2)

Is the only solution to assembly 2 separate environments for "dev" and "no dev" where "dev" contains all the dependencies of the latter, plus dev requirements (and therefore not use categories feature)?

@maresb
Copy link
Contributor

maresb commented Jul 31, 2024

Hi @Ben-Habermeyer, in practice I've usually managed to work around the issue by ensuring that main dependencies are never repeated as dev dependencies and then occasionally adding any missing transitive dependencies as explicit main dependencies as needed. It's pretty ugly and fragile but usually gets the job done.

Since you're working with a Poetry-based project, I'd highly recommend looking into pixi. Pixi is basically Poetry for conda. It does pypi dependencies a lot more robustly and is being developed actively by a team that I really admire. There are a few differences: it's project-based rather than global-environment-based. Even though you're working with a pyproject.toml I'd recommend using pixi.toml for better conceptual separation even though they're equivalent.

@Ben-Habermeyer
Copy link

Hi @Ben-Habermeyer, in practice I've usually managed to work around the issue by ensuring that main dependencies are never repeated as dev dependencies and then occasionally adding any missing transitive dependencies as explicit main dependencies as needed. It's pretty ugly and fragile but usually gets the job done.

Since you're working with a Poetry-based project, I'd highly recommend looking into pixi. Pixi is basically Poetry for conda. It does pypi dependencies a lot more robustly and is being developed actively by a team that I really admire. There are a few differences: it's project-based rather than global-environment-based. Even though you're working with a pyproject.toml I'd recommend using pixi.toml for better conceptual separation even though they're equivalent.

Thanks @maresb, yes I have been looking at #278, #434 and mamba-org/mamba#1209 as well - seems to be exactly what I am experiencing.

Yea essentially what I have is a bioinformatic python package where dependencies are locked using pyproject.toml. Conda-only, non-python packages such as bwa are listed under [tool.conda-lock.dependencies], python dependencies under [tool.poetry.dependencies] and dev dependencies under [tool.poetry.dev-dependencies]. Those which can only be installed via pip (such as our internal pypi) are specified with {"source" = "pypi"}

Previously (conda lock v1) conda-lock would generate a multiplatform lock file with main and dev dependencies separated - dev would be installed for local development (things like linters, pytest etc.) and main would be installed on the production image.

But as myself and others have explained, there seems to be a transitive dependency issue now with categories where the transitive dependencies (seems limited to pip dependencies?) are only listed in the dev channel.

I have tried your suggestion of adding explicit transitive dependencies with some success. For non-pyproject-toml based requirements, I have also used https://github.com/conda-incubator/conda-env-builder to define environment yaml files which inherit form one another (in this example, dev would inherit from main).

I will also take a look at pixi as you suggested. For the time being do you suggest following #273 as the root issue when a fix may be addressed in conda-lock? Thanks again.

@maresb
Copy link
Contributor

maresb commented Aug 1, 2024

I will also take a look at pixi as you suggested. For the time being do you suggest following #273 as the root issue when a fix may be addressed in conda-lock? Thanks again.

Good question, there's quite a web of issues and PRs at the moment. I'd recommend following #300.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants