Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install and import from kerchunk (not fsspec-reference-maker) #324

Merged
merged 5 commits into from
Mar 11, 2022

Conversation

cisaacstern
Copy link
Member

I believe the fact that upstream-dev tests are failing, despite an upstream fix in kerchunk may be related to the fact that we never migrated from installing and importing from the the fsspec-reference-maker namespace.

As a result, when we install the upstream dev requirements, we get a kerchunk installation in addition to an fsspec-reference-maker installation. Obviously, this is problematic.

I hope migrating to the kerchunk namespace for installation and import may solve this.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@cisaacstern
Copy link
Member Author

cisaacstern commented Mar 11, 2022

Status update:

Re: kerchunk, even though there is a Conda Forge feedstock available, I've kept it as a pip dependency, because installing from conda-forge alongside pynio (another one of our dependencies) causes problems.

The minimal reproducer is to build and activate an environment from

# kerchunk-plus-pynio.yml

name: kerchunk-plus-pynio
channels:
  - conda-forge
dependencies:
  - python=3.8
  - kerchunk>=0.0.4
  - pynio

with

$ conda env create --file kerchunk-plus-pynio.yml

Then call

(kerchunk-plus-pynio) $ python3 -c "import Nio"

which replicates this line from the xarray pynio backend, and raises

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/miniconda3/envs/kerchunk-plus-pynio-conda/lib/python3.8/site-packages/PyNIO/Nio.py", line 83, in <module>
    from _nio import *
ImportError: dlopen(/miniconda3/envs/kerchunk-plus-pynio-conda/lib/python3.8/site-packages/PyNIO/_nio.cpython-38-darwin.so, 0x0002): Library not loaded: @rpath/libtbb.dylib
  Referenced from: /miniconda3/envs/kerchunk-plus-pynio-conda/lib/libtiledb.dylib
  Reason: tried: '/miniconda3/envs/kerchunk-plus-pynio-conda/lib/libtbb.dylib' (no such file), '/miniconda3/envs/kerchunk-plus-pynio-conda/lib/libtbb.dylib' (no such file), '/miniconda3/envs/kerchunk-plus-pynio-conda/lib/libtbb.dylib' (no such file), '/miniconda3/envs/kerchunk-plus-pynio-conda/lib/libtbb.dylib' (no such file), '/miniconda3/envs/kerchunk-plus-pynio-conda/lib/libtbb.dylib' (no such file), '/miniconda3/envs/kerchunk-plus-pynio-conda/lib/libtbb.dylib' (no such file), '/miniconda3/envs/kerchunk-plus-pynio-conda/lib/libtbb.dylib' (no such file), '/miniconda3/envs/kerchunk-plus-pynio-conda/lib/python3.8/site-packages/PyNIO/../../../libtbb.dylib' (no such file), '/miniconda3/envs/kerchunk-plus-pynio-conda/lib/libtbb.dylib' (no such file), '/miniconda3/envs/kerchunk-plus-pynio-conda/lib/libtbb.dylib' (no such file), '/miniconda3/envs/kerchunk-plus-pynio-conda/lib/libtbb.dylib' (no such file), '/miniconda3/envs/kerchunk-plus-pynio-conda/lib/python3.8/site-packages/PyNIO/../../../libtbb.dylib' (no such file), '/miniconda3/envs/kerchunk-plus-pynio-conda/bin/../lib/libtbb.dylib' (no such file), '/miniconda3/envs/kerchunk-plus-pynio-conda/bin/../lib/libtbb.dylib' (no such file), '/usr/local/lib/libtbb.dylib' (no such file), '/usr/lib/libtbb.dylib' (no such file)

Installing kerchunk from pip avoids this problem, for some reason.

Is this worth raising on https://github.com/conda-forge/kerchunk-feedstock? I ask because the xarray docs indicate that pynio is no longer actively maintained so maybe chasing pynio compatibility is not a relevant concern, and the solution is for us to just drop the pynio dependency?

cc @martindurant @rabernat

@martindurant
Copy link
Contributor

Since the original fsspec-reference-maker surely has no users, I could remove it from PyPI, if that helps; but it's more involved fro conda-forge. I can also remove the alias package from kerchunk - but you won't notice that unless I make a new release (which would bring combine2 with it, since I should merge that).

@rabernat
Copy link
Contributor

For a fix for the nbmake issue, see pydata/xarray#6350.

@cisaacstern
Copy link
Member Author

@martindurant, this PR resolves our fsspec-reference-maker/kerchunk namespace issues by moving completely to the kerchunk namespace, so no package index changes needed on my account. My main question was whether or not you care that conda apparently cannot resolve a functional environment with kerchunk + pynio both installed from conda-forge.

@martindurant
Copy link
Contributor

kerchunk's deps are extremely minimal, so I don't see how that's causing a problem.

requirements:
  host:
    - pip
    - python >=3.7
  run:
    - fsspec
    - python >=3.7
    - ujson

@cisaacstern
Copy link
Member Author

cisaacstern commented Mar 11, 2022

Since ujson and pynio both have C components and the traceback in #324 (comment) indicates a missing .dylib ... maybe it's ujson?

Edit: I will test this and report back.

@martindurant
Copy link
Contributor

Don't know, maybe? It says the lib was missing, though, not some symbol clash. For reference, libtbb is some multiprocessing abstraction.

Could try explicitly listing ujson with some specific version, but that's shooting in the dark.

@cisaacstern
Copy link
Member Author

For a fix for the nbmake issue, see pydata/xarray#6350

Thanks Ryan. 4fa6fea also fixes this, because it bumps to nbclient>=0.5.13, which includes the jupyter/nbclient#209 bugfix.

@cisaacstern
Copy link
Member Author

cisaacstern commented Mar 11, 2022

Martin, is the removal of MultiZarrToZarr.xarray_open_kwargs in kerchunk main intended as a breaking change? We're now failing here due to that

@martindurant
Copy link
Contributor

Yes, this is unavoidable. I am looking at this repo now for the fix. The signature is now:

    def __init__(self, path, coo_map=None, concat_dims=None, coo_dtypes=None,
                 identical_dims=None,
                 target_options=None, remote_protocol=None, remote_options=None,
                 inline_threshold=500, preprocess=None, postprocess=None):

where probably the only content of xarray_open_kwargs was concat_dims, which now becomes simply concat_dims by itself.

@cisaacstern
Copy link
Member Author

Re: kerchunk/pynio incompatibility, this is actually a ujson/pynio incompatibility as this environment breaks with the same traceback reported in #324 (comment):

name: ujson-pynio
channels:
  - conda-forge
dependencies:
  - python=3.8
  - pynio
  - ujson

I'm a bit out of my depth in debugging this, but my hunch is that pynio being unmaintained is the problem, and maybe we want to just drop the pynio dependency?

@martindurant
Copy link
Contributor

Certainly, ujson is common enough that it could easily creep into the environment via some other dependency in another package later, so if we don't need pynio...

@cisaacstern
Copy link
Member Author

The pynio question is being tracked in #325 so the only remaining question here is how to handle the upstream kerchunk API change. (Note that for some reason the 3.8, upstream-dev builds just passed, but that is a fluke of caching or somesuch thing. I re-ran these tests locally against kerchunk@main and of course they also fail due to the API change.)

@cisaacstern
Copy link
Member Author

Per offline conversation with @rabernat, I'll merge this as-is, with the knowledge that after the next kerchunk release, we'll need to pin kerchunk to >={next release} and change our MultiZarrToZarr call signature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants