Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] Dask tests on Linux failing #4285

Closed
jameslamb opened this issue May 14, 2021 · 10 comments
Closed

[ci] Dask tests on Linux failing #4285

jameslamb opened this issue May 14, 2021 · 10 comments

Comments

@jameslamb
Copy link
Collaborator

Description

Created from #4283 (comment).

Python CI jobs in many Linux environments are failing with the following error for all Dask tests.

>       from distributed.protocol.core import dumps_msgpack
E       ImportError: cannot import name 'dumps_msgpack' from 'distributed.protocol.core' (/root/miniconda/envs/test-env/lib/python3.7/site-packages/distributed/protocol/core.py)

examples:

Additional Comments

It seems dumps_msgpack was removed in distributed version 2021.04.1, in dask/distributed#4677.

https://github.com/dask/distributed/blob/4637099e2548a963197fdcc04e563401f77adef5/docs/source/changelog.rst#2021041

So I think the root issue is that dask 2021.4.0 is not compatible with distributed 2021.4.1.

This looks very similar to the situation #4054, where incompatible versions of dask and distributed are uploaded to the official anaconda channel several days apart.

I can see the following versions getting installed in the failing builds:

dask               pkgs/main/noarch::dask-2021.4.0-pyhd3eb1b0_0
dask-core          pkgs/main/noarch::dask-core-2021.4.0-pyhd3eb1b0_0
distributed        pkgs/main/linux-64::distributed-2021.4.1-py39h06a4308_0
...
dask-2021.4.0              |     pyhd3eb1b0_0           5 KB
dask-core-2021.4.0         |     pyhd3eb1b0_0         670 KB
distributed-2021.4.1       |   py37h06a4308_0         1.0 MB

I'm not sure how we're ending up with distributed 2021.4.1 though, or why this just started breaking. It seems dask, dask-core, and distributed haven't been updated in a few weeks, and the latest version for all of them is 2021.4.0.

https://anaconda.org/anaconda/dask

image

https://anaconda.org/anaconda/dask-core

image

https://anaconda.org/anaconda/distributed

image

@jameslamb
Copy link
Collaborator Author

I do see that distributed was updated to 2021.4.1 on conda-forge about 9 hours ago. Are this project's CI jobs somehow getting that version from conda-forge?

https://anaconda.org/conda-forge/distributed

image

@jameslamb
Copy link
Collaborator Author

I've just restarted the failing Linux_Latest builds on https://dev.azure.com/lightgbm-ci/lightgbm-ci/_build/results?buildId=10070&view=results (from #4283 ).

If they fail with the same error as documented above, that would rule out the possibility that distributed 2021.4.1 was uploaded to the main Anaconda channel and then taken down. I think that's very unlikely but just trying to narrow things down.

@jameslamb
Copy link
Collaborator Author

@jameslamb
Copy link
Collaborator Author

jameslamb commented May 14, 2021

I tried to replicate this with docker tonight, running the following command from the root of the repo.

docker run \
  -v $(pwd):/opt/LightGBM \
  -w /opt/LightGBM \
  --env COMPILER=clang \
  --env BUILD_DIRECTORY=/opt/LightGBM \
  --env DEBIAN_FRONTEND='noninteractive' \
  --env IN_UBUNTU_LATEST_CONTAINER='true' \
  --env OS_NAME='linux' \
  --env SETUP_CONDA='true' \
  --env PYTHON_VERSION=3.6 \
  --env TASK=regular \
  -it ubuntu:latest \
  /bin/bash

apt update -y
apt install -y sudo

export CONDA=${HOME}/miniconda
export LGB_VER=$(head -n 1 VERSION.txt)
export PATH=${CONDA}/bin:${PATH}

./.ci/setup.sh

Here is some of the output conda config --show. I don't see conda-forge listed.

channels:
  - defaults
custom_channels:
  pkgs/main: https://repo.anaconda.com
  pkgs/r: https://repo.anaconda.com
  pkgs/pro: https://repo.anaconda.com
custom_multichannels:
  defaults:
    - https://repo.anaconda.com/pkgs/main
    - https://repo.anaconda.com/pkgs/r
  local:
default_channels:
  - https://repo.anaconda.com/pkgs/main
  - https://repo.anaconda.com/pkgs/r
full output of 'conda config'
add_anaconda_token: True
add_pip_as_python_dependency: True
aggressive_update_packages:
  - ca-certificates
  - certifi
  - openssl
allow_conda_downgrades: False
allow_cycles: True
allow_non_channel_urls: False
allow_softlinks: False
always_copy: False
always_softlink: False
always_yes: True
anaconda_upload: None
auto_activate_base: True
auto_stack: 0
auto_update_conda: True
bld_path:
changeps1: False
channel_alias: https://conda.anaconda.org
channel_priority: flexible
channels:
  - defaults
client_ssl_cert: None
client_ssl_cert_key: None
clobber: False
conda_build: {}
create_default_packages: []
croot: /root/miniconda/conda-bld
custom_channels:
  pkgs/main: https://repo.anaconda.com
  pkgs/r: https://repo.anaconda.com
  pkgs/pro: https://repo.anaconda.com
custom_multichannels:
  defaults:
    - https://repo.anaconda.com/pkgs/main
    - https://repo.anaconda.com/pkgs/r
  local:
debug: False
default_channels:
  - https://repo.anaconda.com/pkgs/main
  - https://repo.anaconda.com/pkgs/r
default_python: 3.8
default_threads: None
deps_modifier: not_set
dev: False
disallowed_packages: []
download_only: False
dry_run: False
enable_private_envs: False
env_prompt: ({default_env})
envs_dirs:
  - /root/miniconda/envs
  - /root/.conda/envs
error_upload_url: https://conda.io/conda-post/unexpected-error
execute_threads: 1
extra_safety_checks: False
force: False
force_32bit: False
force_reinstall: False
force_remove: False
ignore_pinned: False
json: False
local_repodata_ttl: 1
migrated_channel_aliases: []
migrated_custom_channels: {}
non_admin_enabled: True
notify_outdated_conda: True
offline: False
override_channels_enabled: True
path_conflict: clobber
pinned_packages: []
pip_interop_enabled: False
pkgs_dirs:
  - /root/miniconda/pkgs
  - /root/.conda/pkgs
proxy_servers: {}
quiet: False
remote_backoff_factor: 1
remote_connect_timeout_secs: 9.15
remote_max_retries: 3
remote_read_timeout_secs: 60.0
repodata_fns:
  - current_repodata.json
  - repodata.json
repodata_threads: None
report_errors: None
restore_free_channel: False
rollback_enabled: True
root_prefix: /root/miniconda
safety_checks: warn
sat_solver: pycosat
separate_format_cache: False
shortcuts: True
show_channel_urls: None
signing_metadata_url_base: https://repo.anaconda.com/pkgs/main
solver_ignore_timestamps: False
ssl_verify: True
subdir: linux-64
subdirs:
  - linux-64
  - noarch
target_prefix_override:
track_features: []
unsatisfiable_hints: True
unsatisfiable_hints_check_depth: 2
update_modifier: update_specs
use_index_cache: False
use_local: False
use_only_tar_bz2: False
verbosity: 0
verify_threads: 1
whitelist_channels: []

@jameslamb
Copy link
Collaborator Author

Aha! Looking at repo.anaconda.com, I do see see distributed 2021.04.1 was uploaded earlier today.

https://repo.anaconda.com/pkgs/main/linux-64/

image

So I guess I misunderstood the relationship between those web pages like https://anaconda.org/anaconda/distributed or https://anaconda.org/main/distributed and what the "default" channels are.


Given that we have agreed in #4054 not to use conda-forge, I think the options to fix these CI jobs are:

  1. Just wait until dask 2021.4.1 is released to the default conda channels
    • NOTE: it was released to PyPI and conda-forge April 23, not sure how long it will take to get to the default conda channels
  2. pin distributed and dask to 2021.4.0 in CI

@StrikerRUS which option do you prefer? Or is there some other option you can think of that I haven't?

@StrikerRUS
Copy link
Collaborator

StrikerRUS commented May 14, 2021

Given that dask and distributed are tight to each other and in fact should be installed with the same version number, I think we should pin their version at our CI and bump it from time to time. In addition, it looks like their releases are quite frequent.

Another option will be switching to get them from PyPI directly. But this option should be investigated for time consumption and possible dependency conflicts. And of course this is suitable only in case these packages are updated simultaneously at PyPI, otherwise we gain nothing from switching to PyPI.

@jameslamb
Copy link
Collaborator Author

alright. I'll pin them for now to unblock CI but we can leave this repo open to track the work of maybe switching to PyPI.

dask and distributed are released very frequently, as you mentioned, so I'd like to avoid the maintenance work of updating pins if possible.

I've also opened dask/distributed#4819 asking the distributed maintainers if they can influence when these packages are published on the Anaconda main channels.

@StrikerRUS
Copy link
Collaborator

I'll pin them for now to unblock CI but we can leave this repo open to track the work of maybe switching to PyPI.

Agreed.

so I'd like to avoid the maintenance work of updating pins if possible.

We can create requirements_dask.txt and setup dependabot to update this file if we end up using pip later.

@jameslamb
Copy link
Collaborator Author

ha sorry I meant "leave this issue" open, not "leave this repo". Typing too fast this morning.

We can create requirements_dask.txt and setup dependabot to update this file if we end up using pip later.

Oh, great idea! I've definitely seen a setup like that be successful in other projects.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants