Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up reproducible Python environments with conda-lock #2901

Merged
merged 47 commits into from
Oct 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
3fc5951
Add a conda-lock setup for discussion.
zaneselvans Sep 27, 2023
5e82997
Move python-snappy into project.dependencies in pyproject.toml
zaneselvans Sep 28, 2023
da594f1
Remove sphinx-autoapi from pypi deps, and no longer required snappy-p…
zaneselvans Oct 12, 2023
a9bde5a
Update conda-lock file
zaneselvans Oct 13, 2023
ce62f2d
Switch to using conda-forge version of recordlinkage v0.16
zaneselvans Oct 13, 2023
beb0f35
Update conda-lock.yml now that all dependencies are available on cond…
zaneselvans Oct 14, 2023
08a58c0
Consolidate conda env files under environments/ dir
zaneselvans Oct 19, 2023
cdde302
Add a GitHub action to relock dependencies
zaneselvans Oct 19, 2023
468ded1
Quote the pip install command
zaneselvans Oct 19, 2023
fbe5f0f
Remove pip install of pudl from environment.yml
zaneselvans Oct 19, 2023
06b45ed
Rename workflow
zaneselvans Oct 20, 2023
1adb606
Only build lockfile from pyproject.toml, don't install extras.
zaneselvans Oct 20, 2023
2fd2dad
Just install conda-lock, not pudl, before running conda-lock.
zaneselvans Oct 20, 2023
2ce49cf
install conda-lock with pip
zaneselvans Oct 20, 2023
fb77b54
Move all remaining dev-environment.yml deps to pyproject.toml
zaneselvans Oct 20, 2023
41d9f3d
Add other platforms; make draft PR against dev.
zaneselvans Oct 20, 2023
54ad813
Comment out dev base branch for now.
zaneselvans Oct 20, 2023
b81e69a
Remove pandas extras and recordlinkage deps from pyproject.toml
zaneselvans Oct 20, 2023
846e905
Use conda-lock --micromamba rather than --mamba
zaneselvans Oct 20, 2023
4bbfc41
Update lockfile
zaneselvans Oct 20, 2023
f4141d3
Don't specify grpcio, or specific recordlinkage version
zaneselvans Oct 20, 2023
e644a0b
Update lockfile
zaneselvans Oct 20, 2023
887e60a
Render platform-specific environment files in github action
zaneselvans Oct 20, 2023
6b0c10d
Fix paths relative to environments directory
zaneselvans Oct 20, 2023
c99388d
Update lockfile
zaneselvans Oct 20, 2023
ebfbfc8
Add some comment notes to workflow
zaneselvans Oct 20, 2023
45ecbd0
Render environment for Read The Docs.
zaneselvans Oct 20, 2023
d1f29ec
Use environment not explicit rendered lockfile
zaneselvans Oct 20, 2023
402af3a
Add readthedocs specific sphinx extension
zaneselvans Oct 20, 2023
db4502a
Don't render explicit conda env for RTD since it can't read it.
zaneselvans Oct 20, 2023
09592c8
Build linux-aarch64 lockfile. Use conda-lock.yml in workflows.
zaneselvans Oct 20, 2023
c670207
Comment out non-working linux-aarch64 platform for now.
zaneselvans Oct 20, 2023
ad488ff
Switch to using rendered lockfiles.
zaneselvans Oct 20, 2023
dd210ea
Remove deprecated environment files
zaneselvans Oct 20, 2023
1647bd1
Update lockfile
zaneselvans Oct 20, 2023
ad1f46b
Switch to using a micromamba docker image
zaneselvans Oct 20, 2023
26f3041
Install git into the docker image.
zaneselvans Oct 20, 2023
400a67f
Use micromamba and unrendered multi-platform lockfile.
zaneselvans Oct 20, 2023
8f490a7
Add main category to micromamba environment creation.
zaneselvans Oct 20, 2023
a4986d6
Use conda-lock not base as env name
zaneselvans Oct 20, 2023
02dddd3
Update lockfile
zaneselvans Oct 21, 2023
1c981af
Add a conda-lock setup for discussion.
zaneselvans Oct 21, 2023
0cd0ebb
Move python-snappy into project.dependencies in pyproject.toml
zaneselvans Sep 28, 2023
9ae6973
Re-render all conda lockfiles
zaneselvans Oct 21, 2023
eebc98c
Update lockfile
zaneselvans Oct 21, 2023
dd3d2e3
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 21, 2023
f954409
Merge pull request #2967 from catalyst-cooperative/update-lockfile
zaneselvans Oct 21, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 9 additions & 18 deletions .github/workflows/tox-pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,13 +35,10 @@ jobs:
- name: Install Conda environment using mamba
uses: mamba-org/setup-micromamba@v1
with:
environment-file: test/test-environment.yml
environment-file: environments/conda-lock.yml
environment-name: pudl-dev
cache-environment: true
condarc: |
channels:
- conda-forge
- defaults
channel_priority: strict
create-args: --category main dev docs test datasette

- name: Log environment details
run: |
Expand Down Expand Up @@ -78,13 +75,10 @@ jobs:
- name: Install Conda environment using mamba
uses: mamba-org/setup-micromamba@v1
with:
environment-file: test/test-environment.yml
environment-file: environments/conda-lock.yml
environment-name: pudl-dev
cache-environment: true
condarc: |
channels:
- conda-forge
- defaults
channel_priority: strict
create-args: --category main dev docs test datasette

- name: Log environment details
run: |
Expand Down Expand Up @@ -131,13 +125,10 @@ jobs:
- name: Install Conda environment using mamba
uses: mamba-org/setup-micromamba@v1
with:
environment-file: test/test-environment.yml
environment-file: environments/conda-lock.yml
environment-name: pudl-dev
cache-environment: true
condarc: |
channels:
- conda-forge
- defaults
channel_priority: strict
create-args: --category main dev docs test datasette

- name: Log environment details
run: |
Expand Down
77 changes: 77 additions & 0 deletions .github/workflows/update-lockfile.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
---
name: update-lockfile

on:
workflow_dispatch:
#schedule:
# - cron: "0 9 * * 1-5" # Weekdays at 9AM UTC
#pull_request:
# paths:
# - "pyproject.toml"

jobs:
conda-lock:
# Don't run scheduled job on forks.
if: (github.event_name == 'schedule' && github.repository == 'catalyst-cooperative/pudl') || (github.event_name != 'schedule')
defaults:
run:
shell: bash -l {0}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
# If running on a schedule, run on dev.
# If running from workflow_dispatch, run on whatever the chosen branch/ref was.
# with:
# ref: dev
- name: Install Micromamba
uses: mamba-org/setup-micromamba@v1
with:
environment-name: conda-lock
create-args: >-
python=3.11
conda-lock
- name: Run conda-lock to recreate lockfile from scratch
run: |
cd environments
rm conda-lock.yml
conda-lock \
--micromamba \
--file=../pyproject.toml \
--lockfile=conda-lock.yml
conda-lock render \
--kind explicit \
--kind env \
--dev-dependencies \
--extras docs \
--extras datasette \
conda-lock.yml
conda-lock render \
--kind env \
--extras docs \
--platform linux-64 \
--filename-template "readthedocs-{platform}.conda.lock" \
conda-lock.yml
cd ..
- name: Open a pull request
uses: peter-evans/create-pull-request@v5
with:
# # The default GITHUB_TOKEN doesn't allow other workflows to trigger.
# # Thus if there are tests to be run, they won't be run. For more info,
# # see the note under
# # <https://github.com/peter-evans/create-pull-request#action-inputs>.
# # One possible workaround is to specify a Personal Access Token (PAT).
# # This PAT should have read-write permissions for "Pull Requests"
# # and read-write permissions for "Contents".
# token: ${{ secrets.GH_PAT_FOR_PR }}
commit-message: Update lockfile
title: Update Lockfile
body: >
This pull request relocks the dependencies with conda-lock.
It is triggered by [update-lockfile](https://github.com/catalyst-cooperative/pudl/blob/main/.github/workflows/update-lockfile.yml).
branch: update-lockfile
labels: dependencies, conda-lock
reviewers: zaneselvans
delete-branch: true
# base: dev
draft: true
9 changes: 3 additions & 6 deletions .github/workflows/zenodo-cache-sync.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,13 +47,10 @@ jobs:
- name: Install Conda environment using mamba
uses: mamba-org/setup-micromamba@v1
with:
environment-file: test/test-environment.yml
environment-file: environments/conda-lock.yml
environment-name: pudl-dev
cache-environment: true
condarc: |
channels:
- conda-forge
- defaults
channel_priority: strict
create-args: --category main dev docs test datasette

- name: Log environment details
run: |
Expand Down
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ codecov.sh
.env_pudl/
*wheel-metadata
dask-worker-space*
devtools/user-requirements.txt
devtools/user-environment.yml
environments/user-requirements.txt
environments/user-environment.yml
.vscode/*
commit.txt
devtools/profiles/
Expand Down
4 changes: 1 addition & 3 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ build:

# Define the python environment using conda / mamba
conda:
environment: docs/docs-environment.yml
environment: environments/readthedocs-linux-64.conda.lock.yml

# Build documentation in the docs/ directory with Sphinx
sphinx:
Expand All @@ -27,5 +27,3 @@ python:
install:
- method: pip
path: .
extra_requirements:
- doc
35 changes: 0 additions & 35 deletions devtools/environment.yml

This file was deleted.

16 changes: 0 additions & 16 deletions docker-compose.yml

This file was deleted.

49 changes: 20 additions & 29 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,63 +1,54 @@
FROM condaforge/mambaforge:23.3.1-1
FROM mambaorg/micromamba:1.5.1

USER root

# Install curl and js
# awscli requires unzip, less, groff and mandoc
# hadolint ignore=DL3008
RUN apt-get update && apt-get install --no-install-recommends -y curl jq unzip less groff mandoc \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
RUN apt-get update && \
apt-get install --no-install-recommends -y git curl jq unzip less groff mandoc && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*

# Configure gsutil authentication
# hadolint ignore=DL3059
RUN printf '[GoogleCompute]\nservice_account = default' > /etc/boto.cfg

# Install awscli2
RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && unzip awscliv2.zip && ./aws/install

# Create a non-root user inside the container
# hadolint ignore=DL3059
RUN useradd -Ums /bin/bash catalyst

ENV CONTAINER_HOME=/home/catalyst
RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && \
unzip awscliv2.zip && \
./aws/install

# Switch to being the catalyst user and go into the copied repo
USER catalyst
# Switch back to being non-root user and get into the home directory
USER $MAMBA_USER
ENV CONTAINER_HOME=/home/$MAMBA_USER
WORKDIR ${CONTAINER_HOME}

ENV CONDA_PREFIX=${CONTAINER_HOME}/env
ENV PUDL_REPO=${CONTAINER_HOME}/pudl
ENV CONDA_RUN="conda run --no-capture-output --prefix ${CONDA_PREFIX}"
ENV PYTHON_VERSION="3.11"
ENV CONDA_RUN="micromamba run --prefix ${CONDA_PREFIX}"

ENV CONTAINER_PUDL_WORKSPACE=${CONTAINER_HOME}/pudl_work
ENV PUDL_INPUT=${CONTAINER_PUDL_WORKSPACE}/data
ENV PUDL_INPUT=${CONTAINER_PUDL_WORKSPACE}/input
ENV PUDL_OUTPUT=${CONTAINER_PUDL_WORKSPACE}/output
ENV DAGSTER_HOME=${CONTAINER_PUDL_WORKSPACE}/dagster_home

# Create data input/output directories
RUN mkdir -p ${PUDL_INPUT} ${PUDL_OUTPUT} ${DAGSTER_HOME}

# Create a conda environment based on the specification in the repo
COPY test/test-environment.yml test/test-environment.yml
RUN mamba create --copy --prefix ${CONDA_PREFIX} --yes python=${PYTHON_VERSION} && \
# Then we can use mamba env update, which can parse the environment.yml file:
mamba env update --prefix ${CONDA_PREFIX} --file test/test-environment.yml && \
conda clean -afy


COPY environments/conda-lock.yml environments/conda-lock.yml
RUN micromamba create --prefix ${CONDA_PREFIX} --yes --category main dev docs test datasette --file environments/conda-lock.yml && \
micromamba clean -afy
# Copy the cloned pudl repository into the user's home directory
COPY --chown=catalyst:catalyst . ${CONTAINER_HOME}
COPY --chown=${MAMBA_USER}:${MAMBA_USER} . ${CONTAINER_HOME}

# TODO(rousik): The following is a workaround for sudden breakage where conda
# can't find libraries contained within the environment. It's unclear why!
ENV LD_LIBRARY_PATH=${CONDA_PREFIX}/lib
# We need information from .git to get version with setuptools_scm so we mount that
# directory without copying it into the image.
RUN --mount=type=bind,source=.git,target=${PUDL_REPO}/.git \
${CONDA_RUN} pip install --no-cache-dir -e './[dev,doc,test,datasette]' && \
${CONDA_RUN} pip install --no-cache-dir --editable . && \
# Run the PUDL setup script so we know where to read and write data
${CONDA_RUN} pudl_setup


# Run the unit tests:
CMD ["conda", "run", "--no-capture-output", "--prefix", "${CONDA_PREFIX}", "pytest", "test/unit"]
12 changes: 0 additions & 12 deletions docs/docs-environment.yml

This file was deleted.

Loading
Loading