-
-
Notifications
You must be signed in to change notification settings - Fork 118
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Set up reproducible Python environments with conda-lock (#2901)
* Add a conda-lock setup for discussion. * Move python-snappy into project.dependencies in pyproject.toml * Remove sphinx-autoapi from pypi deps, and no longer required snappy-python * Switch to using conda-forge version of recordlinkage v0.16 * Update conda-lock.yml now that all dependencies are available on conda-forge * Consolidate conda env files under environments/ dir * Add a GitHub action to relock dependencies * Quote the pip install command * Remove pip install of pudl from environment.yml * Rename workflow * Only build lockfile from pyproject.toml, don't install extras. * Just install conda-lock, not pudl, before running conda-lock. * install conda-lock with pip * Move all remaining dev-environment.yml deps to pyproject.toml * Add other platforms; make draft PR against dev. * Comment out dev base branch for now. * Remove pandas extras and recordlinkage deps from pyproject.toml * Use conda-lock --micromamba rather than --mamba * Don't specify grpcio, or specific recordlinkage version * Render platform-specific environment files in github action * Fix paths relative to environments directory * Add some comment notes to workflow * Render environment for Read The Docs. * Use environment not explicit rendered lockfile * Add readthedocs specific sphinx extension * Don't render explicit conda env for RTD since it can't read it. * Build linux-aarch64 lockfile. Use conda-lock.yml in workflows. * Comment out non-working linux-aarch64 platform for now. * Switch to using rendered lockfiles. * Remove deprecated environment files * Switch to using a micromamba docker image * Install git into the docker image. * Use micromamba and unrendered multi-platform lockfile. * Add main category to micromamba environment creation. * Use conda-lock not base as env name * Add a conda-lock setup for discussion. * Move python-snappy into project.dependencies in pyproject.toml * Remove sphinx-autoapi from pypi deps, and no longer required snappy-python * Add linux-aarch64 platform back into conda-lock settings.
- Loading branch information
1 parent
0a1a125
commit 01096a6
Showing
22 changed files
with
35,060 additions
and
154 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
--- | ||
name: update-lockfile | ||
|
||
on: | ||
workflow_dispatch: | ||
#schedule: | ||
# - cron: "0 9 * * 1-5" # Weekdays at 9AM UTC | ||
#pull_request: | ||
# paths: | ||
# - "pyproject.toml" | ||
|
||
jobs: | ||
conda-lock: | ||
# Don't run scheduled job on forks. | ||
if: (github.event_name == 'schedule' && github.repository == 'catalyst-cooperative/pudl') || (github.event_name != 'schedule') | ||
defaults: | ||
run: | ||
shell: bash -l {0} | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v4 | ||
# If running on a schedule, run on dev. | ||
# If running from workflow_dispatch, run on whatever the chosen branch/ref was. | ||
# with: | ||
# ref: dev | ||
- name: Install Micromamba | ||
uses: mamba-org/setup-micromamba@v1 | ||
with: | ||
environment-name: conda-lock | ||
create-args: >- | ||
python=3.11 | ||
conda-lock | ||
- name: Run conda-lock to recreate lockfile from scratch | ||
run: | | ||
cd environments | ||
rm conda-lock.yml | ||
conda-lock \ | ||
--micromamba \ | ||
--file=../pyproject.toml \ | ||
--lockfile=conda-lock.yml | ||
conda-lock render \ | ||
--kind explicit \ | ||
--kind env \ | ||
--dev-dependencies \ | ||
--extras docs \ | ||
--extras datasette \ | ||
conda-lock.yml | ||
conda-lock render \ | ||
--kind env \ | ||
--extras docs \ | ||
--platform linux-64 \ | ||
--filename-template "readthedocs-{platform}.conda.lock" \ | ||
conda-lock.yml | ||
cd .. | ||
- name: Open a pull request | ||
uses: peter-evans/create-pull-request@v5 | ||
with: | ||
# # The default GITHUB_TOKEN doesn't allow other workflows to trigger. | ||
# # Thus if there are tests to be run, they won't be run. For more info, | ||
# # see the note under | ||
# # <https://github.com/peter-evans/create-pull-request#action-inputs>. | ||
# # One possible workaround is to specify a Personal Access Token (PAT). | ||
# # This PAT should have read-write permissions for "Pull Requests" | ||
# # and read-write permissions for "Contents". | ||
# token: ${{ secrets.GH_PAT_FOR_PR }} | ||
commit-message: Update lockfile | ||
title: Update Lockfile | ||
body: > | ||
This pull request relocks the dependencies with conda-lock. | ||
It is triggered by [update-lockfile](https://github.com/catalyst-cooperative/pudl/blob/main/.github/workflows/update-lockfile.yml). | ||
branch: update-lockfile | ||
labels: dependencies, conda-lock | ||
reviewers: zaneselvans | ||
delete-branch: true | ||
# base: dev | ||
draft: true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,63 +1,54 @@ | ||
FROM condaforge/mambaforge:23.3.1-1 | ||
FROM mambaorg/micromamba:1.5.1 | ||
|
||
USER root | ||
|
||
# Install curl and js | ||
# awscli requires unzip, less, groff and mandoc | ||
# hadolint ignore=DL3008 | ||
RUN apt-get update && apt-get install --no-install-recommends -y curl jq unzip less groff mandoc \ | ||
&& apt-get clean \ | ||
&& rm -rf /var/lib/apt/lists/* | ||
RUN apt-get update && \ | ||
apt-get install --no-install-recommends -y git curl jq unzip less groff mandoc && \ | ||
apt-get clean && \ | ||
rm -rf /var/lib/apt/lists/* | ||
|
||
# Configure gsutil authentication | ||
# hadolint ignore=DL3059 | ||
RUN printf '[GoogleCompute]\nservice_account = default' > /etc/boto.cfg | ||
|
||
# Install awscli2 | ||
RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && unzip awscliv2.zip && ./aws/install | ||
|
||
# Create a non-root user inside the container | ||
# hadolint ignore=DL3059 | ||
RUN useradd -Ums /bin/bash catalyst | ||
|
||
ENV CONTAINER_HOME=/home/catalyst | ||
RUN curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && \ | ||
unzip awscliv2.zip && \ | ||
./aws/install | ||
|
||
# Switch to being the catalyst user and go into the copied repo | ||
USER catalyst | ||
# Switch back to being non-root user and get into the home directory | ||
USER $MAMBA_USER | ||
ENV CONTAINER_HOME=/home/$MAMBA_USER | ||
WORKDIR ${CONTAINER_HOME} | ||
|
||
ENV CONDA_PREFIX=${CONTAINER_HOME}/env | ||
ENV PUDL_REPO=${CONTAINER_HOME}/pudl | ||
ENV CONDA_RUN="conda run --no-capture-output --prefix ${CONDA_PREFIX}" | ||
ENV PYTHON_VERSION="3.11" | ||
ENV CONDA_RUN="micromamba run --prefix ${CONDA_PREFIX}" | ||
|
||
ENV CONTAINER_PUDL_WORKSPACE=${CONTAINER_HOME}/pudl_work | ||
ENV PUDL_INPUT=${CONTAINER_PUDL_WORKSPACE}/data | ||
ENV PUDL_INPUT=${CONTAINER_PUDL_WORKSPACE}/input | ||
ENV PUDL_OUTPUT=${CONTAINER_PUDL_WORKSPACE}/output | ||
ENV DAGSTER_HOME=${CONTAINER_PUDL_WORKSPACE}/dagster_home | ||
|
||
# Create data input/output directories | ||
RUN mkdir -p ${PUDL_INPUT} ${PUDL_OUTPUT} ${DAGSTER_HOME} | ||
|
||
# Create a conda environment based on the specification in the repo | ||
COPY test/test-environment.yml test/test-environment.yml | ||
RUN mamba create --copy --prefix ${CONDA_PREFIX} --yes python=${PYTHON_VERSION} && \ | ||
# Then we can use mamba env update, which can parse the environment.yml file: | ||
mamba env update --prefix ${CONDA_PREFIX} --file test/test-environment.yml && \ | ||
conda clean -afy | ||
|
||
|
||
COPY environments/conda-lock.yml environments/conda-lock.yml | ||
RUN micromamba create --prefix ${CONDA_PREFIX} --yes --category main dev docs test datasette --file environments/conda-lock.yml && \ | ||
micromamba clean -afy | ||
# Copy the cloned pudl repository into the user's home directory | ||
COPY --chown=catalyst:catalyst . ${CONTAINER_HOME} | ||
COPY --chown=${MAMBA_USER}:${MAMBA_USER} . ${CONTAINER_HOME} | ||
|
||
# TODO(rousik): The following is a workaround for sudden breakage where conda | ||
# can't find libraries contained within the environment. It's unclear why! | ||
ENV LD_LIBRARY_PATH=${CONDA_PREFIX}/lib | ||
# We need information from .git to get version with setuptools_scm so we mount that | ||
# directory without copying it into the image. | ||
RUN --mount=type=bind,source=.git,target=${PUDL_REPO}/.git \ | ||
${CONDA_RUN} pip install --no-cache-dir -e './[dev,doc,test,datasette]' && \ | ||
${CONDA_RUN} pip install --no-cache-dir --editable . && \ | ||
# Run the PUDL setup script so we know where to read and write data | ||
${CONDA_RUN} pudl_setup | ||
|
||
|
||
# Run the unit tests: | ||
CMD ["conda", "run", "--no-capture-output", "--prefix", "${CONDA_PREFIX}", "pytest", "test/unit"] |
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.