Skip to content

Commit

Permalink
Refactor fv3 runtime modules and image construction (#185)
Browse files Browse the repository at this point in the history
As part of the effort to adjust the runfile for the one_step jobs, I realized that I could share the following functionality:
1. the docker image built by the prognostic run workflow
1. the `online_modules` directory used there.

After talking to @frodre  and @oliverwm1, I decided to pull these routines into the top-level fv3net workspace. This involved the following changes:

- build/version docker images in top level
  - Start versioning fv3net for the purposes of building docker images
  - Move docker image construction for a `prognostic_run` and `fv3net` images to the `/docker` folder.
  - Delete the old contents of `/docker` which seem obsolete (last touched Oct 2019).
  - add rules `make push_images` and `make build_images`
  - Update the README.md
  - updated yamls in workflow directories
- move `workflows/prognostic_c48_run/online_modules` into `fv3net/runtime`. This module will now be available to other workflows.

Other changes:

* Refactor online_modules to fv3net
* Create us.gcr.io/vcm-ml/prognostic_run:v0.1.0
* Refactor us.gcr.io/vcm-ml/fv3net image build code
* Add __version__ to fv3net init
* update prognostic_run_diags configuration
* update readme
* pin pandas version to 1.0.1 (this is incompatible with xarray 0.15)
  • Loading branch information
nbren12 authored Mar 21, 2020
1 parent 48d14e2 commit 598cb26
Show file tree
Hide file tree
Showing 22 changed files with 88 additions and 143 deletions.
11 changes: 11 additions & 0 deletions HISTORY.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
=======
History
=======

Current
-------

0.1.0 (2020-03-20)
------------------

* First release of fv3net
22 changes: 17 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
#################################################################################
# GLOBALS #
#################################################################################
VERSION = v0.1.0
ENVIRONMENT_SCRIPTS = .environment-scripts
PROJECT_DIR := $(shell dirname $(realpath $(lastword $(MAKEFILE_LIST))))
BUCKET = [OPTIONAL] your-bucket-for-syncing-data (do not include 's3://')
Expand All @@ -22,8 +23,22 @@ endif
#################################################################################
# COMMANDS #
#################################################################################
build_image:
docker build . -t $(IMAGE) -t $(GCR_IMAGE)
.PHONY: wheels build_images push_image
wheels:
pip wheel --no-deps .
pip wheel --no-deps external/vcm

# pattern rule for building docker images
build_image_%:
docker build -f docker/$*/Dockerfile . -t us.gcr.io/vcm-ml/$*:$(VERSION)

build_image_prognostic_run: wheels

build_images: build_image_fv3net build_image_prognostic_run

push_image:
docker push us.gcr.io/vcm-ml/fv3net:$(VERSION)
docker push us.gcr.io/vcm-ml/prognostic_run:$(VERSION)

enter: build_image
docker run -it -v $(shell pwd):/code \
Expand All @@ -33,9 +48,6 @@ enter: build_image
# -e GOOGLE_APPLICATION_CREDENTIALS=/google_creds.json \
# -v $(HOME)/.config/gcloud/application_default_credentials.json:/google_creds.json \
push_image: build_image
docker push $(GCR_IMAGE)


## Make Dataset
.PHONY: data update_submodules create_environment overwrite_baseline_images
Expand Down
13 changes: 13 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,18 @@ The main data processing pipelines for this project currently utilize Google Clo
Dataflow and Kubernetes with Docker images. Run scripts to deploy these workflows
along with information can be found under the `workflows` directory.

## Building the fv3net docker images

The workflows use a pair of common images:

|Image| Description|
|-----|------------|
| `us.gcr.io/vcm-ml/prognostic_run` | fv3gfs-python with minimal fv3net and vcm installed |
| `us.gcr.io/vcm-ml/fv3net` | fv3net image with all dependencies including plotting |

These images can be built and pushed to GCR using `make build_images` and
`make push_images`, respectively.

## Dataflow

Dataflow jobs run in a "serverless" style where data is piped between workers who
Expand Down Expand Up @@ -117,6 +129,7 @@ If you get an error `Could not create workflow; user does not have write access
trying to submit the dataflow job, do `gcloud auth application-default login` first and then retry.



## Deploying on k8s with fv3net

Docker images with the python-wrapped model and fv3run are available from the
Expand Down
30 changes: 0 additions & 30 deletions docker/Dockerfile.kubernetes

This file was deleted.

22 changes: 0 additions & 22 deletions docker/download_inputdata.sh

This file was deleted.

10 changes: 3 additions & 7 deletions Dockerfile → docker/fv3net/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ ENV PROJECT_NAME=fv3net
USER root
RUN apt-get update && apt-get install -y gfortran
ADD environment.yml $FV3NET/
ADD Makefile $FV3NET/
ADD .environment-scripts $ENVIRONMENT_SCRIPTS
RUN fix-permissions $FV3NET
WORKDIR $FV3NET
Expand All @@ -20,21 +19,18 @@ ENV PATH=/opt/conda/envs/fv3net/bin:$PATH
RUN bash $ENVIRONMENT_SCRIPTS/build_environment.sh $PROJECT_NAME
RUN jupyter labextension install @pyviz/jupyterlab_pyviz

# Add rest of fv3net directory
USER root
ADD . $FV3NET
# install gcloud sdk
RUN cd / && \
curl https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-284.0.0-linux-x86_64.tar.gz |\
tar xz
ENV PATH=/google-cloud-sdk/bin:${PATH}
#RUN /google-cloud-sdk/bin/gcloud init

# Add rest of fv3net directory
ADD . $FV3NET

RUN fix-permissions $FV3NET
USER $NB_UID

# RUN gcloud init

# setup the local python packages

RUN bash $ENVIRONMENT_SCRIPTS/install_local_packages.sh $PROJECT_NAME
18 changes: 0 additions & 18 deletions docker/install_gcloud.sh

This file was deleted.

8 changes: 8 additions & 0 deletions docker/prognostic_run/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
FROM us.gcr.io/vcm-ml/fv3gfs-python:v0.2.1


COPY docker/prognostic_run/requirements.txt /tmp/requirements.txt
RUN pip3 install -r /tmp/requirements.txt
COPY fv3net-0.1.0-py3-none-any.whl /wheels/fv3net-0.1.0-py3-none-any.whl
COPY vcm-0.1.0-py3-none-any.whl /wheels/vcm-0.1.0-py3-none-any.whl
RUN pip3 install --no-deps /wheels/fv3net-0.1.0-py3-none-any.whl && pip3 install /wheels/vcm-0.1.0-py3-none-any.whl
5 changes: 5 additions & 0 deletions docker/prognostic_run/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
scikit-learn==0.22.1
dask
joblib
zarr
scikit-image
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ dependencies:
- h5netcdf
- h5py>=2.10
- hypothesis
- pandas=1.0.1
- intake
- intake-xarray
- metpy
Expand Down
2 changes: 1 addition & 1 deletion external/fv3config
2 changes: 2 additions & 0 deletions fv3net/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@

TOP_LEVEL_DIR = pathlib.Path(__file__).parent.parent.absolute()
COARSENED_DIAGS_ZARR_NAME = "gfsphysics_15min_coarse.zarr"

__version__ = "0.1.0"
3 changes: 3 additions & 0 deletions fv3net/runtime/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from . import sklearn_interface as sklearn
from .state_io import init_writers, append_to_writers, CF_TO_RESTART_MAP
from .config import get_runfile_config, get_namelist
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ class dotdict(dict):
__delattr__ = dict.__delitem__


def get_config():
def get_runfile_config():
with open("fv3config.yml") as f:
config = yaml.safe_load(f)
return dotdict(config["scikit_learn"])
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,12 @@
from sklearn.externals import joblib
from sklearn.utils import parallel_backend

import state_io
from . import state_io

__all__ = ["open_model", "predict", "update"]

def open_sklearn_model(url):

def open_model(url):
# Load the model
with fsspec.open(url, "rb") as f:
return joblib.load(f)
Expand All @@ -30,17 +32,3 @@ def update(model, state, dt):
)

return state_io.rename_to_orig(updated), state_io.rename_to_orig(tend)


if __name__ == "__main__":
import sys

state_path = sys.argv[1]
model = open_sklearn_model(sys.argv[2])

with open(state_path, "rb") as f:
data = state_io.load(f)

tile = data[0]
preds = update(model, tile, dt=1)
print(preds)
File renamed without changes.
2 changes: 1 addition & 1 deletion workflows/end_to_end/full-workflow-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ experiment:
restart_file_dir:
from: one_step_run
ic_timestep: "20160801.001500"
docker_image: us.gcr.io/vcm-ml/prognostic-run-orchestration
docker_image: us.gcr.io/vcm-ml/prognostic_run:v0.1.0
--model_url:
from: train_sklearn_model
--prog_config_yml: workflows/prognostic_c48_run/prognostic_config.yml
Expand Down
10 changes: 0 additions & 10 deletions workflows/prognostic_c48_run/Dockerfile

This file was deleted.

25 changes: 7 additions & 18 deletions workflows/prognostic_c48_run/Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
IMAGE=test-image
IMAGE = us.gcr.io/vcm-ml/prognostic_run:v0.1.0
KEY_ARGS= -v $(GOOGLE_APPLICATION_CREDENTIALS):/key.json \
-e GOOGLE_APPLICATION_CREDENTIALS=/key.json
LOCAL_DIR_ARGS = -w /code -v $(shell pwd):/code
Expand All @@ -7,26 +7,10 @@ RUN_ARGS = --rm $(KEY_ARGS) $(LOCAL_DIR_ARGS) $(IMAGE)
RUN_INTERACTIVE = docker run -ti $(RUN_ARGS)
RUN ?= docker run $(RUN_ARGS)
SKLEARN_MODEL = gs://vcm-ml-data/test-annak/ml-pipeline-output/2020-01-17_rf_40d_run.pkl
FV3CONFIG = fv3config.yml
FV3NET_VERSION ?=2020-01-23-prognostic-rf
FV3CONFIG = gs://vcm-ml-data/end-to-end-experiments/2020-02-26-physics-off/annak-prognostic-physics-off-1773255e/prognostic_run_prognostic_yaml_adjust_prognostic_config.yml_ic_timestep_20160801.001500_docker_image_prognostic-run-orchestration/job_config/fv3config.yml

all: sklearn_run

fv3net-0.1.0-py3-none-any.whl:
pip wheel --no-deps git+ssh://git@github.com/VulcanClimateModeling/fv3net.git@$(FV3NET_VERSION)

build: fv3net-0.1.0-py3-none-any.whl
docker build . -t $(IMAGE)

fv3net-local:
pip wheel --no-deps ../../.

vcm-local:
pip wheel --no-deps ../../external/vcm

build_local: fv3net-local vcm-local
docker build . -t $(IMAGE)

dev:
$(RUN_INTERACTIVE) bash

Expand All @@ -36,9 +20,14 @@ test_run_sklearn: state.pkl
state.pkl:
fv3run --dockerimage test-image --runfile save_state_runfile.py $(FV3CONFIG) save_state/
cp save_state/state.pkl .

sklearn_run_local: #rundir
fv3run --dockerimage $(IMAGE) --runfile sklearn_runfile.py $(FV3CONFIG) rundir

sklearn_run: #rundir
fv3run --dockerimage us.gcr.io/vcm-ml/prognostic-run-orchestration --runfile sklearn_runfile.py $(FV3CONFIG) ../../scratch/rundir

clean:
rm -rf net_precip net_heating/ PW

.PHONY: fv3net vcm build dev sklearn_run
3 changes: 1 addition & 2 deletions workflows/prognostic_c48_run/fv3config.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
scikit_learn:
model: gs://vcm-ml-data/test-annak/ml-pipeline-output/2020-01-17_rf_40d_run.pkl
scikit_learn: model:gs://vcm-ml-data/end-to-end-experiments/2020-02-26-physics-off/annak-prognostic-physics-off/train_sklearn_model_train-config-file_example_base_rf_training_config.yml_delete-local-results-after-upload_False/sklearn_model.pkl
zarr_output: diags.zarr
data_table: default
diag_table: gs://vcm-ml-data/2020-01-15-noahb-exploration/2hr_strong_dampingone_step_config/C48/20160805.000000/diag_table
Expand Down
Loading

0 comments on commit 598cb26

Please sign in to comment.