Skip to content

Commit

Permalink
Merge branch 'main' of https://github.com/dask/dask-cloudprovider int…
Browse files Browse the repository at this point in the history
…o fly
  • Loading branch information
jacobtomlinson committed Sep 16, 2024
2 parents 88143c1 + 6ca172e commit 5ce0bad
Show file tree
Hide file tree
Showing 29 changed files with 2,673 additions and 36 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ jobs:
fail-fast: true
matrix:
os: ["ubuntu-latest"]
python-version: ["3.9", "3.10"]
python-version: ["3.10", "3.11", "3.12"]

steps:
- name: Checkout source
Expand Down Expand Up @@ -50,7 +50,7 @@ jobs:
uses: conda-incubator/setup-miniconda@v2
with:
miniconda-version: "latest"
python-version: "3.9"
python-version: "3.12"

- name: Run import tests
shell: bash -l {0}
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@ jobs:
- name: Checkout source
uses: actions/checkout@v2

- name: Set up Python 3.9
- name: Set up Python 3.12
uses: actions/setup-python@v1
with:
python-version: 3.9
python-version: 3.12

- name: Install pypa/build
run: python -m pip install build wheel
run: python -m pip install build wheel setuptools

- name: Build distributions
shell: bash -l {0}
Expand Down
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
repos:
- repo: https://github.com/psf/black
rev: 22.3.0
rev: 23.10.1
hooks:
- id: black
language_version: python3
exclude: versioneer.py
- repo: https://github.com/pycqa/flake8
rev: 3.9.2
rev: 6.1.0
hooks:
- id: flake8
language_version: python3
2 changes: 1 addition & 1 deletion .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,4 @@ submodules:
build:
os: ubuntu-22.04
tools:
python: "3"
python: "3.11"
8 changes: 5 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ Dask Cloud Provider
:alt: Conda Forge


Native Cloud integration for Dask. This library intends to allow people to
create dask clusters on a given cloud provider with no set up other than having
credentials.
Native Cloud integration for Dask.

This library provides tools to enable Dask clusters to more natively integrate with the cloud.
It includes cluster managers to create dask clusters on a given cloud provider using native resources,
plugins to more closely integrate Dask components with the cloud platform they are running on and documentation to empower all folks running Dask on the cloud.
2 changes: 1 addition & 1 deletion ci/environment-3.9.yml → ci/environment-3.11.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ channels:
- defaults
- conda-forge
dependencies:
- python=3.9
- python=3.11
- nomkl
- pip
# Dask
Expand Down
38 changes: 38 additions & 0 deletions ci/environment-3.12.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
name: dask-cloudprovider-test
channels:
- defaults
- conda-forge
dependencies:
- python=3.12
- nomkl
- pip
# Dask
- dask
# testing / CI
- flake8
- ipywidgets
- pytest
- pytest-asyncio
- black >=20.8b1
- pyyaml
# dask dependencies
- cloudpickle
- toolz
- cytoolz
- numpy
- partd
# distributed dependencies
- click >=6.6
- msgpack-python
- psutil >=5.0
- six
- sortedcontainers !=2.0.0,!=2.0.1
- tblib
- tornado >=5
- zict >=0.1.3
# `event_loop_policy` change See https://github.com/dask/distributed/pull/4212
- pytest-asyncio >=0.14.0
- pytest-timeout
- pip:
- git+https://github.com/dask/dask.git@main
- git+https://github.com/dask/distributed@main
6 changes: 4 additions & 2 deletions ci/scripts/test_imports.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@ set -o errexit


test_import () {
echo "Create environment: python=3.9 $1"
echo "Create environment: python=3.12 $1"
# Create an empty environment
conda create -q -y -n test-imports -c conda-forge python=3.9
conda create -q -y -n test-imports -c conda-forge python=3.12
conda activate test-imports
pip install -e .[$1]
echo "python -c '$2'"
Expand All @@ -20,3 +20,5 @@ test_import "azure" "import dask_cloudprovider.azure"
test_import "digitalocean" "import dask_cloudprovider.digitalocean"
test_import "gcp" "import dask_cloudprovider.gcp"
test_import "fly" "import dask_cloudprovider.fly"
test_import "ibm" "import dask_cloudprovider.ibm"
test_import "openstack" "import dask_cloudprovider.openstack"
16 changes: 12 additions & 4 deletions dask_cloudprovider/aws/ecs.py
Original file line number Diff line number Diff line change
Expand Up @@ -371,7 +371,7 @@ class Scheduler(Task):
Any extra command line arguments to pass to dask-scheduler, e.g. ``["--tls-cert", "/path/to/cert.pem"]``
Defaults to `None`, no extra command line arguments.
kwargs: Dict()
kwargs:
Other kwargs to be passed to :class:`Task`.
See :class:`Task` for parameter info.
Expand Down Expand Up @@ -413,7 +413,7 @@ class Worker(Task):
scheduler: str
The address of the scheduler
kwargs: Dict()
kwargs:
Other kwargs to be passed to :class:`Task`.
"""

Expand Down Expand Up @@ -484,6 +484,10 @@ class ECSCluster(SpecCluster, ConfigMixin):
The docker image to use for the scheduler and worker tasks.
Defaults to ``daskdev/dask:latest`` or ``rapidsai/rapidsai:latest`` if ``worker_gpu`` is set.
cpu_architecture: str (optional)
Runtime platform CPU architecture
Defaults to ``X86_64``.
scheduler_cpu: int (optional)
The amount of CPU to request for the scheduler in milli-cpu (1/1024).
Expand Down Expand Up @@ -678,7 +682,7 @@ class ECSCluster(SpecCluster, ConfigMixin):
mounted in worker tasks. This setting controls whether volumes are also mounted in the scheduler task.
Default ``False``.
**kwargs: dict
**kwargs:
Additional keyword arguments to pass to ``SpecCluster``.
Examples
Expand Down Expand Up @@ -712,6 +716,7 @@ def __init__(
fargate_workers=None,
fargate_spot=None,
image=None,
cpu_architecture="X86_64",
scheduler_cpu=None,
scheduler_mem=None,
scheduler_port=8786,
Expand Down Expand Up @@ -758,6 +763,7 @@ def __init__(
self._fargate_workers = fargate_workers
self._fargate_spot = fargate_spot
self.image = image
self._cpu_architecture = cpu_architecture.upper()
self._scheduler_cpu = scheduler_cpu
self._scheduler_mem = scheduler_mem
self._scheduler_port = scheduler_port
Expand Down Expand Up @@ -1223,6 +1229,7 @@ async def _create_scheduler_task_definition_arn(self):
if self._volumes and self._mount_volumes_on_scheduler
else [],
requiresCompatibilities=["FARGATE"] if self._fargate_scheduler else [],
runtimePlatform={"cpuArchitecture": self._cpu_architecture},
cpu=str(self._scheduler_cpu),
memory=str(self._scheduler_mem),
tags=dict_to_aws(self.tags),
Expand Down Expand Up @@ -1297,6 +1304,7 @@ async def _create_worker_task_definition_arn(self):
],
volumes=self._volumes if self._volumes else [],
requiresCompatibilities=["FARGATE"] if self._fargate_workers else [],
runtimePlatform={"cpuArchitecture": self._cpu_architecture},
cpu=str(self._worker_cpu),
memory=str(self._worker_mem),
tags=dict_to_aws(self.tags),
Expand Down Expand Up @@ -1388,7 +1396,7 @@ class FargateCluster(ECSCluster):
Parameters
----------
kwargs: dict
kwargs:
Keyword arguments to be passed to :class:`ECSCluster`.
Examples
Expand Down
2 changes: 0 additions & 2 deletions dask_cloudprovider/azure/tests/test_azurevm.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,6 @@ def inc(x):
@skip_without_credentials
@pytest.mark.external
async def test_create_cluster_sync():

with AzureVMCluster() as cluster:
with Client(cluster) as client:
cluster.scale(1)
Expand All @@ -84,7 +83,6 @@ def inc(x):
@skip_without_credentials
@pytest.mark.external
async def test_create_rapids_cluster_sync():

with AzureVMCluster(
vm_size="Standard_NC12s_v3",
docker_image="rapidsai/rapidsai:cuda11.0-runtime-ubuntu18.04-py3.9",
Expand Down
30 changes: 29 additions & 1 deletion dask_cloudprovider/cloudprovider.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ cloudprovider:
scheduler_timeout: "5 minutes" # Length of inactivity to wait before closing the cluster

image: "daskdev/dask:latest" # Docker image to use for non GPU tasks
cpu_architecture: "X86_64" # Runtime platform CPU architecture
gpu_image: "rapidsai/rapidsai:latest" # Docker image to use for GPU tasks
cluster_name_template: "dask-{uuid}" # Template to use when creating a cluster
cluster_arn: "" # ARN of existing ECS cluster to use (if not set one will be created)
Expand Down Expand Up @@ -126,4 +127,31 @@ cloudprovider:
memory_mb: 1024 # Memory in MB to use for the scheduler and workers
cpus: 1 # Number of CPUs to use for the scheduler and workers
app_name: null # Name of Fly app to use. If it is blank, a random name will be generated.
org_slug: "personal" # Organization slug to use. If it is blank, the personal organization will be used.
org_slug: "personal" # Organization slug to use. If it is blank, the personal organization will be used.

ibm:
api_key: null
image: "ghcr.io/dask/dask:latest"
region: us-east
project_id: null
scheduler_cpu: "1.0"
scheduler_mem: 4G
scheduler_timeout: 600 # seconds
worker_cpu: "2.0"
worker_mem: 8G
worker_threads: 1

openstack:
region: "RegionOne" # The name of the region where resources will be allocated in OpenStack. List available regions using: `openstack region list`.
size: null # Openstack flavors define the compute, memory, and storage capacity of computing instances. List available flavors using: `openstack flavor list`
auth_url: null # The authentication URL for the OpenStack Identity service (Keystone). Example: https://cloud.example.com:5000
application_credential_id: null # The application credential id created in OpenStack. Create application credentials using: openstack application credential create
application_credential_secret: null # The secret associated with the application credential ID for authentication.
auth_type: "v3applicationcredential" # The type of authentication used, typically "v3applicationcredential" for using OpenStack application credentials.
network_id: null # The unique identifier for the internal/private network in OpenStack where the cluster VMs will be connected. List available networks using: `openstack network list`
image: null # The OS image name or id to use for the VM. List available images using: `openstack image list`
keypair_name: null # The name of the SSH keypair used for instance access. Ensure you have created a keypair or use an existing one. List available keypairs using: `openstack keypair list`
security_group: null # The security group name that defines firewall rules for instances. List available security groups using: `openstack security group list`
external_network_id: null # The ID of the external network used for assigning floating IPs. List available external networks using: `openstack network list --external`
create_floating_ip: false # Specifies whether to assign a floating IP to each instance, enabling external access. Set to `True` if external connectivity is needed.
docker_image: "daskdev/dask:latest" # docker image to use
30 changes: 22 additions & 8 deletions dask_cloudprovider/gcp/instances.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import json

import sqlite3
from typing import Optional, Any, Dict

import dask
from dask.utils import tmpfile
Expand Down Expand Up @@ -106,11 +107,9 @@ def __init__(
self.instance_labels = _instance_labels

self.general_zone = "-".join(self.zone.split("-")[:2]) # us-east1-c -> us-east1

self.service_account = service_account or self.config.get("service_account")

def create_gcp_config(self):

subnetwork = f"projects/{self.network_projectid}/regions/{self.general_zone}/subnetworks/{self.network}"
config = {
"name": self.name,
Expand Down Expand Up @@ -205,7 +204,6 @@ def create_gcp_config(self):
return config

async def create_vm(self):

self.cloud_init = self.cluster.render_process_cloud_init(self)

self.gcp_config = self.create_gcp_config()
Expand Down Expand Up @@ -496,6 +494,8 @@ class GCPCluster(VMCluster):
service_account: str
Service account that all VMs will run under.
Defaults to the default Compute Engine service account for your GCP project.
service_account_credentials: Optional[Dict[str, Any]]
Service account credentials to create the compute engine Vms
Examples
--------
Expand Down Expand Up @@ -589,10 +589,10 @@ def __init__(
debug=False,
instance_labels=None,
service_account=None,
service_account_credentials: Optional[Dict[str, Any]] = None,
**kwargs,
):

self.compute = GCPCompute()
self.compute = GCPCompute(service_account_credentials)

self.config = dask.config.get("cloudprovider.gcp", {})
self.auto_shutdown = (
Expand Down Expand Up @@ -644,20 +644,34 @@ def __init__(


class GCPCompute:
"""Wrapper for the ``googleapiclient`` compute object."""
"""
Wrapper for the ``googleapiclient`` compute object.
def __init__(self):
Attributes
----------
service_account_credentials: Optional[dict]
Service account credentials to create the compute engine Vms
"""

def __init__(self, service_account_credentials: Optional[dict[str, Any]] = None):
self.service_account_credentials = service_account_credentials or {}
self._compute = self.refresh_client()

def refresh_client(self):

if os.environ.get("GOOGLE_APPLICATION_CREDENTIALS", False):
import google.oauth2.service_account # google-auth

creds = google.oauth2.service_account.Credentials.from_service_account_file(
os.environ["GOOGLE_APPLICATION_CREDENTIALS"],
scopes=["https://www.googleapis.com/auth/cloud-platform"],
)
elif self.service_account_credentials:
import google.oauth2.service_account # google-auth

creds = google.oauth2.service_account.Credentials.from_service_account_info(
self.service_account_credentials,
scopes=["https://www.googleapis.com/auth/cloud-platform"],
)
else:
import google.auth.credentials # google-auth

Expand Down
2 changes: 0 additions & 2 deletions dask_cloudprovider/gcp/tests/test_gcp.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,6 @@ async def test_create_cluster():
async with GCPCluster(
asynchronous=True, env_vars={"FOO": "bar"}, security=True
) as cluster:

assert cluster.status == Status.running

cluster.scale(2)
Expand Down Expand Up @@ -132,7 +131,6 @@ async def test_create_rapids_cluster():
auto_shutdown=True,
bootstrap=False,
) as cluster:

assert cluster.status == Status.running

cluster.scale(1)
Expand Down
1 change: 1 addition & 0 deletions dask_cloudprovider/ibm/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .code_engine import IBMCodeEngineCluster
Loading

0 comments on commit 5ce0bad

Please sign in to comment.