diff --git a/.vsts/pipeline.yml b/.vsts/pipeline.yml index 2d3e231a..159ceb19 100644 --- a/.vsts/pipeline.yml +++ b/.vsts/pipeline.yml @@ -246,18 +246,33 @@ jobs: set -o pipefail docker version docker login "$(docker.servername)" -u="$(docker.username)" -p="$(docker.password)" + export DOCKER_CLI_EXPERIMENTAL=enabled singularity_version=$(grep -m1 _SINGULARITY_VERSION convoy/misc.py | cut -d "'" -f 2) - echo "Replicating Singularity verison $singularity_version images to MCR" - dhImage="alfpark/singularity:${singularity_version}-mnt" - docker pull "$dhImage" - mcrImage="$(docker.servername)/public/azure-batch/shipyard:${singularity_version}-singularity-mnt" - docker tag "$dhImage" "$mcrImage" - docker push "$mcrImage" - dhImage="alfpark/singularity:${singularity_version}-mnt-resource" - docker pull "$dhImage" - mcrImage="$(docker.servername)/public/azure-batch/shipyard:${singularity_version}-singularity-mnt-resource" - docker tag "$dhImage" "$mcrImage" - docker push "$mcrImage" + echo "Replicating Singularity version $singularity_version images to MCR" + chkImage=mcr.microsoft.com/azure-batch/shipyard:${singularity_version}-singularity-mnt + set +e + if docker manifest inspect "$chkImage"; then + echo "$chkImage exists, skipping replication" + else + set -e + dhImage="alfpark/singularity:${singularity_version}-mnt" + mcrImage="$(docker.servername)/public/azure-batch/shipyard:${singularity_version}-singularity-mnt" + docker pull "$dhImage" + docker tag "$dhImage" "$mcrImage" + docker push "$mcrImage" + fi + chkImage=mcr.microsoft.com/azure-batch/shipyard:${singularity_version}-singularity-mnt-resource + set +e + if docker manifest inspect "$chkImage"; then + echo "$chkImage exists, skipping replication" + else + set -e + dhImage="alfpark/singularity:${singularity_version}-mnt-resource" + mcrImage="$(docker.servername)/public/azure-batch/shipyard:${singularity_version}-singularity-mnt-resource" + docker pull "$dhImage" + docker tag "$dhImage" "$mcrImage" + docker push "$mcrImage" + fi displayName: Replicate Singularity Container Images condition: and(succeeded(), ne(variables['ARTIFACT_CLI'], '')) - template: ./pyenv.yml diff --git a/CHANGELOG.md b/CHANGELOG.md index 70008a87..04b9d608 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,66 @@ ## [Unreleased] +## [3.8.0] - 2019-08-13 +### Added +- Revamped Singularity support, including support for Singularity 3, +SIF images, and pull support from ACR registries for SIF images via ORAS. +Please see the global and jobs configuration docs for more information. +([#146](https://github.com/Azure/batch-shipyard/issues/146)) +- New MPI interface in jobs configuration for seamless multi-instance task +executions with automatic configuration for SR-IOV RDMA VM sizes with support +for popular MPI runtimes including OpenMPI, MPICH, Intel MPI, and MVAPICH +([#287](https://github.com/Azure/batch-shipyard/issues/287)) +- Support for Hb/Hc SR-IOV RDMA VM sizes +([#277](https://github.com/Azure/batch-shipyard/issues/277)) +- Support for NC/NV/H Promo VM sizes +- Support for user-specified job preparation and release tasks on the host +([#202](https://github.com/Azure/batch-shipyard/issues/202)) +- Support for conditional output data +([#230](https://github.com/Azure/batch-shipyard/issues/230)) +- Support for bring your own public IP addresses on Batch pools. +Please see the pool configuration doc and the +[Virtual Networks and Public IPs guide](docs/64-batch-shipyard-byovnet.md) +for more information. +- Support for Shared Image Gallery for custom images +- Support for CentOS HPC 7.6 native conversion +- Additional Slurm configuration options +- New recipes: mpiBench across various configurations, +OpenFOAM-Infiniband-OpenMPI, OSUMicroBenchmarks-Infiniband-MVAPICH + +### Changed +- **Breaking Change:** the `singularity_images` property in the global +configuration has been modified to accomodate Singularity 3 support. +Please see the global configuration doc for more information. +([#146](https://github.com/Azure/batch-shipyard/issues/146)) +- **Breaking Change:** the `gpu` property in the jobs configuration has +been changed to `gpus` to accommodate the new native GPU execution +support in Docker 19.03. Please see the jobs configuration doc for +more information. +([#293](https://github.com/Azure/batch-shipyard/issues/293)) +- `pool images` commands now support Singularity +- Non-native task execution is now proxied via script +([#235](https://github.com/Azure/batch-shipyard/issues/235)) +- Batch Shipyard images have been migrated to the Microsoft Container Registry +([#278](https://github.com/Azure/batch-shipyard/issues/278)) +- Updated Docker CE to 19.03.1 +- Updated blobxfer to 1.9.0 +- Updated LIS to 4.3.3 +- Updated NC/ND driver to 418.67, NV driver to 430.30 +- Updated Batch Insights to 1.3.0 +- Updated dependencies to latest, where applicable +- Updated Python to 3.7.4 for pre-built binaries +- Updated Docker images to use Alpine 3.10 +- Various recipe updates to showcase the new MPI schema, HPLinpack and HPCG +updates to SR-IOV RDMA VM sizes + +### Fixed +- Cargo Batch service client update missed +([#274](https://github.com/Azure/batch-shipyard/issues/274), [#296](https://github.com/Azure/batch-shipyard/issues/296)) +- Premium File Shares were not enumerating correctly with AAD +([#294](https://github.com/Azure/batch-shipyard/issues/294)) +- Per-job autoscratch setup failing for more than 2 nodes + ### Removed - Python 3.4 support @@ -1532,7 +1592,8 @@ transfer is disabled #### Added - Initial release -[Unreleased]: https://github.com/Azure/batch-shipyard/compare/3.7.1...HEAD +[Unreleased]: https://github.com/Azure/batch-shipyard/compare/3.8.0...HEAD +[3.8.0]: https://github.com/Azure/batch-shipyard/compare/3.7.1...3.8.0 [3.7.1]: https://github.com/Azure/batch-shipyard/compare/3.7.0...3.7.1 [3.7.0]: https://github.com/Azure/batch-shipyard/compare/3.6.1...3.7.0 [3.6.1]: https://github.com/Azure/batch-shipyard/compare/3.6.0...3.6.1 diff --git a/README.md b/README.md index a03f26c2..8c612a78 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,6 @@ [![Build Status](https://azurebatch.visualstudio.com/batch-shipyard/_apis/build/status/batch-shipyard-CI)](https://azurebatch.visualstudio.com/batch-shipyard/_build/latest?definitionId=11) [![Build Status](https://travis-ci.org/Azure/batch-shipyard.svg?branch=master)](https://travis-ci.org/Azure/batch-shipyard) [![Build status](https://ci.appveyor.com/api/projects/status/3a0j0gww57o6nkpw/branch/master?svg=true)](https://ci.appveyor.com/project/alfpark/batch-shipyard) -[![Docker Pulls](https://img.shields.io/docker/pulls/alfpark/batch-shipyard.svg)](https://hub.docker.com/r/alfpark/batch-shipyard) -[![Image Layers](https://images.microbadger.com/badges/image/alfpark/batch-shipyard:latest-cli.svg)](http://microbadger.com/images/alfpark/batch-shipyard) # Batch Shipyard dashboard @@ -28,13 +26,21 @@ in Azure, independent of any integrated Azure Batch functionality. [Kata Containers](https://katacontainers.io/) tuned for Azure Batch compute nodes * Automated deployment of container images required for tasks to compute nodes +* Support for container registries including +[Azure Container Registry](https://azure.microsoft.com/services/container-registry/) +for both Docker and Singularity images (ORAS), other Internet-accessible +public and private registries, and support for +the [Sylabs Singularity Library](https://cloud.sylabs.io/library) and +[Singularity Hub](https://singularity-hub.org/) * Transparent support for GPU-accelerated container applications on both [Docker](https://github.com/NVIDIA/nvidia-docker) and Singularity on [Azure N-Series VM instances](https://docs.microsoft.com/azure/virtual-machines/linux/sizes-gpu) -* Support for Docker Registries including -[Azure Container Registry](https://azure.microsoft.com/services/container-registry/), -other Internet-accessible public and private registries, and support for -the [Singularity Hub](https://singularity-hub.org/) Container Registry +* Transparent assist for running Docker and Singularity containers utilizing +Infiniband/RDMA on HPC Azure VM instances including +[A-Series](https://docs.microsoft.com/azure/virtual-machines/linux/sizes-hpc), +[H-Series](https://docs.microsoft.com/azure/virtual-machines/linux/sizes-hpc), +[Hb/Hc-Series](https://docs.microsoft.com/azure/virtual-machines/linux/sizes-hpc), +and [N-Series](https://docs.microsoft.com/azure/virtual-machines/linux/sizes-gpu) ### Data Management and Shared File Systems * Comprehensive [data movement](https://batch-shipyard.readthedocs.io/en/latest/70-batch-shipyard-data-movement/) @@ -90,13 +96,8 @@ to accommodate MPI and multi-node cluster applications packaged as Docker or Singularity containers on compute pools with automatic job completion and task termination * Seamless, direct high-level configuration support for popular MPI runtimes -including OpenMPI, MPICH, MVAPICH, and Intel MPI -* Transparent assist for running Docker and Singularity containers utilizing -Infiniband/RDMA for MPI on HPC low-latency Azure VM instances including -[A-Series](https://docs.microsoft.com/azure/virtual-machines/linux/sizes-hpc), -[H-Series](https://docs.microsoft.com/azure/virtual-machines/linux/sizes-hpc), -[Hb/Hc-Series](https://docs.microsoft.com/azure/virtual-machines/linux/sizes-hpc), -and [N-Series](https://docs.microsoft.com/azure/virtual-machines/linux/sizes-gpu) +including OpenMPI, MPICH, MVAPICH, and Intel MPI with automatic configuration +for Infiniband, including SR-IOV RDMA VM sizes * Seamless integration with Azure Batch job, task and file concepts along with full pass-through of the [Azure Batch API](https://azure.microsoft.com/documentation/articles/batch-api-basics/) @@ -111,9 +112,11 @@ tasks at set intervals * Support for [Low Priority Compute Nodes](https://docs.microsoft.com/azure/batch/batch-low-pri-vms) * Support for deploying Batch compute nodes into a specified [Virtual Network](https://batch-shipyard.readthedocs.io/en/latest/64-batch-shipyard-byovnet/) +and pre-defined public IP addresses * Automatic setup of SSH or RDP users to all nodes in the compute pool and optional creation of SSH tunneling scripts to Docker Hosts on compute nodes * Support for [custom host images](https://batch-shipyard.readthedocs.io/en/latest/63-batch-shipyard-custom-images/) +including Shared Image Gallery * Support for [Windows Containers](https://docs.microsoft.com/virtualization/windowscontainers/about/) on compliant Windows compute node pools with the ability to activate [Azure Hybrid Use Benefit](https://azure.microsoft.com/pricing/hybrid-benefit/) @@ -134,8 +137,8 @@ and [iOS](https://itunes.apple.com/us/app/microsoft-azure/id1219013620?mt=8) app. Simply request a Cloud Shell session and type `shipyard` to invoke the CLI; -no installation is required. Try Batch Shipyard now from your browser: -[![Launch Cloud Shell](https://shell.azure.com/images/launchcloudshell.png "Launch Cloud Shell")](https://shell.azure.com) +no installation is required. Try Batch Shipyard now +[in your browser](https://shell.azure.com). ## Documentation and Recipes Please refer to the diff --git a/convoy/version.py b/convoy/version.py index 241affc6..1aed36d9 100644 --- a/convoy/version.py +++ b/convoy/version.py @@ -22,4 +22,4 @@ # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER # DEALINGS IN THE SOFTWARE. -__version__ = '3.7.1' +__version__ = '3.8.0' diff --git a/docs/64-batch-shipyard-byovnet.md b/docs/64-batch-shipyard-byovnet.md index 2770d5f0..50ffbfbe 100644 --- a/docs/64-batch-shipyard-byovnet.md +++ b/docs/64-batch-shipyard-byovnet.md @@ -39,24 +39,9 @@ at least the `Virtual Machine Contributor` role permission or a * `Microsoft.Network/publicIPAddresses/read` * `Microsoft.Network/publicIPAddresses/join/action` -## Public IPs -For pools that are not internode communication enabled, more than 1 public IP -and load balancer may be created for the pool. If you are not bringing your -own public IPs, they are allocated in the subscription that has allocated the -virtual network. If you are not bringing your own public IPs, ensure that -the sufficient Public IP quota has been granted for the subscription of the -virtual network (and is sufficient for any pool resizes that may occur). +## Virtual Networks -If you are bringing your own public IPs, you must supply a sufficient number -of public IPs in the pool configuration for the maximum number of compute -nodes you intend to deploy for the pool. The current requirements are -1 public IP per 50 dedicated nodes or 20 low priority nodes. - -Note that enabling internode communication is not recommended unless -running MPI (multinstance) jobs as this will restrict the upper-bound -scalability of the pool. - -## `virtual_network` Pool configuration +### `virtual_network` Pool configuration To deploy Batch compute nodes into a subnet within a Virtual Network that you specify, you will need to define the `virtual_network` property in the pool configuration file. The template is: @@ -100,7 +85,24 @@ on-premises, then you may have to add to that subnet. Please follow the instructions found in this [document](https://docs.microsoft.com/azure/batch/batch-virtual-network#user-defined-routes-for-forced-tunneling). -## `public_ips` Pool configuration +## Public IPs +For pools that are not internode communication enabled, more than 1 public IP +and load balancer may be created for the pool. If you are not bringing your +own public IPs, they are allocated in the subscription that has allocated the +virtual network. If you are not bringing your own public IPs, ensure that +the sufficient Public IP quota has been granted for the subscription of the +virtual network (and is sufficient for any pool resizes that may occur). + +If you are bringing your own public IPs, you must supply a sufficient number +of public IPs in the pool configuration for the maximum number of compute +nodes you intend to deploy for the pool. The current requirements are +1 public IP per 50 dedicated nodes or 20 low priority nodes. + +Note that enabling internode communication is not recommended unless +running MPI (multinstance) jobs as this will restrict the upper-bound +scalability of the pool. + +### `public_ips` Pool configuration To deploy Batch compute nodes with pre-defined public IPs that you specify, you will need to define the `public_ips` property in the pool configuration file. The template is: