A curated list of cloud HPC.
- Solution
- Management Tool
- IaaS-Server
- IaaS-Network
- IaaS-Storage
- IaaS-Image
- PaaS
- CAE and EDA ISV
- SaaS
- Job Scheduler
- Blog, Documentation, YouTube
- Repository
- Amazon Web Services
- Google GCP
- HUAWEI Cloud
- IBM Cloud
- Microsoft Azure
- Naver NCP
- Oracle OCI
- Samsung SDS SCP
-
Alibaba E-HPC - Alibaba Cloud's computing service for resource management, job submission, performance analysis, and VNC in E-HPC console.
-
AWS ParallelCluster - Open source cluster management tool for deploying and managing HPC clusters (Repository).
-
AWS ParallelCluster UI - Front-end for AWS ParallelCluster.
-
Azure CycleCloud - Secure and flexible cloud HPC and Big Compute environments.
-
Azure HPC OnDemand Platform - Azure-based HPC cluster solution with features like Terraform, Ansible, Packer integration, job scheduling, autoscaling, and monitoring (Repository, Marketplace).
-
CloudyCluster - Turn-Key Cloud HPC elastic orchestration with a familiar hpc look and feel.
-
Cluster in the Cloud - Multi cloud solution that uses Terraform for infrastructure setup, Ansible for software configuration, and Slurm with custom Python scripts for dynamic node management in cloud-based HPC environment.
-
Cluster Toolkit - Google Cloud's open-source software for deploying AI/ML and high-performance computing environments on GCP, featuring customizable Terraform modules and Packer integration. (Repository).
-
Flight Environment - The Flight User Suite for improved HPC access through CLI tools, the Flight Web Suite as a web interface for HPC end-users, and the Flight Admin Tools for administrative HPC environment configuration.
-
HPC-NOW - The platform aims to simplify the process of starting and managing HPC workloads in the cloud.
-
JedAI Cloud - Optimized HPC stacks enable easy cluster management and on-demand HPC through pre-integrated solutions, delivering bare metal infrastructure, virtualized services, and containerized apps via a single management interface by Define Tech.
-
KT Cloud HPC - KT Cloud's HPC management product integrating Altair's solutions.
-
Magic Castle - Multi-cloud HPC cluster solution that leverages Terraform and Puppet for deployment, featuring job scheduling with Slurm and over 3000 research software applications.
-
Microsoft HPC Pack - Creation and management tool of HPC clusters, enabling the use of Windows or Linux nodes on-premises and cloud resources in Azure.
-
OCI HPC Cluster - Automated HPC cluster deployment on OCI.
-
OCI HPC File System (HFS) - Solution for deploying various HPC file servers on OCI. Automated HPC cluster deployment on OCI.
-
SCP HPC Cluster - HPC cluster environment on SCP.
-
Scyld Cloud Manager - Comprehensive management platform to cloud-enable Enterprise HPC.
-
TrinityX - Next-gen open-source HPC, AI, and cloud platform offering customizable installations with efficient provisioning, SLURM/OpenPBS, OpenHPC, and more for modern cluster management.
-
Amazon EC2 Hpc7g - HPC-optimized instances powered by AWS Graviton3E processors.
-
Amazon EC2 Hpc7a - HPC-optimized instances powered by 4th Generation AMD EPYC processors.
-
Amazon EC2 Hpc6id - HPC-optimized instances powered by 3rd Generation Intel Xeon Scalable processors.
-
Amazon EC2 P5 - GPU instances powerd by NVIDIA H100 GPUs.
-
Amazon EC2 P4 - GPU instances powerd by NVIDIA A100(80Gb,40Gb) GPUs.
-
Amazon EC2 P3 - GPU instances powerd by NVIDIA V100 GPUs.
-
Amazon EC2 G5 - GPU instances powerd by NVIDIA A10G GPUs and 2nd Gen AMD EPYC processors.
-
Azure HBv4-series - HPC-optimized instances powered by 4th Generation AMD EPYC processors.
-
Azure HBv3-series - HPC-optimized instances powered by 3rd Generation AMD EPYC processors.
-
Azure HBv2-series - HPC-optimized instances powered by 2nd Generation AMD EPYC processors.
-
Azure HB-series - HPC-optimized instances powered by 1st Generation AMD EPYC processors.
-
Azure HC-series - HPC-optimized instances powered by 1st Generation Intel Xeon Scalable processors.
-
Azure HX-series - Optimized instances for workloads that require significant memory capacity with twice the memory capacity as HBv4.
-
Azure NDm H100 v5-series - GPU instances powerd by NVIDIA H100 GPUs.
-
Azure NDm A100 v4-series - GPU instances powerd by NVIDIA A100(80Gb) GPUs and 3rd Generation AMD EPYC processors.
-
Azure NC A100 v4-series - GPU instances powerd by NVIDIA A100(40Gb) GPUs and 3rd Generation AMD EPYC processors.
-
Azure NCv3-series - GPU instances powerd by NVIDIA V100 GPUs.
-
Azure NCasT4_v3-series - GPU instances powerd by NVIDIA T4 GPUs and 2nd Gen AMD EPYC CPUs.
-
GCP H3 machine-series - CPU instances powerd by 4th Generation Intel Xeon Scalable processors.
-
GCP C2D machine-series - CPU instances powerd by 3rd Generation AMD EPYC processors.
-
GCP C2 machine-series - CPU instances powerd by 2nd Geration Intel Xeon Scalable processors.
-
GCP A3 machine-series - GPU instances powerd by NVIDIA H100 GPUs.
-
GCP A2 machine-series - GPU instances powerd by NVIDIA A100(80Gb,40Gb) GPUs.
-
GCP G2 machine-series - GPU instances powerd by NVIDIA L4 GPUs.
-
Super Computing Cluster - Based on ECS Bare Metal Instance powered by Alibaba Cloud, utilizes high-speed RDMA-based connections to enhance network performance and acceleration ratio in large-scale clusters, providing high-bandwidth and low-latency networks.
-
Azure InfiniBand - RDMA capable HB-series and N-series VMs communicate over the InfiniBand network.
-
Compute Clusters(Cluster Networks with Instance Pools) - Group of high performance computing (HPC), GPU, or optimized instances that are connected with a high-bandwidth, ultra low-latency network.
-
Elastic Fabric Adapter - Network interface for Amazon EC2 instances that enables customers to run applications requiring high levels of inter-node communications at scale.
- Amazon FSx for Lustre - Fully managed shared storage with the scalability and performance of the popular Lustre file system.
- Amazon FSx for OpenZFS - Fully managed shared storage built on the popular OpenZFS file system.
- Azure HPC Cache File caching for HPC on Azure.
- Azure Managed Lustre - Managed, pay-as-you-go file system for high-performance computing (HPC) and AI workloads.
- Azure NetApp Files - Enterprise-grade Azure file shares, powered by NetApp.
- GCP File Store - High-performance, fully managed file storage.
- GCP Parallel Store - Based on Intel DAOS and delivers up to 6.3x greater read throughput performance compared to competitive Lustre scratch offerings.
-
Azhpc-images - Installation scripts for HPC images in Azure Marketplace, specifically CentOS-HPC, Ubuntu-HPC, and AlmaLinux-HPC.
-
Flight Solo - HPC-ready, platform-agnostic image approach to deploying HPC resources powerd by alcesflight.
-
GCP HPC-ready VM - CentOS 7.9 or Rocky Linux 8 based VM image that is optimized for tightly coupled HPC workloads Marketplace CentOS 7 Marketplace Rocky Linux 8.
-
HPCBOX - Desktop-centric, intelligent workflow cloud HPC platform for automating and executing your application pipelines (Marketplace).
-
HPC Pack 2019(Cloud Infrastructure Services) - Microsoft HPC Pack 2019 image powered by Cloud Infrastructure Services (Marketplace(Azure), Marketplace(AWS), Marketplace(GCP)).
-
NVIDIA Virtual Machine Images - Operating system environment for running NVIDIA GPU accelerated software in the cloud.
-
AWS Batch - Fully managed batch computing service.
-
AWS Parallel Computing Service - Managed service for HPC cluster deployment and scaling on AWS using Slurm.
-
Batch(Azure) - Cloud-scale job scheduling and compute management.
-
Batch(GCP) - Fully managed batch service to schedule, queue, and execute batch jobs on Google's infrastructure.
-
Batch Compute - Cloud service for massive simultaneous batch processing on Alibaba Cloud.
-
Covalent - Pythonic workflow orchestration platform for scaling workloads from your laptop to any compute backend (Repository).
-
Amazon DCV - High-performance remote display protocol that provides customers with a secure way to deliver remote desktops and application streaming.
-
NI SP EF Portal - Unified interface to submit jobs for both on-premises and cloud workflow.
-
Research and Engineering Studio - Open source, easy-to-use web-based portal for administrators to create and manage secure cloud-based research and engineering environments on AWS.
-
Rntier Cloud - R&D cloud platform enabling easy and quick access to complex HPC simulations, vGPU-based remote 3D design, and multi-GPU deep learning environments via a web browser.
-
Scyld Cloud Central™ - Fully managed, cloud-based, end-to-end solution for high performance computing that makes it easier and faster for end-users, developers, and data scientists to deploy pure HPC, pure AI, and converged HPC/AI workloads on high-performance clusters.
-
Scyld ClusterWare - Intelligent suite of management functionality, including node provisioning, image customization, and cluster monitoring, while serving as a platform for additional software and schedulers.
-
Scyld Cloud Workstation - Unparalleled performance and a breadth of features that allow it to stand out as a solution for remote access.
-
Skypilot - Framework for running LLMs, AI, and batch jobs on any cloud, offering maximum cost savings, highest GPU availability, and managed execution (Repository).
-
CloudHPC - On-demand cloud computing for CAE engineering simulations powered by CFD FEA SERVICE.
-
dicehub - Real-time collaborative CFD (Computational Fluid Dynamics) simulations platform which simplifies your engineering workflow, offers massive parallel scaling and runs in web browser.
-
EPIC - Primarily for CFD applications, available on the web and created by Zenotech, which also includes Zenotech's ZCFD.
-
Kaleidosim - Enabling of browser-based access to HPC software through advanced cloud orchestration technology.
-
Luminary Cloud - A cloud-based, pay-per-use SaaS simulation platform with a fast, GPU-powered, cloud-native CFD solver and comprehensive high-fidelity capabilities.
-
Nimbix - A comprehensive cloud computing solution powered by Atos, offering access to the HyperHub Application Marketplace with over 1,000 high-performance applications and workflows for diverse industries (Repository).
-
OnScale Solve - The cloud engineering simulation platform built by engineers for engineers.
-
Rescale - Hybrid-cloud platform offering turnkey HPC with extensive(1000+) ecosystem integrations and API connections to major PLM, SPDM, and data storage systems(Repository).
-
Sabalcore - User-friendly, pay-as-you-go high performance computing cloud service with a full-featured, light-weight client that doesn't require a browser.
-
Scala Computing - Optimized, automated cloud-based HPC resource management platform with integrated network simulation and EDA tools, offering flexible, on-demand computing, secure workflows, and global infrastructure access.
-
Simscale - Cloud-based computer-aided engineering (CAE) software for computational fluid dynamics, finite element analysis, and thermal simulations, using open source codes in its backend (Repository(SDK)).
-
SyncHPC - Powerful and flexible hybrid HPC and VDI management platform that provides a comprehensive solution for managing high-performance computing (HPC) and Virtual Desktop Infrastructure (VDI) resources.
-
TAESUNG Cloud - Offering Ansys applications as a service in a cloud-based SaaS.
-
3DEXPERIENCE platform on ther cloud - Complete suite of industry-leading apps and software(CATIA, SIMULIA, DELMIA, 3DEXCITE, etc.) powered by Dassalut Systèmes.
-
Altair One - Cloud Gateway offering dynamic and collaborative access to simulation and data analytics technology, along with scalable HPC and cloud resources.
-
Altair Unlimited - A turnkey, state-of-the-art private appliance available in both on-premises and cloud-based formats, offering unlimited access to a wide range of Altair HyperWorks solver software.
-
Ansys Access on Microsoft Azure - Cloud-based simulation solution available on the Azure Marketplace, offering fast, scalable access to Ansys applications (Marketplace).
-
Ansys Cloud Direct - Cloud-based interactive workstations and HPC clusters, with flexible licensing that can be accessed from desktop.
-
Ansys Gateway by AWS - Cloud-based solution for managing Ansys Simulation & CAD/CAE developments via a web browser.
-
Cadence OnCloud Platform - SaaS software platform for all your system design and simulation needs that can operate on any hardware, removing the requirement to run and maintain expensive infrastructure hardware.
-
Cloud Passport - Cloud-ready tools powered by Cadence that have been optimized for use in customers' own cloud environment.
-
Managed Cloud Service - EDA-optimized platform powered by Cadence that provides a fully integrated and proven cloud environment to jump-start product design, verification, and implementation.
-
Palladium and Protium Cloud - Emulation and prototyping offering provides pre-silicon hardware system verification and debug powered by Cadence.
-
Simcenter Cloud HPC - Part of the Xcelerator as a Service(XaaS) offering powered by Siemens, offers increased flexibility and scalability for CFD simulations with no additional setup needed.
-
Synopsys Cloud - Platform that enables delivery of EDA tools, IP and infrastructure for end-to-end chip design through a browser.
-
Altair Access - HPC Job Submission Portal for Researchers and Engineers.
-
Altair Control - HPC Administrator's Control Center for Managing, Optimizing, and Forecasting Resources with seamless cloud bursting capabilities.
-
Altair Grid Engine - Distributed Resource Management and Optimization.
-
Altair HPCWorks - High-Performance Computing (HPC) and Cloud Platform by Altair.
-
Altair NavOps - Cloud Migration, Automation, and Spend Management for HPC.
-
Altair PBS-Professional - Industry-leading Workload Manager and Job Scheduler for HPC and High-throughput Computing.
-
IBM Spectrun LSF Suites - Workload management platform and job scheduler for HPC with dynamic HPC cloud support for all major cloud providers (Repository).
-
Slurm on Google Cloud Platform - Open-source software solution that enables setting up Slurm clusters on Google Cloud Platform with ease.
-
Slurm Power Saving Guide - Suspending and resuming nodes as needed, and supports cloud integration with providers like AWS, GCP, and Azure for workload management and cloud bursting.
-
Day 1 HPC - AWS engineering's hpc communutiy site.
-
HPC Tech Shorts - Day 1 HPC YouTube Channel.
-
Azure HPC - Easy automation scripts for building a HPC environment in Azure.
-
Cloud MPI - Collection of scripts for optimizing MPI performance in tightly coupled HPC workloads on GCP Compute Engines.
-
Dynamic EC2 budget control - Dynamic EC2 cores allocation limit for each business unit (BU), automatically adapted according to a past time frame (e.g. one week) spending on AWS Parallel Cluster.
-
HPC Recipes for AWS - Example recipes that demonstrate how to build HPC systems using AWS ParallelCluster, Research and Engineering Studio on AWS, and other AWS products.