Simplify HPC and Batch workloads on Azure
Перейти к файлу
Fred Park 074ca55a1c
Extend retries on package manager contention
2019-11-01 15:00:58 +00:00
.github Fix nvidia-docker2 installation 2018-07-20 09:14:10 -07:00
.vsts Minor post-release fixes 2019-08-14 15:13:28 +00:00
cargo Update drivers and dependencies 2019-09-11 17:51:13 +00:00
cascade MCR migration 2019-08-14 03:23:03 +00:00
config_templates Bring your own Public IP support 2019-08-14 03:23:09 +00:00
contrib Update docs for Shared Image Gallery support 2019-06-27 20:49:37 +00:00
convoy Add FI_PROVIDER to Intel MPI ofi fabrics 2019-10-25 17:55:33 +00:00
docs Warn/error when mixing auto_scratch with autoscale 2019-10-22 16:58:38 +00:00
federation Update drivers and dependencies 2019-09-11 17:51:13 +00:00
heimdall Update drivers and dependencies 2019-09-11 17:51:13 +00:00
images Minor post-release fixes 2019-08-14 15:13:28 +00:00
recipes Fix remotefs bootstrap 2019-10-09 15:48:40 +00:00
schemas Bring your own Public IP support 2019-08-14 03:23:09 +00:00
scripts Extend retries on package manager contention 2019-11-01 15:00:58 +00:00
site-extension Update Function App guide 2019-10-01 15:55:41 +00:00
slurm Update drivers and dependencies 2019-09-11 17:51:13 +00:00
.gitattributes Add AppVeyor build 2017-08-10 10:29:22 -07:00
.gitignore Allow CentOS 7.3 on NC/NV 2017-07-06 11:12:05 -07:00
.travis.yml Minor post-release fixes 2019-08-14 15:13:28 +00:00
CHANGELOG.md Fix native output_data to observe full remote path 2019-09-13 19:56:57 +00:00
CODE_OF_CONDUCT.md Update docs 2017-08-29 08:04:30 -07:00
CONTRIBUTING.md Update docs 2017-08-29 08:04:30 -07:00
LICENSE Add dummy README 2016-07-18 08:15:56 -07:00
README.md Tag for 3.8.0 release 2019-08-14 03:23:09 +00:00
THIRD_PARTY_NOTICES.txt Minor post-release fixes 2019-08-14 15:13:28 +00:00
appveyor.yml Fix AppVeyor issue 2019-10-23 15:10:02 +00:00
install.cmd Update dependencies 2018-11-05 11:24:08 -08:00
install.sh Update build to Python 3.7.1 2018-10-30 14:24:31 -07:00
mkdocs.yml Bring your own Public IP support 2019-08-14 03:23:09 +00:00
req_nodeps.txt Update dependencies 2018-11-05 11:24:08 -08:00
requirements.txt Update drivers and dependencies 2019-09-11 17:51:13 +00:00
shipyard.py Fix prefix filter on task factory remote_path 2019-08-29 16:53:47 +00:00

README.md

Build Status Build Status Build status

Batch Shipyard

dashboard

Batch Shipyard is a tool to help provision, execute, and monitor container-based batch processing and HPC workloads on Azure Batch. Batch Shipyard supports both Docker and Singularity containers. No experience with the Azure Batch SDK is needed; run your containers with easy-to-understand configuration files. All Azure regions are supported, including non-public Azure regions.

Additionally, Batch Shipyard provides the ability to provision and manage entire standalone remote file systems (storage clusters) in Azure, independent of any integrated Azure Batch functionality.

Major Features

Container Runtime and Image Management

Data Management and Shared File Systems

Monitoring

Open Source Scheduler Integration

Azure Ecosystem Integration

Azure Batch Integration and Enhancements

  • Federation support: enables unified, constraint-based scheduling to collections of heterogeneous pools, including across multiple Batch accounts and Azure regions
  • Support for simple, scenario-based pool autoscale and autopool to dynamically scale and control computing resources on-demand
  • Support for Task Factories with the ability to generate tasks based on parametric (parameter) sweeps, randomized input, file enumeration, replication, and custom Python code-based generators
  • Support for multi-instance tasks to accommodate MPI and multi-node cluster applications packaged as Docker or Singularity containers on compute pools with automatic job completion and task termination
  • Seamless, direct high-level configuration support for popular MPI runtimes including OpenMPI, MPICH, MVAPICH, and Intel MPI with automatic configuration for Infiniband, including SR-IOV RDMA VM sizes
  • Seamless integration with Azure Batch job, task and file concepts along with full pass-through of the Azure Batch API to containers executed on compute nodes
  • Support for Azure Batch task dependencies allowing complex processing pipelines and DAGs
  • Support for merge or final task specification that automatically depends on all other tasks within the job
  • Support for job schedules and recurrences for automatic execution of tasks at set intervals
  • Support for live job and job schedule migration between pools
  • Support for Low Priority Compute Nodes
  • Support for deploying Batch compute nodes into a specified Virtual Network and pre-defined public IP addresses
  • Automatic setup of SSH or RDP users to all nodes in the compute pool and optional creation of SSH tunneling scripts to Docker Hosts on compute nodes
  • Support for custom host images including Shared Image Gallery
  • Support for Windows Containers on compliant Windows compute node pools with the ability to activate Azure Hybrid Use Benefit if applicable

Installation

Local Installation

Please see the installation guide for more information regarding the various local installation options and requirements.

Azure Cloud Shell

Batch Shipyard is integrated directly into Azure Cloud Shell and you can execute any Batch Shipyard workload using your web browser or the Microsoft Azure Android and iOS app.

Simply request a Cloud Shell session and type shipyard to invoke the CLI; no installation is required. Try Batch Shipyard now in your browser.

Documentation and Recipes

Please refer to the Batch Shipyard Documentation on Read the Docs.

Visit the Batch Shipyard Recipes section for various sample container workloads using Azure Batch and Batch Shipyard.

Batch Shipyard Compute Node Host OS Support

Batch Shipyard is currently compatible with popular Azure Batch supported Marketplace Linux VMs, compliant Linux custom images, and native Azure Batch Windows Server with Containers VMs. Please see the platform image support documentation for more information specific to Batch Shipyard support of compute node host operating systems.

Change Log

Please see the Change Log for project history.


Please see this project's Code of Conduct and Contributing guidelines.